home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: alt.lang.awk
- Path: sparky!uunet!cs.utexas.edu!news.uta.edu!hermes.chpc.utexas.edu!rshouman
- From: rshouman@chpc.utexas.edu (Radey Shouman)
- Subject: Re: tolower() function?
- Message-ID: <1993Jan22.012739.26965@chpc.utexas.edu>
- Organization: The University of Texas System - CHPC
- References: <1993Jan19.152725.3510@trentu.ca> <EMCOOP.93Jan19111755@bcars148.bnr.ca> <C184vr.p5q@austin.ibm.com>
- Date: Fri, 22 Jan 93 01:27:39 GMT
- Lines: 72
-
- In article <C184vr.p5q@austin.ibm.com> tbates@austin.ibm.com (Tom Bates)
- writes:
-
- >Quoting "The AWK Programming Language",
- >
- > "The cleanest way to do case conversion in awk is with an array that
- >maps each letter...it's better to use...tr"
- >
- >I often use tr for this very reason. For example:
- >
- > echo "HeLlO, wOrLd!" | tr [:upper:] [:lower]
- >
- >This gives you:
- >
- > hello, world!
- >
- >To use tr from awk:
- >
- >#!/bin/awk -f
- > {
- > "echo "$0" | tr [:upper:] [:lower:]" | getline
- > print
- >}
-
- Note that this will give squirelly results if handed any characters
- special to /bin/sh, like quotes, or <>|, or backslash, or backtick,
- or $, or ...
-
- In order to do this generally, one would have to quote the argument to
- echo in single quotes, e.g.:
-
- gsub(/'/, "'\\''" $0);
- "echo '"$0"' | tr [A-Z] [a-z]" | getline
-
- Doing this inside a shell script would make it look particularly ugly.
-
- This approach, while undeniably clever, has a few other drawbacks:
-
- 1) It's really slow, if it has to be done repeatedly, since we fork a
- new subshell for each string we want to downcase.
-
- 2) It's very Un*x specific.
-
-
- tolower and toupper are such innocuously handy functions that I have
- trouble understanding why they weren't included in the original nawk.
-
- Here, just for fun, is an alternative implementation of tolower, in awk.
- It's probably slower than one based on a mapping array, but it was faster
- for me to type:
-
- BEGIN {
- Aa = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";
- }
-
- function tolower(str, s, i, c) {
- i = 1;
- s = str;
- while (match(substr(s, i), /[A-Z]/))
- {
- i += RSTART - 1;
- c = substr(s, i, 1);
- c = substr(Aa, index(Aa,c) + 1, 1);
- s = substr(s, 1, i - 1) c substr(s, i + 1);
- i++;
- }
- return s;
- }
-
- --Radey
- --
- Radey Shouman rshouman@chpc.utexas.edu
-