home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.rexx
- Path: sparky!uunet!mcsun!sunic!fuw.edu.pl!cocos!jt
- From: jt@fuw.edu.pl (Jerzy Tarasiuk)
- Subject: Re: Lower-case alphabetic set
- In-Reply-To: Jack Hamilton's message of Thu, 7 Jan 1993 13:05:37 PST
- Message-ID: <JT.93Jan21152802@fizyk1.fuw.edu.pl>
- Sender: news@fuw.edu.pl
- Nntp-Posting-Host: fizyk1
- Organization: Warsaw University Physics Dept.
- References: <9301072105.AA26425@netcom.netcom.com>
- Date: Thu, 21 Jan 1993 14:28:02 GMT
- Lines: 33
-
- >>>>> On Thu, 7 Jan 1993 13:05:37 PST, Jack Hamilton <jfh@NETCOM.NETCOM.COM> said:
- Jack> Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Jack> Comments: To: REXXLIST@OHSTVMA.ACS.OHIO-STATE.EDU
-
- Jack> Jim McMaster wrote:
- >
- >Your technique would work in an ASCII system, bacause all upper-case
- >alphabetics are in the range X'41'-X'5A' (with no intermixed
- >characters), and lower-case alphabetics are X'61'-X'7A'.
-
- Jack> Only for business English. I don't think it would be true for poetic
- Jack> English, which uses some special accented characters (double-dot over the O
- Jack> in coordinate, backward slash over the last e in despised), and it's not
- Jack> true for many European languages with accented characters as part of the
- Jack> regular character set.
-
- Jack> Is there a generic name for those character sets, other than "8-bit"?
-
- Fact, translate cannot be used for Double Byte Character Set (or what
- is its name). Need detect character set escape character and maintain
- it separately. The simple expression containing two translate() and
- two xrange('a','z') should be limited to strings which doesn't contain
- double byte characters. For best speed use:
- parse var inp tmp 'code'x inp
- to detect c.s. escape (I don't know what code it has); then translate
- tmp: out=out''translate(tmp,xrange('a','z'),translate(xrange('a','z')))
- then use parse to split inp to one char and remaining string, translate
- the one char (what are rules of converting case of it?) and add it
- (preceded by the escape code) to out. Until all inp chars processed.
- Note is last char in input string is the escape code parse doesn't give
- any information: must check length(inp) and count characters used...
-
- Jerzy Tarasiuk <jt@zfja-gate.fuw.edu.pl>
-