home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!not-for-mail
- From: avg@rodan.UU.NET (Vadim Antonov)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Date: 1 Jan 1993 16:56:34 -0500
- Organization: UUNET Technologies Inc, Falls Church, VA
- Lines: 44
- Message-ID: <1i2emiINN2td@rodan.UU.NET>
- References: <1992Dec31.203101.5447@prl.dec.com> <1i0s05INNnfn@rodan.UU.NET> <1993Jan1.114158.17149@prl.dec.com>
- NNTP-Posting-Host: rodan.uu.net
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
-
- In article <1993Jan1.114158.17149@prl.dec.com> boyd@prl.dec.com (Boyd Roberts) writes:
- >In article <1i0s05INNnfn@rodan.UU.NET>, avg@rodan.UU.NET (Vadim Antonov) writes:
- >>
- >> A good encoding should support easy (i'd say natural) localization.
- >> It should provide simple algorithms for simple functions
- >> like getting string length, searching a character, case-insensitive
- >> comparison, lexicographical comparison.
- >>
- >
- >Well that's where you're wrong. The characters and how they are used
- >are distinct problems.
-
- Don't you realize that having trivial programs to ask which language
- they're doing operation in effectively defeats the entire purpose of
- Unicode? Should my shell ask me about language of every [a-z] in my
- commands? If it shouldn't then it has to get the information somewhere,
- right? If the information is kept outside the text (file names in this case)
- then why do i need all those extra bits -- my program *already* knows the exact
- (small) alphabet.
-
- "Unicode -- a code for texts which will never be sorted!" Great.
-
- >UNICODE is
- >a good example of this: not only does it specify the code -> glyph
- >mapping (ie the encoding) it has support for left -> right, right -> left
- >writing styles and a bunch of other stuff, and this part of UNICODE is a mess.
-
- Yuck. Right->left is nothing more than a character with negative width.
-
- >Problem 2 (localisation) is damn hard.
-
- Tell me. I've spent ten years doing *real* localization and i know
- the price of ill-thought solutions on the ground level (aka character
- set ordering).
-
- >Should Problem 1 cater for the fact I type `localisation' whereas
- >you type `localization'? We're both using Engligh, typed on American
- >keyboards (I guess, oops mine's made in West Germany) so where are you
- >going to draw the line. Is this Problem 1? I say it's Problem 2.
-
- The example is artificial and has nothing to do with the character sets.
- As you well aware it is different words in the same alphabet.
-
- --vadim
-