home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!not-for-mail
- From: avg@rodan.UU.NET (Vadim Antonov)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Date: 1 Jan 1993 02:31:17 -0500
- Organization: UUNET Technologies Inc, Falls Church, VA
- Lines: 29
- Message-ID: <1i0s05INNnfn@rodan.UU.NET>
- References: <8490@charon.cwi.nl> <1992Dec31.171450.1513@klaava.Helsinki.FI> <1992Dec31.203101.5447@prl.dec.com>
- NNTP-Posting-Host: rodan.uu.net
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
-
- In article <1992Dec31.203101.5447@prl.dec.com> boyd@prl.dec.com (Boyd Roberts) writes:
- >There are two problems:
- > 1. Getting an encoding of the characters.
- > 2. Getting local conventions right.
- >Problem 2 is hard. Problem 1 should not address problem 2.
-
- Oops. Nice try. Come again.
-
- The ONLY reason people invent charcter encoding standards is to
- "get local conventions right". If you've got your own machine which
- does not communicate with others you can choose your own arbitrary
- encoding.
-
- A good encoding should support easy (i'd say natural) localization.
- It should provide simple algorithms for simple functions
- like getting string length, searching a character, case-insensitive
- comparison, lexicographical comparison.
-
- Unicode (and for that matter Plan 9 UTF) does not support the last
- two mentioned functions. I have yet to see Plan 9 _sort_ which will
- sort Russian strings without being told explicitly that it is Russian.
-
- >Plan 9 utf solves Problem 1.
-
- UTF does not solve the problem 1 -- it is merely a way to encode
- 16-bit unsigned integers in the way which (supposedly) will not
- aggravate the ASCII world.
-
- --vadim
-