home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!sun4nl!cwi.nl!dik
- From: dik@cwi.nl (Dik T. Winter)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
- Message-ID: <8492@charon.cwi.nl>
- Date: 1 Jan 93 01:25:20 GMT
- References: <1hu9v5INNbp1@rodan.UU.NET> <8490@charon.cwi.nl> <1hvu79INN4qf@rodan.UU.NET>
- Sender: news@cwi.nl
- Organization: CWI, Amsterdam
- Lines: 47
-
- In article <1hvu79INN4qf@rodan.UU.NET> avg@rodan.UU.NET (Vadim Antonov) writes:
- > Dik, i never insisted that all European languages belong to
- > the single group -- how many are the ISO Latin-X sets?
- > My point was that there obviously are identifyable meta-alpahbets
- > covering several languages.
- I do think that the number in several is very small.
- >
- > Or dutch, where the letter combination ij is sorted either
- > >amongst i as a double letter, or amongst y as a single letter, or
- > >between y and z as a single letter, depending on who does the sorting?
- >
- > If a combination of letters is treated as a letter IT IS A LETTER.
- > Then add it to the alphabet and let the keyboard driver (which surely
- > knows the language -- simply because there are different keyboard
- > layouts) to handle the matter.
- Still wrong. Take the dutch ij. I have one typewriter that has the ij
- on a single key, but all typewriters sold the last 20 years and all
- computer keyboards sold in the Netherlands are not specific dutch. I
- would be surprised if there is even a large number of computer keyboards
- sold that is not US, UK or German. So how would the keyboard driver deal
- with the 'ij' combination? When I enter the combination it can either be
- the single letter ij (some dutch people say there is no such single letter),
- or two letters, depending on context. So must the keyboard driver look
- at the context (e.g. it is a french loadnword like bijoux so that ij is
- really two letters), or what?
-
- Sorting is extremely context sensitive, even in a single language. As
- another person already mentioned in english you sort McNeill as if it
- is MacNeill. Similar the abbreviation St. which can be either Street or
- Saint. (Moreover, when sorting names I would prefer to sort C. van der Bilt
- under V if it is an American and under B if it is a Dutchman ;-).)
-
- To me it appears very silly to put more than superficial sorting
- information in the encoding. The remainder must be handled by the
- applications (through library programs). And indeed, that may require
- table look-up.
- >
- > The idea of visual encoding (and one letter-onr glyph is nothing more
- > than a compressed image of the text) is simply wrong because it
- > drops valuable information readily available at the point of the CREATION
- > of the text but not later.
- But as I said, such information is not readily available at the point of
- creation, only if the system asks everytime. That would be silly as most
- text is not sorted anyway.
- --
- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland
- home: bovenover 215, 1025 jn amsterdam, nederland; e-mail: dik@cwi.nl
-