home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!not-for-mail
- From: avg@rodan.UU.NET (Vadim Antonov)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Date: 1 Jan 1993 16:43:55 -0500
- Organization: UUNET Technologies Inc, Falls Church, VA
- Lines: 48
- Message-ID: <1i2durINN2pj@rodan.UU.NET>
- References: <8492@charon.cwi.nl> <1i0vnmINN352@rodan.UU.NET> <8494@charon.cwi.nl>
- NNTP-Posting-Host: rodan.uu.net
- Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
-
- In article <8494@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes:
- >As somebody else mentioned already: should a spelling-checker allow a
- >German A-umlaut in a Swedish word?
-
- No, it should not. As well as it should not allow similarly-looking
- hierogliphs of mumbo-jumbo tribe. Think of that as of different letters
- which happened to look remarkably alike -- like I and l. Moreover indroducing
- foreighn letgter in a middle of a word requires deliberate action and in practice
- is real seldom occurence (i have a keyboard with Cyrillic with a lot
- of similar-looking letters - o, e, k, x, c, m, E, T, O, p, P, A, D, H, K,
- X, C, B, M with TWO codes in KOI-8) -- ever saw it in my postings here?
- And i do not use spell-checkers.
-
- >If so, what are the consequences,
-
- The good spell-checker will suggest to replace the letter with the
- correct one.
-
- >Moreover, one question: how would you encode the German A-umlaut such that
- >it sorts properly (i.e. as if it is the letter combination AE)?
-
- The sorting order should be strict -- if you have two identical words
- with a-umlaut and ae in the middle is it the same word? If it is then
- ae IS a variation of a-umlaut and should always be treated as a signle
- letter.
-
- >Even in
- >a single language (German) you can not come up with a coding that gives
- >proper sorting.
-
- You can come with a reasonable approximation anyway.
-
- >And also how to code the German eszet such that when
- >uppercased it becomes the *two* letters 'SS' (and how do you lowercase
- >that again? Things are not as simple as you appear to think.
-
- Then there is an uppercase eszet which looks like SS. An input program
- will map the keyboard accordingly.
-
- Forget about "traditions" -- users do not care which code is inside if
- it looks like their usual stuff.
-
- Basically it is a purely mathematical problem -- you've got a number
- of orderings and map it into single (partial) ordering by merging
- as much as possible elements. The number of meta-alphabets is generally
- determined by the criteria of minimality of duplication.
-
- --vadim
-