home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!pipex!bnr.co.uk!uknet!mcsun!sunic!seunet!enea!sommar
- From: sommar@enea.se (Erland Sommarskog)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Message-ID: <1993Jan2.230101.20871@enea.se>
- Date: 2 Jan 93 23:01:01 GMT
- References: <1hvu79INN4qf@rodan.UU.NET> <1993Jan1.115424.27258@enea.se> <1i2gpvINN3lm@rodan.UU.NET>
- Organization: Enea Data AB
- Lines: 110
-
- Vadim Antonov (avg@rodan.UU.NET) writes:
- >In article <1993Jan1.115424.27258@enea.se> sommar@enea.se (Erland Sommarskog) writes:
- >>So if I type a C then a million key presses later changes puts in
- >>an H after the C how can the keyboard driver handle that? It might
- >>not even be the same driver who are seeing the two!
- >
- >Aw, don't be silly. It's trivial.
-
- When you can't explain write off the problem as trivial.
-
- OK, I confess I'm silly. I am even plain fucking stupid, because I
- understand anything. Could you divine guru explain how the keyboard
- could correct what is going in my editor? Clearly you don't mean
- that the editor is to be fixed - you want to save the applications
- from keeping track of which language I'm using, don't you?
-
- >>>why on the Earth do i need to spare bits for encoding glyphs if
- >>>i already know the language and 8 (or 16 for oriental languages) bits
- >>>is quite enough to map the alphabet. Don't you see this gap in
- >>>the logic nullifying all benefits of 10646?
- >>
- >>What the hell has the number of bits to do with anything? Do computers
- >>exist for the programmers of the users?
- >
- >Look, you've missed the logic completely. Read it please again. I also
- >explained it several times in other postings.
-
- What logic? I want to be able to write and read text in European
- languages. Period. Then how many bits you use is not my issue, as
- long as you give me something which I consider user-friendly. (Being
- forced to keep track whether a certain dotted "a" is German or
- Swedish is not.) How many bits you use is completely irrelevant.
- (But since I want more than 256 symbols, you will have a pain if
- you stay with eight bits.)
-
- >What do you tell the poor user when he has a database with English
- >and Russian company names (a case from my practice, to be real) --
- >in both upper and lower case and the smart guys (apparently Erlands
- >pupils) made a terminal which converts cyrillic codes for the letters
- >of the same shape as latin to the latin codes? Go get a rope?
-
- Yes, that is precisely the confusion which is likely to happen when
- you assign the same character different codes depending on language,
- and when the application program is not smart enough to equate them.
-
- We had this discussion on the 10646 list, since this problem is present
- in 10646 due to thanks to floating diacritics you can represent many
- characters in more than one way. The general agreement was that a
- good program would equate the two without notice. In that discussion
- I stated that if the spell-checker complains because I'm using the
- wrong sort of dotted A, I would scream "you fascist!" and throw the
- machine out of the window. Of course this applies to Vadim Antonov's
- wretched system as well.
-
- >The basic ASCII principles (after reordering and replacing several
- >characters) remained the same -- there is a way to convert upper<->lower
- >case and there is a way to sort without asking which language every word
- >came from (it's known apriori).
-
- Nope. Not with German. Look in a German dictionary. Then look in a
- German phonebook. Then you will find that the dotted suckers are
- sorted differently in the two places. If you want to support both,
- you have to know what the user wants. Or do you suggest that the
- user should specify that on input with choosing the correct set of
- dotted characters? What if another user wants the other order?
- Sure, if he has write access to the text he could filter it first,
- but then what's the difference with informing the program with an
- environment variable or a clicking on the appropriate window item?
-
- The problem with your idea is that you believe that everything is
- known at input time. It isn't. If you have a list of names which
- is to be used in Sweden, Norway, Denmark and Finland, the list will
- sort differently depending on the reader, not on who is entering the
- text. The Swedish and Finnish alphabets ends with A-ring, A-dots,
- O-dots. The Danish and Norwegian ends with AE-ligature, O-slash,
- A-ring. Looks trivial for a simple bit-order sort? Nope. Because
- the dotted A is equivlent to the AE ligature and so is dotted O and
- O-slash. Thus Danish and Norwegian names with slashed O should
- appear together with Swedish and Finnish names with dotted O. So
- the sort algorithm must make no distinction between the two, except
- when everything else in the same. And the sort algorithm must know
- in which order the user wants the text to be presented.
-
- This is a simple end-user requirement which your proposal is not
- incapabale to handle. But it requires the same solutions as 10646
- (or Latin-1) requires. But your proposals give me alot more mess
- with other things which does conflict with end-user requirements
- which 10646 does not.
-
-
- >As i already said there is no easy way around -- you have to deal
- >with those issues somewhere and it's better to have it solved on
- >the elementary level -- otherwise EVERY program will be forced to
- >keep track of the language
-
- Yes, on elementary level. Just like any device-independent program
- must be able all sorts of terminals. Can you say "routine library"?
- I know you could.
-
- >which is not easy and sometimes ruins the whole logic of the
- >program (see shell globbing example in my previous posting os
- >tr example before).
-
- You've talked a lot about regular expressions etc. Frankly I
- don't give a damn about those. The main bulk of computer users
- are not programmers and don't know what a regular expression
- is, so why focus such specific issues?
- --
- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
- Jag gav en k{ck tjeck en check.
-