home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!pipex!bnr.co.uk!uknet!mcsun!sunic!seunet!enea!sommar
- From: sommar@enea.se (Erland Sommarskog)
- Newsgroups: comp.std.internat
- Subject: Re: Language tagging
- Message-ID: <1993Jan2.231703.21201@enea.se>
- Date: 2 Jan 93 23:17:03 GMT
- References: <1i2m57INN4vr@rodan.UU.NET> <1993Jan2.020512.3287@klaava.Helsinki.FI> <1321@blue.cis.pitt.edu>
- Organization: Enea Data AB
- Lines: 40
-
- David J Birnbaum (djbpitt+@pitt.edu) writes:
- >I think the objections to the input-related problems of Vadim's proposal
- >are misdirected, since both Vadim's proposal and Unicode require the
- >user to input language identifying information during data entry if this
- >information will be needed for later processing. Under Vadim's
- >proposal, what would be input would be an instruction (not part of the
- >stored text stream) to shift to the appropriate subset of characters.
- >Under a system built on a Unicode character set, what would be input
- >would be some sort of language or locale tagging that would be entered
- >into the text at a higher level than character set.
- >
- >In both cases, if you want language-specific data in your text stream,
- >you have to say so during input. If I need to insert Bulgarian words
- >into a Russian text stream I can do so without indicating a change,
- >as long as I understand that the consequence will be that the Bulgarian
- >data will be treated like Russian.
-
- Then I have to ask you explain something about Unicode I don't
- know. It is true that if you are using language-dependent features
- such as spell-checking and hyphenation, while inputting the text,
- then you have to know what you are doing with Unicode. But once
- you're done with it, it doesn't matter any longer, until the next
- time you want to process the text in some way. With Unicode you
- can decide whether to treat the text in Bulgarian or Russian,
- with Vadim's system you're stuck unless you convert it. (Well,
- maybe you'll get away with Bulgarian and Russian, but not with
- Swedish and German. Or as I demonstrated in another article,
- Swedish and Danish.)
-
- But of course, if I have CCCP in a Swedish text and copies the word
- to a Russian text I will see funny things both with Vadim Antonov's
- system and 10646. (And I am sure this confusion will cost someone
- a couple of wasted work hours in finding the error.) But at least
- shifting scripts is a more obvious, than changing languages. If I
- switch from Swedish to Russian I would probably change the keyboard
- set-up, but not if I switch from Swedish to German - there is no
- reason to.
- --
- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
- Jag gav en k{ck tjeck en check.
-