NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / os / linux / 22009 < prev next >

Wrap

Text File | 1992-12-29 | 4.1 KB | 104 lines

Newsgroups: comp.os.linux Path: sparky!uunet!math.fu-berlin.de!news.th-darmstadt.de!minnie!wilbur!ckurs-2 From: ckurs-2@wilbur.uni-mainz.de (Teilnehmer am C-Kurs WS 1992_93) Subject: Unicode (was RE: DUMB ....) Message-ID: <OZ09MMB@minnie.zdv.uni-mainz.de> Keywords: Unicode UCS Sender: ckurs-2@wilbur (Teilnehmer am C-Kurs WS 1992_93) Nntp-Posting-Host: wilbur Organization: Johannes Gutenberg Universitaet Mainz Date: Tue, 29 Dec 1992 11:11:26 GMT Lines: 91 After reading all those flames on dumb people and standards i remembered an article about UCS and Unicode which was published by the german magazine c't in its Sept. 1992 issue. So i have to make some corrections here on all that stuff that was posted: 1. Unicode is only a subset of UCS, which calls for 32 bit chars. 2. Implementing UCS can not be so simple, since the standard documents need over 700 pages to describe only the Unicode subset! BUT: I think a first step can be done towards internationalization by commiting ourselfes to use an internal 32 bit char format in all our applications. That should cause no trouble, because the first 256 chars of Unicode/UCS are those of the ISO-Latin1 charset. If we act that way, there should be no problem in upgrading the code to Unicode/UCS (I hope so!) Somebody pointed to the difficulties for east asian user with the unicode set. Well, i can not judge on that, but i think if we use the 32 bit UCS there should be no problem at all and surely the east asian countries will come up with a code plateau of their own to solve that problem. What is a very important point about Unicode/UCS, is the support of typographic symbols and markers for different writing directions and standard format symbols. That should end the confusion with the document formats rather soon. Also there is an easy way to detect wether a document was created on a little endian or a big endian machine (which is very important when you use multi byte characters!) Before the flames are continuing, I recommend that all intrested parts read one or more of the following sources: First the official documents (i haven't read them :-( ) The Unicode Consortium: The Unicode Standard 1.0 Addison-Wessley, Reading, Mass. 1991 ISO/IEC: Information technology - Universal Multiple-Octet Coded Character Set(UCS) Part 1: Architecture and Basic Multilingual Plane, (DIS 10646-1.2) Beuth-Verlag, Berlin 1992 Second publications in computer magazines: Kenneth M. Sheldon: ASCII Goes Global Byte, July 1991, p. 108 Bernd Behr: Welt der Zeichen - Neuer Zeichensatzstandard der ISO c't - magazin f\"ur computertechnik, September 1992, p. 241 Zeichen im Wandel - Textkonverter f\"ur 16-Bit- und andere Zeichens\"atze c't - magazin f\"ur computertechnik, September 1992, p. 234 The last article describes a charcode converter between: Amiga, Archimedes, Atari-ST, NeXT (adobe fonts), Macintosh, IBM-PC (codepage 437), Windows-ANSI, OS/2 (codepage 850), ASCII 7-Bit and Unicode (well not the complete Unicode, but enough for Europe/America) This code-converter is implemented in GfA-BASIC 3.07 and if i got my keyboard problems solved, then i will try to port it to gcc/linux. All c't- and iX-Listings are available via anonymous-ftp from: ftp.uni-paderborn.de ftp.uni-regensburg.de clio.rz.uni-duesseldorf.de (only after 18.00 h central european time!) ftp.zrz.tu-berlin.de (Sorry, i don't know the exact locations :-() The c't magazine (and therefore possibly the author of those 2 articles ?) can be reached via eMail as: ct@ix.de Happy New Year! Dominik +-----------------------------------------------------------------------------+ | eMail to: kubla@MZDMZA.ZDV.UNI-MAINZ.DE | | sMail to: Dominik Kubla, Steinsberg 34, 5428 Nastaetten, GERMANY | +-----------------------------------------------------------------------------+ | | | I don't *like* LINUX, I *LOVE* it ... | | | +-----------------------------------------------------------------------------+