home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.os.linux
- Path: sparky!uunet!math.fu-berlin.de!news.th-darmstadt.de!minnie!wilbur!ckurs-2
- From: ckurs-2@wilbur.uni-mainz.de (Teilnehmer am C-Kurs WS 1992_93)
- Subject: Unicode (was RE: DUMB ....)
- Message-ID: <OZ09MMB@minnie.zdv.uni-mainz.de>
- Keywords: Unicode UCS
- Sender: ckurs-2@wilbur (Teilnehmer am C-Kurs WS 1992_93)
- Nntp-Posting-Host: wilbur
- Organization: Johannes Gutenberg Universitaet Mainz
- Date: Tue, 29 Dec 1992 11:11:26 GMT
- Lines: 91
-
- After reading all those flames on dumb people and standards i remembered an
- article about UCS and Unicode which was published by the german magazine c't in
- its Sept. 1992 issue.
-
- So i have to make some corrections here on all that stuff that was posted:
-
- 1. Unicode is only a subset of UCS, which calls for 32 bit chars.
- 2. Implementing UCS can not be so simple, since the standard documents need over
- 700 pages to describe only the Unicode subset!
-
- BUT:
-
- I think a first step can be done towards internationalization by commiting
- ourselfes to use an internal 32 bit char format in all our applications. That
- should cause no trouble, because the first 256 chars of Unicode/UCS are those of
- the ISO-Latin1 charset. If we act that way, there should be no problem in
- upgrading the code to Unicode/UCS (I hope so!)
-
- Somebody pointed to the difficulties for east asian user with the unicode set.
- Well, i can not judge on that, but i think if we use the 32 bit UCS there should
- be no problem at all and surely the east asian countries will come up with a code
- plateau of their own to solve that problem.
-
- What is a very important point about Unicode/UCS, is the support of typographic
- symbols and markers for different writing directions and standard format symbols.
- That should end the confusion with the document formats rather soon. Also there
- is an easy way to detect wether a document was created on a little endian or a
- big endian machine (which is very important when you use multi byte characters!)
-
- Before the flames are continuing, I recommend that all intrested parts read one
- or more of the following sources:
-
- First the official documents (i haven't read them :-( )
-
- The Unicode Consortium:
- The Unicode Standard 1.0
- Addison-Wessley, Reading, Mass. 1991
-
- ISO/IEC:
- Information technology - Universal Multiple-Octet Coded Character Set(UCS)
- Part 1: Architecture and Basic Multilingual Plane, (DIS 10646-1.2)
- Beuth-Verlag, Berlin 1992
-
- Second publications in computer magazines:
-
- Kenneth M. Sheldon:
- ASCII Goes Global
- Byte, July 1991, p. 108
-
- Bernd Behr:
- Welt der Zeichen - Neuer Zeichensatzstandard der ISO
- c't - magazin f\"ur computertechnik, September 1992, p. 241
-
- Zeichen im Wandel - Textkonverter f\"ur 16-Bit- und andere Zeichens\"atze
- c't - magazin f\"ur computertechnik, September 1992, p. 234
-
- The last article describes a charcode converter between:
- Amiga,
- Archimedes,
- Atari-ST,
- NeXT (adobe fonts),
- Macintosh,
- IBM-PC (codepage 437),
- Windows-ANSI,
- OS/2 (codepage 850),
- ASCII 7-Bit and
- Unicode (well not the complete Unicode, but enough for Europe/America)
-
- This code-converter is implemented in GfA-BASIC 3.07 and if i got my keyboard
- problems solved, then i will try to port it to gcc/linux.
- All c't- and iX-Listings are available via anonymous-ftp from:
- ftp.uni-paderborn.de
- ftp.uni-regensburg.de
- clio.rz.uni-duesseldorf.de (only after 18.00 h central european time!)
- ftp.zrz.tu-berlin.de
- (Sorry, i don't know the exact locations :-()
-
- The c't magazine (and therefore possibly the author of those 2 articles ?)
- can be reached via eMail as: ct@ix.de
-
- Happy New Year!
-
- Dominik
- +-----------------------------------------------------------------------------+
- | eMail to: kubla@MZDMZA.ZDV.UNI-MAINZ.DE |
- | sMail to: Dominik Kubla, Steinsberg 34, 5428 Nastaetten, GERMANY |
- +-----------------------------------------------------------------------------+
- | |
- | I don't *like* LINUX, I *LOVE* it ... |
- | |
- +-----------------------------------------------------------------------------+
-