NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / std / internat / 1323 < prev next >

Wrap

Internet Message Format | 1993-01-26 | 5.2 KB

Path: sparky!uunet!mcsun!sun4nl!cwi.nl!dik From: dik@cwi.nl (Dik T. Winter) Newsgroups: comp.std.internat Subject: Re: Alphabets Message-ID: <8732@charon.cwi.nl> Date: 26 Jan 93 02:02:43 GMT References: <1jutusINNlfa@life.ai.mit.edu> <8719@charon.cwi.nl> <1k100eINNs9n@life.ai.mit.edu> Sender: news@cwi.nl Organization: CWI, Amsterdam Lines: 81 In article <1k100eINNs9n@life.ai.mit.edu> glenn@wheat-chex.ai.mit.edu (Glenn A. Adams) writes: > In article <8719@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: > >"What is Unicode encoding?". Scripts? Writing system? > Unicode encodes scripts, and not writing systems (alphabets). Good, but see later. > > >Suetterlin. Is that a different font? Many would think the latter not > >predominantly derived from the symbols used in the Roman alphabet. I see > >them as being more derived from the Germanic Runes. Still, there is a 1<->1 > >corespondence between the symbols in the Suetterlin script and the German > >version of the Latin script (I think). > I assume you refer to the written form developed by von Ludwig Suetterlin > (1865-1917). I don't have any detailed information on it, so I can't say > for sure. Without knowing any details, I would be willing to say it was > a distinct script to the extent that Suetterlin created new forms or even > borrowed forms from other scripts, perhaps modifying them in the process. True enough. > > Aside from the issue of encoding utility, I would say that abstracting the > forms of two or more alphabets into a single script should take into account > historical derivation, formal similarity, and perhaps even functional > similarity, although I would give the much less priority than the former two > criteria. In my opinion you can not give absolute criteria. While Suetterlin is not derived from the Latin script and there is no formal similarity, it is functionally equivalent. So, although it is in fact a different script it may just as well be viewed as a different font. I do not think that Unicode (where it is extended to extinct scripts) should reserve codepoints for this script. And I think many German people would agree. This is were you can not ignore the culture and where functional equivalence takes priority. (I may note that this script has been used very extensively and that it was learned at schools still in the sixties.) > > >I think you should add that unification of different scripts is possible > >iff the scripts can be viewed as just being font changes (although the > >derivation of the scripts can be completely different). > > I think you may be confusing "script" as I am using it with "handwriting > form" or possibly "written form." Clearly the latter would be a matter of > only font changes, and nothing more. No, I think not. What I indicated was that functional equivalence (as you correctly stated it) might be the most important deciding factor for some scripts. > The general process used in Unicode is to identify an alphabet (i.e., the > symbols used in a particular writing system) with some historically known > collection of symbols (a script), attempt to unify the alphabet with the > this collection, and, then, to the extent that the unification is successful > and doesn't interfere with basic processing tasks, replace the script with > the (unified) union of the original script and the forms of the new alphabet. But I think that the identification process may ignore the actual form. When Suetterlin was used it was mixed with the normal Roman form without change of meaning. It was more or less viewed as a different font, although the letter forms are completely different. > >>So while you can unify the Suetterlin and the Latin script, you can not unify >>Latin and Greek script although Latin is derived from Greek. > > You could unify Latin and Greek if you want, but it would require radical > unification of both form and function. So it would go against two of the three criteria you gave. (Formal and functional similarity. The derivation is there.) I think the functional dissimilarity is the more important here, the formal dissimilarity is not so very great; the same holds for Latin and Cyrillic. > And it wouldn't buy much as far > as encoding is concerned. But that is only an afterthought. How about the Turkish I with and without dot? It would not have cost much to give them separate coding points. (Yes, I understand the compatibility reasons. Are there other reasons?) What I think about scripts is that the LGC scripts have two distinctive forms (majuscule and minuscule) that must be observed. When coding a LGC symbol you actually require two coding points, and when looking for similarities you have to observe both variants. So because there are not many symbols involved, it pays to *not* unify LGC (there are only a few letters that are similar in both cases). Now I do not know in how far the Kannada and Telugu scripts are Unified (as I said the book was sold out when I tried to buy it), but I think those have the same distinctions as Latin and Greek. -- dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland home: bovenover 215, 1025 jn amsterdam, nederland; e-mail: dik@cwi.nl