NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / std / internat / 1318 < prev next >

Wrap

Internet Message Format | 1993-01-25 | 4.5 KB

Path: sparky!uunet!gatech!paladin.american.edu!howland.reston.ans.net!zaphod.mps.ohio-state.edu!uwm.edu!ogicse!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams) Newsgroups: comp.std.internat Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST) Message-ID: <1k1c8tINN8q2@life.ai.mit.edu> Date: 25 Jan 93 18:41:33 GMT Article-I.D.: life.1k1c8tINN8q2 References: <ISHIKAWA.93Jan22211810@ds5200.personal-media.co.jp> <1js0l3INN3f1@life.ai.mit.edu> <ISHIKAWA.93Jan25211508@ds5200.personal-media.co.jp> Organization: MIT Artificial Intelligence Laboratory Lines: 71 NNTP-Posting-Host: wheat-chex.ai.mit.edu In article <ISHIKAWA.93Jan25211508@ds5200.personal-media.co.jp> ishikawa@personal-media.co.jp writes: >In Japan, if a student put, say, a bar in slightly misplaced >position in a character or put the bar slightly touching other part of >the character when it should (NOTE the usage of SHOULD!) not, a student >can fail a test and required to practice until he/she gets the RIGHT >(CORRECT) writing. Yes, I know all of this. I learned Chinese calligraphy from a well-known calligrapher of the old school. Stroke weight, speed, taper, angle, joins, and, yes, stroke order, were absolutely essential to produce the "correct" shape (glyph) of the character. However, you keep missing an essential point here. Characters != Glyphs. A character is an abstraction of shape in which *all* non-distinguishing characteristics are ignored. A font represents a collection of glyphs, it does not represent a character set. A font is used in the display of characters in a character set. For example, I might choose a vertical form of a open parenthesis character if I am display Japanese in vertical mode, or I might choose a horizontal form if I am displauying in horizontal mode. Nothing in Unicode tells me which of these to use. >But, I can't understand where their priority was during the CJK meeting. >Wonder why they didn't voice these concerns... My guess is that they really >didn't bother to think about the mixing of different country's >character sets. Print company's delegate, for example, would never >think of using a "standard UNICODE" font at his company, of course. You are wrong. They very much understood the issues involved in performing unification. First of all, they were not defining a font. There is no such thing as a "standard UNICODE font." A font is a collection of glyphs which has no necessary relation to a character set. >I agree that the rich text approach is vital and INDISPENSABLE here. I >agree on this point. But in assuming the use of additional information >UNICODE certainly loses some attractiveness in the presense of >apparent lack of rich text standard that every one uses. Your points >about the use of rich text are well taken. If you want to display Unicode-encoded Japanese text on a simple device which does not support rich text, then all you have to do is use a font which displays the expected Japanese glyph when displaying Unicode Han characters which represent Kanji data. That's all there is to it, nothing is complicated about this. >But, then, I think it is best to say that rich text IS NECESSARY instead of >assuming the reading skill of modern Japanese. No, it is not necessary. Use a Japanese font. Sure, if you are going to mix a text with Chinese and Japanese, then you will need to designate fonts and/or languages. But, tell me, how many existing devices (or software) do you know of which will display a mixture of CJK and use the correct fonts without using any kind of font attributes, language attributes, or some other kind of rich text. I do know that the MULE version of Gnu EMACS will support this, but it marks each character in a buffer according to the character set it belongs to, and it encodes CJK in separate character sets. So it does use a kind of rich text (character set attribute). Furthermore, if you try to search for the Kanji "nihongo" in a mixed CJK MULE buffer, you aren't going to get a match on the Hanzi "ribenyu" as one might expect. By not unifying, MULE allows distinguishing between which font/language, but it prevents performing searching/sorting and other operations which might best operate on a unified character set. CJK unification in Unicode derives the benefit of aiding many common text processing tasks, while not detracting from current display technology. If you want a full multilingual CJK system to do typographically correct display, then you are going to need rich text no matter which character set or sets you choose to use. Glenn Adams