home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!paladin.american.edu!howland.reston.ans.net!zaphod.mps.ohio-state.edu!uwm.edu!ogicse!mintaka.lcs.mit.edu!ai-lab!wheat-chex!glenn
- From: glenn@wheat-chex.ai.mit.edu (Glenn A. Adams)
- Newsgroups: comp.std.internat
- Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
- Message-ID: <1k1c8tINN8q2@life.ai.mit.edu>
- Date: 25 Jan 93 18:41:33 GMT
- Article-I.D.: life.1k1c8tINN8q2
- References: <ISHIKAWA.93Jan22211810@ds5200.personal-media.co.jp> <1js0l3INN3f1@life.ai.mit.edu> <ISHIKAWA.93Jan25211508@ds5200.personal-media.co.jp>
- Organization: MIT Artificial Intelligence Laboratory
- Lines: 71
- NNTP-Posting-Host: wheat-chex.ai.mit.edu
-
- In article <ISHIKAWA.93Jan25211508@ds5200.personal-media.co.jp> ishikawa@personal-media.co.jp writes:
- >In Japan, if a student put, say, a bar in slightly misplaced
- >position in a character or put the bar slightly touching other part of
- >the character when it should (NOTE the usage of SHOULD!) not, a student
- >can fail a test and required to practice until he/she gets the RIGHT
- >(CORRECT) writing.
-
- Yes, I know all of this. I learned Chinese calligraphy from a well-known
- calligrapher of the old school. Stroke weight, speed, taper, angle,
- joins, and, yes, stroke order, were absolutely essential to produce the
- "correct" shape (glyph) of the character.
-
- However, you keep missing an essential point here. Characters != Glyphs.
- A character is an abstraction of shape in which *all* non-distinguishing
- characteristics are ignored.
-
- A font represents a collection of glyphs, it does not represent a character
- set. A font is used in the display of characters in a character set. For
- example, I might choose a vertical form of a open parenthesis character if
- I am display Japanese in vertical mode, or I might choose a horizontal form
- if I am displauying in horizontal mode. Nothing in Unicode tells me which
- of these to use.
-
- >But, I can't understand where their priority was during the CJK meeting.
- >Wonder why they didn't voice these concerns... My guess is that they really
- >didn't bother to think about the mixing of different country's
- >character sets. Print company's delegate, for example, would never
- >think of using a "standard UNICODE" font at his company, of course.
-
- You are wrong. They very much understood the issues involved in performing
- unification. First of all, they were not defining a font. There is no
- such thing as a "standard UNICODE font." A font is a collection of glyphs
- which has no necessary relation to a character set.
-
- >I agree that the rich text approach is vital and INDISPENSABLE here. I
- >agree on this point. But in assuming the use of additional information
- >UNICODE certainly loses some attractiveness in the presense of
- >apparent lack of rich text standard that every one uses. Your points
- >about the use of rich text are well taken.
-
- If you want to display Unicode-encoded Japanese text on a simple device
- which does not support rich text, then all you have to do is use a font
- which displays the expected Japanese glyph when displaying Unicode Han
- characters which represent Kanji data. That's all there is to it, nothing
- is complicated about this.
-
- >But, then, I think it is best to say that rich text IS NECESSARY instead of
- >assuming the reading skill of modern Japanese.
-
- No, it is not necessary. Use a Japanese font. Sure, if you are going to
- mix a text with Chinese and Japanese, then you will need to designate fonts
- and/or languages. But, tell me, how many existing devices (or software) do
- you know of which will display a mixture of CJK and use the correct fonts
- without using any kind of font attributes, language attributes, or some
- other kind of rich text. I do know that the MULE version of Gnu EMACS will
- support this, but it marks each character in a buffer according to the
- character set it belongs to, and it encodes CJK in separate character sets.
- So it does use a kind of rich text (character set attribute). Furthermore,
- if you try to search for the Kanji "nihongo" in a mixed CJK MULE buffer,
- you aren't going to get a match on the Hanzi "ribenyu" as one might expect.
- By not unifying, MULE allows distinguishing between which font/language,
- but it prevents performing searching/sorting and other operations which
- might best operate on a unified character set.
-
- CJK unification in Unicode derives the benefit of aiding many common
- text processing tasks, while not detracting from current display
- technology. If you want a full multilingual CJK system to do typographically
- correct display, then you are going to need rich text no matter which
- character set or sets you choose to use.
-
- Glenn Adams
-