home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!paladin.american.edu!howland.reston.ans.net!spool.mu.edu!yale.edu!yale!mintaka.lcs.mit.edu!ai-lab!muesli!glenn
- From: glenn@muesli.ai.mit.edu (Glenn A. Adams)
- Newsgroups: comp.std.internat
- Subject: Re: Radicals Instead of Characters
- Date: 21 Jan 1993 08:39:57 GMT
- Organization: MIT Artificial Intelligence Laboratory
- Lines: 40
- Message-ID: <1jlngtINNqnk@life.ai.mit.edu>
- References: <1jhvstINNbcq@flop.ENGR.ORST.EDU> <1993Jan20.233154.19733@fcom.cc.utah.edu> <MELBY.93Jan21144739@dove.yk.fujitsu.co.jp>
- NNTP-Posting-Host: muesli.ai.mit.edu
-
- In article <MELBY.93Jan21144739@dove.yk.fujitsu.co.jp> melby@dove.yk.fujitsu.co.jp (John B. Melby) writes:
- >Looking at Han characters in a probabilistic sense probably is not going
- >to help much, since the positioning of radicals varies widely between
- >characters.
-
- The idea being discussed for Han decomposition would have different
- combining radicals for each of the possible positions the radical
- could take; e.g. MAN-LEFT, MAN-TOP, MAN-BOTTOM, etc.
-
- >(1) some rare characters cannot be expressed in this manner,
-
- Characters which could not be decomposed in this manner would be
- represented in their entirety (i.e., as non-decomposed symbols).
-
- >(2) allowing the display of arbitrary characters using this sort of
- >composition does not mean that their components will be aesthetically
- >spaced.
-
- A system that displayed such decomposed symbols would most likely
- employ a font which either (1) contained glyphs that represented the
- entire symbol; or (2) contained internal instructions that would allow
- it to position the radical properly. In both cases, the correct
- display geometry would be used. The display engine would have to
- map multiple coded character elements to single glyph references
- or mutliple glyph references as appropriate.
-
- >A 16 bit font is insufficient for encoding rare characters, whichever way
- >you look at it, although having 16-bit CJK unification and a user-defined
- >character facility may be sufficient for an average user.
-
- Keep in mind that there is no necessary relation between a 16-bit character
- encoding and a 16-bit font. One can have a 16-bit character encoding like
- Unicode (with 20,902 precomposed Han characters, and possibly a collection
- of combining radical characters) and display with a 16-bit font that contains
- 2^16 Han glyphs, or even with a 24-bit font, a 32-bit font, etc. The
- relation of Unicode character code to font code is not defined by the
- Unicode display model.
-
- Glenn Adams
-
-