NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / std / internat / 1305 < prev next >

Wrap

Text File | 1993-01-24 | 3.1 KB | 72 lines

Newsgroups: comp.std.internat Path: sparky!uunet!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!dog.ee.lbl.gov!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: Cleanicode Message-ID: <1993Jan25.001633.29534@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <ISHIKAWA.93Jan20182546@ds5200.personal-media.co.jp> <1993Jan21.001303.20834@fcom.cc.utah.edu> <ISHIKAWA.93Jan21204416@ds5200.personal-media.co.jp> Date: Mon, 25 Jan 93 00:16:33 GMT Lines: 60 In article <ISHIKAWA.93Jan21204416@ds5200.personal-media.co.jp> ishikawa@personal-media.co.jp writes: > The problem is one of an existing character sets having multiple > possible reverse translations of a single code point. > > A similar condition would be a Japanese/Chinese character set standard > which had seperate code points for characters unified by Unicode; if > such existed, then there would be no round-trip for characters translated > from that character set to the Unicode character set -- the correct code > point within the Japanese/Chinese combined set could not be identified > by the Unicode character itself. No such example exists. > >I am an ignorant programmer. So bear with me. There is no such >example, is there? [From what I heard, strictly hearsay, mind you, >there seem to be a few characters that would look slightly differently >on printed paper and yet were put into the same code point. This >slight difference might be big enough for some to scream and small >enough for others to ignore. If someone knowledgeable could shed some >light on this mattter, I would be grateful.] This is only true if the "round-tripping" were done from a character recognition standpoint, where the input character could not be recognized becuase of it being drawn differently. While I can think of several examples here, they are pretty much non-applicable because the seperation of the rendering engine technology from the storage and process coding of the data. A particular example would be a font with both Chinese and Japanese characters which did not chare code points, ie: -------- -------- ---- ---- -------- -------- 0x9915 0x9272 (Japanese) (Chinese) I don't believe such a font exists. It's possible to create one, but unless it ends up widely accepted, I expect that this would not impact future revisions of Unicode. >Wonder if there was such combined >Latin/Cyrillic/Greek standard at the time of Unicode design. ISO 8859-5 and IS 8859-7 seem to me to qualify. Terry Lambert terry@icarus.weber.edu terry_lambert@novell.com --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------