<oXygen/> User Guide

Unicode Character Encoding

The table below provides a matrix from which to match Unicode names with the names shown by the Java Encoder when it cannot identify encoding.

Table A.1. Unicode to Java Name Matrix

Common NameName in XML filesName TypeJava Encoder Name
8 bit UnicodeUTF-8IANAUTF8
16 bit UnicodeUTF-16IANAUnicode
16 bit Unicode little endianUTF-16LEIANAUnicodeLittle
16 bit Unicode big endianUTF-16BEIANAUnicodeBig
ISO Latin 1ISO-8859-1MIMEISO-8859-1
ISO Latin 2ISO-8859-2MIMEISO-8859-2
ISO Latin 3ISO-8859-3MIMEISO-8859-3
ISO Latin 4ISO-8859-4MIMEISO-8859-4
ISO Latin CyrillicISO-8859-5MIMEISO-8859-5
ISO Latin ArabicISO-8859-6MIMEISO-8859-6
ISO Latin GreekISO-8859-7MIMEISO-8859-7
ISO Latin HebrewISO-8859-8MIMEISO-8859-8
ISO Latin 5ISO-8859-9MIMEISO-8859-9
EBCDIC: USebcdic-cp-usIANAcp037
EBCDIC: Canadaebcdic-cp-caIANAcp037
EBCDIC: Netherlandsebcdic-cp-nlIANAcp037
EBCDIC: Denmarkebcdic-cp-dkIANAcp277
EBCDIC: Norwayebcdic-cp-noIANAcp277
EBCDIC: Finlandebcdic-cp-fiIANAcp278
EBCDIC: Swedenebcdic-cp-seIANAcp278
EBCDIC: Italyebcdic-cp-itIANAcp280
EBCDIC: Spain, Latin Americaebcdic-cp-esIANAcp284
EBCDIC: Great Britainebcdic-cp-gbIANAcp285
EBCDIC: Franceebcdic-cp-frIANAcp297
EBCDIC: Arabicebcdic-cp-ar1IANAcp420
EBCDIC: Hebrewebcdic-cp-heIANAcp424
EBCDIC: Switzerlandebcdic-cp-chIANAcp500
EBCDIC: Roeceebcdic-cp-roeceIANAcp870
EBCDIC: Yugoslaviaebcdic-cp-yuIANAcp870
EBCDIC: Icelandebcdic-cp-isIANAcp871
EBCDIC: Urduebcdic-cp-ar2IANAcp918
Chinese for PRC, mixed 1/2 bytegb2312MIMEGB2312
Extended Unix Code, packed for Japaneseeuc-jpMIMEeucjis
Japanese: iso-2022-jpiso-2020-jpMIMEJIS
Japanese: Shift JISShift_JISMIMESJIS
Chinese: Big5Big5MIMEBig5
Extended Unix Code, packed for Koreaneuc-krMIMEiso2022kr
Cyrillickoi8-rMIMEkoi8-r