Share on Facebook Share on Twitter Email
Answers.com

List of CJK Unified Ideographs

 
Wikipedia: List of CJK Unified Ideographs
CJK ideograph 次 in Simplified and Traditional Chinese, Japanese Kanji and Korean Hanja.

Unicode defines a total of 74,394 CJK Unified Ideographs, split across five blocks

The terms ideographs or ideograms may be misleading, since the Chinese script is not strictly a picture writing system.

Contents

CJK Unified Ideographs block

The CJK Unified Ideographs block (4E00-9FFF) contains 20,940 basic Chinese characters, not only those used in the Chinese writing system but also the Kanji used in the Japanese writing system and the Hanja, whose use is diminishing in Korea. Many characters in this block are used in all three writing systems, while others are in only one or two of the three. Chinese characters were also used in the Vietnamese Chữ nôm script (now obsolete). The first 20,902 characters in the block are arranged according to the Kangxi Dictionary ordering of radicals. In this system the characters written with the fewest strokes are listed first. The remaining characters were added later, and so are not in radical sequence.

The block is the result of Han unification[1], which was somewhat controversial in the Far East.[2] Since Chinese, Japanese and Korean characters were coded in the same location, the appearance of a selected glyph could depend on the particular font being used. However, the source separation rule states that characters encoded separately in an earlier character set would remain separate in the new Unicode encoding.[3]

Using variation selectors[4] it is possible to specify certain variant CJK ideograms within Unicode. The Adobe-Japan1 character set proposal, which actually calls for 14,658 ideographic variation sequences,[4] is an extreme example of the use of variation selectors.[5]

The following tables list the characters of the CJK Unified Ideographs block (4E00-9FFF), though without official Unicode names and descriptions of each). For space reasons the character glyphs are divided among four separate articles.

  1. CJK Unified Ideographs, 4E00-62FF
  2. CJK Unified Ideographs, 6300-77FF
  3. CJK Unified Ideographs, 7800-8CFF
  4. CJK Unified Ideographs, 8D00-9FFF

The first 20,902 characters (4E00-9FA5) have been defined since Unicode version 1.0 (1991). 22 characters (9FA6-9FBB) were added in Unicode 4.1 (2005); 8 characters (9FBC-9FC3) were added in Unicode 5.1 (2008); and 8 characters (9FC4-9FCB) were added in Unicode 5.2 (2009).

CJK Unified Ideographs Extension A

The CJK Unified Ideographs Extension A block (3400-4DBF) comprises 6,582 less common characters that were added in Unicode 3.0 (1999).

CJK Unified Ideographs Extension B

The CJK Unified Ideographs Extension B block (20000-2A6DF) comprises 42,711 characters that were added in Unicode 3.1 (2001). These include most of the characters used in the Kangxi Dictionary that are not in the basic CJK Unified Ideographs block, as well as many Chữ Nôm characters that were historically used for writing Vietnamese language.

CJK Unified Ideographs Extension C

The CJK Unified Ideographs Extension C block (2A700-2B73F) comprises 4,149 characters that were added in Unicode 5.2 (2009).

CJK Compatibility Ideographs

The CJK Compatibility Ideographs block (F900-FAFF) includes twelve characters that despite their names and location are in fact classified as unified ideographs: FA0E, FA0F, FA11, FA13, FA14, FA1F, FA21, FA23, FA24, FA27, FA28 and FA29.


Notes

See also

External links


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "List of CJK Unified Ideographs" Read more