CJK Unified Ideographs
- For unified CJK characters, see Han Unification.
CJK Unified Ideographs is a range of Unicode code points assigned for ideographs used by Chinese characters. Since its introduction in Unicode 1.00, the use of CJK ideographs has been extended to multiple blocks.
Unicode ranges
These ideographic characters appear in the following blocks:
- CJK Unified Ideographs (4E00–9FFF) (chart)
- CJK Unified Ideographs Extension A (3400–4DBF) (chart)
- CJK Unified Ideographs Extension B (20000–2A6DF)
Unicode includes support of CJKV radicals, strokes, punctuation, marks and symbols in the following blocks:
- CJK Radicals Supplement (2E80–2EFF)
- CJK Symbols and Punctuation (3000–303F) (chart)
- CJK Strokes (31C0–31EF)
- Ideographic Description Characters (2FF0–2FFF)
Additional compatibility (discouraged use) characters appear in these blocks:
- Kangxi Radicals (2F00–2FDF)
- Enclosed CJK Letters and Months (3200–32FF) (chart)
- CJK Compatibility (3300–33FF) (chart)
- CJK Compatibility Ideographs (F900–FAFF) (chart)
- CJK Compatibility Ideographs (2F800–2FA1F)
- CJK Compatibility Forms (FE30–FE4F) (chart)
These compatibility characters are included for compatibility with legacy text handling system and other legacy character sets. They include forms of characters for vertical text layout and rich text characters that Unicode recommends handling through other means.
Version history
| Unicode version | Addition | Plane | Characters | Total Characters |
|---|---|---|---|---|
| 1.0 | CJK Unified Ideographs | Basic Multilingual Plane(BMP) | 20,902 | 20,914 |
| CJK Compatibility Ideographs | BMP | 12 | ||
| 3.0 | CJK Unified Ideographs Extension A | BMP | 6,582 | 27,496 |
| 3.1 | CJK Unified Ideographs Extension B | Supplementary Ideographic Plane(SIP) | 42,711 | 70,207 |
| 4.1 | CJK Unified Ideographs: Ideographs from HKSCS-2004 and GB 18030-2000 not in ISO 10646 | BMP | 22 | 70,229 |
| 5.1 (expected) | CJK Unified Ideographs Extension C | SIP | 4,251 | 74,480 |
Sources
CJK Unified Ideographs
The code points in this region are assigned under Source Separation Rule. These characters came from following:
PRC
| Code | Standard | Character count | note |
|---|---|---|---|
| G0 | GB 2312-80 | 6763 | |
| G1 | GB 12345-90 | 2352 | |
| G3 | GB 7589-87 traditional Chinese | 7237 | |
| G5 | GB 7590-87 traditional Chinese | 7039 | |
| G7 | Modern Chinese general character chart | 642 | |
| G8 | GB 8565-89 | 290 |
Taiwan
| Code | Standard | Character count | note |
|---|---|---|---|
| T1 | CNS 11643-1986 plane 1 | 5401+9 | |
| T2 | CNS 11643-1986 plane 2 | 7650 | |
| TE | CNS 11643-1986 plane 14 | 6319+239+10 | 239 from CCIII, 10 from XCCS |
Japan
| Code | Standard | Character count | note |
|---|---|---|---|
| J0 | JIS X 0208-90 | 6335+1 | |
| J1 | JIS X 0212-90 | 5801 |
South Korea
| Code | Standard | Character count | note |
|---|---|---|---|
| K0 | KS C 5601-87 | 4888 | includes 268 duplicates |
| K1 | KS C 5657-91 | 2856 |
Others
- ANSI Z39.64-1989
- Big5
- CCCII plane 1
- GB 12052-89
- JEF
- Chinese telegraph code
- Taiwan telegraph code
- Xerox Chinese
In Unicode 4.1, 14 HKSCS-2004 characters and 8 GB 18030 characters are assigned to between U+9FA6 and U+9FBB code points.
CJK Unified Ideographs Extension A
PRC
| Code | Standard |
|---|---|
| GE | GB 16500-95 |
| GS | Singapore CJK ideographs |
Taiwan
| Code | Standard | note |
|---|---|---|
| T3 | CNS 11643-1992 plane 3 | |
| T4 | CNS 11643-1992 plane 4 | |
| T5 | CNS 11643-1992 plane 5 | |
| T6 | CNS 11643-1992 plane 6 | |
| T7 | CNS 11643-1992 plane 7 | |
| TF | CNS 11643-1992 plane 15 |
Japan
| Code | Standard | note |
|---|---|---|
| JA | Unified Japanese IT Vendors Contemporary Ideographs, 1993 |
South Korea
| Code | Standard | note |
|---|---|---|
| K2 | ||
| K3 |
Vietnam
| Code | Standard | note |
|---|---|---|
| V0 | ||
| V1 |
CJK Unified Ideographs Extension B
- Kangxi dictionary
- Hanyu character dictionary
- Ciyuan
- Cihai
- Hanyu word dictionary
- Encyclopedia of China
- Beijing University Founder DTP
- Siku Quanshu
- HKSCS
- JIS X 0213 planes 3 and 4
- KPS 9566-97, KPS 10721-2000
- CNS 11643 planes 4-7, 15
- TCVN, ,
CJK Unified Ideographs Extension C
Under current proposal, 4251 characters will be assigned to the successor of Unicode 5.0, allocated to U+2A6E0 to U+2B77A code points. The characters came from following:
PRC
- Encyclopedia of China
- Beijing University Founder DTP
- Hanyu character dictionary
- Hanyu word dictionary
- Old hanyu word dictionary
- Commericial Press Ideographs
- Xiandaihanyu Cidian
- Cihai
- Kangxi dictionary
- Chinese Academy of Surveying & Mapping
- Modern Chinese Dialect Encyclopedia
- Yanzhou jinwen jicheng yinde (殷周金文集成引得)
Macau Japan
- Japanese KOKUJI Collection
South Korea
- Korean IRG Hanja Character Set 5th Edition: 2001
North Korea
- KPS 10721:2003
Vietnam
- Từ điển chữ Nôm (喃字典), Nguyễn Quang Hồng, 2006
- Từ điển chữ Nôm Tày, Hoàng Triều Ân, 2003
- Bảng tra chữ Nôm miền Nam, Vũ Văn Kính, 1994
UTC
- ABC Chinese-English Dictionary, John DeFrancis(德范克), et al., eds., 2nd edition. (1998) Honolulu: University of Hawaii Press
- The Church of Jesus Christ of Latter-Day Saints Hong Kong division
- Mathews' Chinese-English Dictionary, Robert H. Mathews (1975) Cambridge; Harvard University Press
- Guangyun
- Chinese bird system index (中国鸟类系统检索), Zheng Zhuoxin (郑作新), et al. (2000), Beijing, 科学出版社 (www.sciencep.com)
- Annotated Shuowen Jiezi, Duan Yucai
CJK Unified Ideographs Extension D
According to the CJK editorial group report ISO/IEC JTC1/SC2/WG2/IRG N1266, there are at least characters from following:
Taiwan
- TD-454E
- TC-5036
- TD-624C
- TD-5352
- TC-4139
- TC-4A76
- TD-5C26
Korea
- K5H00535
- K5H00222
- K5H00297
- KP1-73E1
- KP1-712E
- KP1-70BE
- KP1-6752
- KP1-672B
- KP1-6651
- KP1-4B50
- KP1-487E
- KP1-4731
Vietnam
- V04-5073
Unicode
- UTC00103
Others
- CJK Unified Ideographs Extension C Remainder list
- Macao SAR (IRGN1249 with minor adjustment)
- Unicode (IRGN1256 and IRGN1257, 472 char)
- China (IRGN1264, 57)
CJK Compatibility Ideographs
See Also
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




