KS X 1001

KS X 1001
MIME / IANA	ks_c_5601-1987
Alias(es)	KS C 5601
Language(s)	Korean, English, Russian; Partial support:; Greek, Japanese
Standard	KS X 1001
Classification	ISO-2022-compatible DBCS, CJK encoding
Encoding formats	EUC-KR, ISO 2022, UHC, Johab
Preceded by	N-byte Hangul code (KS C 5601-1974)
Other related encoding(s)	KS X 1002, KPS 9566, JIS X 0208, GB 2312, GB 12052

KS X 1001, "Code for Information Interchange (Hangul and Hanja)",[lower-alpha 1][1] formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer.

KS X 1001 is encoded by the most common legacy (pre-Unicode) character encodings for Korean, including EUC-KR and Microsoft's Unified Hangul Code (UHC). It contains Korean Hangul syllables, CJK ideographs (Hanja), Greek, Cyrillic, Japanese (Hiragana and Katakana) and some other characters.

KS X 1001 is arranged as a 94×94 table, following the structure of 2-byte code words in ISO 2022 and EUC. Therefore, its code points are pairs of integers 1–94. However, some encodings (UHC and Johab), in addition to providing codes for every code point, provide additional codes for characters otherwise representable only as code point sequences.

History

This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions in 1987, 1992, 1998 and 2002.

The present, double-byte, Wansung (완성, Wanseong, 'precomposing')[1] character set was standardised by the third edition of KS C 5601,[2] which was published in 1986.[3] It is an ISO 2022 compatible encoding, typically used in EUC form, which assigns double-byte codes for non-Hangul, Hangul jamo, and the most common Hangul syllables, in contrast to Johab (조합, Johap, 'combining')[1] which assigns double-byte codes to all Hangul syllables using modern jamo. Wansung is technically a variable-length encoding, allowing other syllables to be represented with eight-byte sequences (using the jamo and Hangul Filler character), but this feature is not always implemented.[4]

The earliest edition of KS C 5601, published in 1974,[2] defined a variable-length[2] 7-bit character set which assigned single-byte code points to 51[3] basic Hangul jamo, somewhat analogously to JIS C 6220, in an encoding known as "N-byte Hangul".[5] The second edition, published in 1982, retained the main character set from the 1974 edition but defined two supplementary sets, including Johab. Neither edition was adopted as widely as intended.[2]

Wansung was kept unchanged in the 1987 and 1992 editions. In the 1992 edition, additional annex material was added,[3] including the definition of the Johab encoding[6] in annex 3, and the older N-byte Hangul encoding in annex 4.[1][5] It was published in response to industry use of Johab as a competing encoding to Wansung, being used at the time by Hangul Word Processor. Following the introduction of Unified Hangul Code by Microsoft in Windows 95, and Hangul Word Processor abandoning Johab in favour of Unicode in 2000, Johab ceased to be commonly used.[2]

Encodings

Various CJK encodings, including four based on KS X 1001, supported by Mozilla Firefox as of 2004. (This support has been reduced in later versions to avoid certain cross site scripting attacks.)

Encoding schemes of KS X 1001 include EUC-KR (in both ASCII and ISO 646-KR based variants, the latter of which includes a won currency sign (₩) at byte 0x5C rather than a backslash) and ISO-2022-KR,[7] as well as ISO-2022-JP-2 (which also encodes JIS X 0208 and JIS X 0212). These all have the drawback that they only assign codes for the 2350 precomposed Hangul syllables which have their own KS X 1001 codepoints (out of 11172 in total, not counting those using obsolete jamo), and require others to use eight-byte composition sequences, which are not supported by some partial implementations of the standard.[4]

The Johab encoding (stipulated in annex 3 of the 1992 version of the standard) and the EUC-KR superset known as Unified Hangul Code (UHC, also called Windows-949) provide single codes for all 11172 Hangul syllables.[7][6] ISO-2022-KR and Johab are rarely used. Some operating systems extend this standard in other non-uniform ways, e.g. the EUC-KR extensions MacKorean on the classic Mac OS, and IBM-949 by IBM.

Hangul Filler

The Hangul Filler character is used to introduce eight-byte Hangul composition sequences[8][9] and to stand in for an absent element (usually an empty final) in such a sequence.[9]

Unicode includes the Wansung code Hangul Filler in the Hangul Compatibility Jamo block for round-trip compatibility, but uses its own system (with its own, differently used, filler characters) for composing Hangul. The KS X 1001 Hangul composition system is not used in Unicode, and the filler renders merely as an empty space; KS X 1001 composition sequences using modern jamo may be mapped to precomposed characters in Unicode.[9] This is not usually done with Unified Hangul Code.

For round-trip compatibility, Unicode also includes the N-byte Hangul code Hangul Filler separately in the Halfwidth and Fullwidth Forms block, named the "Halfwidth Hangul Filler".

N-byte Hangul code

This is the N-byte Hangul code,[5] as specified by KS C 5601-1974 and by annex 4 of KS C 5601-1992. The second half of IBM's Code page 1040[10] is a superset of this, assigning the characters ¢¬\~ (although not £) to the same locations as in Code page 1041. Character 0x40/0xC0 is a Hangul Filler (see above), used in combining sequences.

Similarly to its Japanese counterpart JIS C 6220 (JIS X 0201), N-byte Hangul code could be used as a 7-bit encoding, with character allocations over the range 0x40 through 0x7C.[5] The chart below shows the code in an 8-bit environment with the high bit set (i.e. over 0xC0 through 0xFC), as it is used in e.g. code page 1040.

KS C 5601-1974 / N-byte Hangul[11]
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
8_ 128
9_ 144
A_ 160
B_ 176
C_ 192	HWHF FFA0	ﾡ FFA1	ﾢ FFA2	ﾣ FFA3	ﾤ FFA4	ﾥ FFA5	ﾦ FFA6	ﾧ FFA7	ﾨ FFA8	ﾩ FFA9	ﾪ FFAA	ﾫ FFAB	ﾬ FFAC	ﾭ FFAD	ﾮ FFAE	ﾯ FFAF
D_ 208	ﾰ FFB0	ﾱ FFB1	ﾲ FFB2	ﾳ FFB3	ﾴ FFB4	ﾵ FFB5	ﾶ FFB6	ﾷ FFB7	ﾸ FFB8	ﾹ FFB9	ﾺ FFBA	ﾻ FFBB	ﾼ FFBC	ﾽ FFBD	ﾾ FFBE
E_ 224			ￂ FFC2	ￃ FFC3	ￄ FFC4	ￅ FFC5	ￆ FFC6	ￇ FFC7			ￊ FFCA	ￋ FFCB	ￌ FFCC	ￍ FFCD	ￎ FFCE	ￏ FFCF
F_ 240			ￒ FFD2	ￓ FFD3	ￔ FFD4	ￕ FFD5	ￖ FFD6	ￗ FFD7			ￚ FFDA	ￛ FFDB	ￜ FFDC

Wansung code charts

Following are the code charts for KS X 1001 in Wansung layout. Where a pair of hexadecimal numbers is given, the smaller is used when encoded over GL (0x21-0x7E), as in ISO-2022-KR when the Korean set has been shifted to, and the larger is used in the more typical case of it being encoded over GR (0xA1-0xFE), as in EUC-KR or UHC. Johab changes the arrangement to encode all 11172 Hangul clusters separately and in order.

Character set 0x21 / 0xA1 (row number 1, special characters)

This set contains punctuation and other symbols, excluding punctuation present in KS X 1003 (which is included in row 3). Encodings which combine KS X 1001 with single-byte ASCII may use alternative Unicode mapping to the Halfwidth and Fullwidth Forms block for the backslash. Unicode mapping of the wave dash (tilde dash) also differs between vendors, and may be U+301C (favoured by IBM and Apple)[12][13][14] or U+223C (favoured by Microsoft).[15][16] Compare the similar but not identical handling of the JIS wave dash, and the handling of the tilde in the next row.

Except for the backslash, if two mappings are shown below, the first is used by Apple and the second is used by Microsoft.[14][16]

KS X 1001 (prefixed with 0x21 / 0xA1)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		IDSP 3000 1-1	、 3001 1-2	。 3002 1-3	· 00B7 1-4	‥ 2025 1-5	… 2026 1-6	¨ 00A8 1-7	〃 3003 1-8	–/SHY 2013/00AD 1-9	—/― 2014/2015 1-10	‖/∥ 2016/2225 1-11	\/＼ 005C/FF3C 1-12	〜/∼ 301C/223C 1-13	‘ 2018 1-14	’ 2019 1-15
3_/B_	“ 201C 1-16	” 201D 1-17	〔 3014 1-18	〕 3015 1-19	〈 3008 1-20	〉 3009 1-21	《 300A 1-22	》 300B 1-23	「 300C 1-24	」 300D 1-25	『 300E 1-26	』 300F 1-27	【 3010 1-28	】 3011 1-29	± 00B1 1-30	× 00D7 1-31
4_/C_	÷ 00F7 1-32	≠ 2260 1-33	≤ 2264 1-34	≥ 2265 1-35	∞ 221E 1-36	∴ 2234 1-37	° 00B0 1-38	′ 2032 1-39	″ 2033 1-40	℃ 2103 1-41	Å 212B 1-42	¢/￠ 00A2/FFE0 1-43	£/￡ 00A3/FFE1 1-44	¥/￥ 00A5/FFE5 1-45	♂ 2642 1-46	♀ 2640 1-47
5_/D_	∠ 2220 1-48	⊥ 22A5 1-49	⌒ 2312 1-50	∂ 2202 1-51	∇ 2207 1-52	≡ 2261 1-53	≒ 2252 1-54	§ 00A7 1-55	※ 203B 1-56	☆ 2606 1-57	★ 2605 1-58	○ 25CB 1-59	● 25CF 1-60	◎ 25CE 1-61	◇ 25C7 1-62	◆ 25C6 1-63
6_/E_	□ 25A1 1-64	■ 25A0 1-65	△ 25B3 1-66	▲ 25B2 1-67	▽ 25BD 1-68	▼ 25BC 1-69	→ 2192 1-70	← 2190 1-71	↑ 2191 1-72	↓ 2193 1-73	↔ 2194 1-74	〓 3013 1-75	≪ 226A 1-76	≫ 226B 1-77	√ 221A 1-78	∽ 223D 1-79
7_/F_	∝ 221D 1-80	∵ 2235 1-81	∫ 222B 1-82	∬ 222C 1-83	∈ 2208 1-84	∋ 220B 1-85	⊆ 2286 1-86	⊇ 2287 1-87	⊂ 2282 1-88	⊃ 2283 1-89	∪ 222A 1-90	∩ 2229 1-91	∧ 2227 1-92	∨ 2228 1-93	¬/￢ 00AC/FFE2 1-94

Letter Number Punctuation Symbol Other Undefined

Character set 0x22 / 0xA2 (row number 2, special characters)

This set contains additional punctuation and symbols. Similarly to the tilde character in the previous row, different mappings are used by Apple and Microsoft for the tilde character in this row (U+02DC by Apple, FF5E by Microsoft),[14][16] which is intended to be shown as a raised tilde, whereas the tilde in the previous row is intended to be shown in-line at dash height.[17] Mapping of the circled dot also differs.[14][16]

The euro and registered trademark sign were added in 1998, while the postal mark (㉾) was added in 2002.[1]

KS X 1001 (prefixed with 0x22 / 0xA2)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		⇒ 21D2 2-1	⇔ 21D4 2-2	∀ 2200 2-3	∃ 2203 2-4	´ 00B4 2-5	˜/～ 02DC/FF5E 2-6	ˇ 02C7 2-7	˘ 02D8 2-8	˝ 02DD 2-9	˚ 02DA 2-10	˙ 02D9 2-11	¸ 00B8 2-12	˛ 02DB 2-13	¡ 00A1 2-14	¿ 00BF 2-15
3_/B_	ː 02D0 2-16	∮ 222E 2-17	∑ 2211 2-18	∏ 220F 2-19	¤ 00A4 2-20	℉ 2109 2-21	‰ 2030 2-22	◁ 25C1 2-23	◀ 25C0 2-24	▷ 25B7 2-25	▶ 25B6 2-26	♤ 2664 2-27	♠ 2660 2-28	♡ 2661 2-29	♥ 2665 2-30	♧ 2667 2-31
4_/C_	♣ 2663 2-32	◉/⊙ 25C9/2299 2-33	◈ 25C8 2-34	▣ 25A3 2-35	◐ 25D0 2-36	◑ 25D1 2-37	▒ 2592 2-38	▤ 25A4 2-39	▥ 25A5 2-40	▨ 25A8 2-41	▧ 25A7 2-42	▦ 25A6 2-43	▩ 25A9 2-44	♨ 2668 2-45	☏ 260F 2-46	☎ 260E 2-47
5_/D_	☜ 261C 2-48	☞ 261E 2-49	¶ 00B6 2-50	† 2020 2-51	‡ 2021 2-52	↕ 2195 2-53	↗ 2197 2-54	↙ 2199 2-55	↖ 2196 2-56	↘ 2198 2-57	♭ 266D 2-58	♩ 2669 2-59	♪ 266A 2-60	♬ 266C 2-61	㉿ 327F 2-62	㈜ 321C 2-63
6_/E_	№ 2116 2-64	㏇ 33C7 2-65	™ 2122 2-66	㏂ 33C2 2-67	㏘ 33D8 2-68	℡ 2121 2-69	€ 20AC 2-70	® 00AE 2-71	㉾ 327E 2-72	2-73	2-74	2-75	2-76	2-77	2-78	2-79
7_/F_	2-80	2-81	2-82	2-83	2-84	2-85	2-86	2-87	2-88	2-89	2-90	2-91	2-92	2-93	2-94

Character set 0x23 / 0xA3 (row number 3, basic Latin / ISO 646-KR)

This set corresponds to KS X 1003 (the ISO 646 variant for Korean, a similar set to ASCII), but as two-byte codes preceded by 0x23 (or 0xA3 in GR-delegated (EUC) form). It includes the English alphabet / Basic Latin alphabet, western Arabic numerals and punctuation.

Compare the Roman set of JIS X 0201, which differs by including a Yen sign rather than a Won sign. Contrast the third rows of KPS 9566 and of JIS X 0208, which follow the ISO 646 layout but only include letters and digits.

KS X 1001 (prefixed with 0x23 / 0xA3); non-fullwidth mappings
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		! 0021 3-1	" 0022 3-2	# 0023 3-3	$ 0024 3-4	% 0025 3-5	& 0026 3-6	' 0027 3-7	( 0028 3-8	) 0029 3-9	* 002A 3-10	+ 002B 3-11	, 002C 3-12	- 002D 3-13	. 002E 3-14	/ 002F 3-15
3_/B_	0 0030 3-16	1 0031 3-17	2 0032 3-18	3 0033 3-19	4 0034 3-20	5 0035 3-21	6 0036 3-22	7 0037 3-23	8 0038 3-24	9 0039 3-25	: 003A 3-26	; 003B 3-27	< 003C 3-28	= 003D 3-29	> 003E 3-30	? 003F 3-31
4_/C_	@ 0040 3-32	A 0041 3-33	B 0042 3-34	C 0043 3-35	D 0044 3-36	E 0045 3-37	F 0046 3-38	G 0047 3-39	H 0048 3-40	I 0049 3-41	J 004A 3-42	K 004B 3-43	L 004C 3-44	M 004D 3-45	N 004E 3-46	O 004F 3-47
5_/D_	P 0050 3-48	Q 0051 3-49	R 0052 3-50	S 0053 3-51	T 0054 3-52	U 0055 3-53	V 0056 3-54	W 0057 3-55	X 0058 3-56	Y 0059 3-57	Z 005A 3-58	[ 005B 3-59	₩ 20A9 3-60	] 005D 3-61	^ 005E 3-62	_ 005F 3-63
6_/E_	` 0060 3-64	a 0061 3-65	b 0062 3-66	c 0063 3-67	d 0064 3-68	e 0065 3-69	f 0066 3-70	g 0067 3-71	h 0068 3-72	i 0069 3-73	j 006A 3-74	k 006B 3-75	l 006C 3-76	m 006D 3-77	n 006E 3-78	o 006F 3-79
7_/F_	p 0070 3-80	q 0071 3-81	r 0072 3-82	s 0073 3-83	t 0074 3-84	u 0075 3-85	v 0076 3-86	w 0077 3-87	x 0078 3-88	y 0079 3-89	z 007A 3-90	{ 007B 3-91	\| 007C 3-92	} 007D 3-93	‾ 203E 3-94

Encodings such as EUC-KR and UHC combine KS X 1001 with single-byte ASCII or KS X 1003, and hence use alternative Unicode mappings to the Halfwidth and Fullwidth Forms block for the double-byte representations of these characters.

KS X 1001 (prefixed with 0x23 / 0xA3); fullwidth mappings
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		！ FF01 3-1	＂ FF02 3-2	＃ FF03 3-3	＄ FF04 3-4	％ FF05 3-5	＆ FF06 3-6	＇ FF07 3-7	（ FF08 3-8	） FF09 3-9	＊ FF0A 3-10	＋ FF0B 3-11	， FF0C 3-12	－ FF0D 3-13	． FF0E 3-14	／ FF0F 3-15
3_/B_	０ FF10 3-16	１ FF11 3-17	２ FF12 3-18	３ FF13 3-19	４ FF14 3-20	５ FF15 3-21	６ FF16 3-22	７ FF17 3-23	８ FF18 3-24	９ FF19 3-25	： FF1A 3-26	； FF1B 3-27	＜ FF1C 3-28	＝ FF1D 3-29	＞ FF1E 3-30	？ FF1F 3-31
4_/C_	＠ FF20 3-32	Ａ FF21 3-33	Ｂ FF22 3-34	Ｃ FF23 3-35	Ｄ FF24 3-36	Ｅ FF25 3-37	Ｆ FF26 3-38	Ｇ FF27 3-39	Ｈ FF28 3-40	Ｉ FF29 3-41	Ｊ FF2A 3-42	Ｋ FF2B 3-43	Ｌ FF2C 3-44	Ｍ FF2D 3-45	Ｎ FF2E 3-46	Ｏ FF2F 3-47
5_/D_	Ｐ FF30 3-48	Ｑ FF31 3-49	Ｒ FF32 3-50	Ｓ FF33 3-51	Ｔ FF34 3-52	Ｕ FF35 3-53	Ｖ FF36 3-54	Ｗ FF37 3-55	Ｘ FF38 3-56	Ｙ FF39 3-57	Ｚ FF3A 3-58	［ FF3B 3-59	￦ FFE6 3-60	］ FF3D 3-61	＾ FF3E 3-62	＿ FF3F 3-63
6_/E_	｀ FF40 3-64	ａ FF41 3-65	ｂ FF42 3-66	ｃ FF43 3-67	ｄ FF44 3-68	ｅ FF45 3-69	ｆ FF46 3-70	ｇ FF47 3-71	ｈ FF48 3-72	ｉ FF49 3-73	ｊ FF4A 3-74	ｋ FF4B 3-75	ｌ FF4C 3-76	ｍ FF4D 3-77	ｎ FF4E 3-78	ｏ FF4F 3-79
7_/F_	ｐ FF50 3-80	ｑ FF51 3-81	ｒ FF52 3-82	ｓ FF53 3-83	ｔ FF54 3-84	ｕ FF55 3-85	ｖ FF56 3-86	ｗ FF57 3-87	ｘ FF58 3-88	ｙ FF59 3-89	ｚ FF5A 3-90	｛ FF5B 3-91	｜ FF5C 3-92	｝ FF5D 3-93	￣ FFE3 3-94

Character set 0x24 / 0xA4 (row number 4, Hangul jamo)

This set includes modern Hangul consonants, followed by vowels, both ordered by South Korean collation customs, followed by obsolete consonants. When used individually, these characters map to the Unicode Hangul Compatibility Jamo block, and do not have a one-to-one mapping with the position-specific characters in the Hangul Jamo block. Compare with row 4 of the North Korean KPS 9566. Character 04-52 is a Hangul Filler (see above), used in combining sequences.

KS X 1001 (prefixed with 0x24 / 0xA4)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		ㄱ 3131 4-1	ㄲ 3132 4-2	ㄳ 3133 4-3	ㄴ 3134 4-4	ㄵ 3135 4-5	ㄶ 3136 4-6	ㄷ 3137 4-7	ㄸ 3138 4-8	ㄹ 3139 4-9	ㄺ 313A 4-10	ㄻ 313B 4-11	ㄼ 313C 4-12	ㄽ 313D 4-13	ㄾ 313E 4-14	ㄿ 313F 4-15
3_/B_	ㅀ 3140 4-16	ㅁ 3141 4-17	ㅂ 3142 4-18	ㅃ 3143 4-19	ㅄ 3144 4-20	ㅅ 3145 4-21	ㅆ 3146 4-22	ㅇ 3147 4-23	ㅈ 3148 4-24	ㅉ 3149 4-25	ㅊ 314A 4-26	ㅋ 314B 4-27	ㅌ 314C 4-28	ㅍ 314D 4-29	ㅎ 314E 4-30	ㅏ 314F 4-31
4_/C_	ㅐ 3150 4-32	ㅑ 3151 4-33	ㅒ 3152 4-34	ㅓ 3153 4-35	ㅔ 3154 4-36	ㅕ 3155 4-37	ㅖ 3156 4-38	ㅗ 3157 4-39	ㅘ 3158 4-40	ㅙ 3159 4-41	ㅚ 315A 4-42	ㅛ 315B 4-43	ㅜ 315C 4-44	ㅝ 315D 4-45	ㅞ 315E 4-46	ㅟ 315F 4-47
5_/D_	ㅠ 3160 4-48	ㅡ 3161 4-49	ㅢ 3162 4-50	ㅣ 3163 4-51	HF 3164 4-52	ㅥ 3165 4-53	ㅦ 3166 4-54	ㅧ 3167 4-55	ㅨ 3168 4-56	ㅩ 3169 4-57	ㅪ 316A 4-58	ㅫ 316B 4-59	ㅬ 316C 4-60	ㅭ 316D 4-61	ㅮ 316E 4-62	ㅯ 316F 4-63
6_/E_	ㅰ 3170 4-64	ㅱ 3171 4-65	ㅲ 3172 4-66	ㅳ 3173 4-67	ㅴ 3174 4-68	ㅵ 3175 4-69	ㅶ 3176 4-70	ㅷ 3177 4-71	ㅸ 3178 4-72	ㅹ 3179 4-73	ㅺ 317A 4-74	ㅻ 317B 4-75	ㅼ 317C 4-76	ㅽ 317D 4-77	ㅾ 317E 4-78	ㅿ 317F 4-79
7_/F_	ㆀ 3180 4-80	ㆁ 3181 4-81	ㆂ 3182 4-82	ㆃ 3183 4-83	ㆄ 3184 4-84	ㆅ 3185 4-85	ㆆ 3186 4-86	ㆇ 3187 4-87	ㆈ 3188 4-88	ㆉ 3189 4-89	ㆊ 318A 4-90	ㆋ 318B 4-91	ㆌ 318C 4-92	ㆍ 318D 4-93	ㆎ 318E 4-94

Character set 0x25 / 0xA5 (row number 5, Roman numerals and Greek)

This set contains Roman numerals and basic support for the Greek alphabet, without diacritics or the final sigma.

Contrast row 6 of KPS 9566, which includes the same characters but in a different layout.

KS X 1001 (prefixed with 0x25 / 0xA5)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		ⅰ 2170 5-1	ⅱ 2171 5-2	ⅲ 2172 5-3	ⅳ 2173 5-4	ⅴ 2174 5-5	ⅵ 2175 5-6	ⅶ 2176 5-7	ⅷ 2177 5-8	ⅸ 2178 5-9	ⅹ 2179 5-10	5-11	5-12	5-13	5-14	5-15
3_/B_	Ⅰ 2160 5-16	Ⅱ 2161 5-17	Ⅲ 2162 5-18	Ⅳ 2163 5-19	Ⅴ 2164 5-20	Ⅵ 2165 5-21	Ⅶ 2166 5-22	Ⅷ 2167 5-23	Ⅸ 2168 5-24	Ⅹ 2169 5-25	5-26	5-27	5-28	5-29	5-30	5-31
4_/C_	5-32	Α 0391 5-33	Β 0392 5-34	Γ 0393 5-35	Δ 0394 5-36	Ε 0395 5-37	Ζ 0396 5-38	Η 0397 5-39	Θ 0398 5-40	Ι 0399 5-41	Κ 039A 5-42	Λ 039B 5-43	Μ 039C 5-44	Ν 039D 5-45	Ξ 039E 5-46	Ο 039F 5-47
5_/D_	Π 03A0 5-48	Ρ 03A1 5-49	Σ 03A3 5-50	Τ 03A4 5-51	Υ 03A5 5-52	Φ 03A6 5-53	Χ 03A7 5-54	Ψ 03A8 5-55	Ω 03A9 5-56	5-57	5-58	5-59	5-60	5-61	5-62	5-63
6_/E_	5-64	α 03B1 5-65	β 03B2 5-66	γ 03B3 5-67	δ 03B4 5-68	ε 03B5 5-69	ζ 03B6 5-70	η 03B7 5-71	θ 03B8 5-72	ι 03B9 5-73	κ 03BA 5-74	λ 03BB 5-75	μ 03BC 5-76	ν 03BD 5-77	ξ 03BE 5-78	ο 03BF 5-79
7_/F_	π 03C0 5-80	ρ 03C1 5-81	σ 03C3 5-82	τ 03C4 5-83	υ 03C5 5-84	φ 03C6 5-85	χ 03C7 5-86	ψ 03C8 5-87	ω 03C9 5-88	5-89	5-90	5-91	5-92	5-93	5-94

Character set 0x26 / 0xA6 (row number 6, box drawing)

KS X 1001 (prefixed with 0x26 / 0xA6)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		─ 2500 6-1	│ 2502 6-2	┌ 250C 6-3	┐ 2510 6-4	┘ 2518 6-5	└ 2514 6-6	├ 251C 6-7	┬ 252C 6-8	┤ 2524 6-9	┴ 2534 6-10	┼ 253C 6-11	━ 2501 6-12	┃ 2503 6-13	┏ 250F 6-14	┓ 2513 6-15
3_/B_	┛ 251B 6-16	┗ 2517 6-17	┣ 2523 6-18	┳ 2533 6-19	┫ 252B 6-20	┻ 253B 6-21	╋ 254B 6-22	┠ 2520 6-23	┯ 252F 6-24	┨ 2528 6-25	┷ 2537 6-26	┿ 253F 6-27	┝ 251D 6-28	┰ 2530 6-29	┥ 2525 6-30	┸ 2538 6-31
4_/C_	╂ 2542 6-32	┒ 2512 6-33	┑ 2511 6-34	┚ 251A 6-35	┙ 2519 6-36	┖ 2516 6-37	┕ 2515 6-38	┎ 250E 6-39	┍ 250D 6-40	┞ 251E 6-41	┟ 251F 6-42	┡ 2521 6-43	┢ 2522 6-44	┦ 2526 6-45	┧ 2527 6-46	┩ 2529 6-47
5_/D_	┪ 252A 6-48	┭ 252D 6-49	┮ 252E 6-50	┱ 2531 6-51	┲ 2532 6-52	┵ 2535 6-53	┶ 2536 6-54	┹ 2539 6-55	┺ 253A 6-56	┽ 253D 6-57	┾ 253E 6-58	╀ 2540 6-59	╁ 2541 6-60	╃ 2543 6-61	╄ 2544 6-62	╅ 2545 6-63
6_/E_	╆ 2546 6-64	╇ 2547 6-65	╈ 2548 6-66	╉ 2549 6-67	╊ 254A 6-68	6-69	6-70	6-71	6-72	6-73	6-74	6-75	6-76	6-77	6-78	6-79
7_/F_	6-80	6-81	6-82	6-83	6-84	6-85	6-86	6-87	6-88	6-89	6-90	6-91	6-92	6-93	6-94

Character set 0x27 / 0xA7 (row number 7, unit symbols)

KS X 1001 (prefixed with 0x27 / 0xA7)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		㎕ 3395 7-1	㎖ 3396 7-2	㎗ 3397 7-3	ℓ 2113 7-4	㎘ 3398 7-5	㏄ 33C4 7-6	㎣ 33A3 7-7	㎤ 33A4 7-8	㎥ 33A5 7-9	㎦ 33A6 7-10	㎙ 3399 7-11	㎚ 339A 7-12	㎛ 339B 7-13	㎜ 339C 7-14	㎝ 339D 7-15
3_/B_	㎞ 339E 7-16	㎟ 339F 7-17	㎠ 33A0 7-18	㎡ 33A1 7-19	㎢ 33A2 7-20	㏊ 33CA 7-21	㎍ 338D 7-22	㎎ 338E 7-23	㎏ 338F 7-24	㏏ 33CF 7-25	㎈ 3388 7-26	㎉ 3389 7-27	㏈ 33C8 7-28	㎧ 33A7 7-29	㎨ 33A8 7-30	㎰ 33B0 7-31
4_/C_	㎱ 33B1 7-32	㎲ 33B2 7-33	㎳ 33B3 7-34	㎴ 33B4 7-35	㎵ 33B5 7-36	㎶ 33B6 7-37	㎷ 33B7 7-38	㎸ 33B8 7-39	㎹ 33B9 7-40	㎀ 3380 7-41	㎁ 3381 7-42	㎂ 3382 7-43	㎃ 3383 7-44	㎄ 3384 7-45	㎺ 33BA 7-46	㎻ 33BB 7-47
5_/D_	㎼ 33BC 7-48	㎽ 33BD 7-49	㎾ 33BE 7-50	㎿ 33BF 7-51	㎐ 3390 7-52	㎑ 3391 7-53	㎒ 3392 7-54	㎓ 3393 7-55	㎔ 3394 7-56	Ω 2126 7-57	㏀ 33C0 7-58	㏁ 33C1 7-59	㎊ 338A 7-60	㎋ 338B 7-61	㎌ 338C 7-62	㏖ 33D6 7-63
6_/E_	㏅ 33C5 7-64	㎭ 33AD 7-65	㎮ 33AE 7-66	㎯ 33AF 7-67	㏛ 33DB 7-68	㎩ 33A9 7-69	㎪ 33AA 7-70	㎫ 33AB 7-71	㎬ 33AC 7-72	㏝ 33DD 7-73	㏐ 33D0 7-74	㏓ 33D3 7-75	㏃ 33C3 7-76	㏉ 33C9 7-77	㏜ 33DC 7-78	㏆ 33C6 7-79
7_/F_	7-80	7-81	7-82	7-83	7-84	7-85	7-86	7-87	7-88	7-89	7-90	7-91	7-92	7-93	7-94

Character set 0x28 / 0xA8 (row number 8, extended Latin, encircled, fractions)

KS X 1001 (prefixed with 0x28 / 0xA8)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		Æ 00C6 8-1	Ð 00D0 8-2	ª 00AA 8-3	Ħ 0126 8-4	8-5	Ĳ 0132 8-6	8-7	Ŀ 013F 8-8	Ł 0141 8-9	Ø 00D8 8-10	Œ 0152 8-11	º 00BA 8-12	Þ 00DE 8-13	Ŧ 0166 8-14	Ŋ 014A 8-15
3_/B_	8-16	㉠ 3260 8-17	㉡ 3261 8-18	㉢ 3262 8-19	㉣ 3263 8-20	㉤ 3264 8-21	㉥ 3265 8-22	㉦ 3266 8-23	㉧ 3267 8-24	㉨ 3268 8-25	㉩ 3269 8-26	㉪ 326A 8-27	㉫ 326B 8-28	㉬ 326C 8-29	㉭ 326D 8-30	㉮ 326E 8-31
4_/C_	㉯ 326F 8-32	㉰ 3270 8-33	㉱ 3271 8-34	㉲ 3272 8-35	㉳ 3273 8-36	㉴ 3274 8-37	㉵ 3275 8-38	㉶ 3276 8-39	㉷ 3277 8-40	㉸ 3278 8-41	㉹ 3279 8-42	㉺ 327A 8-43	㉻ 327B 8-44	ⓐ 24D0 8-45	ⓑ 24D1 8-46	ⓒ 24D2 8-47
5_/D_	ⓓ 24D3 8-48	ⓔ 24D4 8-49	ⓕ 24D5 8-50	ⓖ 24D6 8-51	ⓗ 24D7 8-52	ⓘ 24D8 8-53	ⓙ 24D9 8-54	ⓚ 24DA 8-55	ⓛ 24DB 8-56	ⓜ 24DC 8-57	ⓝ 24DD 8-58	ⓞ 24DE 8-59	ⓟ 24DF 8-60	ⓠ 24E0 8-61	ⓡ 24E1 8-62	ⓢ 24E2 8-63
6_/E_	ⓣ 24E3 8-64	ⓤ 24E4 8-65	ⓥ 24E5 8-66	ⓦ 24E6 8-67	ⓧ 24E7 8-68	ⓨ 24E8 8-69	ⓩ 24E9 8-70	① 2460 8-71	② 2461 8-72	③ 2462 8-73	④ 2463 8-74	⑤ 2464 8-75	⑥ 2465 8-76	⑦ 2466 8-77	⑧ 2467 8-78	⑨ 2468 8-79
7_/F_	⑩ 2469 8-80	⑪ 246A 8-81	⑫ 246B 8-82	⑬ 246C 8-83	⑭ 246D 8-84	⑮ 246E 8-85	½ 00BD 8-86	⅓ 2153 8-87	⅔ 2154 8-88	¼ 00BC 8-89	¾ 00BE 8-90	⅛ 215B 8-91	⅜ 215C 8-92	⅝ 215D 8-93	⅞ 215E 8-94

Character set 0x29 / 0xA9 (row number 9, extended Latin, encircled, superscript and subscript)

KS X 1001 (prefixed with 0x29 / 0xA9)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		æ 00E6 9-1	đ 0111 9-2	ð 00F0 9-3	ħ 0127 9-4	ı 0131 9-5	ĳ 0133 9-6	ĸ 0138 9-7	ŀ 0140 9-8	ł 0142 9-9	ø 00F8 9-10	œ 0153 9-11	ß 00DF 9-12	þ 00FE 9-13	ŧ 0167 9-14	ŋ 014B 9-15
3_/B_	ŉ 0149 9-16	㈀ 3200 9-17	㈁ 3201 9-18	㈂ 3202 9-19	㈃ 3203 9-20	㈄ 3204 9-21	㈅ 3205 9-22	㈆ 3206 9-23	㈇ 3207 9-24	㈈ 3208 9-25	㈉ 3209 9-26	㈊ 320A 9-27	㈋ 320B 9-28	㈌ 320C 9-29	㈍ 320D 9-30	㈎ 320E 9-31
4_/C_	㈏ 320F 9-32	㈐ 3210 9-33	㈑ 3211 9-34	㈒ 3212 9-35	㈓ 3213 9-36	㈔ 3214 9-37	㈕ 3215 9-38	㈖ 3216 9-39	㈗ 3217 9-40	㈘ 3218 9-41	㈙ 3219 9-42	㈚ 321A 9-43	㈛ 321B 9-44	⒜ 249C 9-45	⒝ 249D 9-46	⒞ 249E 9-47
5_/D_	⒟ 249F 9-48	⒠ 24A0 9-49	⒡ 24A1 9-50	⒢ 24A2 9-51	⒣ 24A3 9-52	⒤ 24A4 9-53	⒥ 24A5 9-54	⒦ 24A6 9-55	⒧ 24A7 9-56	⒨ 24A8 9-57	⒩ 24A9 9-58	⒪ 24AA 9-59	⒫ 24AB 9-60	⒬ 24AC 9-61	⒭ 24AD 9-62	⒮ 24AE 9-63
6_/E_	⒯ 24AF 9-64	⒰ 24B0 9-65	⒱ 24B1 9-66	⒲ 24B2 9-67	⒳ 24B3 9-68	⒴ 24B4 9-69	⒵ 24B5 9-70	⑴ 2474 9-71	⑵ 2475 9-72	⑶ 2476 9-73	⑷ 2477 9-74	⑸ 2478 9-75	⑹ 2479 9-76	⑺ 247A 9-77	⑻ 247B 9-78	⑼ 247C 9-79
7_/F_	⑽ 247D 9-80	⑾ 247E 9-81	⑿ 247F 9-82	⒀ 2480 9-83	⒁ 2481 9-84	⒂ 2482 9-85	¹ 00B9 9-86	² 00B2 9-87	³ 00B3 9-88	⁴ 2074 9-89	ⁿ 207F 9-90	₁ 2081 9-91	₂ 2082 9-92	₃ 2083 9-93	₄ 2084 9-94

Character set 0x2A / 0xAA (row number 10, Hiragana)

This set contains Hiragana for writing the Japanese language.

Compare row 10 of KPS 9566, which uses the same layout. Compare and contrast row 4 of JIS X 0208, which also uses the same layout, but in a different row.

KS X 1001 (prefixed with 0x2A / 0xAA)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		ぁ 3041 10-1	あ 3042 10-2	ぃ 3043 10-3	い 3044 10-4	ぅ 3045 10-5	う 3046 10-6	ぇ 3047 10-7	え 3048 10-8	ぉ 3049 10-9	お 304A 10-10	か 304B 10-11	が 304C 10-12	き 304D 10-13	ぎ 304E 10-14	く 304F 10-15
3_/B_	ぐ 3050 10-16	け 3051 10-17	げ 3052 10-18	こ 3053 10-19	ご 3054 10-20	さ 3055 10-21	ざ 3056 10-22	し 3057 10-23	じ 3058 10-24	す 3059 10-25	ず 305A 10-26	せ 305B 10-27	ぜ 305C 10-28	そ 305D 10-29	ぞ 305E 10-30	た 305F 10-31
4_/C_	だ 3060 10-32	ち 3061 10-33	ぢ 3062 10-34	っ 3063 10-35	つ 3064 10-36	づ 3065 10-37	て 3066 10-38	で 3067 10-39	と 3068 10-40	ど 3069 10-41	な 306A 10-42	に 306B 10-43	ぬ 306C 10-44	ね 306D 10-45	の 306E 10-46	は 306F 10-47
5_/D_	ば 3070 10-48	ぱ 3071 10-49	ひ 3072 10-50	び 3073 10-51	ぴ 3074 10-52	ふ 3075 10-53	ぶ 3076 10-54	ぷ 3077 10-55	へ 3078 10-56	べ 3079 10-57	ぺ 307A 10-58	ほ 307B 10-59	ぼ 307C 10-60	ぽ 307D 10-61	ま 307E 10-62	み 307F 10-63
6_/E_	む 3080 10-64	め 3081 10-65	も 3082 10-66	ゃ 3083 10-67	や 3084 10-68	ゅ 3085 10-69	ゆ 3086 10-70	ょ 3087 10-71	よ 3088 10-72	ら 3089 10-73	り 308A 10-74	る 308B 10-75	れ 308C 10-76	ろ 308D 10-77	ゎ 308E 10-78	わ 308F 10-79
7_/F_	ゐ 3090 10-80	ゑ 3091 10-81	を 3092 10-82	ん 3093 10-83	10-84	10-85	10-86	10-87	10-88	10-89	10-90	10-91	10-92	10-93	10-94

Character set 0x2B / 0xAB (row number 11, Katakana)

This set contains Katakana for writing the Japanese language. However, the Japanese long vowel mark, which is used in katakana text and included in row 1 of JIS X 0208, is not included.[18]

Compare row 11 of KPS 9566, which uses the same layout. Compare and contrast row 5 of JIS X 0208, which also uses the same layout, but in a different row.

KS X 1001 (prefixed with 0x2B / 0xAB)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		ァ 30A1 11-1	ア 30A2 11-2	ィ 30A3 11-3	イ 30A4 11-4	ゥ 30A5 11-5	ウ 30A6 11-6	ェ 30A7 11-7	エ 30A8 11-8	ォ 30A9 11-9	オ 30AA 11-10	カ 30AB 11-11	ガ 30AC 11-12	キ 30AD 11-13	ギ 30AE 11-14	ク 30AF 11-15
3_/B_	グ 30B0 11-16	ケ 30B1 11-17	ゲ 30B2 11-18	コ 30B3 11-19	ゴ 30B4 11-20	サ 30B5 11-21	ザ 30B6 11-22	シ 30B7 11-23	ジ 30B8 11-24	ス 30B9 11-25	ズ 30BA 11-26	セ 30BB 11-27	ゼ 30BC 11-28	ソ 30BD 11-29	ゾ 30BE 11-30	タ 30BF 11-31
4_/C_	ダ 30C0 11-32	チ 30C1 11-33	ヂ 30C2 11-34	ッ 30C3 11-35	ツ 30C4 11-36	ヅ 30C5 11-37	テ 30C6 11-38	デ 30C7 11-39	ト 30C8 11-40	ド 30C9 11-41	ナ 30CA 11-42	ニ 30CB 11-43	ヌ 30CC 11-44	ネ 30CD 11-45	ノ 30CE 11-46	ハ 30CF 11-47
5_/D_	バ 30D0 11-48	パ 30D1 11-49	ヒ 30D2 11-50	ビ 30D3 11-51	ピ 30D4 11-52	フ 30D5 11-53	ブ 30D6 11-54	プ 30D7 11-55	ヘ 30D8 11-56	ベ 30D9 11-57	ペ 30DA 11-58	ホ 30DB 11-59	ボ 30DC 11-60	ポ 30DD 11-61	マ 30DE 11-62	ミ 30DF 11-63
6_/E_	ム 30E0 11-64	メ 30E1 11-65	モ 30E2 11-66	ャ 30E3 11-67	ヤ 30E4 11-68	ュ 30E5 11-69	ユ 30E6 11-70	ョ 30E7 11-71	ヨ 30E8 11-72	ラ 30E9 11-73	リ 30EA 11-74	ル 30EB 11-75	レ 30EC 11-76	ロ 30ED 11-77	ヮ 30EE 11-78	ワ 30EF 11-79
7_/F_	ヰ 30F0 11-80	ヱ 30F1 11-81	ヲ 30F2 11-82	ン 30F3 11-83	ヴ 30F4 11-84	ヵ 30F5 11-85	ヶ 30F6 11-86	11-87	11-88	11-89	11-90	11-91	11-92	11-93	11-94

Character set 0x2C / 0xAC (row number 12, Cyrillic)

This set contains the modern Russian alphabet, and is not necessarily sufficient to represent other forms of the Cyrillic script.

Compare row 5 of KPS 9566 and row 7 of JIS X 0208, which use the same layout (but in a different row).

KS X 1001 (prefixed with 0x2C / 0xAC)
	_0	_1	_2	_3	_4	_5	_6	_7	_8	_9	_A	_B	_C	_D	_E	_F
2_/A_		А 0410 12-1	Б 0411 12-2	В 0412 12-3	Г 0413 12-4	Д 0414 12-5	Е 0415 12-6	Ё 0401 12-7	Ж 0416 12-8	З 0417 12-9	И 0418 12-10	Й 0419 12-11	К 041A 12-12	Л 041B 12-13	М 041C 12-14	Н 041D 12-15
3_/B_	О 041E 12-16	П 041F 12-17	Р 0420 12-18	С 0421 12-19	Т 0422 12-20	У 0423 12-21	Ф 0424 12-22	Х 0425 12-23	Ц 0426 12-24	Ч 0427 12-25	Ш 0428 12-26	Щ 0429 12-27	Ъ 042A 12-28	Ы 042B 12-29	Ь 042C 12-30	Э 042D 12-31
4_/C_	Ю 042E 12-32	Я 042F 12-33	12-34	12-35	12-36	12-37	12-38	12-39	12-40	12-41	12-42	12-43	12-44	12-45	12-46	12-47
5_/D_	12-48	а 0430 12-49	б 0431 12-50	в 0432 12-51	г 0433 12-52	д 0434 12-53	е 0435 12-54	ё 0451 12-55	ж 0436 12-56	з 0437 12-57	и 0438 12-58	й 0439 12-59	к 043A 12-60	л 043B 12-61	м 043C 12-62	н 043D 12-63
6_/E_	о 043E 12-64	п 043F 12-65	р 0440 12-66	с 0441 12-67	т 0442 12-68	у 0443 12-69	ф 0444 12-70	х 0445 12-71	ц 0446 12-72	ч 0447 12-73	ш 0448 12-74	щ 0449 12-75	ъ 044A 12-76	ы 044B 12-77	ь 044C 12-78	э 044D 12-79
7_/F_	ю 044E 12-80	я 044F 12-81	12-82	12-83	12-84	12-85	12-86	12-87	12-88	12-89	12-90	12-91	12-92	12-93	12-94

Precomposed Hangul sets (rows number 16 through 40)

Code points for precomposed Hangul are included in a continuous sorted block between code points 16-01 and 40-94 inclusive. Not all possible syllable clusters are included in this range. Compare the different ordering and availability in KPS 9566.

Note that initial+vowel+final syllables 뢨, 썅, 쏀, 쓩, and 쭁 are included but their initial+vowel counterparts 뢔, 쌰, 쎼, 쓔, and 쬬 are not. This used to cause problems when inputting, because input methods have to go through an initial+vowel syllable first in order to input an initial+vowel+final syllable (e.g. ㅎ → 하 → 한).

Those which are not listed here may be represented using eight-byte composition sequences. All other modern-jamo clusters are assigned codes elsewhere by UHC. All possible modern-jamo clusters are assigned codes by Johab.

Row 16: 가 각 간 갇 갈 갉 갊 감 갑 값 갓 갔 강 갖 갗 같 갚 갛 개 객 갠 갤 갬 갭 갯 갰 갱 갸 갹 갼 걀 걋 걍 걔 걘 걜 거 걱 건 걷 걸 걺 검 겁 것 겄 겅 겆 겉 겊 겋 게 겐 겔 겜 겝 겟 겠 겡 겨 격 겪 견 겯 결 겸 겹 겻 겼 경 곁 계 곈 곌 곕 곗 고 곡 곤 곧 골 곪 곬 곯 곰 곱 곳 공 곶 과 곽 관 괄 괆
Row 17: 괌 괍 괏 광 괘 괜 괠 괩 괬 괭 괴 괵 괸 괼 굄 굅 굇 굉 교 굔 굘 굡 굣 구 국 군 굳 굴 굵 굶 굻 굼 굽 굿 궁 궂 궈 궉 권 궐 궜 궝 궤 궷 귀 귁 귄 귈 귐 귑 귓 규 균 귤 그 극 근 귿 글 긁 금 급 긋 긍 긔 기 긱 긴 긷 길 긺 김 깁 깃 깅 깆 깊 까 깍 깎 깐 깔 깖 깜 깝 깟 깠 깡 깥 깨 깩 깬 깰 깸
Row 18: 깹 깻 깼 깽 꺄 꺅 꺌 꺼 꺽 꺾 껀 껄 껌 껍 껏 껐 껑 께 껙 껜 껨 껫 껭 껴 껸 껼 꼇 꼈 꼍 꼐 꼬 꼭 꼰 꼲 꼴 꼼 꼽 꼿 꽁 꽂 꽃 꽈 꽉 꽐 꽜 꽝 꽤 꽥 꽹 꾀 꾄 꾈 꾐 꾑 꾕 꾜 꾸 꾹 꾼 꿀 꿇 꿈 꿉 꿋 꿍 꿎 꿔 꿜 꿨 꿩 꿰 꿱 꿴 꿸 뀀 뀁 뀄 뀌 뀐 뀔 뀜 뀝 뀨 끄 끅 끈 끊 끌 끎 끓 끔 끕 끗 끙
Row 19: 끝 끼 끽 낀 낄 낌 낍 낏 낑 나 낙 낚 난 낟 날 낡 낢 남 납 낫 났 낭 낮 낯 낱 낳 내 낵 낸 낼 냄 냅 냇 냈 냉 냐 냑 냔 냘 냠 냥 너 넉 넋 넌 널 넒 넓 넘 넙 넛 넜 넝 넣 네 넥 넨 넬 넴 넵 넷 넸 넹 녀 녁 년 녈 념 녑 녔 녕 녘 녜 녠 노 녹 논 놀 놂 놈 놉 놋 농 높 놓 놔 놘 놜 놨 뇌 뇐 뇔 뇜 뇝
Row 20: 뇟 뇨 뇩 뇬 뇰 뇹 뇻 뇽 누 눅 눈 눋 눌 눔 눕 눗 눙 눠 눴 눼 뉘 뉜 뉠 뉨 뉩 뉴 뉵 뉼 늄 늅 늉 느 늑 는 늘 늙 늚 늠 늡 늣 능 늦 늪 늬 늰 늴 니 닉 닌 닐 닒 님 닙 닛 닝 닢 다 닥 닦 단 닫 달 닭 닮 닯 닳 담 답 닷 닸 당 닺 닻 닿 대 댁 댄 댈 댐 댑 댓 댔 댕 댜 더 덕 덖 던 덛 덜 덞 덟 덤 덥
Row 21: 덧 덩 덫 덮 데 덱 덴 델 뎀 뎁 뎃 뎄 뎅 뎌 뎐 뎔 뎠 뎡 뎨 뎬 도 독 돈 돋 돌 돎 돐 돔 돕 돗 동 돛 돝 돠 돤 돨 돼 됐 되 된 될 됨 됩 됫 됴 두 둑 둔 둘 둠 둡 둣 둥 둬 뒀 뒈 뒝 뒤 뒨 뒬 뒵 뒷 뒹 듀 듄 듈 듐 듕 드 득 든 듣 들 듦 듬 듭 듯 등 듸 디 딕 딘 딛 딜 딤 딥 딧 딨 딩 딪 따 딱 딴 딸
Row 22: 땀 땁 땃 땄 땅 땋 때 땍 땐 땔 땜 땝 땟 땠 땡 떠 떡 떤 떨 떪 떫 떰 떱 떳 떴 떵 떻 떼 떽 뗀 뗄 뗌 뗍 뗏 뗐 뗑 뗘 뗬 또 똑 똔 똘 똥 똬 똴 뙈 뙤 뙨 뚜 뚝 뚠 뚤 뚫 뚬 뚱 뛔 뛰 뛴 뛸 뜀 뜁 뜅 뜨 뜩 뜬 뜯 뜰 뜸 뜹 뜻 띄 띈 띌 띔 띕 띠 띤 띨 띰 띱 띳 띵 라 락 란 랄 람 랍 랏 랐 랑 랒 랖 랗
Row 23: 래 랙 랜 랠 램 랩 랫 랬 랭 랴 략 랸 럇 량 러 럭 런 럴 럼 럽 럿 렀 렁 렇 레 렉 렌 렐 렘 렙 렛 렝 려 력 련 렬 렴 렵 렷 렸 령 례 롄 롑 롓 로 록 론 롤 롬 롭 롯 롱 롸 롼 뢍 뢨 뢰 뢴 뢸 룀 룁 룃 룅 료 룐 룔 룝 룟 룡 루 룩 룬 룰 룸 룹 룻 룽 뤄 뤘 뤠 뤼 뤽 륀 륄 륌 륏 륑 류 륙 륜 률 륨 륩
Row 24: 륫 륭 르 륵 른 를 름 릅 릇 릉 릊 릍 릎 리 릭 린 릴 림 립 릿 링 마 막 만 많 맏 말 맑 맒 맘 맙 맛 망 맞 맡 맣 매 맥 맨 맬 맴 맵 맷 맸 맹 맺 먀 먁 먈 먕 머 먹 먼 멀 멂 멈 멉 멋 멍 멎 멓 메 멕 멘 멜 멤 멥 멧 멨 멩 며 멱 면 멸 몃 몄 명 몇 몌 모 목 몫 몬 몰 몲 몸 몹 못 몽 뫄 뫈 뫘 뫙 뫼
Row 25: 묀 묄 묍 묏 묑 묘 묜 묠 묩 묫 무 묵 묶 문 묻 물 묽 묾 뭄 뭅 뭇 뭉 뭍 뭏 뭐 뭔 뭘 뭡 뭣 뭬 뮈 뮌 뮐 뮤 뮨 뮬 뮴 뮷 므 믄 믈 믐 믓 미 믹 민 믿 밀 밂 밈 밉 밋 밌 밍 및 밑 바 박 밖 밗 반 받 발 밝 밞 밟 밤 밥 밧 방 밭 배 백 밴 밸 뱀 뱁 뱃 뱄 뱅 뱉 뱌 뱍 뱐 뱝 버 벅 번 벋 벌 벎 범 법 벗
Row 26: 벙 벚 베 벡 벤 벧 벨 벰 벱 벳 벴 벵 벼 벽 변 별 볍 볏 볐 병 볕 볘 볜 보 복 볶 본 볼 봄 봅 봇 봉 봐 봔 봤 봬 뵀 뵈 뵉 뵌 뵐 뵘 뵙 뵤 뵨 부 북 분 붇 불 붉 붊 붐 붑 붓 붕 붙 붚 붜 붤 붰 붸 뷔 뷕 뷘 뷜 뷩 뷰 뷴 뷸 븀 븃 븅 브 븍 븐 블 븜 븝 븟 비 빅 빈 빌 빎 빔 빕 빗 빙 빚 빛 빠 빡 빤
Row 27: 빨 빪 빰 빱 빳 빴 빵 빻 빼 빽 뺀 뺄 뺌 뺍 뺏 뺐 뺑 뺘 뺙 뺨 뻐 뻑 뻔 뻗 뻘 뻠 뻣 뻤 뻥 뻬 뼁 뼈 뼉 뼘 뼙 뼛 뼜 뼝 뽀 뽁 뽄 뽈 뽐 뽑 뽕 뾔 뾰 뿅 뿌 뿍 뿐 뿔 뿜 뿟 뿡 쀼 쁑 쁘 쁜 쁠 쁨 쁩 삐 삑 삔 삘 삠 삡 삣 삥 사 삭 삯 산 삳 살 삵 삶 삼 삽 삿 샀 상 샅 새 색 샌 샐 샘 샙 샛 샜 생 샤
Row 28: 샥 샨 샬 샴 샵 샷 샹 섀 섄 섈 섐 섕 서 석 섞 섟 선 섣 설 섦 섧 섬 섭 섯 섰 성 섶 세 섹 센 셀 셈 셉 셋 셌 셍 셔 셕 션 셜 셤 셥 셧 셨 셩 셰 셴 셸 솅 소 속 솎 손 솔 솖 솜 솝 솟 송 솥 솨 솩 솬 솰 솽 쇄 쇈 쇌 쇔 쇗 쇘 쇠 쇤 쇨 쇰 쇱 쇳 쇼 쇽 숀 숄 숌 숍 숏 숑 수 숙 순 숟 술 숨 숩 숫 숭
Row 29: 숯 숱 숲 숴 쉈 쉐 쉑 쉔 쉘 쉠 쉥 쉬 쉭 쉰 쉴 쉼 쉽 쉿 슁 슈 슉 슐 슘 슛 슝 스 슥 슨 슬 슭 슴 습 슷 승 시 식 신 싣 실 싫 심 십 싯 싱 싶 싸 싹 싻 싼 쌀 쌈 쌉 쌌 쌍 쌓 쌔 쌕 쌘 쌜 쌤 쌥 쌨 쌩 썅 써 썩 썬 썰 썲 썸 썹 썼 썽 쎄 쎈 쎌 쏀 쏘 쏙 쏜 쏟 쏠 쏢 쏨 쏩 쏭 쏴 쏵 쏸 쐈 쐐 쐤 쐬 쐰
Row 30: 쐴 쐼 쐽 쑈 쑤 쑥 쑨 쑬 쑴 쑵 쑹 쒀 쒔 쒜 쒸 쒼 쓩 쓰 쓱 쓴 쓸 쓺 쓿 씀 씁 씌 씐 씔 씜 씨 씩 씬 씰 씸 씹 씻 씽 아 악 안 앉 않 알 앍 앎 앓 암 압 앗 았 앙 앝 앞 애 액 앤 앨 앰 앱 앳 앴 앵 야 약 얀 얄 얇 얌 얍 얏 양 얕 얗 얘 얜 얠 얩 어 억 언 얹 얻 얼 얽 얾 엄 업 없 엇 었 엉 엊 엌 엎
Row 31: 에 엑 엔 엘 엠 엡 엣 엥 여 역 엮 연 열 엶 엷 염 엽 엾 엿 였 영 옅 옆 옇 예 옌 옐 옘 옙 옛 옜 오 옥 온 올 옭 옮 옰 옳 옴 옵 옷 옹 옻 와 왁 완 왈 왐 왑 왓 왔 왕 왜 왝 왠 왬 왯 왱 외 왹 왼 욀 욈 욉 욋 욍 요 욕 욘 욜 욤 욥 욧 용 우 욱 운 울 욹 욺 움 웁 웃 웅 워 웍 원 월 웜 웝 웠 웡 웨
Row 32: 웩 웬 웰 웸 웹 웽 위 윅 윈 윌 윔 윕 윗 윙 유 육 윤 율 윰 윱 윳 융 윷 으 윽 은 을 읊 음 읍 읏 응 읒 읓 읔 읕 읖 읗 의 읜 읠 읨 읫 이 익 인 일 읽 읾 잃 임 입 잇 있 잉 잊 잎 자 작 잔 잖 잗 잘 잚 잠 잡 잣 잤 장 잦 재 잭 잰 잴 잼 잽 잿 쟀 쟁 쟈 쟉 쟌 쟎 쟐 쟘 쟝 쟤 쟨 쟬 저 적 전 절 젊
Row 33: 점 접 젓 정 젖 제 젝 젠 젤 젬 젭 젯 젱 져 젼 졀 졈 졉 졌 졍 졔 조 족 존 졸 졺 좀 좁 좃 종 좆 좇 좋 좌 좍 좔 좝 좟 좡 좨 좼 좽 죄 죈 죌 죔 죕 죗 죙 죠 죡 죤 죵 주 죽 준 줄 줅 줆 줌 줍 줏 중 줘 줬 줴 쥐 쥑 쥔 쥘 쥠 쥡 쥣 쥬 쥰 쥴 쥼 즈 즉 즌 즐 즘 즙 즛 증 지 직 진 짇 질 짊 짐 집 짓
Row 34: 징 짖 짙 짚 짜 짝 짠 짢 짤 짧 짬 짭 짯 짰 짱 째 짹 짼 쨀 쨈 쨉 쨋 쨌 쨍 쨔 쨘 쨩 쩌 쩍 쩐 쩔 쩜 쩝 쩟 쩠 쩡 쩨 쩽 쪄 쪘 쪼 쪽 쫀 쫄 쫌 쫍 쫏 쫑 쫓 쫘 쫙 쫠 쫬 쫴 쬈 쬐 쬔 쬘 쬠 쬡 쭁 쭈 쭉 쭌 쭐 쭘 쭙 쭝 쭤 쭸 쭹 쮜 쮸 쯔 쯤 쯧 쯩 찌 찍 찐 찔 찜 찝 찡 찢 찧 차 착 찬 찮 찰 참 찹 찻
Row 35: 찼 창 찾 채 책 챈 챌 챔 챕 챗 챘 챙 챠 챤 챦 챨 챰 챵 처 척 천 철 첨 첩 첫 첬 청 체 첵 첸 첼 쳄 쳅 쳇 쳉 쳐 쳔 쳤 쳬 쳰 촁 초 촉 촌 촐 촘 촙 촛 총 촤 촨 촬 촹 최 쵠 쵤 쵬 쵭 쵯 쵱 쵸 춈 추 축 춘 출 춤 춥 춧 충 춰 췄 췌 췐 취 췬 췰 췸 췹 췻 췽 츄 츈 츌 츔 츙 츠 측 츤 츨 츰 츱 츳 층
Row 36: 치 칙 친 칟 칠 칡 침 칩 칫 칭 카 칵 칸 칼 캄 캅 캇 캉 캐 캑 캔 캘 캠 캡 캣 캤 캥 캬 캭 컁 커 컥 컨 컫 컬 컴 컵 컷 컸 컹 케 켁 켄 켈 켐 켑 켓 켕 켜 켠 켤 켬 켭 켯 켰 켱 켸 코 콕 콘 콜 콤 콥 콧 콩 콰 콱 콴 콸 쾀 쾅 쾌 쾡 쾨 쾰 쿄 쿠 쿡 쿤 쿨 쿰 쿱 쿳 쿵 쿼 퀀 퀄 퀑 퀘 퀭 퀴 퀵 퀸 퀼
Row 37: 큄 큅 큇 큉 큐 큔 큘 큠 크 큭 큰 클 큼 큽 킁 키 킥 킨 킬 킴 킵 킷 킹 타 탁 탄 탈 탉 탐 탑 탓 탔 탕 태 택 탠 탤 탬 탭 탯 탰 탱 탸 턍 터 턱 턴 털 턺 텀 텁 텃 텄 텅 테 텍 텐 텔 템 텝 텟 텡 텨 텬 텼 톄 톈 토 톡 톤 톨 톰 톱 톳 통 톺 톼 퇀 퇘 퇴 퇸 툇 툉 툐 투 툭 툰 툴 툼 툽 툿 퉁 퉈 퉜
Row 38: 퉤 튀 튁 튄 튈 튐 튑 튕 튜 튠 튤 튬 튱 트 특 튼 튿 틀 틂 틈 틉 틋 틔 틘 틜 틤 틥 티 틱 틴 틸 팀 팁 팃 팅 파 팍 팎 판 팔 팖 팜 팝 팟 팠 팡 팥 패 팩 팬 팰 팸 팹 팻 팼 팽 퍄 퍅 퍼 퍽 펀 펄 펌 펍 펏 펐 펑 페 펙 펜 펠 펨 펩 펫 펭 펴 편 펼 폄 폅 폈 평 폐 폘 폡 폣 포 폭 폰 폴 폼 폽 폿 퐁
Row 39: 퐈 퐝 푀 푄 표 푠 푤 푭 푯 푸 푹 푼 푿 풀 풂 품 풉 풋 풍 풔 풩 퓌 퓐 퓔 퓜 퓟 퓨 퓬 퓰 퓸 퓻 퓽 프 픈 플 픔 픕 픗 피 픽 핀 필 핌 핍 핏 핑 하 학 한 할 핥 함 합 핫 항 해 핵 핸 핼 햄 햅 햇 했 행 햐 향 허 헉 헌 헐 헒 험 헙 헛 헝 헤 헥 헨 헬 헴 헵 헷 헹 혀 혁 현 혈 혐 협 혓 혔 형 혜 혠
Row 40: 혤 혭 호 혹 혼 홀 홅 홈 홉 홋 홍 홑 화 확 환 활 홧 황 홰 홱 홴 횃 횅 회 획 횐 횔 횝 횟 횡 효 횬 횰 횹 횻 후 훅 훈 훌 훑 훔 훗 훙 훠 훤 훨 훰 훵 훼 훽 휀 휄 휑 휘 휙 휜 휠 휨 휩 휫 휭 휴 휵 휸 휼 흄 흇 흉 흐 흑 흔 흖 흗 흘 흙 흠 흡 흣 흥 흩 희 흰 흴 흼 흽 힁 히 힉 힌 힐 힘 힙 힛 힝

Hanja sets

Johab encoding

Diagram of Johab encoding layout

KS X 1001, since 1992, also defines an alternative encoding known as Johab. This represents a hangul syllable as the sequence of three five-bit values, split across two 8-bit bytes, most significant bit first. The most significant bit of the lead byte is always set (allowing combination with single-byte ASCII or KS X 1003). This encoding is also used for the modern jamo from row 4 of KS X 1001, by using the filler values for the other components. The Johab encoding for hangul is shown in the table below.[19]

Johab encodes the remainder of KS X 1001 using lead bytes which do not correspond to an initial jamo (0xE0–0xF9 for hanja and 0xD9–0xDE[20] for non-hanja, excluding hangul syllables and modern jamo), with trail bytes in the ranges 0x31–0x7E and 0x91–0xFE.[19] These codes are algorithmically mapped from the characters' KS X 1001 code points,[20] with two KS X 1001 rows per lead byte (compare and contrast Shift JIS).

Five-bit sequence	As initial	As vowel	As final
00000	Not used	Not used[lower-alpha 2]	Not used
00001	Filler	Not used[lower-alpha 3]	Filler (empty final)
00010	ㄱ	Filler	ㄱ
00011	ㄲ	ㅏ	ㄲ
00100	ㄴ	ㅐ	ㄳ
00101	ㄷ	ㅑ	ㄴ
00110	ㄸ	ㅒ	ㄵ
00111	ㄹ	ㅓ	ㄶ
01000	ㅁ	Not used[lower-alpha 2]	ㄷ
01001	ㅂ	Not used[lower-alpha 3]	ㄹ
01010	ㅃ	ㅔ	ㄺ
01011	ㅅ	ㅕ	ㄻ
01100	ㅆ	ㅖ	ㄼ
01101	ㅇ	ㅗ	ㄽ
01110	ㅈ	ㅘ	ㄾ
01111	ㅉ	ㅙ	ㄿ
10000	ㅊ	Not used[lower-alpha 2]	ㅀ
10001	ㅋ	Not used[lower-alpha 3]	ㅁ
10010	ㅌ	ㅚ	Not used
10011	ㅍ	ㅛ	ㅂ
10100	ㅎ	ㅜ	ㅄ
10101	Not used	ㅝ	ㅅ
10110	Non-Hangul lead bytes	ㅞ	ㅆ
10111	Non-Hangul lead bytes	ㅟ	ㅇ
11000	Non-Hangul lead bytes	Not used[lower-alpha 2]	ㅈ
11001	Non-Hangul lead bytes	Not used[lower-alpha 3]	ㅊ
11010	Non-Hangul lead bytes	ㅠ	ㅋ
11011	Non-Hangul lead bytes	ㅡ	ㅌ
11100	Non-Hangul lead bytes	ㅢ	ㅍ
11101	Non-Hangul lead bytes	ㅣ	ㅎ
11110	Non-Hangul lead bytes	Not used	Not used
11111	Not used	Not used	Not used

Footnotes

Korean: 정보 교환용 부호계 (한글 및 한자), romanized: Jeongbo Gyohwan'yong Buhogye (Hangeul mich Hanja)
Were this one used, it would result in a trail byte in the C0 control codes range.
Were this one used, it would result in trail bytes in the 0x2_ and 0x3_ rows of ASCII. Johab does not use the 0x2_ row for trail bytes, similarly to most common legacy CJK encodings (compare Shift JIS, GBK, Big5).

References

Lunde, Ken (2009). "Chapter 3: Character Set Standards". CJKV Information Processing. p. 143-148. ISBN 978-0596514471.
Hwang, Jinsang (2005). The Social Shaping of ICTs Standards: A Case of National Coded Character Set Standards Controversy in Korea (PDF). University of Edinburgh.
Lunde, Ken (1995-12-18). "2.4.6: Obsolete Standards". CJK.INF Version 1.9.
Shin, Jungshik. "What are KS X 1001(KS C 5601) and other Hangul codes?". Hangul & Internet in Korea FAQ.
Lunde, Ken (1995-12-18). "3.3.6: N-byte Hangul". CJK.INF Version 1.9.
"INFO: Hangul (Korean) Character Sets", Microsoft Support, Microsoft
Zsigri, Gyula (2002-06-18). "KSC and UHC".
Chang, Hye-Shik. "cpython/Modules/cjkcodecs/_codecs_kr.c (revision d3faf43)". cPython source tree. Python Software Foundation.
Chung, Jaemin (2017-03-30). Proposal to add an informative note to U+3164 HANGUL FILLER (PDF). Unicode Consortium. UTC L2/17-081.
"Code Page 01040" (PDF). IBM. Archived from the original (PDF) on 2015-07-08.
"KSRI-87-37-IR: 항을・한자 코드 표준화에 관한 예연구: A Study on Standardization of Hangul and Hanja Codes" (PDF) (in Korean). Ministry of Science and Technology. 1987. p. 68. Archived from the original (PDF) on 2019-03-01.
"ibm-1363_P110-1997 (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.
"euc-kr (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.
"Map (external version) from Mac OS Korean encoding to Unicode 3.2 and later". Apple.
"windows-949-2000 (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.
"Lead Byte A1-A2 (Code page 949)". MSDN. Microsoft.
Korea Bureau of Standards (1988-10-01). Korean Graphic Character Set for Information Interchange (PDF). ITSCJ/IPSJ. ISO-IR-149.
Lunde, Ken (2009). "Seemingly Missing Characters". CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (2nd ed.). Sebastopol, CA: O'Reilly. p. 180. ISBN 978-0-596-51447-1.
Lunde, Ken (2008). "Chapter 4: Encoding Methods (§ Johab Encoding—KS X 1001:2004)". CJKV Information Processing (2nd ed.). Sebastopol, California: O'Reilly Media. pp. 268–273. ISBN 978-0-596-51447-1.
Shin, Jungshik (2011-10-14) [1999-08-16]. Johab to Unicode table. Unicode Consortium.

External links

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Korean: 정보 교환용 부호계 (한글 및 한자), romanized: Jeongbo Gyohwan'yong Buhogye (Hangeul mich Hanja)

[johabc0-22] Were this one used, it would result in a trail byte in the C0 control codes range.

[johab23-23] Were this one used, it would result in trail bytes in the 0x2_ and 0x3_ rows of ASCII. Johab does not use the 0x2_ row for trail bytes, similarly to most common legacy CJK encodings (compare Shift JIS, GBK, Big5).

[lundech3-2] Lunde, Ken (2009). "Chapter 3: Character Set Standards". CJKV Information Processing. p. 143-148. ISBN 978-0596514471.

[Hwang-3] Hwang, Jinsang (2005). The Social Shaping of ICTs Standards: A Case of National Coded Character Set Standards Controversy in Korea (PDF). University of Edinburgh.

[cjkinf246-4] Lunde, Ken (1995-12-18). "2.4.6: Obsolete Standards". CJK.INF Version 1.9.

[shin-5] Shin, Jungshik. "What are KS X 1001(KS C 5601) and other Hangul codes?". Hangul & Internet in Korea FAQ.

[cjkinf336-6] Lunde, Ken (1995-12-18). "3.3.6: N-byte Hangul". CJK.INF Version 1.9.

[msinfo-7] "INFO: Hangul (Korean) Character Sets", Microsoft Support, Microsoft

[fontboard-8] Zsigri, Gyula (2002-06-18). "KSC and UHC".

[9] Chang, Hye-Shik. "cpython/Modules/cjkcodecs/_codecs_kr.c (revision d3faf43)". cPython source tree. Python Software Foundation.

[L2-17-081-10] Chung, Jaemin (2017-03-30). Proposal to add an informative note to U+3164 HANGUL FILLER (PDF). Unicode Consortium. UTC L2/17-081.

[11] "Code Page 01040" (PDF). IBM. Archived from the original (PDF) on 2015-07-08.

[12] "KSRI-87-37-IR: 항을・한자 코드 표준화에 관한 예연구: A Study on Standardization of Hangul and Hanja Codes" (PDF) (in Korean). Ministry of Science and Technology. 1987. p. 68. Archived from the original (PDF) on 2019-03-01.

[13] "ibm-1363_P110-1997 (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.

[14] "euc-kr (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.

[applemaps-15] "Map (external version) from Mac OS Korean encoding to Unicode 3.2 and later". Apple.

[16] "windows-949-2000 (lead byte A1)". ICU Demonstration - Converter Explorer. International Components for Unicode.

[msmaps-17] "Lead Byte A1-A2 (Code page 949)". MSDN. Microsoft.

[18] Korea Bureau of Standards (1988-10-01). Korean Graphic Character Set for Information Interchange (PDF). ITSCJ/IPSJ. ISO-IR-149.

[lunde2009chouon-19] Lunde, Ken (2009). "Seemingly Missing Characters". CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (2nd ed.). Sebastopol, CA: O'Reilly. p. 180. ISBN 978-0-596-51447-1.

[lundejohab-20] Lunde, Ken (2008). "Chapter 4: Encoding Methods (§ Johab Encoding—KS X 1001:2004)". CJKV Information Processing (2nd ed.). Sebastopol, California: O'Reilly Media. pp. 268–273. ISBN 978-0-596-51447-1.

[utcjohab-21] Shin, Jungshik (2011-10-14) [1999-08-16]. Johab to Unicode table. Unicode Consortium.

Character encodings
Early telecommunications	Telegraph code Needle Morse Non-Latin Wabun/Kana Chinese Cyrillic Korean Baudot and Murray FIELDATA ASCII ISO/IEC 646 BCDIC 353 355 357 358 359 360 EBCDIC Teletex and Videotex/Teletext ISO/IEC 6937 / ITU T.51 ITU T.61 ITU T.101 World System Teletext background sets
ISO/IEC 8859	Approved -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -13 -14 -15 -16 Abandoned -12 Adaptations ISO-IR-182 ISO-IR-200 ISO-IR-201 Proposed but not approved ISO-IR-111 ISO-IR-197 French/Dutch/Turkish draft
Bibliographic use	MARC-8 ANSEL CCCII/EACC ISO 5426 / 5426-2 / 5427 / 5428 / 6438 / 6861 / 6862 / 10585 / 10586 / 10754 / 11822
National standards	ArmSCII BraSCII CNS 11643 ELOT 927 GOST 10859 GB 2312 GB 12052 GB 18030 HKSCS I.S. 434 ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KOI-7 KPS 9566 KS X 1001 KS X 1002 LST 1284 LST 1564 LST 1590-1 LST 1590-2 LST 1590-3 LST 1590-4 PASCII RUSCII SI 960 TIS-620 TSCII VISCII VSCII YUSCII
ISO/IEC 2022	7-bit CN CN-EXT JP JP-EXT JP-1 JP-2 JP-3 KR ISO/IEC 4873 ISO/IEC 8859 ISO/IEC 10367 Extended Unix Code / EUC CN KR JP TW
Mac OS code pages ("scripts")	Armenian Arabic Barents Cyrillic Celtic CentEuro ChineseSimp / EUC-CN ChineseTrad / Big5 Croatian Cyrillic Devanagari / ISCII Dingbats Farsi (Persian) Gaelic Georgian Greek Gujarati / ISCII Gurmukhi / ISCII Hebrew Iceland Inuit Japanese / Shift JIS Keyboard Korean / EUC-KR Latin (Kermit) Maltese/Esperanto Ogham / I.S. 434 Roman Romanian Sámi Symbol Thai / TIS-620 Turkish Turkic Cyrillic Ukrainian VT100
DOS code pages	100 111 112 113 151 152 161 162 163 164 165 166 210 220 301 437 449 489 620 667 668 707 708 709 710 711 714 715 720 721 737 768 770 771 772 773 774 775 776 777 778 790 850 851 852 853 854 855/872 856 857 858 859 860 861 862 863 864 865 866/808 867 868 869 874/1161/1162 876 877 878 881 882 883 884 885 891 895 896 897 898 899 900 903 904 906 907 909 910 911 926 927 928 929 932 934 936 938 941 942 943 944 946 947 948 949 950/1370 951 966 991 1034 1039 1040 1041 1042 1043 1044 1046 1086 1088 1092 1093 1098 1108 1109 1114 1115 1116 1117 1118 1119 1125/848 1126 1127 1131/849 1139 1167 1168 1300 1351 1361 1362 1363 1372 1373 1374 1375 1380 1381 1385 1386 1391 1392 1393 1394 3012 3021 3843 3844 3845 3846 3847 3848 30000 30001 30002 30003 30004 30005 30006 30007 30008 30009 30010 30011 30012 30013 30014 30015 30016 30017 30018 30019 30020 30021 30022 30023 30024 30025 30026 30027 30028 30029 30030 30031 30032 30033 30034 30039 30040 58152 58210 58335 59234 59829 60258 60853 61282 62306 CS Indic CSX Indic CSX+ Indic CWI-2 Iran System Kamenický KOI8 Mazovia MIK
IBM AIX code pages	367 371 806 813 819 895 896 912 913 914 915 916 919 920 921/901 922/902 923 952 953 954 955 956 957 958 959 960 961 963 964 965 970 971 1004 1006 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1029 1036 1089 1111 1124 1129/1163 1133 1350 1382 1383
IBM code pages for other vendors' encodings	Apple Macintosh 1275 1280 1281 1282 1283 1284 1285 1286 Adobe 1038 1276 1277 DEC 1020 1021 1023 1090 1100 1101 1102 1103 1104 1105 1106 1107 1287 1288 HP 1050 1051 1052 1053 1054 1055 1056 1057 1058
Windows code pages	CER-GS 874/1162 (TIS-620) 932/943 (Shift JIS) 936/1386 (GBK) 950/1370 (Big5) 949/1363 (EUC-KR) 1169 1174 Extended Latin-8 1200 (UTF-16LE) 1201 (UTF-16BE) 1250 1251 1252 1253 1254 1255 1256 1257 1258 1261 1270 54936 (GB18030) Armenian Cyrillic + Finnish Cyrillic + French Cyrillic + German Polytonic Greek 65001 (UTF-8)
Microsoft code pages for other vendors' encodings	Apple Macintosh 10000 10001 10002 10003 10004 10005 10006 10007 10008 10010 10017 10021 10029 10079 10081 10082
EBCDIC code pages	37 390 391 392 393 394 395 435 829 834 835 837 839 881 882 883 884 885 886 887 888 889 890 931 933/1364 935/1388 937/1371 939/1399 1001 1003 1005 1007 1024 1027 1028 1030 1031 1032 1033 1037 1068 1071 1073 1074 1075 1076 1077 1078 1080 1082 1083 1085 1087 1091 1136 1150 1151 1152 1278 1279 1303 1364 1376 1377
DEC terminals (VTx)	Multinational (MCS) National Replacement (NRCS) French Canadian Swiss Spanish United Kingdom Dutch Finnish French Norwegian and Danish Swedish Norwegian and Danish (alternative) 8-bit Greek 8-bit Turkish 7-bit Hebrew 8-bit Hebrew Special Graphics Technical (TCS)
Platform specific	Acorn Adobe Standard Adobe Latin 1 Amstrad CPC Apple I Apple II Apple III ATASCII Atari ST BICS Casio calculators CDC Compucolor II CP/M+ DEC RADIX 50 DEC MCS/NRCS DG International ELWRO-Junior FIELDATA GEM GEOS GSM 03.38 HP Roman Extension HP Roman-8 HP Roman-9 HP FOCAL HP RPL IBM SQUOZE LICS LMBCS Mattel Aquarius Minitel MSX NEC APC NeXT OricSCII PCW PETSCII Sega SC-3000 Sharp calculators Sharp MZ Sinclair QL Teletext TI calculators TRS-80 Ventura International Ventura Symbol WISCII XCCS ZX80 ZX81 ZX Spectrum
Unicode / ISO/IEC 10646	UTF-1 UTF-7 UTF-8 UTF-16 (UTF-16LE/UTF-16BE) / UCS-2 UTF-32 (UTF-32LE/UTF-32BE) / UCS-4 UTF-EBCDIC GB 18030 BOCU-1 CESU-8 SCSU
TeX typesetting system	Cork IL1 IL2 IL3 L7X LGR LY1 OML OMS OMX OT1 OT2 OT3 OT4 PL0 QX T2A T2B T2C T2D T3 T4 T5 TS1 TS3 U X2
Miscellaneous code pages	ABICOMP APL 293 310 (Graphic Escape) 351 (GDDM) 907 (OEM) ISO-IR-68 ARIB STD-B24 HZ IEC-P27-1 INIS 7-bit 8-bit Cyrillic ISO-IR-169 ISO 2033 Johab Mojikyō SEASCII Stanford/ITS TACE16 TRON UTF-5 UTF-6 WTF-8
Control and nonprinting character sets	Morse prosigns C0 and C1 control codes ISO/IEC 6429 / ANSI X3.64 / ECMA-48 / JIS X 0211 ISO 6630 DIN 31626 JIS X 0207 ITU T.101 C0 C1 EBCDIC control codes Unicode control, format and separator characters Whitespace characters
Related topics	Code page Windows code page CCSID Character encodings in HTML Charset detection Han unification Hardware Mojibake
Character sets