KOI8-T
KOI8-T is an 8-bit single-byte extended ASCII character encoding adapting KOI8 to cover the Tajik Cyrillic alphabet.[1] It was introduced by Michael Davis as an interim solution for representing Tajiki Cyrillic text in an interchangeable manner appropriate for use on the web, in an attempt to bridge the gap between existing non-interoperable font-specific encodings and the eventual wide adoption of Unicode.[2] It is used by the GNU C Library as its default encoding for Tajik.[3]
Language(s) | Tajik Cyrillic, Russian, Bulgarian |
---|---|
Created by | Michael Davis |
Classification | 8-bit KOI, extended ASCII |
Extends | KOI8-B |
The Cyrillic letters that are also used in Russian are encoded according to the KOI8-R layout, making the encoding a KOI8-B superset, whereas the punctuation mostly follows the layout in Windows-1251 and Windows-1252 as applicable.[2]
Character set
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ 0 |
||||||||||||||||
1_ 16 |
||||||||||||||||
2_ 32 |
SP 0020 |
! 0021 |
" 0022 |
# 0023 |
$ 0024 |
% 0025 |
& 0026 |
' 0027 |
( 0028 |
) 0029 |
* 002A |
+ 002B |
, 002C |
- 002D |
. 002E |
/ 002F |
3_ 48 |
0 0030 |
1 0031 |
2 0032 |
3 0033 |
4 0034 |
5 0035 |
6 0036 |
7 0037 |
8 0038 |
9 0039 |
: 003A |
; 003B |
< 003C |
= 003D |
> 003E |
? 003F |
4_ 64 |
@ 0040 |
A 0041 |
B 0042 |
C 0043 |
D 0044 |
E 0045 |
F 0046 |
G 0047 |
H 0048 |
I 0049 |
J 004A |
K 004B |
L 004C |
M 004D |
N 004E |
O 004F |
5_ 80 |
P 0050 |
Q 0051 |
R 0052 |
S 0053 |
T 0054 |
U 0055 |
V 0056 |
W 0057 |
X 0058 |
Y 0059 |
Z 005A |
[ 005B |
\ 005C |
] 005D |
^ 005E |
_ 005F |
6_ 96 |
` 0060 |
a 0061 |
b 0062 |
c 0063 |
d 0064 |
e 0065 |
f 0066 |
g 0067 |
h 0068 |
i 0069 |
j 006A |
k 006B |
l 006C |
m 006D |
n 006E |
o 006F |
7_ 112 |
p 0070 |
q 0071 |
r 0072 |
s 0073 |
t 0074 |
u 0075 |
v 0076 |
w 0077 |
x 0078 |
y 0079 |
z 007A |
{ 007B |
| 007C |
} 007D |
~ 007E |
|
8_ 128 |
қ 049B |
ғ 0493 |
‚ 201A |
Ғ 0492 |
„ 201E |
… 2026 |
† 2020 |
‡ 2021 |
‰ 2030 |
ҳ 04B3 |
‹ 2039 |
Ҳ 04B2 |
ҷ 04B7 |
Ҷ 04B6 |
||
9_ 144 |
Қ 049A |
‘ 2018 |
’ 2019 |
“ 201C |
” 201D |
• 2022 |
– 2013 |
— 2014 |
™ 2122 |
› 203A |
||||||
A_ 160 |
ӯ 04EF |
Ӯ 04EE |
ё 0451 |
¤ 00A4 |
ӣ 04E3 |
¦ 00A6 |
§ 00A7 |
« 00AB |
¬ 00AC |
SHY 00AD |
® 00AE |
|||||
B_ 176 |
° 00B0 |
± 00B1 |
² 00B2 |
Ё 0401 |
Ӣ 04E2 |
¶ 00B6 |
· 00B7 |
№ 2116 |
» 00BB |
© 00A9 | ||||||
C_ 192 |
ю 044E |
а 0430 |
б 0431 |
ц 0446 |
д 0434 |
е 0435 |
ф 0444 |
г 0433 |
х 0445 |
и 0438 |
й 0439 |
к 043A |
л 043B |
м 043C |
н 043D |
о 043E |
D_ 208 |
п 043F |
я 044F |
р 0440 |
с 0441 |
т 0442 |
у 0443 |
ж 0436 |
в 0432 |
ь 044C |
ы 044B |
з 0437 |
ш 0448 |
э 044D |
щ 0449 |
ч 0447 |
ъ 044A |
E_ 224 |
Ю 042E |
А 0410 |
Б 0411 |
Ц 0426 |
Д 0414 |
Е 0415 |
Ф 0424 |
Г 0413 |
Х 0425 |
И 0418 |
Й 0419 |
К 041A |
Л 041B |
М 041C |
Н 041D |
О 041E |
F_ 240 |
П 041F |
Я 042F |
Р 0420 |
С 0421 |
Т 0422 |
У 0423 |
Ж 0416 |
В 0412 |
Ь 042C |
Ы 042B |
З 0417 |
Ш 0428 |
Э 042D |
Щ 0429 |
Ч 0427 |
Ъ 042A |
Letter Number Punctuation Symbol Other Undefined
See also
- Mac OS Turkic Cyrillic, encodes Tajik amongst other languages.
References
- Flohr, Guido. "Locale::RecodeData::KOI8_T - Conversion routines for KOI8-T". libintl-perl-1.31. CPAN.
- Davis, Michael (2000-11-21). "Tajiki TrueType fonts for the Web: Frequently Asked Questions". Travel Tajikistan. Archived from the original on 2001-10-05.
- Storchaka, Serhiy (2014-10-20). "Add support of KOI8-T encoding". Python Bug Tracker.