Version 5.2 of the
Unicode standard was released on
October 1, 2009. The previous version was
Unicode 5.1 and the next version is
Unicode 6.0. All the gory details can be found at http://www.unicode.org/versions/Unicode5.2.0/
Unicode 5.2 adds 6,648 new characters, including 26 new code blocks, to support 7 new contemporary scripts and 6 new historic scripts.
Seven new contemporary scripts have been added : Bamum, Javanese, Lisu, Meetei Mayek, Samaritan, Tai Tham, and Tai Viet. New character additions to existing scripts now provide greater support for Abkhaz, Canadian Aboriginal Syllabics, Coptic, Devanagari, Khamti Shan, Malayalam, and Myanmar.
Unicode 5.2 now supports the Gardiner set of Egyptian Hieroglyphs as well as other important historic scripts: Imperial Aramaic, Avestan, Kaithi, Old South Arabian, and Old Turkic.
Unicode 5.2 has exactly the same character assignments as ISO/IEC 10646:2003 plus Amendments 1 through 6.
New Code Blocks
26 new
code blocks were added in 5.2
U+0800 to U+083F Samaritan 61/64
U+18B0 to U+18FF Unified Canadian Aboriginal Syllabics Extended 70/80
U+1A20 to U+1AAF Tai Tham 127/144
U+1CD0 to U+1CFF Vedic Extensions 35/48
U+A4D0 to U+A4FF Lisu 48/48
U+A6A0 to U+A6FF Bamum 88/96
U+A830 to U+A83F Common Indic Number Forms 10/16
U+A8E0 to U+A8FF Devanagari Extended 28/32
U+A960 to U+A97F Hangul Jamo Extended A 29/32
U+A980 to U+A9DF Javanese 91/96
U+AA60 to U+AA7F Myanmar Extended A 28/32
U+AA80 to U+AADF Tai Viet 72/96
U+ABC0 to U+ABFF Meetei Mayek 56/64
U+D7B0 to U+D7FF Hangul Jamo Extended B 72/80
U+10840 to U+1085F Imperial Aramaic 31/32
U+10A60 to U+10A7F Old South Arabian 32/32
U+10B00 to U+10B3F Avestan 61/64
U+10B40 to U+10B5F Inscriptional Parthian 30/32
U+10B60 to U+10B7F Inscriptional Pahlavi 27/32
U+10C00 to U+10C4F Old Turkic 73/80
U+10E60 to U+10E7F Rumi Numeral Symbols 31/32
U+11080 to U+110CF Kaithi 66/80
U+13000 to U+1342F Egyptian Hieroglyphs 1071/1072
U+1F100 to U+1F1FF Enclosed Alphanumeric Supplement 63/256
U+1F200 to U+1F2FF Enclosed Ideographic Supplement 44/256
U+2A700 to U+2B73F CJK Unified Ideographs Extension C 4149/4160
New Characters
Excluding those in the new
code blocks, there were 155 new characters added in Unicode 5.2
Number of characters in each General Category :
Letter, Uppercase Lu : 6
Letter, Lowercase Ll : 3
Letter, Other Lo : 40
Mark, Non-Spacing Mn : 7
Mark, Spacing Combining Mc : 4
Number, Decimal Digit Nd : 1
Number, Other No : 6
Punctuation, Dash Pd : 1
Punctuation, Other Po : 1
Symbol, Currency Sc : 4
Symbol, Other So : 82
Number of characters in each Bidirectional Category :
Left To Right L : 70
Right To Left R : 2
European Number Terminator ET : 4
Non Spacing Mark NSM : 7
Other Neutral ON : 72
The columns below should be interpreted as :
- The Unicode code for the character
- The character in question
- The Unicode name for the character
- The Unicode General Category for the character
- The Unicode Bidirectional Category for the character
If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.
Cyrillic Supplement
Abkhaz letters
- U+0524 Ԥ Cyrillic capital letter pe with descender Lu L
- U+0525 ԥ Cyrillic small letter pe with descender Ll L
- * used in modern Abkhaz orthography
- ref U+04A7 ҧ Cyrillic small letter pe with middle hook (Cyrillic)
Devanagari
Various signs
- U+0900 ऀ Devanagari sign inverted candrabindu Mn NSM
- aka vaidika adhomukha candrabindu
Archaic dependent vowel sign
- U+094E ॎ Devanagari vowel sign prishthamatra e Mc L
- * character has historic use only
- * combines with E to form AI, with AA to form O, and with O to form AU
Accent marks
- U+0955 ॕ Devanagari vowel sign candra long e Mn NSM
- * used in transliteration of Avestan
Additional consonants
- U+0979 ॹ Devanagari letter zha Lo L
- * used in transliteration of Avestan
- U+097A ॺ Devanagari letter heavy ya Lo L
- * used for an affricated glide JJYA
Bengali
Bengali-specific additions
- U+09FB ৻ Bengali ganda mark Sc ET
Tibetan
Religious symbols
- U+0FD5 ࿕ right facing svasti sign So L
- aka gyung drung nang -khor
- * symbol of good luck and well-being in India
- ref U+5350 卐 CJK Ideograph 5350 (CJK Unified Ideographs)
- U+0FD6 ࿖ left facing svasti sign So L
- aka gyung drung phyi -khor
- ref U+534D 卍 CJK Ideograph 534D (CJK Unified Ideographs)
- U+0FD7 ࿗ right facing svasti sign with dots So L
- aka gyung drung nang -khor bzhi mig can
- U+0FD8 ࿘ left facing svasti sign with dots So L
- aka gyung drung phyi -khor bzhi mig can
Myanmar
Extensions for Khamti Shan
- U+109A ႚ Myanmar sign khamti tone 1 Mc L
- U+109B ႛ Myanmar sign khamti tone 3 Mc L
Extensions for Aiton and Phake
- U+109C ႜ Myanmar vowel sign aiton a Mc L
- U+109D ႝ Myanmar vowel sign aiton ai Mn NSM
Hangul Jamo
Initial consonants
- U+115A ᅚ Hangul choseong kiyeok tikeut Lo L
- U+115B ᅛ Hangul choseong nieun sios Lo L
- U+115C ᅜ Hangul choseong nieun cieuc Lo L
- U+115D ᅝ Hangul choseong nieun hieuh Lo L
- U+115E ᅞ Hangul choseong tikeut rieul Lo L
Medial vowels
- U+11A3 ᆣ Hangul jungseong a eu Lo L
- U+11A4 ᆤ Hangul jungseong ya u Lo L
- U+11A5 ᆥ Hangul jungseong yeo ya Lo L
- U+11A6 ᆦ Hangul jungseong o ya Lo L
- U+11A7 ᆧ Hangul jungseong o yae Lo L
Final consonants
- U+11FA ᇺ Hangul jongseong kiyeok nieun Lo L
- U+11FB ᇻ Hangul jongseong kiyeok pieup Lo L
- U+11FC ᇼ Hangul jongseong kiyeok chieuch Lo L
- U+11FD ᇽ Hangul jongseong kiyeok khieukh Lo L
- U+11FE ᇾ Hangul jongseong kiyeok hieuh Lo L
- U+11FF ᇿ Hangul jongseong ssangnieun Lo L
Unified Canadian Aboriginal Syllabics
Punctuation
- U+1400 ᐀ Canadian syllabics hyphen Pd ON
Syllables
- U+1677 ᙷ Canadian syllabics woods cree thwee Lo L
- U+1678 ᙸ Canadian syllabics woods cree thwi Lo L
- U+1679 ᙹ Canadian syllabics woods cree thwii Lo L
- U+167A ᙺ Canadian syllabics woods cree thwo Lo L
- U+167B ᙻ Canadian syllabics woods cree thwoo Lo L
- U+167C ᙼ Canadian syllabics woods cree thwa Lo L
- U+167D ᙽ Canadian syllabics woods cree thwaa Lo L
- U+167E ᙾ Canadian syllabics woods cree final th Lo L
- U+167F ᙿ Canadian syllabics blackfoot w Lo L
New Tai Lue
Consonants
- U+19AA ᦪ new Tai lue letter high sua Lo L
- U+19AB ᦫ new Tai lue letter low sua Lo L
Digits
- U+19DA ᧚ new Tai lue tham digit one Nd L
Combining Diacritical Marks Supplement
Miscellaneous mark
- U+1DFD ᷽ combining almost equal to below Mn NSM
Currency Symbols
Currency symbols
A number of currency symbols are found in other blocks. Fullwidth versions of some currency symbols are found in the Halfwidth and Fullwidth Forms block.
see also U+0024 $ dollar sign (Basic Latin)
see also U+00A2 ¢ cent sign (Latin-1 Supplement)
see also U+00A3 £ pound sign (Latin-1 Supplement)
see also U+00A4 ¤ currency sign (Latin-1 Supplement)
see also U+00A5 ¥ yen sign (Latin-1 Supplement)
see also U+0192 ƒ Latin small letter F with hook (Latin Extended B)
see also U+060B ؋ afghani sign (Arabic)
see also U+09F2 ৲ Bengali rupee mark (Bengali)
see also U+09F3 ৳ Bengali rupee sign (Bengali)
see also U+0AF1 ૱ Gujarati rupee sign (Gujarati)
see also U+0BF9 ௹ Tamil rupee sign (Tamil)
see also U+0E3F ฿ Thai currency symbol baht (Thai)
see also U+17DB ៛ Khmer currency symbol riel (Khmer)
see also U+2133 ℳ script capital m (Letterlike Symbols)
see also U+5143 元 CJK Ideograph 5143 (CJK Unified Ideographs)
see also U+5186 円 CJK Ideograph 5186 (CJK Unified Ideographs)
see also U+5706 圆 CJK Ideograph 5706 (CJK Unified Ideographs)
see also U+5713 圓 CJK Ideograph 5713 (CJK Unified Ideographs)
see also U+FDFC ﷼ rial sign (Arabic Presentation Forms A)
- U+20B6 ₶ livre tournois sign Sc ET
- * used in France from 13th-18th centuries
- U+20B7 ₷ spesmilo sign Sc ET
- * historical international currency associated with Esperanto
- U+20B8 ₸ tenge sign Sc ET
- * Kazakhstan
- ref U+2351 ⍑ APL functional symbol up tack overbar (Miscellaneous Technical)
- ref U+2564 ╤ box drawings down single and horizontal double (Box Drawing)
- ref U+3012 〒 postal mark (CJK Symbols and Punctuation)
Number Forms
Fractions
Other fraction number forms are found in the Latin-1 Supplement block.
see also U+00BC ¼ vulgar fraction one quarter (Latin-1 Supplement)
see also U+00BD ½ vulgar fraction one half (Latin-1 Supplement)
see also U+00BE ¾ vulgar fraction three quarters (Latin-1 Supplement)
- U+2150 ⅐ vulgar fraction one seventh No ON
- U+2151 ⅑ vulgar fraction one ninth No ON
- U+2152 ⅒ vulgar fraction one tenth No ON
Fraction
- U+2189 ↉ vulgar fraction zero thirds No ON
- * used in baseball scoring, from ARIB STD B24
Miscellaneous Technical
Miscellaneous technical
- U+23E8 ⏨ decimal exponent symbol So ON
- * Algol-60 token for scientific notation literals
Miscellaneous Symbols
Symbols for closed captioning from ARIB STD B24
- U+269E ⚞ three lines converging right So ON
- aka someone speaking
- U+269F ⚟ three lines converging left So ON
- aka background speaking
Sports symbols
- U+26BD ⚽ soccer ball So ON
- U+26BE ⚾ baseball So ON
Miscellaneous symbol from ARIB STD B24
- U+26BF ⚿ squared key So ON
- aka parental lock
Weather symbols from ARIB STD B24
- U+26C4 ⛄ snowman without snow So ON
- aka light snow
- U+26C5 ⛅ sun behind cloud So ON
- aka partly cloudy
- U+26C6 ⛆ rain So ON
- aka rainy weather
- U+26C7 ⛇ black snowman So ON
- aka heavy snow
- U+26C8 ⛈ thunder cloud and rain So ON
- aka thunderstorm
Game symbols from ARIB STD B24
- U+26C9 ⛉ turned white shogi piece So ON
- U+26CA ⛊ turned black shogi piece So ON
- U+26CB ⛋ white diamond in square So ON
- ref U+233A ⌺ APL functional symbol quad diamond (Miscellaneous Technical)
Traffic signs from ARIB STD B24
- U+26CC ⛌ crossing lanes So ON
- aka accident
- ref U+292C ⤬ falling diagonal crossing rising diagonal (Supplemental Arrows B)
- U+26CD ⛍ disabled car So ON
- U+26CF ⛏ pick So ON
- aka under construction
- U+26D0 ⛐ car sliding So ON
- aka icy road
- U+26D1 ⛑ helmet with white cross So ON
- aka maintenance
- U+26D2 ⛒ circled crossing lanes So ON
- aka road closed
- U+26D3 ⛓ chains So ON
- aka tyre chains required
- U+26D4 ⛔ no entry So ON
- U+26D5 ⛕ alternate one way left way traffic So ON
- * left side traffic
- U+26D6 ⛖ black two way left way traffic So ON
- * left side traffic
- U+26D7 ⛗ white two way left way traffic So ON
- * left side traffic
- U+26D8 ⛘ black left lane merge So ON
- * left side traffic
- U+26D9 ⛙ white left lane merge So ON
- * left side traffic
- U+26DA ⛚ drive slow sign So ON
- U+26DB ⛛ heavy white down pointing triangle So ON
- aka drive slow
- ref U+25BD ▽ white down pointing triangle (Geometric Shapes)
- U+26DC ⛜ left closed entry So ON
- U+26DD ⛝ squared saltire So ON
- aka closed entry
- ref U+22A0 ⊠ squared times (Mathematical Operators)
- U+26DE ⛞ falling diagonal in white circle in black square So ON
- aka closed to large vehicles
- U+26DF ⛟ black truck So ON
- aka black lorry
- aka closed to large vehicles, alternate
- U+26E0 ⛠ restricted left entry 1 So ON
- U+26E1 ⛡ restricted left entry 2 So ON
Dictionary and map symbols from ARIB STD B24
- U+26E3 ⛣ heavy circle with stroke and two dots above So ON
- aka public office
- U+26E8 ⛨ black cross on shield So ON
- aka hospital
- U+26E9 ⛩ shinto shrine So ON
- aka torii
- U+26EA ⛪ church So ON
- U+26EB ⛫ castle So ON
- U+26EC ⛬ historic site So ON
- U+26ED ⛭ gear without hub So ON
- aka factory
- ref U+2699 ⚙ gear (Miscellaneous Symbols)
- U+26EE ⛮ gear with handles So ON
- aka power plant, power substation
- U+26EF ⛯ map symbol for lighthouse So ON
- U+26F0 ⛰ mountain So ON
- U+26F1 ⛱ umbrella on ground So ON
- aka bathing beach
- U+26F2 ⛲ fountain So ON
- aka park
- U+26F3 ⛳ flag in hole So ON
- aka golf course
- U+26F4 ⛴ ferry So ON
- aka ferry boat terminal
- U+26F5 ⛵ sailboat So ON
- aka marina or yacht harbour
- U+26F6 ⛶ square four corners So ON
- aka intersection
- U+26F7 ⛷ skier So ON
- aka ski resort
- U+26F8 ⛸ ice skate So ON
- aka ice skating rink
- U+26F9 ⛹ person with ball So ON
- aka track and field, gymnasium
- U+26FA ⛺ tent So ON
- aka camping site
- U+26FB ⛻ japanese bank symbol So ON
- U+26FC ⛼ headstone graveyard symbol So ON
- aka graveyard, memorial park, cemetery
- U+26FD ⛽ fuel pump So ON
- aka petrol station, gas station
- U+26FE ⛾ cup on black square So ON
- aka drive-in restaurant
- U+26FF ⛿ white flag with horizontal middle black stripe So ON
- aka japanese self-defence force site
Dingbats
Miscellaneous
- U+2757 ❗ heavy exclamation mark symbol So ON
- aka obstacles on the road, arib std b24
Miscellaneous Symbols and Arrows
Traffic sign from ARIB STD B24
- U+2B55 ⭕ heavy large circle So ON
- aka basic symbol for speed limit
- ref U+25EF ◯ large circle (Geometric Shapes)
Dictionary and map symbols from ARIB STD B24
- U+2B56 ⭖ heavy oval with oval inside So ON
- aka prefectural office
- U+2B57 ⭗ heavy circle with circle inside So ON
- aka municipal office
- ref U+25CE ◎ bullseye (Geometric Shapes)
- U+2B58 ⭘ heavy circle So ON
- aka town or village office
- ref U+25CB ○ white circle (Geometric Shapes)
- U+2B59 ⭙ heavy circled saltire So ON
- aka police station
- ref U+2A02 ⨂ n ary circled times operator (Supplemental Mathematical Operators)
Latin Extended C
Miscellaneous additions
- U+2C70 Ɒ Latin capital letter turned alpha Lu L
- * lowercase is 0252
Additions for Shona
- U+2C7E Ȿ Latin capital letter S with swash tail Lu L
- * lower case is 023F
- U+2C7F Ɀ Latin capital letter Z with swash tail Lu L
- * lower case is 0240
Coptic
Cryptogrammic letters
- U+2CEB Ⳬ Coptic capital letter cryptogrammic shei Lu L
- U+2CEC ⳬ Coptic small letter cryptogrammic shei Ll L
- U+2CED Ⳮ Coptic capital letter cryptogrammic gangia Lu L
- U+2CEE ⳮ Coptic small letter cryptogrammic gangia Ll L
Combining marks
- U+2CEF ⳯ Coptic combining ni above Mn NSM
- * this mark is used in final position and extends above the following character (usually a space)
- U+2CF0 ⳰ Coptic combining spiritus asper Mn NSM
- ref U+0314 ̔ combining reversed comma above (Combining Diacritical Marks)
- ref U+0485 ҅ combining Cyrillic dasia pneumata (Cyrillic)
- U+2CF1 ⳱ Coptic combining spiritus lenis Mn NSM
- ref U+0313 ̓ combining comma above (Combining Diacritical Marks)
- ref U+0486 ҆ combining Cyrillic psili pneumata (Cyrillic)
Supplemental Punctuation
Historic punctuation
- U+2E31 ⸱ word separator middle dot Po ON
- * used in Avestan, Samaritan, ...
- ref U+00B7 · middle dot (Latin-1 Supplement)
Enclosed CJK Letters and Months
Circled ideographs from ARIB STD B24
- U+3244 ㉄ circled ideograph question So L
- U+3245 ㉅ circled ideograph kindergarten So L
- U+3246 ㉆ circled ideograph school So L
- U+3247 ㉇ circled ideograph koto So L
Circled numbers on black squares from ARIB STD B24
- U+3248 ㉈ circled number ten on black square So L
- aka speed limit 10 km/h
- U+3249 ㉉ circled number twenty on black square So L
- aka speed limit 20 km/h
- U+324A ㉊ circled number thirty on black square So L
- aka speed limit 30 km/h
- U+324B ㉋ circled number forty on black square So L
- aka speed limit 40 km/h
- U+324C ㉌ circled number fifty on black square So L
- aka speed limit 50 km/h
- U+324D ㉍ circled number sixty on black square So L
- aka speed limit 60 km/h
- U+324E ㉎ circled number seventy on black square So L
- aka speed limit 70 km/h
- U+324F ㉏ circled number eighty on black square So L
- aka speed limit 80 km/h
CJK Unified Ideographs
- U+9FC4 鿄 CJK Ideograph 9FC4 Lo L
- U+9FC5 鿅 CJK Ideograph 9FC5 Lo L
- U+9FC6 鿆 CJK Ideograph 9FC6 Lo L
- U+9FC7 鿇 CJK Ideograph 9FC7 Lo L
- U+9FC8 鿈 CJK Ideograph 9FC8 Lo L
- U+9FC9 鿉 CJK Ideograph 9FC9 Lo L
- U+9FCA 鿊 CJK Ideograph 9FCA Lo L
- U+9FCB 鿋 CJK Ideograph 9FCB Lo L
CJK Compatibility Ideographs
ARIB compatibility ideographs
- U+FA6B 恵 CJK compatibility ideograph fa6b Lo L
- U+FA6C 𤋮 CJK compatibility ideograph fa6c Lo L
- U+FA6D 舘 CJK compatibility ideograph fa6d Lo L
Phoenician
Numbers
- U+1091A 𐤚 phoenician number two No R
- U+1091B 𐤛 phoenician number three No R
Altered Characters
In addition, 7 characters were altered in Bad Version
A total of 2 characters changed their General Category
2 characters changed their General Category from Letter, Lowercase to Letter, Modifier
A total of 5 characters changed their Bidirectional Category
5 characters changed their Bidirectional Category from Left To Right to Other Neutral
Superscripts and Subscripts
U+2071
ⁱ superscript Latin small letter I had its
General Category changed from
Letter, Lowercase to
Letter, Modifier
U+207F
ⁿ superscript Latin small letter N had its
General Category changed from
Letter, Lowercase to
Letter, Modifier
Mathematical Alphanumeric Symbols
U+1D6DB
𝛛 mathematical bold partial differential had its
Bidirectional Category changed from
Left To Right to
Other Neutral
U+1D715
𝜕 mathematical italic partial differential had its
Bidirectional Category changed from
Left To Right to
Other Neutral
U+1D74F
𝝏 mathematical bold italic partial differential had its
Bidirectional Category changed from
Left To Right to
Other Neutral
U+1D789
𝞉 mathematical sans serif bold partial differential had its
Bidirectional Category changed from
Left To Right to
Other Neutral
U+1D7C3
𝟃 mathematical sans serif bold italic partial differential had its
Bidirectional Category changed from
Left To Right to
Other Neutral
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html