Unicode 4.1 was released in 2005. The previous version was
Unicode 4.0 and the next is
Unicode 5.0.
All the gory details can be found at http://www.unicode.org/versions/Unicode4.1.0/
Unicode 4.1 adds 1273 new characters, most of which are in the 20 new code blocks.
There are eight new scripts supported in 4.1 : New Tai Lue, Buginese, Glagolitic, Coptic, Tifinagh, Syloti Nagri, Old Persian and Kharoshthi. Coptic used to be lumped in with Greek in the Greek and Coptic code block, but now has its own identity.
The changes from 4.0.1 include the following.
New Code Blocks
20 new
code blocks were added in 4.1
U+0750 to U+077F Arabic Supplement 30/48
U+1380 to U+139F Ethiopic Supplement 26/32
U+1980 to U+19DF New Tai Lue 80/96
U+1A00 to U+1A1F Buginese 30/32
U+1D80 to U+1DBF Phonetic Extensions Supplement 64/64
U+1DC0 to U+1DFF Combining Diacritical Marks Supplement 4/64
U+2C00 to U+2C5F Glagolitic 94/96
U+2C80 to U+2CFF Coptic 114/128
U+2D00 to U+2D2F Georgian Supplement 38/48
U+2D30 to U+2D7F Tifinagh 55/80
U+2D80 to U+2DDF Ethiopic Extended 79/96
U+2E00 to U+2E7F Supplemental Punctuation 26/128
U+31C0 to U+31EF CJK Basic Strokes 16/48
U+A700 to U+A71F Modifier Tone Letters 23/32
U+A800 to U+A82F Syloti Nagri 44/48
U+FE10 to U+FE1F Vertical Forms 10/16
U+10140 to U+1018F Ancient Greek Numbers 75/80
U+103A0 to U+103DF Old Persian 50/64
U+10A00 to U+10A5F Kharoshthi 65/96
U+1D200 to U+1D24F Ancient Greek Musical Notation 70/80
New Characters
Excluding those in the new
code blocks, there were 280 new characters added in Unicode 4.1
Number of characters in each General Category :
Letter, Uppercase Lu : 9
Letter, Lowercase Ll : 30
Letter, Modifier Lm : 7
Letter, Other Lo :142
Mark, Non-Spacing Mn : 16
Number, Decimal Digit Nd : 1
Punctuation, Open Ps : 1
Punctuation, Close Pe : 1
Punctuation, Other Po : 14
Symbol, Math Sm : 5
Symbol, Currency Sc : 5
Symbol, Other So : 49
Number of characters in each Bidirectional Category :
Left To Right L :192
Right To Left R : 1
Right To Left Arabic AL : 2
European Number Terminator ET : 4
Non Spacing Mark NSM : 16
Other Neutral ON : 65
The columns below should be interpreted as :
- The Unicode code for the character
- The character in question
- The Unicode name for the character
- The Unicode General Category for the character
- The Unicode Bidirectional Category for the character
If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.
Latin Extended B
- U+0237 ȷ Latin small letter dotless j Ll L
- U+0238 ȸ Latin small letter db digraph Ll L
- U+0239 ȹ Latin small letter qp digraph Ll L
- U+023A Ⱥ Latin capital letter A with stroke Lu L
- U+023B Ȼ Latin capital letter C with stroke Lu L
- U+023C ȼ Latin small letter C with stroke Ll L
- U+023D Ƚ Latin capital letter L with bar Lu L
- U+023E Ⱦ Latin capital letter T with diagonal stroke Lu L
- U+023F ȿ Latin small letter S with swash tail Ll L
- U+0240 ɀ Latin small letter Z with swash tail Ll L
- U+0241 Ɂ Latin capital letter glottal stop Lu L
Combining Diacritical Marks
- U+0358 ͘ combining dot above right Mn NSM
- U+0359 ͙ combining asterisk below Mn NSM
- U+035A ͚ combining double ring below Mn NSM
- U+035B ͛ combining zigzag above Mn NSM
- U+035C ͜ combining double breve below Mn NSM
Greek and Coptic
- U+03FC ϼ Greek rho with stroke symbol Ll ON
- U+03FD Ͻ Greek capital reversed lunate sigma symbol Lu L
- U+03FE Ͼ Greek capital dotted lunate sigma symbol Lu L
- U+03FF Ͽ Greek capital reversed dotted lunate sigma symbol Lu L
Cyrillic
Extended Cyrillic
- U+04F6 Ӷ Cyrillic capital letter ghe with descender Lu L
- U+04F7 ӷ Cyrillic small letter ghe with descender Ll L
Hebrew
Cantillation marks
- U+05A2 ֢ Hebrew accent atnah hafukh Mn NSM
Points and punctuation
- U+05BA ֺ Hebrew point qamats qatan Mn NSM
- U+05C5 ׅ Hebrew mark lower dot Mn NSM
- U+05C6 ׆ Hebrew punctuation nun hafukha Po R
Arabic
- U+060B ؋ afghani sign Sc AL
Punctuation
- U+061E ؞ Arabic triple dot punctuation mark Po AL
- U+0659 ٙ Arabic zwarakay Mn NSM
- U+065A ٚ Arabic vowel sign small v above Mn NSM
- U+065B ٛ Arabic vowel sign inverted small v above Mn NSM
- U+065C ٜ Arabic vowel sign dot below Mn NSM
- U+065D ٝ Arabic reversed damma Mn NSM
- U+065E ٞ Arabic fatha with two dots Mn NSM
Devanagari
- U+097D ॽ Devanagari letter glottal stop Lo L
Bengali
Various signs
- U+09CE ৎ Bengali letter khanda ta Lo L
Tamil
Consonants
- U+0BB6 ஶ Tamil letter sha Lo L
Digits
- U+0BE6 ௦ Tamil digit zero Nd L
- ref U+0030 0 digit zero (Basic Latin)
Tibetan
- U+0FD0 ࿐ Tibetan mark bska shog gi mgo rgyan Po L
- U+0FD1 ࿑ Tibetan mark mnyam yig gi mgo rgyan Po L
Georgian
- U+10F9 ჹ Georgian letter turned gan Lo L
- U+10FA ჺ Georgian letter ain Lo L
- U+10FC ჼ modifier letter Georgian nar Lm L
Ethiopic
Syllables
- U+1207 ሇ Ethiopic syllable hoa Lo L
- U+1247 ቇ Ethiopic syllable qoa Lo L
- U+1287 ኇ Ethiopic syllable xoa Lo L
- U+12AF ኯ Ethiopic syllable koa Lo L
- U+12CF ዏ Ethiopic syllable woa Lo L
- U+12EF ዯ Ethiopic syllable yoa Lo L
- U+130F ጏ Ethiopic syllable goa Lo L
- U+131F ጟ Ethiopic syllable ggwaa Lo L
- U+1347 ፇ Ethiopic syllable tzoa Lo L
- U+135F ፟ Ethiopic combining gemination mark Mn NSM
- U+1360 ፠ Ethiopic section mark Po L
Phonetic Extensions
- U+1D6C ᵬ Latin small letter B with middle tilde Ll L
- U+1D6D ᵭ Latin small letter D with middle tilde Ll L
- U+1D6E ᵮ Latin small letter F with middle tilde Ll L
- U+1D6F ᵯ Latin small letter M with middle tilde Ll L
- U+1D70 ᵰ Latin small letter N with middle tilde Ll L
- U+1D71 ᵱ Latin small letter P with middle tilde Ll L
- U+1D72 ᵲ Latin small letter R with middle tilde Ll L
- U+1D73 ᵳ Latin small letter R with fishhook and middle tilde Ll L
- U+1D74 ᵴ Latin small letter S with middle tilde Ll L
- U+1D75 ᵵ Latin small letter T with middle tilde Ll L
- U+1D76 ᵶ Latin small letter Z with middle tilde Ll L
- U+1D77 ᵷ Latin small letter turned g Ll L
- U+1D78 ᵸ modifier letter Cyrillic en Lm L
- U+1D79 ᵹ Latin small letter insular g Ll L
- U+1D7A ᵺ Latin small letter th with strikethrough Ll L
- U+1D7B ᵻ Latin small capital letter I with stroke Ll L
- U+1D7C ᵼ Latin small letter iota with stroke Ll L
- U+1D7D ᵽ Latin small letter P with stroke Ll L
- U+1D7E ᵾ Latin small capital letter U with stroke Ll L
- U+1D7F ᵿ Latin small letter upsilon with stroke Ll L
General Punctuation
General punctuation
- U+2055 ⁕ flower punctuation mark Po ON
- U+2056 ⁖ three dot punctuation Po ON
- U+2058 ⁘ four dot punctuation Po ON
- U+2059 ⁙ five dot punctuation Po ON
- U+205A ⁚ two dot punctuation Po ON
- U+205B ⁛ four dot mark Po ON
- U+205C ⁜ dotted cross Po ON
- U+205D ⁝ tricolon Po ON
- U+205E ⁞ vertical four dots Po ON
Superscripts and Subscripts
- U+2090 ₐ Latin subscript small letter A Lm L
- U+2091 ₑ Latin subscript small letter E Lm L
- U+2092 ₒ Latin subscript small letter O Lm L
- U+2093 ₓ Latin subscript small letter X Lm L
- U+2094 ₔ Latin subscript small letter schwa Lm L
Currency Symbols
- U+20B2 ₲ guarani sign Sc ET
- U+20B3 ₳ austral sign Sc ET
- U+20B4 ₴ hryvnia sign Sc ET
- U+20B5 ₵ cedi sign Sc ET
Combining Diacritical Marks for Symbols
- U+20EB ⃫ combining long double solidus overlay Mn NSM
Letterlike Symbols
Additional letterlike symbols
- U+213C ℼ double struck small pi Ll L
- U+214C ⅌ per sign So ON
Miscellaneous Technical
- U+23D1 ⏑ metrical breve So ON
- U+23D2 ⏒ metrical long over short So ON
- U+23D3 ⏓ metrical short over long So ON
- U+23D4 ⏔ metrical long over two shorts So ON
- U+23D5 ⏕ metrical two shorts over long So ON
- U+23D6 ⏖ metrical two shorts joined So ON
- U+23D7 ⏗ metrical triseme So ON
- U+23D8 ⏘ metrical tetraseme So ON
- U+23D9 ⏙ metrical pentaseme So ON
- U+23DA ⏚ earth ground So ON
- U+23DB ⏛ fuse So ON
Miscellaneous Symbols
- U+2618 ☘ shamrock So ON
- U+267E ♾ permanent paper sign So ON
- U+267F ♿ wheelchair symbol So ON
- U+2692 ⚒ hammer and pick So ON
- U+2693 ⚓ anchor So ON
- U+2694 ⚔ crossed swords So ON
- U+2695 ⚕ staff of aesculapius So ON
- U+2696 ⚖ scales So ON
- U+2697 ⚗ alembic So ON
- U+2698 ⚘ flower So ON
- U+2699 ⚙ gear So ON
- U+269A ⚚ staff of hermes So ON
- U+269B ⚛ atom symbol So ON
- U+269C ⚜ fleur de lis So ON
- U+26A2 ⚢ doubled female sign So ON
- U+26A3 ⚣ doubled male sign So ON
- U+26A4 ⚤ interlocked female and male sign So ON
- U+26A5 ⚥ male and female sign So ON
- U+26A6 ⚦ male with stroke sign So ON
- U+26A7 ⚧ male with stroke and male and female sign So ON
- U+26A8 ⚨ vertical male with stroke sign So ON
- U+26A9 ⚩ horizontal male with stroke sign So ON
- U+26AA ⚪ medium white circle So ON
- U+26AB ⚫ medium black circle So ON
- U+26AC ⚬ medium small white circle So L
- U+26AD ⚭ marriage symbol So ON
- U+26AE ⚮ divorce symbol So ON
- U+26AF ⚯ unmarried partnership symbol So ON
- U+26B0 ⚰ coffin So ON
- U+26B1 ⚱ funeral urn So ON
Miscellaneous Mathematical Symbols A
- U+27C0 ⟀ three dimensional angle Sm ON
- U+27C1 ⟁ white triangle containing small white triangle Sm ON
- U+27C2 ⟂ perpendicular Sm ON
- U+27C3 ⟃ open subset Sm ON
- U+27C4 ⟄ open superset Sm ON
- U+27C5 ⟅ left s shaped bag delimiter Ps ON
- U+27C6 ⟆ right s shaped bag delimiter Pe ON
Miscellaneous Symbols and Arrows
Arrows
Other white and black arrows to complete this set can be found in Supplementary Arrows-B and Dingbats
- U+2B0E ⬎ rightwards arrow with tip downwards So ON
- U+2B0F ⬏ rightwards arrow with tip upwards So ON
- U+2B10 ⬐ leftwards arrow with tip downwards So ON
- U+2B11 ⬑ leftwards arrow with tip upwards So ON
- U+2B12 ⬒ square with top half black So ON
- U+2B13 ⬓ square with bottom half black So ON
Enclosed CJK Letters and Months
- U+327E ㉾ circled Hangul ieung u So ON
Mathematical Alphanumeric Symbols
- U+1D6A4 𝚤 mathematical italic small dotless i Ll L
- U+1D6A5 𝚥 mathematical italic small dotless j Ll L
Altered Characters
In addition, 14 characters were altered in 4.1
A total of 12 characters changed their General Category
1 characters changed their General Category from Letter, Other to Letter, Modifier
9 characters changed their General Category from Number, Decimal Digit to Number, Other
2 characters changed their General Category from Punctuation, Connector to Punctuation, Other
A total of 2 characters changed their Bidirectional Category
1 characters changed their Bidirectional Category from European Number Separator to Common Number Separator
1 characters changed their Bidirectional Category from Whitespace to Common Number Separator
Ethiopic
U+1369
፩ Ethiopic digit one had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136A
፪ Ethiopic digit two had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136B
፫ Ethiopic digit three had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136C
፬ Ethiopic digit four had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136D
፭ Ethiopic digit five had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136E
፮ Ethiopic digit six had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+136F
፯ Ethiopic digit seven had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+1370
፰ Ethiopic digit eight had its
General Category changed from
Number, Decimal Digit to
Number, Other
U+1371
፱ Ethiopic digit nine had its
General Category changed from
Number, Decimal Digit to
Number, Other
General Punctuation
U+202F
narrow no break space had its
Bidirectional Category changed from
Whitespace to
Common Number Separator
Katakana
U+30FB
・ Katakana middle dot had its
General Category changed from
Punctuation, Connector to
Punctuation, Other
Yi Syllables
U+A015
ꀕ Yi syllable wu had its
General Category changed from
Letter, Other to
Letter, Modifier
Halfwidth and Fullwidth Forms
U+FF0F
/ fullwidth solidus had its
Bidirectional Category changed from
European Number Separator to
Common Number Separator
U+FF65
・ halfwidth Katakana middle dot had its
General Category changed from
Punctuation, Connector to
Punctuation, Other
http://unicode.org