Unicode 5.0 was released in 2006. The previous version was Unicode 4.1 and the current version is Unicode 5.1.

All the gory details can be found at http://www.unicode.org/versions/Unicode5.0.0/

Unicode 5.0 adds 1369 new characters, most of which are in the 9 new code blocks : NKo, Balinese, Latin Extended-C, Latin Extended-D, Phags-pa, Phoenician, Cuneiform, Cuneiform Numbers and Punctuation and Counting Rod Numerals.

The changes from 4.1 include the following.


New Code Blocks

9 new code blocks were added in 5.0


U+07C0 to U+07FF   NKo 59/64
U+1B00 to U+1B7F   Balinese 121/128
U+2C60 to U+2C7F   Latin Extended C 17/32
U+A720 to U+A7FF   Latin Extended D 2/224
U+A840 to U+A87F   Phags pa 56/64
U+10900 to U+1091F   Phoenician 27/32
U+12000 to U+123FF   Cuneiform 879/1024
U+12400 to U+1247F   Cuneiform Numbers and Punctuation 103/128
U+1D360 to U+1D37F   Counting Rod Numerals 18/32

 

New Characters

Excluding those in the new code blocks, there were 87 new characters added in Unicode 5.0

Number of characters in each General Category :

Letter, Uppercase  Lu : 14
Letter, Lowercase  Ll : 18
Letter, Modifier   Lm :  4
Letter, Other      Lo :  4
Mark, Non-Spacing  Mn : 16
Symbol, Math       Sm : 10
Symbol, Other      So : 21

Number of characters in each Bidirectional Category :

Left To Right       L : 36
Non Spacing Mark  NSM : 16
Other Neutral      ON : 35

The columns below should be interpreted as :

  1. The Unicode code for the character
  2. The character in question
  3. The Unicode name for the character
  4. The Unicode General Category for the character
  5. The Unicode Bidirectional Category for the character

If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.

 

Latin Extended B

     Miscellaneous additions

U+0242   ɂ   Latin small letter glottal stop Ll L
* casing use in Chipewyan, Dogrib, Slavey (Canadian aboriginal orthographies)
ref U+0294   ʔ   Latin letter glottal stop (IPA Extensions)
ref U+02C0   ˀ   modifier letter glottal stop (Spacing Modifier Letters)
U+0243   Ƀ   Latin capital letter B with stroke Lu L
* lowercase is 0180
U+0244   Ʉ   Latin capital letter U bar Lu L
* lowercase is 0289
U+0245   Ʌ   Latin capital letter turned v Lu L
* lowercase is 028C
U+0246   Ɇ   Latin capital letter E with stroke Lu L
U+0247   ɇ   Latin small letter E with stroke Ll L
U+0248   Ɉ   Latin capital letter J with stroke Lu L
U+0249   ɉ   Latin small letter J with stroke Ll L
U+024A   Ɋ   Latin capital letter small q with hook tail Lu L
U+024B   ɋ   Latin small letter Q with hook tail Ll L
U+024C   Ɍ   Latin capital letter R with stroke Lu L
U+024D   ɍ   Latin small letter R with stroke Ll L
U+024E   Ɏ   Latin capital letter Y with stroke Lu L
U+024F   ɏ   Latin small letter Y with stroke Ll L

 

Greek and Coptic

     Lowercase of editorial symbols

U+037B   ͻ   Greek small reversed lunate sigma symbol Ll L
U+037C   ͼ   Greek small dotted lunate sigma symbol Ll L
U+037D   ͽ   Greek small reversed dotted lunate sigma symbol Ll L

 

Cyrillic

     Extended Cyrillic

U+04CF   ӏ   Cyrillic small letter palochka Ll L

     Additions for Nivkh

U+04FA   Ӻ   Cyrillic capital letter ghe with stroke and hook Lu L
U+04FB   ӻ   Cyrillic small letter ghe with stroke and hook Ll L
U+04FC   Ӽ   Cyrillic capital letter ha with hook Lu L
U+04FD   ӽ   Cyrillic small letter ha with hook Ll L
U+04FE   Ӿ   Cyrillic capital letter ha with stroke Lu L
U+04FF   ӿ   Cyrillic small letter ha with stroke Ll L

 

Cyrillic Supplement

     Cyrillic extensions

U+0510   Ԑ   Cyrillic capital letter reversed ze Lu L
U+0511   ԑ   Cyrillic small letter reversed ze Ll L
U+0512   Ԓ   Cyrillic capital letter el with hook Lu L
U+0513   ԓ   Cyrillic small letter el with hook Ll L

 

Hebrew

     Points and punctuation

U+05BA   ֺ   Hebrew point holam haser for vav Mn NSM

 

Devanagari

     Sindhi implosives
These are added from Amendment 3 to 10646:2003.

U+097B   ॻ   Devanagari letter gga Lo L
U+097C   ॼ   Devanagari letter jja Lo L

     Sindhi implosives
These are added from Amendment 3 to 10646:2003.

U+097E   ॾ   Devanagari letter ddda Lo L
U+097F   ॿ   Devanagari letter bba Lo L

 

Kannada

     Additional vowels for Sanskrit

U+0CE2   ೢ   Kannada vowel sign vocalic l Mn NSM
U+0CE3   ೣ   Kannada vowel sign vocalic ll Mn NSM

     Various signs

U+0CF1   ೱ   Kannada sign jihvamuliya So ON
U+0CF2   ೲ   Kannada sign upadhmaniya So ON

 

Combining Diacritical Marks Supplement

     Contour tone marks

U+1DC4   ᷄   combining macron acute Mn NSM
U+1DC5   ᷅   combining grave macron Mn NSM
U+1DC6   ᷆   combining macron grave Mn NSM
U+1DC7   ᷇   combining acute macron Mn NSM
U+1DC8   ᷈   combining grave acute grave Mn NSM
U+1DC9   ᷉   combining acute grave acute Mn NSM

     Miscellaneous mark

U+1DCA   ᷊   combining Latin small letter R below Mn NSM

     Additional marks for UPA

U+1DFE   ᷾   combining left arrowhead above Mn NSM
U+1DFF   ᷿   combining right arrowhead and down arrowhead below Mn NSM

 

Combining Diacritical Marks for Symbols

     Additional diacritical marks for symbols

U+20EC   ⃬   combining rightwards harpoon with barb downwards Mn NSM
U+20ED   ⃭   combining leftwards harpoon with barb downwards Mn NSM
U+20EE   ⃮   combining left arrow below Mn NSM
U+20EF   ⃯   combining right arrow below Mn NSM

 

Letterlike Symbols

     Additional letterlike symbols

U+214D   ⅍   aktieselskab So ON
ref U+2101   ℁   addressed to the subject (Letterlike Symbols)

     Lowercase Claudian letter
Claudian letters in inscriptions are uppercase, but may be transcribed by scholars in lowercase.

U+214E   ⅎ   turned small f Ll L
* uppercase is 2132
ref U+03DD   ϝ   Greek small letter digamma (Greek and Coptic)

 

Number Forms

     Lowercase Claudian letter
Claudian letters in inscriptions are uppercase, but may be transcribed by scholars in lowercase.

U+2184   ↄ   Latin small letter reversed c Ll L
ref U+037B   ͻ   Greek small reversed lunate sigma symbol (Greek and Coptic)

 

Miscellaneous Technical

     Horizontal brackets
These are intended for bracketing terms of mathematical expressions where their glyph extends to accommodate the width of the bracketed expression

U+23DC   ⏜   top parenthesis Sm ON
ref U+FE35   ︵   presentation form for vertical left parenthesis (CJK Compatibility Forms)
U+23DD   ⏝   bottom parenthesis Sm ON
ref U+FE36   ︶   presentation form for vertical right parenthesis (CJK Compatibility Forms)
U+23DE   ⏞   top curly bracket Sm ON
ref U+FE37   ︷   presentation form for vertical left curly bracket (CJK Compatibility Forms)
U+23DF   ⏟   bottom curly bracket Sm ON
ref U+FE38   ︸   presentation form for vertical right curly bracket (CJK Compatibility Forms)
U+23E0   ⏠   top tortoise shell bracket Sm ON
ref U+FE39   ︹   presentation form for vertical left tortoise shell bracket (CJK Compatibility Forms)
U+23E1   ⏡   bottom tortoise shell bracket Sm ON
ref U+FE3A   ︺   presentation form for vertical right tortoise shell bracket (CJK Compatibility Forms)

     Miscellaneous technical

U+23E2   ⏢   white trapezium So ON

     Chemistry symbol

U+23E3   ⏣   benzene ring with circle So ON

     Miscellaneous technical

U+23E4   ⏤   straightness So ON
U+23E5   ⏥   flatness So ON
U+23E6   ⏦   ac current So ON
U+23E7   ⏧   electrical intersection So ON

 

Miscellaneous Symbols

     Gender symbol

U+26B2   ⚲   neuter So ON

 

Miscellaneous Mathematical Symbols A

     Miscellaneous symbols

U+27C7   ⟇   or with dot inside Sm ON
U+27C8   ⟈   reverse solidus preceding subset Sm ON
U+27C9   ⟉   superset preceding solidus Sm ON

     Vertical line operator

U+27CA   ⟊   vertical bar with horizontal stroke Sm ON
ref U+2AF2   ⫲   parallel with horizontal stroke (Supplemental Mathematical Operators)
ref U+2AF5   ⫵   triple vertical bar with horizontal stroke (Supplemental Mathematical Operators)

 

Miscellaneous Symbols and Arrows

     Squares

U+2B14   ⬔   square with upper right diagonal half black So ON
U+2B15   ⬕   square with lower left diagonal half black So ON

     Diamonds

U+2B16   ⬖   diamond with left half black So ON
U+2B17   ⬗   diamond with right half black So ON
U+2B18   ⬘   diamond with top half black So ON
U+2B19   ⬙   diamond with bottom half black So ON

     Square

U+2B1A   ⬚   dotted square So ON

     Pentagon

U+2B20   ⬠   white pentagon So ON

     Hexagons

U+2B21   ⬡   white hexagon So ON
U+2B22   ⬢   black hexagon So ON
U+2B23   ⬣   horizontal black hexagon So ON

 

Modifier Tone Letters

     Chinantec tone marks

U+A717   ꜗ   modifier letter dot vertical bar Lm ON
U+A718   ꜘ   modifier letter dot slash Lm ON
U+A719   ꜙ   modifier letter dot horizontal bar Lm ON
U+A71A   ꜚ   modifier letter lower right corner angle Lm ON

 

Mathematical Alphanumeric Symbols

     Additional bold Greek symbols

U+1D7CA   𝟊   mathematical bold capital digamma Lu L
U+1D7CB   𝟋   mathematical bold small digamma Ll L

 

Altered Characters


In addition, 13 characters were altered in 5.0

A total of 8 characters changed their General Category
1 characters changed their General Category from Letter, Lowercase to Letter, Other
1 characters changed their General Category from Letter, Other to Number, Letter
1 characters changed their General Category from Number, Letter to Letter, Uppercase
1 characters changed their General Category from Punctuation, Open to Symbol, Other
1 characters changed their General Category from Punctuation, Close to Symbol, Other
1 characters changed their General Category from Punctuation, Other to Symbol, Other
1 characters changed their General Category from Symbol, Other to Letter, Uppercase
1 characters changed their General Category from Symbol, Other to Punctuation, Other
 
A total of 6 characters changed their Bidirectional Category
6 characters changed their Bidirectional Category from Other Neutral to Left To Right

 

IPA Extensions


U+0294   ʔ   Latin letter glottal stop had its General Category changed from Letter, Lowercase to Letter, Other

 

Letterlike Symbols


U+2132     turned capital f had its General Category changed from Symbol, Other to Letter, Uppercase
U+2132     turned capital f had its Bidirectional Category changed from Other Neutral to Left To Right

 

Number Forms


U+2183     Roman numeral reversed one hundred had its General Category changed from Number, Letter to Letter, Uppercase

 

Miscellaneous Technical


U+23B4     top square bracket had its General Category changed from Punctuation, Open to Symbol, Other
U+23B5     bottom square bracket had its General Category changed from Punctuation, Close to Symbol, Other
U+23B6     bottom square bracket over top square bracket had its General Category changed from Punctuation, Other to Symbol, Other

 

Gothic


U+10341   𐍁   Gothic letter ninety had its General Category changed from Letter, Other to Number, Letter

 

Old Persian


U+103D0   𐏐   old persian word divider had its General Category changed from Symbol, Other to Punctuation, Other
U+103D1   𐏑   old persian number one had its Bidirectional Category changed from Other Neutral to Left To Right
U+103D2   𐏒   old persian number two had its Bidirectional Category changed from Other Neutral to Left To Right
U+103D3   𐏓   old persian number ten had its Bidirectional Category changed from Other Neutral to Left To Right
U+103D4   𐏔   old persian number twenty had its Bidirectional Category changed from Other Neutral to Left To Right
U+103D5   𐏕   old persian number hundred had its Bidirectional Category changed from Other Neutral to Left To Right
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html

Log in or register to write something here or to contact authors.