Unicode 3.1 was released in March, 2001, updated to Unicode 3.1.1 in August, 2001, and later updated once again to Unicode 3.1.1 with Corrigendum. The previous version was Unicode 3.0 and the next version is Unicode 3.2.

Unicode 3.1.1 with Corrigendum

This is exactly Unicode 3.1.1, with the addition of Corrigendum #3: U+F951 Normalization (http://www.unicode.org/versions/corrigendum3.html) which states
The canonical decomposition mapping for U+F951 () was recently found to be in error. The correct mapping is to U+964B () This was printed correctly in Unicode 2.0, but was mistakenly entered as U+96FB () in the UnicodeData.txt file, and remained uncorrected in successive versions. This corrigendum fixes that error.

Unicode 3.1.1

The Unicode Standard, Version 3.1.1 is defined by: The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5), as amended by the Unicode Standard Annex #27: Unicode 3.1 (http://www.unicode.org/reports/tr27/) and the Unicode 3.1.1 Update Notice (http://www.unicode.org/versions/Unicode3.1.1.html).

3.1.1 does not contain character additions or major normative changes, but only very subtle changes in a few secondary data files.

Unicode 3.1

The Unicode Standard, Version 3.1.0, is defined by: The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5), as amended by the Unicode Standard Annex #27: Unicode 3.1 (http://www.unicode.org/reports/tr27/).

Unicode 3.1 adds many characters, and is the first Unicode version to assign characters to the supplementary planes (i.e. character codes over 0x10000 or outside the original 2-byte limit). Specifically,

The Supplementary Multilingual Plane, or Plane 1, contains several historic scripts, and several sets of symbols: Old Italic, Gothic, Deseret, Byzantine Musical Symbols, (Western) Musical Symbols, and Mathematical Alphanumeric Symbols. Together these comprise 1594 newly encoded characters.

The Supplementary Ideographic Plane, or Plane 2, contains a very large collection of additional unified Han ideographs known as Vertical Extension B, comprising 42,711 characters, as well as 542 additional CJK Compatibility ideographs.

The Supplementary Special Purpose Plane, or Plane 14, contains a set of tag characters, 97 in all.

Counting the additions to the three supplementary planes and the two characters on the BMP, Unicode 3.1 adds 44,946 new encoded characters. Together with the 49,194 already existing characters in Unicode 3.0, that comes to a grand total of 94,140 encoded characters in Unicode 3.1.

Of those 94,140 characters, 70,207 are unified Han ideographs, and an additional 832 are CJK Compatibility ideographs -- slightly more than 75% of the encoded characters in the standard.

There are 34 specific code points in Unicode 3.0 that are characterized as noncharacters (U+nFFFE and U+nFFFF (where n is from 0 to hex 10). Unicode 3.1 adds an additional 32 noncharacters to the BMP at code points U+FDD0 to U+FDEF.

Unicode Technical Reports
#11: East Asian Width,
#13: Unicode Newline Guidelines,
#14: Line Breaking Properties,
and #15: Unicode Normalization Forms
have been promoted to the status of Unicode Technical Annex (UTX) and are thus officially part of the Unicode Standard.

Some of the differences between Unicode 3.0 and Unicode 3.1 include :

New Code Blocks

11 new code blocks were added in 3.1

U+10300 to U+1032F   Old Italic 35/48
U+10330 to U+1034F   Gothic 27/32
U+10400 to U+1044F   Deseret 76/80
U+1D000 to U+1D0FF   Byzantine Musical Symbols 246/256
U+1D100 to U+1D1FF   Musical Symbols 219/256
U+1D400 to U+1D7FF   Mathematical Alphanumeric Symbols 991/1024
U+20000 to U+2A6DF   CJK Unified Ideographs Extension B 42711/42720
U+2F800 to U+2FA1F   CJK Compatibility Ideographs Supplement 542/544
U+E0000 to U+E007F   Tags 97/128
U+F0000 to U+FFFFF   Supplementary Private Use Area A 65534/65536
U+100000 to U+10FFFF   Supplementary Private Use Area B 65534/65536


New Characters

Excluding those in the new code blocks, there were 2 new characters added in Unicode 3.1

Number of characters in each General Category :

Letter, Uppercase          Lu :  1
Letter, Lowercase          Ll :  1

All the characters in this set are in bidirectional category LeftToRight L

The columns below should be interpreted as :

  1. The Unicode code for the character
  2. The character in question
  3. The Unicode name for the character
  4. The Unicode General Category for the character

If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.


Greek and Coptic

     Greek symbols

U+03F4  ϴ Greek capital theta symbol Lu
ref U+0472 Cyrillic capital letter fita (Cyrillic)
U+03F5  ϵ Greek lunate epsilon symbol Ll
aka straight epsilon
ref U+220A small element of (Mathematical Operators)


Altered Characters

In addition, 3 characters were altered in 3.1



U+16EE     Runic arlaug symbol had its General Category changed from Number, Other to Number, Letter
U+16EF     Runic tvimadur symbol h ad its General Category changed from Number, Other to Number, Letter
U+16F0     Runic belgthor symbol h ad its General Category changed from Number, Other to Number, Letter