Unicode 3.0 was released in February, 2000 , updated to Unicode 3.0.1 in August, 2001 , and later updated once again to Unicode 3.0.1 with Corrigendum. The previous version was Unicode 2.1 and the next version is Unicode 3.1.

Unicode 3.0.1 with Corrigendum

Unicode 3.0.1 with Corrigendum is defined by: The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5) as amended by the Unicode 3.0.1 Update Notice (http://www.unicode.org/versions/Unicode3.0.1.html ), Corrigendum #1: UTF-8 Shortest Form (http://www.unicode.org/versions/corrigendum1.html ) and Corrigendum #2: Yod with Hiriq Normalization (http://www.unicode.org/versions/corrigendum2.html )

Corrigendum #1: UTF-8 Shortest Form states

The conformance clause C12 in The Unicode Standard, Version 3.0 forbids the generation of "non-shortest form" UTF-8, and forbids the interpretation of illegal sequences, but not the interpretation of "non-shortest form". Where software does interpret the non-shortest forms, security issues can arise. For example:
  1. Process A performs security checks, but does not check for non-shortest forms.
  2. Process B accepts the byte sequence from process A, and transforms it into UTF-16 while interpreting non-shortest forms.
  3. The UTF-16 text may then contain characters that should have been filtered out by process A.
To address this issue, the Unicode Technical Committee has modified the definition of UTF-8 to forbid conformant implementations from interpreting non-shortest forms for BMP characters, and clarified some of the conformance clauses.
Corrigendum #2: Yod with Hiriq Normalization states :
In the production of the normalization tables for Unicode 3.0, the character U+FB1D Hebrew letter yod with hiriq was mistakenly omitted from Composition Exclusions . During the public review period, this mistake was reported, but the report was misinterpreted and thus overlooked. In Unicode 3.1 , this character is now included in Composition Exclusions.

Version 3.0.1

Version 3.0.1 is defined by: The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5) as amended by the Unicode 3.0.1 Update Notice (http://www.unicode.org/versions/Unicode3.0.1.html ).

Unicode 3.0.1 does not contain character additions or major normative change

Three new data files have been added to the Unicode 3.0.1 release:

BidiMirroring.txt (see UAX #9: The Bidirectional Algorithm)
Informative properties for substituting characters in an implementation of bidirectional mirroring.
CaseFolding.txt see UTR #21: Case Mappings)
Informative file mapping characters to their case-folded form.
NormalizationTest.txt (see UAX #15 Unicode Normalization Forms)
Normative test file for conformance to Unicode Normalization Forms.

Version 3.0

Version 3.0 is defined by: The Unicode Standard, Version 3.0 (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5).

The Unicode Standard, Version 3.0 contains descriptions and properties for many new characters. It is synchronized with ISO/IEC 10646-1 second edition, and includes a number of new characters.

The following technical reports are approved and upgraded to the status of Unicode Technical Annex] and thus considered part of the Unicode Standard, Version 3.0. These reports may contain either normative or informative material, or both. Any reference to version 3.0 of the standard automatically includes these technical reports.

UAX #09: The Bidirectional Algorithm
UAX #11: East Asian Character Width
UAX #13: Unicode Newline Guidelines
UAX #14: Line Breaking Properties
UAX #15: Unicode Normalization Forms

 

The most significant additions to the standard include the following:

Transformation Formats
The precise definitions of the common Unicode Transformation Formats are provided, including UTF-8, UTF-16, UTF-16BE, and UTF-16LE. The relations between abstract characters, code points (scalar values) and code units (8, 16 or 32 bit) are clarified.
Bidirectional properties
Bidirectional properties are now more consistent with the General Category property, and new bidirectional properties were created. See UAX #09: The Bidirectional Algorithm.
Case
Case properties have been extended for those situations where there is a mapping to multiple characters and where case is locale dependent.
Combining classes
These were updated significantly to resolve problems of normalization and decomposition for Indic scripts in particular.
Decomposition and Composition
Unicode character decompositions have been significantly updated to fix errors in the original assignments, to allow correct collation weighting, and to make decompositions consistent for normalization. Certain characters are excluded from composition, and the precise algorithm for composition is provided. See UAX #15: Unicode Normalization Forms.
General Category
A series of general category changes were made to assist the convergence of the Unicode definition of identifier with ISO TR 10176.
Newlines
Line handling characteristics have been documented more fully for Unicode environments. See UAX #13: Unicode Newline Guidelines
Linebreak properties
Linebreaking properties (normative and informative) are added to the standard to support consistent linebreaking behavior over all Unicode characters. See UAX #14: Line Breaking Properties
East-Asian width properties
Properties for supporting correct choice of full-width vs. half-width glyphs in an East-Asian context are provided. See UAX #11: East Asian Character Width.
The major differences from Unicode 2.1 to Unicode 3.0 include :

New Code Blocks

19 new code blocks were added in 3.0


U+0700 to U+074F   Syriac 71/80
U+0780 to U+07BF   Thaana 49/64
U+0D80 to U+0DFF   Sinhala 80/128
U+1000 to U+109F   Myanmar 78/160
U+1200 to U+137F   Ethiopic 345/384
U+13A0 to U+13FF   Cherokee 85/96
U+1400 to U+167F   Unified Canadian Aboriginal Syllabics 630/640
U+1680 to U+169F   Ogham 29/32
U+16A0 to U+16FF   Runic 81/96
U+1780 to U+17FF   Khmer 103/128
U+1800 to U+18AF   Mongolian 155/176
U+2800 to U+28FF   Braille Patterns 256/256
U+2E80 to U+2EFF   CJK Radicals Supplement 115/128
U+2F00 to U+2FDF   Kangxi Radicals 214/224
U+2FF0 to U+2FFF   Ideographic Description Characters 12/16
U+31A0 to U+31BF   Bopomofo Extended 24/32
U+3400 to U+4DBF   CJK Unified Ideographs Extension A 6582/6592
U+A000 to U+A48F   Yi Syllables 1165/1168
U+A490 to U+A4CF   Yi Radicals 50/64

 

New Bidirectional Categories

8 new Bidirectional Categories were added in 3.0

  • LeftToRightEmbedding (LRE)
  • LeftToRightOverride (LRO)
  • RightToLeftArabic (AL)
  • RightToLeftEmbedding (RLE)
  • RightToLeftOverride (RLO)
  • PopDirectionalFormat (PDF)
  • NonSpacingMark (NSM)
  • BoundaryNeutral (BN)

 

New Characters

Excluding those in the new code blocks, there were 183 new characters added in Unicode 3.0

Number of characters in each General Category :

Letter, Uppercase          Lu : 21
Letter, Lowercase          Ll : 30
Letter, Modifier           Lm :  1
Letter, Other              Lo :  9
Mark, Non-Spacing          Mn : 22
Mark, Enclosing            Me :  4
Number, Letter             Nl :  4
Punctuation, Dash          Pd :  1
Punctuation, Other         Po :  6
Symbol, Currency           Sc :  3
Symbol, Modifier           Sk :  5
Symbol, Other              So : 73
Separator, Space           Zs :  1
Other, Format              Cf :  3

Number of characters in each Bidirectional Category :

LeftToRight                L : 73
RightToLeft                R :  1
RightToLeftArabic         AL :  9
EuropeanNumberTerminator  ET :  3
NonSpacingMark           NSM : 26
BoundaryNeutral           BN :  3
Whitespace                WS :  1
OtherNeutrals             ON : 67

The columns below should be interpreted as :

  1. The Unicode code for the character
  2. The character in question
  3. The Unicode name for the character
  4. The Unicode General Category for the character
  5. The Unicode Bidirectional Category for the character

If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.

 

Latin Extended B

     Additions

U+01F6  Ƕ Latin capital letter hwair Lu L
* lowercase is 0195
U+01F7  Ƿ Latin capital letter wynn Lu L
aka wen
* lowercase is 01BF
U+01F8  Ǹ Latin capital letter N with grave Lu L
U+01F9  ǹ Latin small letter N with grave Ll L
* Pinyin

     Additions for Romanian

U+0218  Ș Latin capital letter S with comma below Lu L
U+0219  ș Latin small letter S with comma below Ll L
* Romanian, when distinct comma below form is required
ref U+015F Latin small letter S with cedilla (Latin Extended A)
U+021A  Ț Latin capital letter T with comma below Lu L
U+021B  ț Latin small letter T with comma below Ll L
* Romanian, when distinct comma below form is required
ref U+0163 Latin small letter T with cedilla (Latin Extended A)

     Miscellaneous additions

U+021C  Ȝ Latin capital letter yogh Lu L
ref U+01B7 Latin capital letter ezh (Latin Extended B)
U+021D  ȝ Latin small letter yogh Ll L
* Middle English, Scots
ref U+0292 Latin small letter ezh (IPA Extensions)
ref U+2125 ounce sign (Letterlike Symbols)
U+021E  Ȟ Latin capital letter H with caron Lu L
U+021F  ȟ Latin small letter H with caron Ll L
* Finnish Romany
U+0222  Ȣ Latin capital letter ou Lu L
U+0223  ȣ Latin small letter ou Ll L
* Algonquin, Huron
ref U+0038 digit eight (Basic Latin)
U+0224  Ȥ Latin capital letter Z with hook Lu L
U+0225  ȥ Latin small letter Z with hook Ll L
* Middle High German
U+0226  Ȧ Latin capital letter A with dot above Lu L
U+0227  ȧ Latin small letter A with dot above Ll L
* Uralicist usage
U+0228  Ȩ Latin capital letter E with cedilla Lu L
U+0229  ȩ Latin small letter E with cedilla Ll L

     Additions for Livonian

U+022A  Ȫ Latin capital letter O with diaeresis and macron Lu L
U+022B  ȫ Latin small letter O with diaeresis and macron Ll L
* Livonian
U+022C  Ȭ Latin capital letter O with tilde and macron Lu L
U+022D  ȭ Latin small letter O with tilde and macron Ll L
* Livonian
U+022E  Ȯ Latin capital letter O with dot above Lu L
U+022F  ȯ Latin small letter O with dot above Ll L
* Livonian
U+0230  Ȱ Latin capital letter O with dot above and macron Lu L
U+0231  ȱ Latin small letter O with dot above and macron Ll L
* Livonian
U+0232  Ȳ Latin capital letter Y with macron Lu L
U+0233  ȳ Latin small letter Y with macron Ll L
* Livonian, Cornish

 

IPA Extensions

     IPA characters for disordered speech

U+02A9  ʩ Latin small letter feng digraph Ll L
* velopharyngeal fricative
U+02AA  ʪ Latin small letter ls digraph Ll L
* lateral alveolar fricative (lisp)
U+02AB  ʫ Latin small letter lz digraph Ll L
* voiced lateral alveolar fricative
U+02AC  ʬ Latin letter bilabial percussive Ll L
* audible lip smack
U+02AD  ʭ Latin letter bidental percussive Ll L
* audible teeth gnashing

 

Spacing Modifier Letters

     Additions based on 1989 IPA

U+02DF  ˟ modifier letter cross accent Sk ON
* Swedish grave accent

     Tone letters

U+02EA  ˪ modifier letter yin departing tone mark Sk ON
U+02EB  ˫ modifier letter yang departing tone mark Sk ON

     IPA modifiers

U+02EC  ˬ modifier letter voicing Sk ON
U+02ED  ˭ modifier letter unaspirated Sk ON

     Other modifier letters

U+02EE  ˮ modifier letter double apostrophe Lm L
* Nenets

 

Combining Diacritical Marks

     Additions for IPA

U+0346  ͆ combining bridge above Mn NSM
* IPA: dentolabial
ref U+20E9 (null) (Combining Diacritical Marks for Symbols)
U+0347  ͇ combining equals sign below Mn NSM
* IPA: alveolar
U+0348  ͈ combining double vertical line below Mn NSM
* IPA: strong articulation
U+0349  ͉ combining left angle below Mn NSM
* IPA: weak articulation
U+034A  ͊ combining not tilde above Mn NSM
* IPA: denasal

     IPA diacritics for disordered speech

U+034B  ͋ combining homothetic above Mn NSM
* IPA: nasal escape
U+034C  ͌ combining almost equal to above Mn NSM
* IPA: velopharyngeal friction
U+034D  ͍ combining left right arrow below Mn NSM
* IPA: labial spreading
U+034E  ͎ combining upwards arrow below Mn NSM
* IPA: whistled articulation

     Double diacritics

U+0362  ͢ combining double rightwards arrow below Mn NSM
* IPA: sliding articulation

 

Greek and Coptic

     Variant letterforms

U+03D7  ϗ Greek kai symbol Ll L
* used as an ampersand

     Archaic letters

U+03DB  ϛ Greek small letter stigma Ll L
ref U+03C2 Greek small letter final sigma (Greek and Coptic)
U+03DD  ϝ Greek small letter digamma Ll L
* used as a symbol with a numeric value of 6
U+03DF  ϟ Greek small letter koppa Ll L
* used in modern Greek as a symbol with a numeric value of 90, as in the dating of legal documentation
U+03E1  ϡ Greek small letter sampi Ll L
* used as a symbol with a numeric value of 900

 

Cyrillic

     Cyrillic extensions

U+0400  Ѐ Cyrillic capital letter ie with grave Lu L
U+040D  Ѝ Cyrillic capital letter I with grave Lu L

     Cyrillic extensions

U+0450  ѐ Cyrillic small letter ie with grave Ll L
* Macedonian
U+045D  ѝ Cyrillic small letter I with grave Ll L
* Macedonian

     Historic miscellaneous

U+0488  ҈ combining Cyrillic hundred thousands sign Me NSM
U+0489  ҉ combining Cyrillic millions sign Me NSM

     Extended Cyrillic

U+048C  Ҍ Cyrillic capital letter semisoft sign Lu L
U+048D  ҍ Cyrillic small letter semisoft sign Ll L
* Kildin Sami
U+048E  Ҏ Cyrillic capital letter er with tick Lu L
U+048F  ҏ Cyrillic small letter er with tick Ll L
* Kildin Sami
U+04EC  Ӭ Cyrillic capital letter E with diaeresis Lu L
U+04ED  ӭ Cyrillic small letter E with diaeresis Ll L
* Kildin Sami

 

Armenian

     Punctuation

U+058A  ֊ Armenian hyphen Pd ON
aka yentamna

 

Arabic

     Combining maddah and hamza

U+0653  ٓ Arabic maddah above Mn NSM
U+0654  ٔ Arabic hamza above Mn NSM
U+0655  ٕ Arabic hamza below Mn NSM

     Extended Arabic letters

U+06B8  ڸ Arabic letter lam with three dots below Lo AL
U+06B9  ڹ Arabic letter noon with dot below Lo AL
U+06BF  ڿ Arabic letter tcheh with dot above Lo AL
U+06CF  ۏ Arabic letter waw with dot above Lo AL

     Extended Arabic letters

U+06FA  ۺ Arabic letter sheen with dot below Lo AL
U+06FB  ۻ Arabic letter dad with dot below Lo AL
U+06FC  ۼ Arabic letter ghain with dot below Lo AL

     Signs for Sindhi

U+06FD  ۽ Arabic sign sindhi ampersand So AL
U+06FE  ۾ Arabic sign sindhi postposition men So AL

 

Tibetan

     Consonants

U+0F6A  ཪ Tibetan letter fixed form ra Lo L
* used only in transliteration and transcription

     Subjoined consonants

U+0F96  ྖ Tibetan subjoined letter cha Mn NSM
U+0FAE  ྮ Tibetan subjoined letter zha Mn NSM
U+0FAF  ྯ Tibetan subjoined letter za Mn NSM
U+0FB0  ྰ Tibetan subjoined letter a Mn NSM
aka a-chung
* rare, only used for full-sized subjoined letter
ref U+0F71 Tibetan vowel sign aa (Tibetan)
U+0FB8  ྸ Tibetan subjoined letter A Mn NSM

     Fixed-form subjoined consonants

U+0FBA  ྺ Tibetan subjoined letter fixed form wa Mn NSM
U+0FBB  ྻ Tibetan subjoined letter fixed form ya Mn NSM
U+0FBC  ྼ Tibetan subjoined letter fixed form ra Mn NSM

     Signs

U+0FBE  ྾ Tibetan ku ru kha So L
* often repeated three times; indicates a refrain
U+0FBF  ྿ Tibetan ku ru kha bzhi mig can So L
* marks point of text insertion or annotation
ref U+203B reference mark (General Punctuation)

     Cantillation signs

U+0FC0  ࿀ Tibetan cantillation sign heavy beat So L
* marks a heavy drum beat
U+0FC1  ࿁ Tibetan cantillation sign light beat So L
* marks a light drum beat
U+0FC2  ࿂ Tibetan cantillation sign cang te u So L
* symbol of a small Tibetan hand drum
U+0FC3  ࿃ Tibetan cantillation sign sbub chal So L
* symbol of a Tibetan cymbal

     Symbols

U+0FC4  ࿄ Tibetan symbol dril bu So L
* symbol of a Tibetan hand bell
U+0FC5  ࿅ Tibetan symbol rdo rje So L
U+0FC6  ࿆ Tibetan symbol padma gdan Mn NSM
U+0FC7  ࿇ Tibetan symbol rdo rje rgya gram So L
U+0FC8  ࿈ Tibetan symbol phur pa So L
U+0FC9  ࿉ Tibetan symbol nor bu So L
U+0FCA  ࿊ Tibetan symbol nor bu nyis khyil So L
* the double body symbol
ref U+262F yin yang (Miscellaneous Symbols)
U+0FCB  ࿋ Tibetan symbol nor bu gsum khyil So L
* the tri-kaya or triple body symbol
U+0FCC  ࿌ Tibetan symbol nor bu bzhi khyil So L
* the quadruple body symbol, a form of the swastika
ref U+534D CJK Ideograph U+534D (CJK Unified Ideographs)

     Astrological sign

U+0FCF  ࿏ Tibetan sign rdel nag gsum So L

 

General Punctuation

     Formatting characters

U+202F    narrow no break space Zs WS
ref U+00A0 no break space (Latin-1 Supplement)

     Double punctuation for vertical text

U+2048  ⁈ question exclamation mark Po ON
U+2049  ⁉ exclamation question mark Po ON

     General punctuation

U+204A  ⁊ tironian sign et Po ON
* Irish Gaelic, ...
U+204B  ⁋ reversed pilcrow sign Po ON
ref U+00B6 pilcrow sign (Latin-1 Supplement)
U+204C  ⁌ black leftwards bullet Po ON
U+204D  ⁍ black rightwards bullet Po ON

 

Currency Symbols

     Currency symbols

U+20AD  ₭ kip sign Sc ET
* Kip in Laos
* Laos
U+20AE  ₮ tugrik sign Sc ET
* Tugrik in Mongolia. Also spelled tugrug, tugric, tugrog or togrog
* Mongolia
* also transliterated as tugrug, tugric, tugrog, togrog, t?gr?g
U+20AF  ₯ drachma sign Sc ET
* Drachma in Greece
* Greece

 

Combining Diacritical Marks for Symbols

     Additional enclosing diacritics

U+20E2  ⃢ combining enclosing screen Me NSM
ref U+239A clear screen symbol (Miscellaneous Technical)
U+20E3  ⃣ combining enclosing keycap Me NSM

 

Letterlike Symbols

     Additional letterlike symbols

U+2139  ℹ information source Ll L
* intended for use with 20DD
U+213A  ℺ rotated capital q So ON
* a binding signature mark

 

Number Forms

     Roman numerals

U+2183  Ↄ Roman numeral reversed one hundred Nl L
aka apostrophic C
* used in combination with C and I to form large numbers

 

Arrows

     Arrows

U+21EB  ⇫ upwards white arrow on pedestal So ON
aka level 2 lock
U+21EC  ⇬ upwards white arrow on pedestal with horizontal bar So ON
aka caps lock
U+21ED  ⇭ upwards white arrow on pedestal with vertical bar So ON
aka numerics lock
U+21EE  ⇮ upwards white double arrow So ON
aka level 3 select
U+21EF  ⇯ upwards white double arrow on pedestal So ON
aka level 3 lock
U+21F0  ⇰ rightwards white arrow from wall So ON
aka group lock
U+21F1  ⇱ north west arrow to corner So ON
aka home
U+21F2  ⇲ south east arrow to corner So ON
aka end
U+21F3  ⇳ up down white arrow So ON
aka scrolling

 

Miscellaneous Technical

     Miscellaneous technical

U+2301  ⌁ electric arrow So ON
* from ISO 2047
* symbol for End of Transmission

     Graphics for control codes

U+237B  ⍻ not check mark So ON
* from ISO 2047
* symbol for Negative Acknowledge

     Graphics for control codes

U+237D  ⍽ shouldered open box So ON
* from ISO 9995-7
* keyboard symbol for No Break Space
U+237E  ⍾ bell symbol So ON
* from ISO 2047
U+237F  ⍿ vertical line with middle dot So ON
* from ISO 2047
* symbol for End of Medium

     Keyboard symbols from ISO 9995-7

U+2380  ⎀ insertion symbol So ON
U+2381  ⎁ continuous underline symbol So ON
U+2382  ⎂ discontinuous underline symbol So ON
U+2383  ⎃ emphasis symbol So ON
U+2384  ⎄ composition symbol So ON
U+2385  ⎅ white square with centre vertical line So ON
U+2386  ⎆ enter symbol So ON
U+2387  ⎇ alternative key symbol So ON
U+2388  ⎈ helm symbol So ON
aka control
ref U+2638 wheel of dharma (Miscellaneous Symbols)
U+2389  ⎉ circled horizontal bar with notch So ON
U+238A  ⎊ circled triangle down So ON
U+238B  ⎋ broken circle with northwest arrow So ON
U+238C  ⎌ undo symbol So ON

     Electrotechnical symbols from IR 181

U+238D  ⎍ monostable symbol So ON
U+238E  ⎎ hysteresis symbol So ON
U+238F  ⎏ open circuit output h type symbol So ON
U+2390  ⎐ open circuit output l type symbol So ON
U+2391  ⎑ passive pull down output symbol So ON
U+2392  ⎒ passive pull up output symbol So ON
U+2393  ⎓ direct current symbol form two So ON
U+2394  ⎔ software function symbol So ON

     APL

U+2395  ⎕ APL functional symbol quad So L
ref U+2337 APL functional symbol squish quad (Miscellaneous Technical)
ref U+25AF white vertical rectangle (Geometric Shapes)

     Keyboard symbols from ISO 9995-7

U+2396  ⎖ decimal separator key symbol So ON
U+2397  ⎗ previous page So ON
U+2398  ⎘ next page So ON
U+2399  ⎙ print screen symbol So ON
U+239A  ⎚ clear screen symbol So ON
ref U+20E2 combining enclosing screen (Combining Diacritical Marks for Symbols)

 

Control Pictures

     Keyboard symbol

U+2425  ␥ symbol for delete form two So ON
* from ISO 9995-7
* keyboard symbol for undoable delete

     Specific symbol for control code

U+2426  ␦ symbol for substitute form two So ON
* from ISO 2047
ref U+061F Arabic question mark (Arabic)

 

Geometric Shapes

     Control code graphics

U+25F0  ◰ white square with upper left quadrant So ON
U+25F1  ◱ white square with lower left quadrant So ON
U+25F2  ◲ white square with lower right quadrant So ON
U+25F3  ◳ white square with upper right quadrant So ON
U+25F4  ◴ white circle with upper left quadrant So ON
U+25F5  ◵ white circle with lower left quadrant So ON
U+25F6  ◶ white circle with lower right quadrant So ON
U+25F7  ◷ white circle with upper right quadrant So ON

 

Miscellaneous Symbols

     Miscellaneous symbol

U+2619  ☙ reversed rotated floral heart bullet So ON
* a binding signature mark
ref U+2767 rotated floral heart bullet (Dingbats)

     Syriac cross symbols

U+2670  ♰ west Syriac cross So ON
U+2671  ♱ east Syriac cross So ON

 

CJK Symbols and Punctuation

     Additional Suzhou numerals

U+3038  〸 Hangzhou numeral ten Nl L
U+3039  〹 Hangzhou numeral twenty Nl L
U+303A  〺 Hangzhou numeral thirty Nl L

     Special CJK indicators

U+303E  〾 ideographic variation indicator So ON
* visual indicator that the following ideograph is to be taken as a variant of the intended character

 

Alphabetic Presentation Forms

     Hebrew presentation forms

U+FB1D  יִ Hebrew letter yod with hiriq Lo R

 

Specials

     Interlinear annotation

U+FFF9   interlinear annotation anchor Cf BN
* marks start of annotated text
U+FFFA   interlinear annotation separator Cf BN
* marks start of annotating character(s)
U+FFFB   interlinear annotation terminator Cf BN
* marks end of annotation block

 

Altered Characters

In addition to more than a thousand General Category changes, and the addition of eight new Bidiretcional Categories, 8 characters altered their Bidirectional Category in 3.0

 

Basic Latin


U+000C     form feed had its Bidirectional Category changed from ParagraphSeparator to Whitespace

 

Latin-1 Supplement


U+0085     next line had its Bidirectional Category changed from OtherNeutrals to ParagraphSeparator

 

Spacing Modifier Letters


U+02D0   ː   modifier letter triangular colon had its Bidirectional Category changed from OtherNeutrals to LeftToRight
U+02D1   ˑ   modifier letter half triangular colon had its Bidirectional Category changed from OtherNeutrals to LeftToRight

 

Letterlike Symbols


U+2118     script capital p had its Bidirectional Category changed from LeftToRight to OtherNeutrals
U+212E     estimated symbol had its Bidirectional Category changed from LeftToRight to EuropeanNumberTerminator

 

Katakana


U+30FB     Katakana middle dot had its Bidirectional Category changed from LeftToRight to OtherNeutrals

 

Halfwidth and Fullwidth Forms


U+FF65     halfwidth Katakana middle dot had its Bidirectional Category changed from LeftToRight to OtherNeutrals
http://unicode.org

Log in or register to write something here or to contact authors.