The Thai script is used to write Thai and other Southeast Asian languages such as Kuy, Lavna and Pali. It is a member of the Indic family of scripts descended from Brahmi. Thai extensions to the Brahmi character set include tone marks derived from superscript digits. The Thai script lacks the conjunct consonants and independent vowels found in most Brahmi-derived scripts. Thai is written left to right.

The Thai layout in Unicode is based on the Thai Industrial Standard 620-2529 and its updated version 620-2533.

In common with Indic scripts, each Thai letter is a consonant possessing an inherent vowel sound. Thai letters further feature inherent tones. The inherent vowel and tone can be modified with vowel signs and tone marks. Most Thai vowel signs are rendered by full letter sized in-line glyphs placed either before, after or around the glyph for the base consonant. When the vowel's glyph is before the consonant, it is encoded as a separate character before the consonant. This differs from all other Indic scripts, but is necessary to comply with the Thai Industrial Standard.

There are several punctuation marks particular to Thai :

U+0E4F    Thai character fongman   is the Thai bullet, used to mark items in lists or appearing at the beginning of a verse, sentence, paragraph or other textual segment.

U+0E46    Thai character maiyamok   is used to mark repetition of preceding letters.

U+0E2F    Thai character paiyannoi   is used to indicate elision or abbreviation of letters. It is also used as a regular letter, such as in the Thai name for Bangkok. Paiyannoi is also used in combination (U+0E2F U+0E25 U+0E2F) to create a construct called paiyanyai which means et cetera and is comparable to U+17D8    Khmer sign beyyal.

U+0E5A    Thai character angkhankhu   is used to mark the end of a long segment of text. It can be followed by U+0E30    Thai character sara a   to mark even longer segments of text, such as at the end of a verse in poetry.

U+0E5B    Thai character khomut   marks the end of a chapter or document, where it always follows the angkhankhu + sara a combination.

The angkhankhu + sara a combination is closely related to U+17D4    Khmer sign khan   and U+17D5    Khmer sign bariyoosan   which are themselves ultimately related to the Devanagari characters U+0964    Devanagari danda   and U+0965    Devanagari double danda.

Thai words are not separated by spaces, but spaces are introduces where Western typography might use a comma or period. To mark a word boundary (e.g. for line breaking) use U+200B    zero width space.


Unicode's Thai code block reserves the 128 code points from U+0E00 to U+0E7F, of which 87 are currently assigned.

Sinhala <-- Thai --> Lao

All the characters in this code block were added in Unicode 1.1

Number of characters in each General Category :

Letter, Modifier       Lm :  1
Letter, Other          Lo : 56
Mark, Non-Spacing      Mn : 16
Number, Decimal Digit  Nd : 10
Punctuation, Other     Po :  3
Symbol, Currency       Sc :  1

Number of characters in each Bidirectional Category :

Left To Right                 L : 70
European Number Terminator   ET :  1
Non Spacing Mark            NSM : 16

The columns below should be interpreted as :

  1. The Unicode code for the character
  2. The character in question
  3. The Unicode name for the character
  4. The Unicode General Category for the character
  5. The Unicode Bidirectional Category for the character

If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.

 

Thai

     Based on TIS 620-2533

U+0E01   ก   Thai character ko kai Lo L
U+0E02   ข   Thai character kho khai Lo L
U+0E03   ฃ   Thai character kho khuat Lo L
U+0E04   ค   Thai character kho khwai Lo L
U+0E05   ฅ   Thai character kho khon Lo L
U+0E06   ฆ   Thai character kho rakhang Lo L
U+0E07   ง   Thai character ngo ngu Lo L
U+0E08   จ   Thai character cho chan Lo L
U+0E09   ฉ   Thai character cho ching Lo L
U+0E0A   ช   Thai character cho chang Lo L
U+0E0B   ซ   Thai character so so Lo L
U+0E0C   ฌ   Thai character cho choe Lo L
U+0E0D   ญ   Thai character yo ying Lo L
U+0E0E   ฎ   Thai character do chada Lo L
U+0E0F   ฏ   Thai character to patak Lo L
U+0E10   ฐ   Thai character tho than Lo L
U+0E11   ฑ   Thai character tho nangmontho Lo L
U+0E12   ฒ   Thai character tho phuthao Lo L
U+0E13   ณ   Thai character no nen Lo L
U+0E14   ด   Thai character do dek Lo L
U+0E15   ต   Thai character to tao Lo L
U+0E16   ถ   Thai character tho thung Lo L
U+0E17   ท   Thai character tho thahan Lo L
U+0E18   ธ   Thai character tho thong Lo L
U+0E19   น   Thai character no nu Lo L
U+0E1A   บ   Thai character bo baimai Lo L
U+0E1B   ป   Thai character po pla Lo L
U+0E1C   ผ   Thai character pho phung Lo L
U+0E1D   ฝ   Thai character fo fa Lo L
U+0E1E   พ   Thai character pho phan Lo L
U+0E1F   ฟ   Thai character fo fan Lo L
U+0E20   ภ   Thai character pho samphao Lo L
U+0E21   ม   Thai character mo ma Lo L
U+0E22   ย   Thai character yo yak Lo L
U+0E23   ร   Thai character ro rua Lo L
U+0E24   ฤ   Thai character ru Lo L
* independent vowel letter used to write Sanskrit
U+0E25   ล   Thai character lo ling Lo L
U+0E26   ฦ   Thai character lu Lo L
* independent vowel letter used to write Sanskrit
U+0E27   ว   Thai character wo waen Lo L
U+0E28   ศ   Thai character so sala Lo L
U+0E29   ษ   Thai character so rusi Lo L
U+0E2A   ส   Thai character so sua Lo L
U+0E2B   ห   Thai character ho hip Lo L
U+0E2C   ฬ   Thai character lo chula Lo L
U+0E2D   อ   Thai character o ang Lo L
U+0E2E   ฮ   Thai character ho nokhuk Lo L
aka ho nok huk

     Sign

U+0E2F   ฯ   Thai character paiyannoi Lo L
aka paiyan noi
* ellipsis, abbreviation

     Vowels

U+0E30   ะ   Thai character sara a Lo L
U+0E31   ั   Thai character mai han akat Mn NSM
U+0E32   า   Thai character sara aa Lo L
ref U+0E45   ๅ   Thai character lakkhangyao (Thai)
U+0E33   ำ   Thai character sara am Lo L
U+0E34   ิ   Thai character sara i Mn NSM
U+0E35   ี   Thai character sara ii Mn NSM
U+0E36   ึ   Thai character sara ue Mn NSM
U+0E37   ื   Thai character sara uee Mn NSM
aka sara uue
U+0E38   ุ   Thai character sara u Mn NSM
U+0E39   ู   Thai character sara uu Mn NSM
U+0E3A   ฺ   Thai character phinthu Mn NSM
* Pali virama

     Currency symbol

U+0E3F   ฿   Thai currency symbol baht Sc ET

     Vowels

U+0E40   เ   Thai character sara e Lo L
U+0E41   แ   Thai character sara ae Lo L
U+0E42   โ   Thai character sara o Lo L
U+0E43   ใ   Thai character sara ai maimuan Lo L
aka sara ai mai muan
U+0E44   ไ   Thai character sara ai maimalai Lo L
aka sara ai mai malai
U+0E45   ๅ   Thai character lakkhangyao Lo L
aka lakkhang yao
* special vowel length indication used with 0E24 or 0E26
ref U+0E32   า   Thai character sara aa (Thai)

     Sign

U+0E46   ๆ   Thai character maiyamok Lm L
aka mai yamok
* repetition

     Vowel

U+0E47   ็   Thai character maitaikhu Mn NSM
aka mai taikhu

     Tone marks

U+0E48   ่   Thai character mai ek Mn NSM
U+0E49   ้   Thai character mai tho Mn NSM
U+0E4A   ๊   Thai character mai tri Mn NSM
U+0E4B   ๋   Thai character mai chattawa Mn NSM

     Signs

U+0E4C   ์   Thai character thanthakhat Mn NSM
* cancellation mark
U+0E4D   ํ   Thai character nikhahit Mn NSM
aka nikkhahit
* final nasal
U+0E4E   ๎   Thai character yamakkan Mn NSM
U+0E4F   ๏   Thai character fongman Po L
* used as a bullet
ref U+17D9   ៙   Khmer sign phnaek muan (Khmer)

     Digits

U+0E50   ๐   Thai digit zero Nd L
U+0E51   ๑   Thai digit one Nd L
U+0E52   ๒   Thai digit two Nd L
U+0E53   ๓   Thai digit three Nd L
U+0E54   ๔   Thai digit four Nd L
U+0E55   ๕   Thai digit five Nd L
U+0E56   ๖   Thai digit six Nd L
U+0E57   ๗   Thai digit seven Nd L
U+0E58   ๘   Thai digit eight Nd L
U+0E59   ๙   Thai digit nine Nd L

     Signs

U+0E5A   ๚   Thai character angkhankhu Po L
* used to mark end of long sections
* used in combination with 0E30 to mark end of a verse
U+0E5B   ๛   Thai character khomut Po L
* used to mark end of chapter or document
ref U+17DA   ៚   Khmer sign koomuut (Khmer)

http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html