Unicode encoded as two bytes per character. The obvioius way to do this is to put the bottom 16 bits into the two bytes (high byte first so sorting order is preserved), and this is called UCS-2. When people realized that (due to Chinese, mostly) more than 65,536 characters were needed, they came up with this bastard encoding, rather than using UTF-8, which is a sensible encoding. MicroSoft uses this encoding in their stuff, sigh.

UTF-16 can encoded Unicode up to 0x10ffff. All codes less than 0xffff but not in the range 0xd800-0xdfff are encoded high byte first, low byte second.

The "characters" 0xd800-0xdfff are called "surrogate characters" and must appear in pairs. These are combined in a complex way to produce the characters in the range 0x10000 through 0x10ffff. They also defeat the only plausible advantage of UTF-16, which is that the characters are the same size!

Don't use this, it is just proof that the standards people have their heads up their asses. Use UTF-8 instead.