This is a continuation of the story about the Unicode standard. Read The ASCII table and The 8th bit if you haven’t already.


Last time I finished with a horrible story of overlapping encoding standards, and a general confusion in the software world. It was a world where a French scientists was unable to talk with a German friend.

The hero who solved all of the problems was the Unicode standard. It started with a simple idea. Collect all the symbols, create a big table, and assign a number to each of them. Oh, and try to be compatible with existing standards as much as possible.

When I said all the symbols, I really meant all the symbols. Music notes, Klingon and Emoji charactes are only some bizzare examples. Collecting all these symbols is much easier said than done, so there exists a consortium a.k.a the Unicode Consortium who is responsible for the whole standard.

Representation #

There are various ways to represent Unicode characters in different contexts. Here is a list of common representations for the symbol below


Screenshot from 2014-10-09 01:00:50.png

Interesting facts #

And for the end here is the list of all the Unicode characters.

Happy encoding!


Now read this

The 8th bit

In the last post I wrote about the ASCII standard. This is a continuation of that post. The original ASCII standard takes up 7 bites, but the basic unit in the computer is 8 bites or 1 byte. This was the cause of a big problem in... Continue →