This is a continuation of the story about the Unicode standard. Read The ASCII table and The 8th bit if you haven’t already.
Last time I finished with a horrible story of overlapping encoding standards, and a general confusion in the software world. It was a world where a French scientists was unable to talk with a German friend.
The hero who solved all of the problems was the Unicode standard. It started with a simple idea. Collect all the symbols, create a big table, and assign a number to each of them. Oh, and try to be compatible with existing standards as much as possible.
When I said all the symbols, I really meant all the symbols. Music notes, Klingon and Emoji charactes are only some bizzare examples. Collecting all these symbols is much easier said than done, so there exists a consortium a.k.a the Unicode Consortium who is responsible for the whole standard.
There are various ways to represent Unicode characters in different contexts. Here is a list of common representations for the symbol below
- The Unicode standard can be viewed as an extension of the ASCII table because they are identical in the first 128 entries
- There are around 107,000 characters in Unicode
- It includes symbols from fictional stories
- It can be encoded with various standards such are UTF-8 and UTF-16
And for the end here is the list of all the Unicode characters.