awesome-codepoints

Unicode characters catalog

Curated list of interesting Unicode characters with unique features

Awesome Code Points

GitHub

763 stars
31 watching
22 forks
last commit: 7 months ago
Linked from 3 awesome lists


Awesome Code Points / Code Points that Affect Others / Breaking and Gluing other characters

U+00A0 NO-BREAK SPACE - force adjacent characters to stick together. Well known as in HTML
U+00AD SOFT HYPHEN - (in HTML: ) like ZERO WIDTH SPACE, but show a hyphen if (and only if) a break occurs
U+200B ZERO WIDTH SPACE - the inverse to U+00A0: create no space, but allow word breaking
U+200D ZERO WIDTH JOINER - force adjacent characters to be joined together (e.g., arabic characters or supported emoji). Apple uses this to compose some emoji like different families
U+2060 WORD JOINER - the same as U+00A0, but completely invisible. Good for writing on Twitter

Awesome Code Points / Record Holders and Extremes

U+0000 <control> - first code point
U+10FFFF ( ) - last code point. The whole rest of its plane apart from U+10FFFE, the code points in the 0x10000-0x10FFFD range, are private use characters, guaranteed to be never filled by a future Unicode standard
U+1F402 OX - shortest name
U+FDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM - longest decomposition form: 18 characters
U+5146 and - code points that represent the highest “single-digit” number. In both cases that’s 1,000,000,000,000, a trillion
U+0F33 TIBETAN DIGIT HALF ZERO - code point that represents the “single-digit” number and at the same time the only negative one, -½
U+0080 The trophy for most useless code points goes to , and . These so-called C1 control characters are more or less unspecified. They got into Unicode, because they were present in the very first version of what should later become ISO 10646, the ISO-standardized version of Unicode. They , that never came to be
A close second place in this regard goes to the CJK unified ideographs , , , , , , , , , , , and . These so-called came to Unicode via the Japanese JIS standard, where they were added, because they were mis-read or misinterpreted from other signs, when JIS was compiled from original printed text sources
U+006F LATIN SMALL LETTER O - leads the list of characters with confusable shapes. Of all the possible mappings in the , the small “o” leads with a whopping 73 entries of similar looking glyphs, followed by LATIN SMALL LETTER L with 70 entries
U+1F4C0 DVD - only code point name without any vowel ( )
U+3106C CJK UNIFIED IDEOGRAPH 3106C - the character with the most : 84. Take your time to write this one!

Awesome Code Points / For Funsies

U+1680 OGHAM SPACE MARK - a space that looks like a dash. Great to bring programmers close to madness:
U+037E GREEK QUESTION MARK - a look-alike to the semicolon. Also a fun way to annoy developers
U+1DD2 COMBINING US ABOVE - this is the most romantic code point
U+F8FF PRIVATE USE CODEPOINT - this private use code point is rendered as Apple logo on many Apple devices
U+1F574 MAN IN BUSINESS SUIT LEVITATING - A rather curious character, that only made it into Unicode for its appearance in the Webdings font (for reasons of backwards compatibility)
U+1F596 RAISED HAND WITH PART BETWEEN MIDDLE AND RING FINGERS - the Vulcan salute. Live long and prosper! 🖖
U+1F918 SIGN OF THE HORNS - Rock on! 🤘
U+2800 BRAILLE PATTERN BLANK - A Braille pattern that has zero of its six or eight dots filled in. According to the standard: “* while this character is imaged as a fixed-width blank in many fonts, it does not act as a space” Essentially it is rendered as white-space, but since it is designated as white-space it isn't matched by white-space-validating regular expressions. This can be used to bypass all kinds of validation that disallows or trims white-space

Awesome Code Points / For Funsies / Games

Chess figures
Card suits and even a whole complete with joker and back of card
Die faces and a nice
Go pieces
Draughts (or checkers) pieces
Shogi pieces , a
Domino tiles
Mahjong tiles

Awesome Code Points / Other Lists of Code Points

Cross-platform terminal characters 219 6 days ago a list of characters that work on most terminals

Backlinks from these awesome lists:

More related projects: