Roland's homepage

My random knot in the Web

Unicode characters

Unicode and specifically UTF-8 is becoming more widely used. In my favorite editor, emacs, I use an input method according to RFC1345. This covers a lot of symbols, but also leaves a lot out. In emacs you can input any unicode character that you know the code-point of with CTRL-x 8 ENTER followed by the hexadecimal code-point. In vim you can use CTRL-k followed by the RFC1345 mnemonic. But if you're writing e.g. a comment on a blog, you usually don't have access to those input methods.

So I put together a file of unicode characters encoded in UTF-8 for easy copy and paste. On top is a line of character that I use the most;

⌀ § ° ± µ · × √ ≈ ß ÷ ℃ ℉ ¶ © « » ‰ ‘ ’ “ ” ff fi fl ffi ffl st ☺

(Note that the fist symbol is the diameter sign [0x2300] and not the empty set [∅, 0x2205])

Whether you see all character correctly depends on the fonts your browser has access to. I'm using the freely available and excellent DejaVu, which covers a significant part of the character set, see the samples.

accent: àáâãäåæç èéêë ìíîï ðñòóôõö øùúûüýþÿ ÀÁÂÃÄÅ Ç ÈÉÊË ÌÍÎÏ ÐÑ ÒÓÔÕÖ ØÙÚÛÜÝÞß

punctuation: ‘ ’ “ ” ‚ ‘ „ “ ¿ ¡ « » ‹ › ¶ § ‐ ‑ ‒ – — ― …

units: Ω ℃ ℉ Å ‰

currencies: € ¢ £ ¤ ¥ ₠

ligatures: ff fi fl ffi ffl st

greek:αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ
dingbats:◆ ◇ ◊ ○ ◎ ● ◐ ◑ ◘ ◙ ◢ ◣ ★ ☆ ☎ ☏ ☜ ☞ ☺ ☻ ☼ ♀ ♂ ✓ ✗ ✠ † ‡
fractions:¼ ½ ¾ ⅓ ⅔ ⅕ ⅖ ⅗ ⅘ ⅙ ⅚ ⅛ ⅜ ⅝ ⅞ ⅐ ⅑ ⅒
superscripts:⁰ ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁺ ⁻ ⁼ ⁽ ⁾ ⁿ ª º
subscripts:₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ ₊ ₋ ₌ ₍ ₎
marks:© ® ™ ℗ ℠ ℞ ㏂ ㏘
roman:Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ Ⅺ Ⅻ Ⅼ Ⅽ Ⅾ Ⅿ
maths:∂ ∀ ∃ ∆ ∇ ∈ ∋ ∏ ∑ ∓ √ ∛ ∜ ∝ ∞ ∥ ∧ ∨ ∩ ∪ ∫ ∬ ∮ ∴ ∵ ≃ ≈ ≠ ≤ ≥ ≪ ≫ ′ ″ ‴ ∼
sets:ℕ ℤ ℚ ℝ ℂ
comp:⌘ ⌥ ‸ ⇧ ⌤ ↑ ↓ → ← ⇞ ⇟ ↖ ↘ ⌫ ⌦ ⎋ ⏏ ↶ ↷ ◀ ▶ ▲ ▼ ◁ ▷ △ ▽ ⇄ ⇤ ⇥ ↹ ↵ ↩ ⏎ ⌧ ⌨
comp2:␣ ⌶ ⎗ ⎘ ⎙ ⎚ ⌚ ⌛ ✂ ✄ ✉ ✍
arrows:← → ↑ ↓ ↔ ↖ ↗ ↙ ↘ ⇐ ⇒ ⇑ ⇓ ⇔ ⇗ ⇦ ⇨ ⇧ ⇩ ↞ ↠ ↟ ↡ ↺ ↻ ☞ ☜ ☝ ☟
ocr:␣ ⑀ ⑁ ⑂ ⑃ ⑆ ⑇ ⑈ ⑉
signs:☠ ☢ ☣ ☤ ♲ ♳ ⌬ ♨ ♿ ⚠ ⚡ ☡
technical:⌒ ⌓ ⌔ ⌕ ⌭ ⌮ ⌯ ⌰ ⌱ ⌲ ⌳ ⚒ ⚓ ⚔ ⚖ ⚗
recycling:♲ ♳ ♴ ♵ ♶ ♷ ♸ ♹ ♺ ♻ ♼ ♽ ♾

There are lots of other characters that could have been added to this list. If you read Coptic or Georgian or cyrillic, or if you can use the IPA alphabet you could add those to the list.