Every character you type has a number behind it. That's not a metaphor — it's literally how computers store text. The letter "A" is 65. A space is 32. The exclamation point you might add to a password is 33. This mapping between characters and numbers is called ASCII, and understanding it explains a surprising amount about how computers handle text, why certain bugs happen, and how to debug encoding issues when they show up.
Why the Letter Codes Are Arranged the Way They Are
Ever wonder why uppercase A is 65 and lowercase a is 97? The gap is exactly 32. And that's not a coincidence. In binary, 65 is 01000001 and 97 is 01100001. The only difference is bit 5 (the third from the left in a 7-bit representation). This means you can toggle between uppercase and lowercase by flipping a single bit — or by adding or subtracting 32. That's an elegant design that made case-insensitive comparisons very cheap to compute in early systems.
It also means you can check if a character is uppercase by testing whether its code falls in the range 65–90, and check for lowercase with 97–122. Digits 0–9 are 48–57. Knowing these ranges lets you manually validate character types without any library functions — useful in embedded systems or anywhere you want absolute minimal overhead.
Extended ASCII and Why 128 Wasn't Enough
The original 128 ASCII characters worked fine for English text. But the moment you needed to write French, German, Spanish, Portuguese — any language with accented characters — ASCII fell apart. The accented e (é), the German ß, the Spanish ñ — none of these exist in standard ASCII.
The solution, for a while, was Extended ASCII. Different manufacturers used the remaining space in an 8-bit byte (codes 128–255) to add characters relevant to their target market. IBM's Code Page 437 added box-drawing characters and some Western European letters. ISO 8859-1 (Latin-1) covered most Western European languages. But since different systems used that 128–255 range differently, a document created on one system looked wrong on another. Sound familiar?
That's ultimately why Unicode was developed — a single standard that aims to include every character in every human language. Unicode assigns each character a "code point" (like U+00E9 for é), and UTF-8 is the encoding that stores those code points as sequences of bytes. The first 128 code points (U+0000 through U+007F) are exactly the ASCII codes, which is why ASCII knowledge transfers directly to understanding UTF-8.