To write the lowercase letter "a" in binary using the widely used ASCII encoding, you write 01100001. For the uppercase letter "A", the binary representation is 01000001. These are 8-bit codes, or one byte, which is the standard unit for storing a single text character in many modern computer systems.
How is a character converted to binary?
A computer stores letters and other characters as numbers, using a character encoding standard to map each character to a unique number. The binary version is simply the base-2 representation of that number. The following steps illustrate this process for the lowercase "a" using the ASCII standard.
Step 1: Find the decimal value
The American Standard Code for Information Interchange (ASCII) is a 7-bit standard that assigns decimal values to 128 characters.
- The ASCII value for the lowercase letter "a" is 97.
- The ASCII value for the uppercase letter "A" is 65.
Step 2: Convert the decimal number to binary
To convert the decimal number to binary, you repeatedly divide the number by 2 and record the remainder. The binary number is the sequence of remainders read from bottom to top.
For the decimal value 97 (for "a"):
-
97÷2=4897 divided by 2 equals 48
97÷2=48
with a remainder of 1
-
48÷2=2448 divided by 2 equals 24
48÷2=24
with a remainder of 0
-
24÷2=1224 divided by 2 equals 12
24÷2=12
with a remainder of 0
-
12÷2=612 divided by 2 equals 6
12÷2=6
with a remainder of 0
-
6÷2=36 divided by 2 equals 3
6÷2=3
with a remainder of 0
-
3÷2=13 divided by 2 equals 1
3÷2=1
with a remainder of 1
-
1÷2=01 divided by 2 equals 0
1÷2=0
with a remainder of 1
Reading the remainders in reverse order (from bottom to top) gives 1100001.
Step 3: Add the leading zero
For an 8-bit system, a leading zero is added to the 7-bit ASCII code to fill the full byte, resulting in 01100001.
Other encoding standards
While ASCII is fundamental, most modern systems use Unicode, which is a far more extensive encoding standard. Unicode includes ASCII as its first 128 characters and expands to cover a vast range of languages and symbols.
Unicode (UTF-8)
UTF-8 is a variable-width encoding that can represent any Unicode character. Since Unicode is backward-compatible with ASCII, the UTF-8 representation for "a" is the same as its 8-bit ASCII equivalent: 01100001.
Extended ASCII
Historically, different "extended ASCII" systems used the eighth bit to add 128 additional, country-specific characters. These were not standardized, so a file encoded with one version would appear as gibberish on another system. These were largely replaced by Unicode.
Hexadecimal
Sometimes you will see binary values expressed in hexadecimal for compactness.
- The hexadecimal value for lowercase "a" is 61.
- The hexadecimal value for uppercase "A" is 41.Each hexadecimal digit directly corresponds to four binary bits.
- For "a": 6 (0110) + 1 (0001) = 01100001
- For "A": 4 (0100) + 1 (0001) = 01000001
A note on context
When you see a binary sequence like 01100001, it's just a pattern of 0s and 1s. A computer only knows what this pattern means based on the context provided by the character encoding standard used to interpret it. A different program using a different standard might interpret 01100001 as a completely different character. For text, ASCII and Unicode (specifically UTF-8) are the most common contexts.