Computer number systems (idea) by Jason W

This node is an introduction to common number systems used in computers. It is designed to help the uninitiated get familar with how they work.

Binary

Binary numbers are composed of only 0's and 1's. The 1 means "on" or True, 0 means "off" or False. A single binary number is called a bit. The most common use of binary is with fixed length strings of bits. Common examples are 8-bit, 16-bit, and 32-bit. An 8-bit example would look like this:

00000000

In our example, we'll assume the bit on the left end <- is the "highest order bit" and the bit farthest right -> is the "lowest order bit". This refers to the magnitude each bit carries. The sequence starts on the right with 1, and increments by powers of 2 as you traverse to the left. Here is a graphical translation of that for 8 bits:

 0  0  0  0  0 0 0 0
128 64 32 16 8 4 2 1

So the binary number

01000001

is 65. The 7th-bit is 64, the 1st-bit is 1, add them together to get 65. Another way to compute the total value if you know the position of the bits (m,n,.., 7 and 1 in this case) is 2^(m-1) + 2^(n-1) + 2^(..-1). In this case 2^(7-1) + 2^(1-1) = 2^6 + 2^0 = 64 + 1 = 65.

Now to find what the total range of a fixed length of bits is, you could set all the bits to 1 and add them up:

11111111

That will give you 255. The actual range is 256: 0 is a possible value. Note that 255 - 128 = 127. So the bits below the highest order bit add up to one less than the highest order bit. The easier way to calculate the total range is just (2^m) where m is the number of the highest order bit, 8 in this case. 2^8 = 256.

Time for some real world examples of range. Your graphical display interface has a setting that specifies the total number of colors available. This setting is usually 8-bit, 16-bit, 24-bit, or 32-bit. So if you're on an 8-bit display you have (2^8) colors, or 256. If you're on a 24-bit display, you have (2^24) colors, or 16777216. This is the often quoted "16 million colors". Another example is the text you read or write. Plain text is dictated by the ASCII standard, which declares 128 number to character relations. Since they used only 128 characters, only 7 bits are needed (we'll add the eighth bit here anyways). The above example of 65 translates to a capital A. Here is a snippit from 'man ascii' showing a few of the assignments:

Oct   Dec   Hex   Char           Oct   Dec   Hex   Char
001   1     01    SOH            101   65    41    A
002   2     02    STX            102   66    42    B
003   3     03    ETX            103   67    43    C
004   4     04    EOT            104   68    44    D
...
040   32    20    SPACE          140   96    60    `
041   33    21    !              141   97    61    a
042   34    22    "              142   98    62    b
043   35    23    #              143   99    63    c
044   36    24    $              144   100   64    d

The decimal value (Dec column) is the one you're used to. Try to spell out AaBbCc in binary using the table above (really, TRY IT, it's the best way to learn. Pencil and paper are good for this, but whatever tickles your fancy).

01000001        A
01100001        a
01000010        B
01100010        b
01000011        C
01100011        c

It should be obvious that to get from A to B to C or a to b to c you just add one. What you might not have noticed was that to get from A to a, B to b, and C to c, you only had to turn one extra bit on, the 6th-bit. The 6th-bit is 2^(6-1) = 32. Looking at the decimal values for the letters, this makes sense. 97 - 65 = 32, 98 - 66 = 32, 99 - 67 = 32. According to 'man ascii', only having to change one bit for a lower to upper case conversion made it easier for manual encoders (people) to encode manually (what manual encoders do best).

A final note on text conversions. Other standards, such as those in ISO 8859, use 8 bits (256 possible values). For the lower order 7 bits (0-128), they use the ASCII conversions. These are called character sets, and there are many different ones to accomodate different characters used in parts of the world. The current list from 'man iso_8859-1':

ISO 8859-1    west European languages (Latin-1)
ISO 8859-2    east European languages (Latin-2)
ISO 8859-3    southeast European and miscellaneous languages (Latin-3)
ISO 8859-4    Scandinavian/Baltic languages (Latin-4)
ISO 8859-5    Latin/Cyrillic
ISO 8859-6    Latin/Arabic
ISO 8859-7    Latin/Greek
ISO 8859-8    Latin/Hebrew
ISO 8859-9    Latin-1 modification for Turkish (Latin-5)
ISO 8859-10   Lappish/Nordic/Eskimo languages (Latin-6)
ISO 8859-11   Thai
ISO 8859-13   Baltic Rim languages (Latin-7)
ISO 8859-14   Celtic (Latin-8)
ISO 8859-15   west European languages (Latin-9)

Octal

Understanding binary can be useful for understanding octal. Octal is simply base 8. Nothing more, nothing less. If you understand how to use the different bases, this will just be review. However, considering we are indoctrinated with base 10 from our first days in the schoolhouse, it is no wonder many people have trouble with octal. First, here is an example of base 10:

What is that number? 1425 you say? It depends on what base you're in. Let's expand it in base 10 form:

 1    4  2  5
1000 100 10 1

You may have learned to do addition like this in elementary school. As you move from right to left, each digit gets a zero added to it:

Neat, huh? Except you're not really adding a zero to each, you're raising ten to the power of the position of the digit, starting on the right. 5 is in the 0 position, 2 in the 1 position, 4 in 2, 1 in 3. So what you're actually doing is:

5 * 10^0 = 5 * 1 = 5
2 * 10^1 = 2 * 10 = 20
4 * 10^2 = 4 * 100 = 400
1 * 10^3 = 1 * 1000 = 1000

Make sense? If not, re-read it until you understand it. Now in comes octal, or base 8. The "base" literally means the base of the exponential operation, the number raised to the position of the digit. So taking the sequence 1425, we get:

5 * 8^0 = 5 * 1 = 5
2 * 8^1 = 2 * 8 = 16
4 * 8^2 = 4 * 64 = 256
1 * 8^3 = 1 * 512 = 512

Which yields 789. So now you understand that 1425 is not the number 1425, but rather a sequence of digits before being operated on. As one would expect, since 8 < 10, the resulting 789 < 1425. Note that in base 10 the available digits are 0-9. In octal, only 0-7 are used.

There a several interesting things in the relationship between binary and octal. To demonstrate this, we'll be using the programming construct &, which (in C at least), will combine binary strings. It will turn on bits which are present in each operand (the things before and after the &), and turn off bits which are not present in either, or are present only in one. For example (note that this is pseudo code: you can't actually use plain binary strings in C):

00100011 & 00100010 = 00100010

In C, octal values are differentiated from decimal values by preceeding the number with a 0, ie 0654. Say you want to turn on just the three lowest order bits in a binary string. You would need to know what the lowest three order bits add up to.

000000111

Using the chart above, we can see that the on bits are 1, 2, 4, which add up to 1 + 2 + 4 = 7. So (01010101 could be any binary string):

01010101 & 7 = 000000111

Great, that was easy. Now let's turn on just the next three bits, 4, 5, 6. We can see that 2^(4-1) + 2^(5-1) + 2^(6-1) = 2^3 + 2^4 + 2^5 = 8 + 24 + 32 = 56. So

01010101 & 56 = 000111000

And for the next three it would be 448. After that is gets really messy, more than anyone would want to deal with. Now octal enters the picture again. Instead of using 7, 56, 488, etc, we can just use octal 7 in different positions.

01010101 & 07 = 000000111
01010101 & 070 = 000111000
01010101 & 0700 = 111000000

And so on. If we examine this, we can clearly see why this is:

0 * 8^0 = 0 * 1 = 0
7 * 8^1 = 7 * 8 = 56
7 * 8^2 = 7 * 64 = 448

This is a very common use of octal in the computer world. The most prevelant example in the Unix world is file permissions. There are separate permissions for owner, group, and other. These permissions are read, write, and execute.

Owner Group Other
000   000   000
RWE   RWE   RWE

Thus, to give all permissions to the owner, we would use 0700. For owner and other it would be 0707, and to give everyone all permissions it would be be 0777. So now you know better than to 'chmod 777 world', right?

Hexadecimal

Hexadecimal is base 16. Used in the same fashion as octal, just for larger bit strings. A good exercise for the reader would be to experiment with hex and see what sorts of bit patterns can be produced. To use hex in C, preceed the hex number with 0x, ie 0x330.

5 * 16^0 = 5 * 1 = 5
2 * 16^1 = 2 * 16 = 32
4 * 16^2 = 4 * 256 = 1024
1 * 16^3 = 1 * 4096 = 4096

As with octal, the available digits are different than with base 10. In this case, 0-15 are used. However, it is impossible to distinguish 10 from a 1 followed by a 0, so the numbers 10-15 are represented by the letters A-F.

Decimal Hexadecimal
10      A
11      B
12      C
13      D
14      E
15      F

0x3F8 therefore expands to:

8 * 16^0 = 8 * 1 = 8
F * 16^1 = 15 * 16 = 240
3 * 16^2 = 3 * 768

Example

Here is a simple C program that will convert any base 10 integer to any other base and print the result. Compile with 'gcc -o base base.c -lm'.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main (int argc, char *argv)
{
        /* Base to work in */
        int base = 8;

        /* Number to operate on */
        int number = 1425;

        int i, final = 0;

        /* Get each digit, multiply by (base ^ position) */
        for (i = 0; number; i++)
        {
                final += ((number % 10) * pow (base, i));
                number = number/10;
        }

        printf ("%d\n", final);
}

The Art of Computer Programming	The C Programming Language	Greek number system	Base nine number system
K&R	Donald Knuth	Computational Models of Experience	Fun with Numbers
Link and Link	GCC	base 8