Internal Numeric Representation

December 10, 2018 0 By Rosa

Now we know how numeric systems work, it would be interesting to know what we can do with it. You can do surprisingly much with just one bit, maybe in more ways that you could imagine. Let’s take a look into how the computer internally represents numeric values.

A bit is the smallest unit of data available in a computer system. It can represent any two different values. It is literally, any two different values. Below are a few examples:

TrueFalse
OnOff
10
RightWrong
LeftRight
1024Red

That last example might be a little strange to see. How can you display 1024 and Red? It is like I said, it is literally any two different values. As long as they don’t match, you represent it using one bit. However, having only two different values often results in very limited values you can represent.

Bit String

In order to have more values to represent, the computer uses bit strings to represent multiple values. A bit string is a sequence of multiple bits, as the name suggests. This gives us the opportunity to achieve other representations such as a hexadecimal representation or ASCII. However, you cannot just decide by yourself what the lenght of your bit string is. The computer has fixed lengths for the bit strings. So, let’s put them in a table.

Bit String NameLength
Nibble4-bit
Byte8-bit
Word16-bit
Double Word
32-bit
“quad word”
64-bit

There is no official term for the 64-bit bit string, that’s why it’s sometimes dubbed as a quad word. As you can see, there are four commonly used bit strings. Although, many times such bit strings are still divided into bytes by the CPU for memory access. Why is that?

A byte is the smallest addressable data item on a CPU. It also has the most efficient size for the CPU to retrieve data that has been stored into a byte. A byte uses the bit numbers to denote every individual bit. This makes it possible for the CPU to prioritise certain bits as well.

A general rule of thumb is that n-bit bit strings have 2n different values. This value increases very quickly, the bigger the n-value gets. See the table below; here I’ve put the table above with their corresponding amount of different values. Just for fun, I’ve also added the 128-bit bit string to give you some more perspective, although it is not supported everywhere yet.

BitsValues
Nibble (4-bits)
16
Byte (8-bits)
256
Word (16-bit)
65.536
Double Word (32-bit)
4.294.967.296
64-bit
18.446.744.073.709.551.616
128-bit340.282.366.920.938.463.463.374.607.431.768.211.456

As you can see, unique values get big very quickly. What should you do with all those bits? Well, for example, you can use some to indicate a certain priority within the bit string. How does the CPU that?

Bit 0 is perceived as the least significant bit. It is also called the LO bit, or Low Order bit. Where there is a low order, there is also a high order. And as you can probably imagine, the highest bit number is the high order bit which in our case is bit 7. What we can do with high and low order bit, we can also do with bytes when using words.

A word (16-bit) consists of two bytes in the 80×86 CPU architecture. (This could differ in another architecture, but since this is the most commonly used architecture we take this as our standard.) Since we have two bytes to work with, our high order bit shifts from bit 7 to bit 15. Also, bits 7 to 0 have now become the low order byte and bits 15 to 8 have become the high order byte. With a double word, yes you are right, the 3rd byte becomes the high order byte and byte 0 stays the low order byte. This can be applied to every length bit string.

Signed Numbers

This is all fun and well but what do you do with high and low order bits? A popular use case for these bits are to indicate whether you have a negative or positive value with signed numbers. With n bits you can represent -2n-1 to 2n-1-1 signed values. The CPU uses the high order bit as the sign bit. So, if the HO bit is 1 the value is negative, otherwise it is positive. So, for example:

0xFFFF = 1111_1111_1111_1111 => Starts with 1, thus is a negative number.
0x7FFF = 0111_1111_1111_1111 => Starts with 0, thus is a positive number.

In order to successfully represent a signed number, the CPU also uses the two’s compliment numbering system. In order to get the two’s complement of a binary number, you have to conduct certain steps. Say we have a binary number 0000_0101 and we want to know what the two’s complement notation of this number would be. First, we invert all the zeroes to ones and all the ones to zeroes. In our case, we would get 1111_1010 when we invert our binary number. To finish it off, we add one to the inverted number. This would get us 1111_1011. Let’s try that once more:

16384:
0100_0000_0000_0000 ; Invert
1011_1111_1111_1111 ; Add one
1100_0000_0000_0000 ; Final result
-16384

One very important note for you when working with the two’s compliment numbering system:

You cannot negate smallest negative value!