Numbers & Computers
January 2026
Bases
Our number systems tend to be positional with respect to some base.
For example, 137 represents:
Binary Numbers
Representation
Binary numbers are represented positionally, just like denary numbers:
| $2^7$ | $2^6$ | $2^5$ | $2^4$ | $2^3$ | $2^2$ | $2^1$ | $2^0$ |
|---|---|---|---|---|---|---|---|
| $128$ | $64$ | $32$ | $16$ | $8$ | $4$ | $2$ | $1$ |
| $1$ | $0$ | $0$ | $0$ | $1$ | $0$ | $0$ | $1$ |
(The above shows you a representation of 137 in binary)
Padding
Binary numbers can be padded with leading zeroes much like denary numbers.
This is often done as registers often have a fixed width.
Binary Arithmetic
Addition
$15 + 4$ in binary:
15 = 1111
04 = 0100
15: 01111
04: 00100
-----
result: 10011
carry: 1
Subtraction
$15 - 4$ in binary:
15 = 1111
04 = 0100
15: 01111
04: 00100
-----
result: 01011
carry:
Multiplication
$15 \times 4$ in binary:
15 = 1111
04 = 0100
15: 1111
04: 0100
--------
0000
00000
111100
+ 0000000
-----------
0111100
Alternative Representations
There are alternative representations for numbers in binary, ie 2-of-5 code and others used in barcodes.
Words & Bytes
A computer’s memory is often divided into words, often 16, 32 or 64 bits in width.
The largest number that a 64-bit word can hold is $2^64 - 1$.
Sometimes a word can hold multiple numbers, ie, in instructions.
Numbers can also be represented with multiple words, when they exceed the length of a single word.
A bit represents a single binary digit, ie: 1 or 0.
A byte represents 8 bits.
A nibble represents half a byte, or 4 bits.
In programming languages, ints tend to represent 32-bit numbers, with longs representing 64 bits.short is used to represent 4 bit numbers (nibbles).
Sign-Magnitude Representation
So far we have only covered unsigned integers, that is to say, they have no sign.signed integers allow us to discern whether or not they are positive or negative, and we can do so in different ways.
Sign-magnitude representation uses the first bit in a binary number to represent the sign, typically:
0represents $+$1represents $-$
For example:
0110would represent61110would represent-6
Excess (offset binary) representation
An excess representation starts at $-2^{n-1}$ and goes up to $2^{n-1} - 1$
For example, with an 8-bit excess representatoin:00000000 = -12800000001 = -12711111110 = 12611111111 = 127
| Bit Pattern | Signed Integer Value |
|---|---|
0000 | -8 |
0001 | -7 |
0010 | -6 |
0011 | -5 |
0100 | -4 |
0101 | -3 |
0110 | -2 |
0111 | -1 |
1000 | 0 |
1001 | 1 |
1010 | 2 |
1011 | 3 |
1100 | 4 |
1101 | 5 |
1110 | 6 |
1111 | 7 |
In excess representation, the bias is defined as $2^{n-1}$
Conversion
To convert excess binary representation to denary:
- Convert it as a standard positionally encoded binary string
- Subtract $2^{K-1}$
Arithmetic
Arithmetic in this form of encoding is nontrivial.
A number $n$ represented in K-bit excess is the positional binary representation of $n + 2^{K - 1}$.
That is to say, if we add numbers $n$ and $m$ positionally whilst they are encoded under the excess representation, the result will be:
This means that in order to get the correct answer, we need to subtract $2^{K-1}$ from the answer.
(We only need to subtract $2^{K-1}$ once because the final result should still have the offset to be represented in excess)
That said, this does not work if the sum is out of range.
Two’s Complement Representation
In two’s complement, the sign of the first bit is flipped, such that:
| $-2^7$ | $2^6$ | $2^5$ | $2^4$ | $2^3$ | $2^2$ | $2^1$ | $2^0$ |
|---|---|---|---|---|---|---|---|
| $-128$ | $64$ | $32$ | $16$ | $8$ | $4$ | $2$ | $1$ |
| $1$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ |
The above shows you how -128 is represented in two’s complement.
Formal Definition
Formally, for a 2’s complement bit string $b_{K - 1} … b_1b_0$, the number represented is:
$b_{K-1}2^{K-1} + … + b_12^1 + b_02^0$ if $b_{K-1} = 0$ and
$(b_{K-1} + … + b_12^1 + b_02^0) - 2^K$ if $b_{K-1} = 1$
Notice how this allows you to determine whether a number is negative or positive from the first digit alone.
Finding Negations
To convert a number to two’s complement:
- Get the number as an unsigned binary number
- Invert all the bits up to the right-most 1
- Add
1to the binary number
Arithmetic
Two’s complement arithmetic works as it does with unsigned numbers as per modular arithmetic.
Ranges & Overflow
Overflow occurs if the result of an arithmetic operation is outside the range of numbers which can be represented in the chosen representation.
When an operation is performed and it overflows, the left-most bits that exceed the range of the datatype will be discarded, as such the result will not match the correct answer.
Real Numbers
In standard base-10, real numbers are represented with a decimal point separating the fractional component from the whole component, as such:
137.512 is equal to:
Likewise, the same can be done for binary
Fixed Point
Fixed point binary numbers have the decimal point put in a fixed location in the number, ie, before the last 4 digits:
1101.1010
| $2^3$ | $2^2$ | $2^1$ | $2^0$ | $2^{-1}$ | $2^{-2}$ | $2^{-3}$ | $2^{-4}$ |
|---|---|---|---|---|---|---|---|
| $8$ | $4$ | $2$ | $1$ | $1/2$ | $1/4$ | $1/8$ | $1/16$ |
| $1$ | $1$ | $0$ | $1$ | $1$ | $0$ | $1$ | $0$ |
Fixed point representations have limited precision, unfortunately, and gives the same level of precision to all numbers stored with it, it is inflexible.
Floating POint
With denary, numbers can be written as such:
Likewise, the same can be done with base 2:
Floating point representations are as such:
In binary:
- The first bit represents the sign,
0for+and1for- - The next
kdigits are the exponent in excess representation - The rest
ndigits is the mantissa in unsigned representation
In IEEE-754, a 32-bit float is defined as:
1 10000110 11110001100000000000000- The first bit is the sign bit
- The next 8 bits are the exponent in excess representation
- The next 23 bits are the mantissa
For 8-bit representations, it is common that k=3 and n=4.
For 16-bit representations, as mentioned, k=5 and n=10.
For 32-bit representations, as mentioned, k=8 and n=23.
For 64-bit representations (often called a double), k=11 and n=52
Examples
00101001
- Sign is
0so the number is positive - The exponent is
010(-2) - The mantissa is
1001 - The number is therefore equal to:
11101100
- Sign is
1so the number is negative - The exponent is
110(2since110is6and it’s excess representaiton so subtract4) - The mantissa is
1100 - The number is therefore equal to:
0011011100100000
- Sign is
0so the number is positive - The exponent is
01101(-3since01101is13and it’s excess representaiton so subtract16) - The mantissa is
1100100000(25/32) - The number is therefore equal to:
Normalised Representation
A floating point representation is normalised if the leading bit of the mantissa is 1.
All representations (except for 0) should be normalised as this allows for the most precision.
Examples
+7/2 as an 8-bit float
- The sign bit is
0
- So the exponent is
2(110in excess representation) - The mantissa is
1110 - The final number is
0110110