Numbers & Computers

Bases

Our number systems tend to be positional with respect to some base.

For example, 137 represents:

\[
(1 \times 10^2) + (3 \times 10^1) + (7 \times 10^0)
\]

Binary Numbers

Representation

Binary numbers are represented positionally, just like denary numbers:

$2^7$	$2^6$	$2^5$	$2^4$	$2^3$	$2^2$	$2^1$	$2^0$
$128$	$64$	$32$	$16$	$8$	$4$	$2$	$1$
$1$	$0$	$0$	$0$	$1$	$0$	$0$	$1$

(The above shows you a representation of 137 in binary)

Padding

Binary numbers can be padded with leading zeroes much like denary numbers.
This is often done as registers often have a fixed width.

Binary Arithmetic

Addition

$15 + 4$ in binary:

15 = 1111 04 = 0100

15:       01111
04:       00100
          -----
result:   10011
carry:     1

Subtraction

$15 - 4$ in binary:

15 = 1111 04 = 0100

15:       01111
04:       00100
          -----
result:   01011
carry:

Multiplication

$15 \times 4$ in binary:

15 = 1111 04 = 0100

15:           1111
04:           0100
          --------
              0000
             00000
            111100
       +   0000000
       -----------
           0111100

Alternative Representations

There are alternative representations for numbers in binary, ie 2-of-5 code and others used in barcodes.

Words & Bytes

A computer’s memory is often divided into words, often 16, 32 or 64 bits in width.
The largest number that a 64-bit word can hold is $2^64 - 1$.

Sometimes a word can hold multiple numbers, ie, in instructions.
Numbers can also be represented with multiple words, when they exceed the length of a single word.

A bit represents a single binary digit, ie: 1 or 0.
A byte represents 8 bits.
A nibble represents half a byte, or 4 bits.

In programming languages, ints tend to represent 32-bit numbers, with longs representing 64 bits.
short is used to represent 4 bit numbers (nibbles).

The exact width of data types in a lanugage depends on the implementation - be sure to check this.

Sign-Magnitude Representation

So far we have only covered unsigned integers, that is to say, they have no sign.
signed integers allow us to discern whether or not they are positive or negative, and we can do so in different ways.

Sign-magnitude representation uses the first bit in a binary number to represent the sign, typically:

0 represents $+$
1 represents $-$

For example:

0110 would represent 6
1110 would represent -6

Excess (offset binary) representation

An excess representation starts at $-2^{n-1}$ and goes up to $2^{n-1} - 1$

For example, with an 8-bit excess representatoin:
00000000 = -128
00000001 = -127
11111110 = 126
11111111 = 127

Bit Pattern	Signed Integer Value
`0000`	`-8`
`0001`	`-7`
`0010`	`-6`
`0011`	`-5`
`0100`	`-4`
`0101`	`-3`
`0110`	`-2`
`0111`	`-1`
`1000`	`0`
`1001`	`1`
`1010`	`2`
`1011`	`3`
`1100`	`4`
`1101`	`5`
`1110`	`6`
`1111`	`7`

In excess representation, the bias is defined as $2^{n-1}$

Conversion

To convert excess binary representation to denary:

Convert it as a standard positionally encoded binary string
Subtract $2^{K-1}$

Arithmetic

Arithmetic in this form of encoding is nontrivial.

A number $n$ represented in K-bit excess is the positional binary representation of $n + 2^{K - 1}$.

That is to say, if we add numbers $n$ and $m$ positionally whilst they are encoded under the excess representation, the result will be:

\[
n + 2^{K-1} + m + 2^{K-1}
\]

This means that in order to get the correct answer, we need to subtract $2^{K-1}$ from the answer.
(We only need to subtract $2^{K-1}$ once because the final result should still have the offset to be represented in excess)
That said, this does not work if the sum is out of range.

Two’s Complement Representation

In two’s complement, the sign of the first bit is flipped, such that:

$-2^7$	$2^6$	$2^5$	$2^4$	$2^3$	$2^2$	$2^1$	$2^0$
$-128$	$64$	$32$	$16$	$8$	$4$	$2$	$1$
$1$	$0$	$0$	$0$	$0$	$0$	$0$	$0$

The above shows you how -128 is represented in two’s complement.

Formal Definition

Formally, for a 2’s complement bit string $b_{K - 1} … b_1b_0$, the number represented is:
$b_{K-1}2^{K-1} + … + b_12^1 + b_02^0$ if $b_{K-1} = 0$ and
$(b_{K-1} + … + b_12^1 + b_02^0) - 2^K$ if $b_{K-1} = 1$

Notice how this allows you to determine whether a number is negative or positive from the first digit alone.

Finding Negations

To convert a number to two’s complement:

Get the number as an unsigned binary number
Invert all the bits up to the right-most 1
Add 1 to the binary number

Arithmetic

Two’s complement arithmetic works as it does with unsigned numbers as per modular arithmetic.

Ranges & Overflow

Overflow occurs if the result of an arithmetic operation is outside the range of numbers which can be represented in the chosen representation.

When an operation is performed and it overflows, the left-most bits that exceed the range of the datatype will be discarded, as such the result will not match the correct answer.

Real Numbers

In standard base-10, real numbers are represented with a decimal point separating the fractional component from the whole component, as such:

137.512 is equal to:

\[
(1 \times 10^2) + (3 \times 10^1) + (7 \times 10^0) + (5 \times 10^{-1}) + (1 \times 10^{-2}) + (2 \times 10^{-3})
\]

Likewise, the same can be done for binary

Fixed Point

Fixed point binary numbers have the decimal point put in a fixed location in the number, ie, before the last 4 digits:

1101.1010

$2^3$	$2^2$	$2^1$	$2^0$	$2^{-1}$	$2^{-2}$	$2^{-3}$	$2^{-4}$
$8$	$4$	$2$	$1$	$1/2$	$1/4$	$1/8$	$1/16$
$1$	$1$	$0$	$1$	$1$	$0$	$1$	$0$

Fixed point representations have limited precision, unfortunately, and gives the same level of precision to all numbers stored with it, it is inflexible.

Floating POint

With denary, numbers can be written as such:

\[
137.512 = 10^3 \times .137512
\]

Likewise, the same can be done with base 2:

\[
11011010 = 2^4 \times .11011010
\]

Floating point representations are as such:

\[
\textrm{number} = \pm \textrm{base}^{\textrm{exponent}} \times \textrm{mantissa}
\]

In binary:

The first bit represents the sign, 0 for + and 1 for -
The next k digits are the exponent in excess representation
The rest n digits is the mantissa in unsigned representation

In IEEE-754, a 32-bit float is defined as:

1 10000110 11110001100000000000000

The first bit is the sign bit
The next 8 bits are the exponent in excess representation
The next 23 bits are the mantissa

For 8-bit representations, it is common that k=3 and n=4.
For 16-bit representations, as mentioned, k=5 and n=10.
For 32-bit representations, as mentioned, k=8 and n=23.
For 64-bit representations (often called a double), k=11 and n=52

Examples

00101001

Sign is 0 so the number is positive
The exponent is 010 (-2)
The mantissa is 1001
The number is therefore equal to:

\[
00101001 = 2^{-2} \times .1001\\
= .001001\\
= 9/64 \textrm{ in denary}
\]

11101100

Sign is 1 so the number is negative
The exponent is 110 (2 since 110 is 6 and it’s excess representaiton so subtract 4)
The mantissa is 1100
The number is therefore equal to:

\[
11101100 = -2^{2} \times .1100\\
= - 11.00\\
= -3 \textrm{ in denary}
\]

0011011100100000

Sign is 0 so the number is positive
The exponent is 01101 (-3 since 01101 is 13 and it’s excess representaiton so subtract 16)
The mantissa is 1100100000 (25/32)
The number is therefore equal to:

\[
11101100 = 2^{-3} \times 25/32\\
= 2^{-2} \times 25/64\\
= 2^{-1} \times 25/128\\
= 25/256
\]

Normalised Representation

A floating point representation is normalised if the leading bit of the mantissa is 1.
All representations (except for 0) should be normalised as this allows for the most precision.

Examples

+7/2 as an 8-bit float

The sign bit is 0

\[
\frac{7}{2} = 3 + \frac{1}{2}\\
\\
= 2^1 + 2^0 + 2^-1\\
\\
= 2^2 \times (2^{-1} + 2^{-2} + 2^{-3})
\]

So the exponent is 2 (110 in excess representation)
The mantissa is 1110
The final number is 0110110

Numbers & Computers

January 2026

Bases

Binary Numbers

Representation

Padding

Binary Arithmetic

Addition

Subtraction

Multiplication

Alternative Representations

Words & Bytes

Sign-Magnitude Representation

Excess (offset binary) representation

Conversion

Arithmetic

Two’s Complement Representation

Formal Definition

Finding Negations

Arithmetic

Ranges & Overflow

Real Numbers

Fixed Point

Floating POint

Examples

Normalised Representation

Examples