Bases

Our number systems tend to be positional with respect to some base.

For example, 137 represents:

\[ (1 \times 10^2) + (3 \times 10^1) + (7 \times 10^0) \]

Binary Numbers

Representation

Binary numbers are represented positionally, just like denary numbers:

$2^7$$2^6$$2^5$$2^4$$2^3$$2^2$$2^1$$2^0$
$128$$64$$32$$16$$8$$4$$2$$1$
$1$$0$$0$$0$$1$$0$$0$$1$

(The above shows you a representation of 137 in binary)

Padding

Binary numbers can be padded with leading zeroes much like denary numbers.
This is often done as registers often have a fixed width.

Binary Arithmetic

Addition

$15 + 4$ in binary:

15 = 1111 04 = 0100

15:       01111
04:       00100
          -----
result:   10011
carry:     1

Subtraction

$15 - 4$ in binary:

15 = 1111 04 = 0100

15:       01111
04:       00100
          -----
result:   01011
carry:

Multiplication

$15 \times 4$ in binary:

15 = 1111 04 = 0100

15:           1111
04:           0100
          --------
              0000
             00000
            111100
       +   0000000
       -----------
           0111100

Alternative Representations

There are alternative representations for numbers in binary, ie 2-of-5 code and others used in barcodes.

Words & Bytes

A computer’s memory is often divided into words, often 16, 32 or 64 bits in width.
The largest number that a 64-bit word can hold is $2^64 - 1$.

Sometimes a word can hold multiple numbers, ie, in instructions.
Numbers can also be represented with multiple words, when they exceed the length of a single word.

A bit represents a single binary digit, ie: 1 or 0.
A byte represents 8 bits.
A nibble represents half a byte, or 4 bits.

In programming languages, ints tend to represent 32-bit numbers, with longs representing 64 bits.
short is used to represent 4 bit numbers (nibbles).

The exact width of data types in a lanugage depends on the implementation - be sure to check this.

Sign-Magnitude Representation

So far we have only covered unsigned integers, that is to say, they have no sign.
signed integers allow us to discern whether or not they are positive or negative, and we can do so in different ways.

Sign-magnitude representation uses the first bit in a binary number to represent the sign, typically:

  • 0 represents $+$
  • 1 represents $-$

For example:

  • 0110 would represent 6
  • 1110 would represent -6

Excess (offset binary) representation

An excess representation starts at $-2^{n-1}$ and goes up to $2^{n-1} - 1$

For example, with an 8-bit excess representatoin:
00000000 = -128
00000001 = -127
11111110 = 126
11111111 = 127

Bit PatternSigned Integer Value
0000-8
0001-7
0010-6
0011-5
0100-4
0101-3
0110-2
0111-1
10000
10011
10102
10113
11004
11015
11106
11117

In excess representation, the bias is defined as $2^{n-1}$

Conversion

To convert excess binary representation to denary:

  • Convert it as a standard positionally encoded binary string
  • Subtract $2^{K-1}$

Arithmetic

Arithmetic in this form of encoding is nontrivial.

A number $n$ represented in K-bit excess is the positional binary representation of $n + 2^{K - 1}$.

That is to say, if we add numbers $n$ and $m$ positionally whilst they are encoded under the excess representation, the result will be:

\[ n + 2^{K-1} + m + 2^{K-1} \]

This means that in order to get the correct answer, we need to subtract $2^{K-1}$ from the answer.
(We only need to subtract $2^{K-1}$ once because the final result should still have the offset to be represented in excess)
That said, this does not work if the sum is out of range.

Two’s Complement Representation

In two’s complement, the sign of the first bit is flipped, such that:

$-2^7$$2^6$$2^5$$2^4$$2^3$$2^2$$2^1$$2^0$
$-128$$64$$32$$16$$8$$4$$2$$1$
$1$$0$$0$$0$$0$$0$$0$$0$

The above shows you how -128 is represented in two’s complement.

Formal Definition

Formally, for a 2’s complement bit string $b_{K - 1} … b_1b_0$, the number represented is:
$b_{K-1}2^{K-1} + … + b_12^1 + b_02^0$ if $b_{K-1} = 0$ and
$(b_{K-1} + … + b_12^1 + b_02^0) - 2^K$ if $b_{K-1} = 1$

Notice how this allows you to determine whether a number is negative or positive from the first digit alone.

Finding Negations

To convert a number to two’s complement:

  1. Get the number as an unsigned binary number
  2. Invert all the bits up to the right-most 1
  3. Add 1 to the binary number

Arithmetic

Two’s complement arithmetic works as it does with unsigned numbers as per modular arithmetic.

Ranges & Overflow

Overflow occurs if the result of an arithmetic operation is outside the range of numbers which can be represented in the chosen representation.

When an operation is performed and it overflows, the left-most bits that exceed the range of the datatype will be discarded, as such the result will not match the correct answer.

Real Numbers

In standard base-10, real numbers are represented with a decimal point separating the fractional component from the whole component, as such:

137.512 is equal to:

\[ (1 \times 10^2) + (3 \times 10^1) + (7 \times 10^0) + (5 \times 10^{-1}) + (1 \times 10^{-2}) + (2 \times 10^{-3}) \]

Likewise, the same can be done for binary

Fixed Point

Fixed point binary numbers have the decimal point put in a fixed location in the number, ie, before the last 4 digits:

1101.1010

$2^3$$2^2$$2^1$$2^0$$2^{-1}$$2^{-2}$$2^{-3}$$2^{-4}$
$8$$4$$2$$1$$1/2$$1/4$$1/8$$1/16$
$1$$1$$0$$1$$1$$0$$1$$0$

Fixed point representations have limited precision, unfortunately, and gives the same level of precision to all numbers stored with it, it is inflexible.

Floating POint

With denary, numbers can be written as such:

\[ 137.512 = 10^3 \times .137512 \]

Likewise, the same can be done with base 2:

\[ 11011010 = 2^4 \times .11011010 \]

Floating point representations are as such:

\[ \textrm{number} = \pm \textrm{base}^{\textrm{exponent}} \times \textrm{mantissa} \]

In binary:

  • The first bit represents the sign, 0 for + and 1 for -
  • The next k digits are the exponent in excess representation
  • The rest n digits is the mantissa in unsigned representation


In IEEE-754, a 32-bit float is defined as:

1 10000110 11110001100000000000000
  • The first bit is the sign bit
  • The next 8 bits are the exponent in excess representation
  • The next 23 bits are the mantissa



For 8-bit representations, it is common that k=3 and n=4.
For 16-bit representations, as mentioned, k=5 and n=10.
For 32-bit representations, as mentioned, k=8 and n=23.
For 64-bit representations (often called a double), k=11 and n=52

Examples

00101001

  • Sign is 0 so the number is positive
  • The exponent is 010 (-2)
  • The mantissa is 1001
  • The number is therefore equal to:
\[ 00101001 = 2^{-2} \times .1001\\ = .001001\\ = 9/64 \textrm{ in denary} \]

11101100

  • Sign is 1 so the number is negative
  • The exponent is 110 (2 since 110 is 6 and it’s excess representaiton so subtract 4)
  • The mantissa is 1100
  • The number is therefore equal to:
\[ 11101100 = -2^{2} \times .1100\\ = - 11.00\\ = -3 \textrm{ in denary} \]

0011011100100000

  • Sign is 0 so the number is positive
  • The exponent is 01101 (-3 since 01101 is 13 and it’s excess representaiton so subtract 16)
  • The mantissa is 1100100000 (25/32)
  • The number is therefore equal to:
\[ 11101100 = 2^{-3} \times 25/32\\ = 2^{-2} \times 25/64\\ = 2^{-1} \times 25/128\\ = 25/256 \]

Normalised Representation

A floating point representation is normalised if the leading bit of the mantissa is 1.
All representations (except for 0) should be normalised as this allows for the most precision.

Examples

+7/2 as an 8-bit float

  • The sign bit is 0
\[ \frac{7}{2} = 3 + \frac{1}{2}\\ \\ = 2^1 + 2^0 + 2^-1\\ \\ = 2^2 \times (2^{-1} + 2^{-2} + 2^{-3}) \]
  • So the exponent is 2 (110 in excess representation)
  • The mantissa is 1110
  • The final number is 0110110