Fast multiplication circuits
The operands are two signed 8-bit integers:
The multiplication we are looking for is signed 16-bit integer:
Or more verbosely:
i |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Weight |
32k |
16k |
8k |
4k |
2k |
1k |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
A |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
||||||||
B |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
||||||||
A*B |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
In the following chapters we show multiplication of the operands using various methods:
Signed Radix-2 Booth multiplication
Signed Radix-4 Booth multiplication
Signed Radix-16 Booth multiplication
Multiplexer based multiplication
Note that most modern processors use high-radix Booth algorithm to speed up the multiplication.
Signed Radix-2 Booth multiplication
In Radix-2 Booth multiplication the bits are grouped in:
We'll assume following selection table:
a(i) |
a(i-1) |
D(i) |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
0 |
-1 |
1 |
1 |
0 |
Encoding the multiplier digits with the selection table results in high-radix multiplier:
i |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
B |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
|
D(0) |
0 |
0 |
|||||||
D(1) |
0 |
0 |
|||||||
D(2) |
1 |
0 |
|||||||
D(3) |
0 |
1 |
|||||||
D(4) |
0 |
0 |
|||||||
D(5) |
1 |
0 |
|||||||
D(6) |
0 |
1 |
|||||||
D(7) |
1 |
0 |
|||||||
D |
-1 |
1 |
-1 |
0 |
1 |
-1 |
0 |
0 |
All possible variations of the multiplicand A can be precalculated in hardware using bit shifting and addition:
i |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Weight |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
D(i)=0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
D(i)=1 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
D(i)=-1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
Multiplying A with encoded values of multiplier yields in:
i |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
32k |
16k |
8k |
4k |
2k |
1k |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
A |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
|||||||||
D |
-1 |
1 |
-1 |
0 |
1 |
-1 |
0 |
0 |
Expl. |
||||||||
Z1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+0 |
Z2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+0 |
|
Z3 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
-A<<2 |
||
Z4 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
+A<<3 |
|||
Z5 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+0 |
||||
Z6 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
-A<<5 |
|||||
Z7 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
+A<<6 |
||||||
Z8 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
-A<<7 |
|||||||
Sum |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
Steps explained:
Z1 - Ignore, since D(0) is zero
Z2 - Ignore, since D(1) is zero
Z3 - Subtract A shifted left by 2 bits since D(2) is -1
Z4 - Add A shifted left by 3 bits since D(3) is 1
Z5 - Ignore, since D(4) is zero
Z6 - Subtract A shifted left by 5 bits since D(5) is -1
Z7 - Add A shifted left by 6 bits since D(6) is 1
Z8 - Subtract A shifted left by 7 bits since D(7) is -1
The steps can also easily be verified on a Python command line:
>>> A=77
>>> B=-92
>>> A*B
-7084
>>> -(A<<2)+(A<<3)-(A<<5)+(A<<6)-(A<<7)
-7084
Thus encoding multiplicand (B) digits with Booth encoding (D) we can drastically simplify multiplication on a modern processor since bit shift is inexpensive operation which in circuit can be implemented with zero delay.
Signed Radix-4 Booth multiplication
Bit grouping in Radix-4:
Selector table:
a(i+1) |
a(i) |
a(i-1) |
D(i) |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
1 |
0 |
1 |
1 |
2 |
1 |
0 |
0 |
-2 |
1 |
0 |
1 |
-1 |
1 |
1 |
0 |
-1 |
1 |
1 |
1 |
0 |
Encoding the multiplier digits yields in:
i |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
B |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
|
D(0) |
0 |
0 |
0 |
||||||
D(2) |
0 |
1 |
0 |
||||||
D(4) |
1 |
0 |
0 |
||||||
D(6) |
1 |
0 |
1 |
||||||
D |
-1 |
-2 |
1 |
0 |
Precalculate possible variations of the multiplicand A:
i |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Weight |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
D(i)=0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
D(i)=1 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
D(i)=2 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
D(i)=-1 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
D(i)=-2 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
Multiplying A with encoded values of multiplier yields in:
i |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
32k |
16k |
8k |
4k |
2k |
1k |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
A |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
|||||||||
D |
-1 |
-2 |
1 |
0 |
Expl. |
||||||||||||
Z1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
+0 |
Z2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
+A<<2 |
||
Z3 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
-2A<<4 |
||||
Z4 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
-A<<6 |
||||||
A*B |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
Steps explained:
Z1 - Ignore, since D(0) is zero
Z2 - Add A shifted left by 2 bits
Z3 - Subtract 2 times A shifted left by 4 bits
Z4 - Subtract A shifted left by 6 bits
Again we can easily verify that the steps used were correct:
>>> A=77
>>> B=-92
>>> A*B
-7084
>>> (A<<2) -((2*A)<<4) - (A<<6)
-7084
Signed Radix-16 Booth multiplication
Bit grouping in Radix-16:
Selector table:
a(i+3) |
a(i+2) |
a(i+1) |
a(i) |
a(i-1) |
D(i) |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
1 |
2 |
0 |
0 |
1 |
0 |
0 |
2 |
0 |
0 |
1 |
0 |
1 |
3 |
0 |
0 |
1 |
1 |
0 |
3 |
0 |
0 |
1 |
1 |
1 |
4 |
0 |
1 |
0 |
0 |
0 |
4 |
0 |
1 |
0 |
0 |
1 |
5 |
0 |
1 |
0 |
1 |
0 |
5 |
0 |
1 |
0 |
1 |
1 |
6 |
0 |
1 |
1 |
0 |
0 |
6 |
0 |
1 |
1 |
0 |
1 |
7 |
0 |
1 |
1 |
1 |
0 |
7 |
0 |
1 |
1 |
1 |
1 |
8 |
1 |
0 |
0 |
0 |
0 |
-8 |
1 |
0 |
0 |
0 |
1 |
-7 |
1 |
0 |
0 |
1 |
0 |
-7 |
1 |
0 |
0 |
1 |
1 |
-6 |
1 |
0 |
1 |
0 |
0 |
-6 |
1 |
0 |
1 |
0 |
1 |
-5 |
1 |
0 |
1 |
1 |
0 |
-5 |
1 |
0 |
1 |
1 |
1 |
-4 |
1 |
1 |
0 |
0 |
0 |
-4 |
1 |
1 |
0 |
0 |
1 |
-3 |
1 |
1 |
0 |
1 |
0 |
-3 |
1 |
1 |
0 |
1 |
1 |
-2 |
1 |
1 |
1 |
0 |
0 |
-2 |
1 |
1 |
1 |
0 |
1 |
-1 |
1 |
1 |
1 |
1 |
0 |
-1 |
1 |
1 |
1 |
1 |
1 |
0 |
Encoding the multiplier digits yields in:
i |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
B |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
|
D(0) |
0 |
1 |
0 |
0 |
0 |
||||
D(4) |
1 |
0 |
1 |
0 |
0 |
||||
D |
-6 |
4 |
Precalculate only interesting variations of the multiplicand A:
i |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Weight |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
D(i)=4 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
D(i)=-6 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
Multiplying A with encoded values of multiplier yields in:
i |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
Weight |
32k |
16k |
8k |
4k |
2k |
1k |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
|
A |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
|||||||||
D |
-6 |
4 |
Expl. |
||||||||||||||
Z1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
+4A |
Z1 |
1 |
1 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
-6A<<4 |
||||
A*B |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
Steps explained:
Z1 - Add 4 times A
Z2 - Subtract 6 times A shifted left by 4 bits
Again we can easily verify that the steps used were correct:
>>> A=77
>>> B=-92
>>> A*B
-7084
>>> (4*A)-((6*A)<<4)
-7084
Multiplexer based multiplication
In following multiplexer based solution signed integers are not supported, therefore we have to substitute the variables used earlier.
The operands are two unsigned 8-bit integers:
The multiplication we are looking for is unsigned 16-bit integer:
Multiplexer approach uses muxers to select either one of the operands and bit shift them on each step:
i |
15 |
14 |
13 |
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Weight |
32k |
16k |
8k |
4k |
2k |
1k |
512 |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
A(i) |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
||||||||
B(i) |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
||||||||
A(i/2)&B(i/2) |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
Z1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
|
Z2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
||
Z3 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
|||
Z4 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
||||
Z5 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
1 |
|||||
Z6 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
||||||
Z7 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
|||||||
Sum |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
Steps explained:
A(i/2)&B(i/2) - Spread conjunction bits
Z1 - Fill with zeroes, because A(1) = B(1) = 0
Z2 - Sum bits of A and B ignoring carry bits, because A(2) = B(2) = 1
Z3 - Use bits from B, because A(3) = 1 and B(3) = 0
Z4 - Fill with zeroes, because A(4) = B(4) = 0
Z5 - Sum bits of A and B ignoring carry bits, because A(5) = B(5) = 1
Z6 - Use bits from B, because A(6) = 1 and B(6) = 0
Z7 - Use bits from A, because A(7) = 0 and B(7) = 1