Construction of an ALU.


The picocomputer has a very primitive ALU. It consisted of just an Adder with the ability to subtract using 2’s compliment arithmetic. There was no Logic in it at all!


The symbol for an ALU is normally written:


Where the Function input selects the desired combination of the two inputs.


If the Function input consists of 2 control lines, 4 functions are possible:

eg:

00 Add C = A + B

01 Subtract C = A - B

10 And C = A and B

11 Or C = A or B


If we increase the number to 3 control lines then further functions (8 in total) are possible - eg Xor

ALU’s are constructed by creating each of the functions separately, and then using a multiplexor to select the desired output. The Function input is used for the input address lines of the mux.

This ALU can be constructed using a 74153 4-input mux, a 7483 4-bit adder, and and or gates. The subtract function needs a little work. It has to generate the inverse of B, and the carry-in signal to the adder, as well as being used by the address lines of the mux.



Adders


There is one small problem with designing an n-bit adder. It is normal to design a 1-bit adder and then use n of them to handle n-bits - a ripple-carry adder


The problem is to do with the carry. Every time a signal has to go through a gate, it incurs a small delay. This gate delay is ~5 ns. The adder itself can be made from 2 levels of gates. The first level is an and array, this is then fed into the second level which is an or array.


To add 2 bits together takes 2 gate delays.


If we chain several 1-bit adders together to make an n-bit adder, we have to wait a considerable time for the carry out to propagate through. The carry out takes 2-gate delays for each single adder, and so will take 2*n gate delays to appear at the final bit.


For a 32-bit adder this will increase the time from 10ns to 320ns - 32 times slower!

It is possible to speed up this process. Indeed, it is always possible to design a circuit that has only two levels of logic. But for a 32-bit adder, this is extremely hardware intensive.


Cin1 (the carry in of bit 1 ) =

Cout0 (the carry out of bit 0)


Cout0 = (B0.Cin0) + (A0.Cin0) + (A0.B0)


=>


Cin1 = (B0.Cin0) + (A0.Cin0) + (A0.B0)


Cin2 (the carry in of bit 2) =

Cout1 (the carry out of bit 1)


Cout1 = (B1.Cin1) + (A1.Cin1) + (A1.B1)


= (B1.B0.Cin0) + (B1.A0.Cin0) + (B1.A0.B0)

+ (A1.B0.Cin0) + (A1.A0.Cin0) + (A1.A0.B0)

+ (A1.B1)



This complexity can be simplified somewhat, by substituting


gi = ai.bi


which is true if bit i of the adder generates a carry out independent of carry in.


and

pi = ai+bi


which is true if bit i of the adder propagates a carry in to carry out.



c1 = g0 + (p0.c0)

c2 = g1 + (p1.g0) + (p1.p0.c0)

c3 = g2 + (p2.g1) + (p2.p1.g0) + (p2.p1.p0.c0)

c4 = g3 + (p3.g2) + (p3.p2.g1) + (p3.p2.p1.g0)

+ (p3.p2.p1.p0.c0)


A carry is generated if carry in for some earlier bit is a one, and all intermediate bits propagate a carry. This is called a carry-lookahead adder


Now we have groups of 4-bits with the carry being handled as quickly as the addition.


To extend this scheme to more than 4-bits we can either:


block them together as before and let the carry ripple through


a partial carry-lookahead adder


or


use a higher level carry lookahead on each block of 4-bits


P0 = p3.p2.p1.p0


G0 = g3 + (p3.g2) + (p3.p2.g1) + (p3.p2.p1.g0)


C1 = G0 + (P0.c0)


and similarly for C2, C3 and C4

Multipliers - Booth’s Algorithm


Booth observed that if we allow the ’adder’ to subtract as well as add, we can multiply numbers differently. When he invented this approach, shifting right was quicker than adding. If the multiplier contains sequences of 1’s or 0’s then we can reduce the number of additions.


Instead of just looking at the LOBIT of the shift register, we look at the LOBIT and the 'previous LOBIT'


The new algorithm is now:


Depending on the LOBIT and previous LOBIT do one of the following:


00: Middle of a string of 0’s - do nothing


01: End of a string of 1’s - add the multiplicand to the high order

result


10: Beginning of a string of 1’s - sub multiplicand


11: Middle of a string of 1’s - do nothing


After this perform a one bit arithmetic right shift, and repeat n times.


An ASR shifts 1 bit to the right, but the high order bit inserted is the same as the previous high order bit


eg 0111 => 0011

1100 => 1110


The very first previous LOBIT is 0.

Example of 2 * 6


Multiplicand Step Shift Regs

0010 Initial values 0000 0110 0

1 0010 00: do nothing 0000 0110 0

0010 ASR 0000 0011 0

2 0010 10: sub mcand 1110 0011 0

0010 ASR 1111 0001 1

3 0010 11: do nothing 1111 0001 1

0010 ASR 1111 1000 1

4 0010 01: add mcand 0001 1000 1

0010 ASR 0000 1100 0



One side-effect of Booth’s algorithm is that it handles negative as well as positive numbers.


Using the previous algorithm, it is necessary to remember the initial signs, convert both numbers to be positive, and negate the final result if the two initial signs were different.


159.233 - Lecture 14 - 1