59.304 Assignment 3

Due 17th October 1997 - Your coursework mark will be the best two of the three assignments. Please do the assignment though because it could be good revision (hint).
There will probably be a penalty for late submission.

This assignment is concerned with the design of a forwarding unit for a 16 bit pipelined CPU.
The CPU has 4 general purpose registers r0,r1,r2,r3 and PC,MAR,FLAGS and IR. The pipeline is consists of five cycles, Fetch, Decode, Execute, Memory and Write.
The instruction set consists of the following instructions. 
ADD rd,rs,rt  
SUB rd,rs,rt  
AND rd,rs,rt  
OR rd,rs,rt  
XOR rd,rs,rt  
LW rd,[rs+imm9]  
B dest  
BGT dest
ADD rd,rs,imm6  
SUB rd,rs,imm6  
AND rd,rs,imm6  
OR rd,rs,imm6  
XOR rd,rs,imm6  
SW rd,[rs+imm9]  
BE dest  
BLT dest
rd is the destination register (except for SW) where it is the source and rs and rt are source registers. imm6, imm9 and dest, are signed constants.
B always branches (realtive to PC+1), BE branches if the zero flag is set, BLT branches if the carry flag is set and BGT branches if carry is clear. The instructions are coded into a 16 bit word as follows (lsb on right).
0 = ALU    3 reg operands
1 = ALUi   2 regs and an immediate
2 = LW     load word from memory
3 = SW     store word to memory
4 = B      branch
5 = BE     branch if equal
6 = BLT    branch if less than
7 = BGT    branch if greater than
rt (2)

The possible values for op are: 0 = ADD , 1 = SUB, 2 = AND, 3 = OR , 4 = XOR

The flags are set by all the ALU operations.  The file stallcpu.lgf on the H: drive contains a working implementation of this cpu. To work with this you also need the file called cpu.gat,and you must add the line "gates  + cpu.gat" to your log.cnf file. cpu.gat contains 2 useful devices - a 16bit ALU and a 16 bit register. Instead of using multiplexors this design uses tri-state devices; the "oe" pin on a register floats the register outputs if it is zero.  The implementation contains a hardwired control unit (no microcode!) and a hazard detection unit. When a hazard is detected the pipeline is stalled.
Your task is to change the hardware so that rather than stall the pipeline, data is forwarded to the ALU.  The design already has the necessary routes in the datapath for simple forwarding.
nfwdalutoa puts the output from the previous alu operation onto the "a" input of the alu.
nfwdmemtoa puts the value currently in the datapath for the "mem" cycle onto the "a" input of the alu.
The faster you can make the CPU the more marks you will get. You can do this assignment in stages, first try to work out nfwdalutoa, next you can try nfwdmemtoa. After this there are still some stalls that can be eliminated. The program run by the cpu is called prog.asm  and there is also a very simple assembler - asm.c so that you can write your own test programs. Some marks will be awarded for interesting test programs.

As before, submit a single file containing your design.  No paper submissions will be accepted (but you can write on the diagram by pressing the 'l' key).

M Johnson 1997