Pipeline Hazards

Using a single memory introduces a structural hazard

Data Hazard on r1

Option 1: Stall to Resolve Data Hazard


Option 2: compiler inserts indepdendent instructions

Worst case is NOPs


Option 3 But the data we want is available!  - Forward it to where it is needed.

HW Change for Forwarding (Bypassing):
 
Forwarding reduces Data Hazard for lw to 1 cycle:

Software Scheduling to Avoid Load Hazards

e.g. perform the following operations: a = b + c;
d = e - f;
assume a, b, c, d ,e, and f are in memory.

Slow code:

LW Rb,[b] 
LW Rc,[c] 
ADD Ra,Rb,Rc 
SW a,[Ra] 
LW Re,[e] 
LW Rf,[f] 
SUB Rd,Re,Rf
SW [d],Rd
Fast code:

Compiler Statistics:

Control Hazard on Branches

Branch Stall Impact: 3 cycles if write in cycle 5. But, we can work out the branch during Decode cycle.

Branch Delay now 1 clock cycle
Branch can stall for 1 cycle or we can have delayed branches.

When is pipelining hard?

Hazard Detection

Suppose instruction i is about to be issued and a predecessor instruction j is in the instruction pipeline.

Rregs ( i ) = Registers read by instruction i
Wregs ( i ) = Registers written by instruction i

° A RAW hazard exists on register r if there exists s where Rregs( r ) intersection Wregs( s )

- Keep a record of pending writes (for inst's in the pipe) and compare with operand regs of current instruction.
- When instruction issues, reserve its result register.
- When on operation completes, remove its write reservation.

RISC Pipelines

° All instructions follow same pipeline order ("static schedule").
° Register write in last stage
- Avoid WAW hazards
° All register reads performed in first stage after issue.
- Avoid WAR hazards
° Memory access in stage 4
- Avoid all memory hazards
° Control hazards resolved by delayed branch.
° RAW hazards resolved by bypass, except on load results which are resolved by delayed load or stall.

Substantial pipelining with very little cost or complexity.