The extra single-cycle adders - PDF Free Download

lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since instrctions reqire mltiple cycles, we cold rese some nits in a different cycle dring the eection of a single instrction. For eample, we cold se the same : to increment the PC (first clock cycle), and for arithmetic operations (third clock cycle). Proposed eection stages. Instrction fetch and PC increment 2. ing sorces from the file 3. Performing an comptation 4. ing or writing () memory 5. Storing back to the file 2

Two etra adders Or original single-cycle path had an and two adders. The arithmetic-logic nit had two responsibilities. Doing an operation on two s for arithmetic instrctions. Adding a to a sign-etended constant, to compte effective addresses for lw and sw instrctions. One of the etra adders incremented the PC by compting PC + 4. The other adder compted branch targets, by adding a sign-etended, shifted offset to (PC + 4). 3

The etra single-cycle adders PC 4 Add Reg Shift left 2 Add PCSrc Instrction address [3-] Instrction memory I [25-2] I [2-6] I [5 - ] 2 2 Registers Zero Reslt Op address address em Data memory emtoreg RegDst Src em I [5 - ] Sign etend 4

Or new adder setp We can eliminate both etra adders in a mlticycle path, and instead se jst one, with mltipleers to select the proper inpts. A 2-to- m SrcA sets the first inpt to be the PC or a. A 4-to- m SrcB selects the second inpt from among: the file (for arithmetic operations), a constant 4 (to increment the PC), a sign-etended constant (for effective addresses), and a sign-etended and shifted constant (for branch targets). This permits a single to perform all of the necessary fnctions. Arithmetic operations on two operands. Incrementing the PC. Compting effective addresses for lw and sw. Adding a sign-etended, shifted offset to (PC + 4) for branches. 5

The mlticycle adder setp highlighted PC PC IorD em SrcA Address emory em em Data RegDst 2 Reg 2 Registers 4 2 3 Zero Reslt Op SrcB Sign etend Shift left 2 emtoreg 6

Eliminating a memory Similarly, we can get by with one nified memory, which will store both program instrctions and. (a Princeton architectre) This memory is sed in both the instrction fetch and access stages, and the address cold come from either: the PC (when we re fetching an instrction), or the otpt (for the effective address of a lw or sw). We add another 2-to- m, IorD, to decide whether the memory is being accessed for instrctions or for. Proposed eection stages. Instrction fetch and PC increment 2. ing sorces from the file 3. Performing an comptation 4. ing or writing () memory 5. Storing back to the file 7

The new memory setp highlighted PC PC IorD em SrcA Address emory em em Data RegDst 2 Reg 2 Registers 4 2 3 Zero Reslt Op SrcB Sign etend Shift left 2 emtoreg 8

Intermediate s Sometimes we need the otpt of a fnctional nit in a later clock cycle dring the eection of one instrction. The instrction word fetched in stage determines the destination of the write in stage 5. The reslt for an address comptation in stage 3 is needed as the memory address for lw or sw in stage 4. These otpts will have to be stored in intermediate s for ftre se. Otherwise they wold probably be lost by the net clock cycle. The instrction read in stage is saved in Instrction. Register file otpts from stage 2 are saved in s A and B. The otpt will be stored in a Ot. Any fetched from memory in stage 4 is kept in the emory, also called DR. 9

The final mlticycle path Reslt Zero Op SrcA 2 3 SrcB 2 2 Registers Reg Address emory em Data Sign etend Shift left 2 PCSorce PC A 4 [3-26] [25-2] [2-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em PC Ot B

Register write control signals We have to add a few more control signals to the path. Since instrctions now take a variable nmber of cycles to eecte, we cannot pdate the PC on each cycle. Instead, a PC signal controls the loading of the PC. The instrction also has a write signal, IR. We need to keep the instrction word for the dration of its eection, and mst eplicitly re-load the instrction when needed. The other intermediate s, DR, A, B and Ot, will store for only one clock cycle at most, and do not need write control signals.

Smmary of lticycle Datapath A single-cycle CPU has two main disadvantages. The cycle time is limited by the worst case latency. It reqires more hardware than necessary. A mlticycle processor splits instrction eection into several stages. Instrctions only eecte as many stages as reqired. Each stage is relatively simple, so the clock cycle time is redced. Fnctional nits can be resed on different cycles. We made several modifications to the single-cycle path. The two etra adders and one memory were removed. ltipleers were inserted so the and memory can be sed for different prposes in different eection stages. New s are needed to store intermediate reslts. Net time, we ll look at controlling this path. 2

Controlling the mlticycle path Now we talk abot how to control this path. PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] Instrction emory RegDst 2 Reg Sign etend 2 Registers Shift left 2 A B 4 2 3 SrcB Zero Reslt Op Ot PCSorce emtoreg 3

lticycle control nit The control nit is responsible for prodcing all of the control signals. Each instrction reqires a seqence of control signals, generated over mltiple clock cycles. This implies that we need a state machine. The path control signals will be otpts of the state machine. Different instrctions reqire different seqences of steps. This implies the instrction word is an inpt to the state machine. The net state depends pon the eact instrction being eected. After we finish eecting one instrction, we ll have to repeat the entire process again to eecte the net instrction. 4

Finite-state machine for the control nit Op = R-type R-type eection R-type writeback Instrction fetch and PC increment Register fetch and branch comptation Op = BEQ Branch completion Effective address comptation Op = SW emory write Op = LW/SW emory read Register write Op = LW Each bbble is a state Holds the control signals for a single cycle Note: All instrctions do the same things dring the first two cycles 5

Stage : Instrction Fetch Stage incldes two actions which se two separate fnctional nits: the memory and the. Fetch the instrction from memory and store it in IR. IR = em[pc] Use the to increment the PC by 4. PC = PC + 4 6

7 Stage : Instrction Fetch Reslt Zero Op SrcA 2 3 SrcB 2 2 Registers Reg Address emory em Data Sign etend Shift left 2 PCSorce PC A B Ot 4 [3-26] [25-2] [2-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em PC

Stage : Instrction fetch and PC increment PC PC IorD IR = em[pc] SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB PC = PC + 4 emtoreg 8

Stage control signals Instrction fetch: IR = em[pc] Signal em IorD IR Vale Description from memory Use PC as the memory read address Save memory contents to instrction Increment the PC: PC = PC + 4 Signal SrcA SrcB Op PC PCSorce Vale ADD Description Use PC as the first operand Use constant 4 as the second operand Perform addition Change PC Update PC from the otpt We ll assme that all control signals not listed are implicitly set to. 9

Stage 2: s Stage 2 is mch simpler. the contents of sorce s rs and rt, and store them in the intermediate s A and B. (Remember the rs and rt fields come from the instrction IR.) A = Reg[IR[25-2]] B = Reg[IR[2-6]] 2

Stage 2: Register File PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 2

Eecting Arithmetic Instrctions: Stages 3 & 4 We ll start with R-type instrctions like add $t, $t, $t2. Stage 3 for an arithmetic instrction is simply comptation. Ot = A op B A and B are the intermediate s holding the sorce operands. The operation is determined by the instrction s fnc field and cold be one of add, sb, and, or, slt. Stage 4, the final R-type stage, is to store the reslt generated in the previos cycle into the destination rd. Reg[IR[5-]] = Ot 23

Stage 3 (R-Type): operation PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 24

Stage 4 (R-Type): Register back PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 25

Stage 3 (R-type): instrction eection PC PC IorD SrcA Save the reslt in Ot em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB Do some comptation on two sorce s emtoreg 26

Stage 4 (R-type): write back PC PC IorD...and store it to rd SrcA Take the reslt from the last cycle... em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 27

Stages 3-4 (R-type) control signals Stage 3 (eection): Ot = A op B Signal SrcA SrcB Op Vale fnc Description Use A as the first operand Use B as the second operand Do the operation specified in the fnc field Stage 4 (writeback): Reg[IR[5-]] = Ot Signal Reg RegDst emtoreg Vale Description to the file Use field rd as the destination Ot contains the to write 28

Eecting a beq instrction We can eecte a branch instrction in three stages or clock cycles. Bt it reqires a little cleverness Stage involves instrction fetch and PC increment. IR = em[pc] PC = PC + 4 Stage 2 is fetch and branch target comptation. A = Reg[IR[25-2]] B = Reg[IR[2-6]] Stage 3 is the final cycle needed for eecting a branch instrction. Assming we have the branch target available if (A == B) then PC = branch_target 29

When shold we compte the branch target? We need the to do the comptation. When is the not bsy? Cycle 2 3 3

When shold we compte the branch target? We need the to do the comptation. When is the not bsy? Cycle 2 3 PC = PC + 4 Here Comparing A & B 3

Optimistic eection Bt, we don t know whether or not the branch is taken in cycle 2!! That s okay. we can still go ahead and compte the branch target first. The book calls this optimistic eection. The is otherwise free dring this clock cycle. Nothing is harmed by doing the comptation early. If the branch is not taken, we can jst ignore the reslt. This idea is also sed in more advanced CPU design techniqes. odern CPUs perform branch prediction, which we ll discss in a few lectres in the contet of pipelining. 32

Stage 2 Revisited: Compte the branch target To Stage 2, we ll add the comptation of the branch target. Compte the branch target address by adding the new PC (the original PC + 4) to the sign-etended, shifted constant from IR. Ot = PC + (sign-etend(ir[5-]) << 2) We save the target address in Ot for now, since we don t know yet if the branch shold be taken. What abot R-type instrctions that always go to PC+4? 33

Stage 2 (Revisited): Branch Target Comptation PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 34

Stage 2: Register fetch & branch target comptation PC PC IorD sorce s SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB Compte branch target address emtoreg 35

Stage 2 control signals No control signals need to be set for the reading operations A = Reg[IR[25-2]] and B = Reg[IR[2-6]]. IR[25-2] and IR[2-6] are already applied to the file. Registers A and B are already written on every clock cycle. Branch target comptation: Ot = PC + (sign-etend(ir[5-]) << 2) Signal SrcA SrcB Op Vale ADD Description Use PC as the first operand Use (sign-etend(ir[5-]) << 2) as second operand Add and save the reslt in Ot Ot is also written atomatically on each clock cycle. 36

Branch completion Stage 3 is the final cycle needed for eecting a branch instrction. if (A == B) then PC = Ot Remember that A and B are compared by sbtracting and testing for a reslt of, so we mst se the again in this stage. 37

Stage 3 (BEQ): Branch Completion PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 38

Stage 3 (beq): Branch completion PC PC IorD em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 SrcA 2 3 Use the target address compted in stage 2 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB Check for eqality of contents emtoreg 39

Comparison: if (A == B)... Stage 3 (beq) control signals Signal SrcA SrcB Op Vale SUB Description Use A as the first operand Use B as the second operand Sbtract, so Zero will be set if A = B Branch:...then PC = Ot Signal PC PCSorce Vale Zero Description Change PC only if Zero is tre (i.e., A = B) Update PC from the Ot Ot contains the reslt from the previos cycle, which wold be the branch target. We can write that to the PC, even thogh the is doing something different (comparing A and B) dring the crrent cycle. 4

Eecting a sw instrction A store instrction, like sw $a, 6($sp), also shares the same first two stages as the other instrctions. Stage : instrction fetch and PC increment. Stage 2: fetch and branch target comptation. Stage 3 comptes the effective memory address sing the. Ot = A + sign-etend(ir[5-]) A contains the base (like $sp), and IR[5-] is the 6-bit constant offset from the instrction word, which is not shifted. Stage 4 saves the contents (here, $a) into memory. em[ot] = B Remember that the second sorce rt was already read in Stage 2 (and again in Stage 3), and its contents are in intermediate B. 4

Stage 3 (SW): Effective Address Comptation PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 42

Stage 4 (SW): emory PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 43

Stage 3 (sw): effective address comptation PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB Compte an effective address and store it in Ot emtoreg 44

Stage 4 (sw): memory write PC PC IorD...into memory. SrcA Use the effective address from stage 3... em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB...to store from one of the s... emtoreg 45

Stages 3-4 (sw) control signals Stage 3 (address comptation): Ot = A + sign-etend(ir[5-]) Signal SrcA SrcB Op Vale Description Use A as the first operand Use sign-etend(ir[5-]) as the second operand Add and store the reslting address in Ot Stage 4 (memory write): em[ot] = B Signal em IorD Vale Description to the memory Use Ot as the memory address The memory s inpt always comes from the B intermediate, so no selection is needed. 46

Eecting a lw instrction Finally, lw is the most comple instrction, reqiring five stages. The first two are like all the other instrctions. Stage : instrction fetch and PC increment. Stage 2: fetch and branch target comptation. The third stage is the same as for sw, since we have to compte an effective memory address in both cases. Stage 3: compte the effective memory address. 47

Stages 4-5 (lw): memory read and write Stage 4 is to read from the effective memory address, and to store the vale in the intermediate DR (memory ). DR = em[ot] Stage 5 stores the contents of DR into the destination. Reg[IR[2-6]] = DR Remember that the destination for lw is field rt (bits 2-6) and not field rd (bits 5-). 48

Stage 4 (LW): emory PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 49

Stage 4 (lw): memory read PC PC IorD...to read from memory... em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 SrcA 2 3 Use the effective address from stage 3... Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB...into DR. emtoreg 5

Stage 5 (LW): Register back PC PC IorD SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB emtoreg 5

Stage 5 (lw): write PC PC IorD...and store it in rt. SrcA em Address emory em em Data IR [3-26] [25-2] [2-6] [5-] [5-] RegDst 2 Reg 2 Registers A B 4 2 3 Zero Reslt Op Ot PCSorce Instrction emory Sign etend Shift left 2 SrcB Take DR... emtoreg 52

Stages 4-5 (lw) control signals Stage 4 (memory read): DR = em[ot] Signal em IorD Vale Description from memory Use Ot as the memory address The memory contents will be atomatically written to DR. Stage 5 (writeback): Reg[IR[2-6]] = DR Signal Reg RegDst emtoreg Vale Description Store new in the file Use field rt as the destination from DR (from memory) 53

Finite-state machine for the control nit Op = R-type SrcA = SrcB = Op = fnc R-type eection Reg = RegDst = emtoreg = R-type writeback Instrction fetch and PC increment IorD = em = IR = SrcA = SrcB = Op = PCSorce = PC = Register fetch and branch comptation SrcA = SrcB = Op = Op = BEQ SrcA = SrcB = Op = PC = Zero PCSorce = Effective address comptation Branch completion Op = SW em = IorD = emory write Op = LW/SW SrcA = SrcB = Op = emory read Register write Op = LW em = IorD = Reg = RegDst = emtoreg = 54

Implementing the FS This can be translated into a state table; here are the first two states. Crrent State Inpt (Op) Net State PC IorD em em IR Otpt (Control signals) Reg Dst emto Reg Reg SrcA SrcB Op PC Sorce Instr Fetch X Reg Fetch X X Reg Fetch BEQ Branch compl X X X X Reg Fetch R-type R-type eecte X X X X Reg Fetch LW/S W Compte eff addr X X X X Yo can implement this the hard way. Represent the crrent state sing flip-flops or a. Find eqations for the net state and (control signal) otpts in terms of the crrent state and inpt (instrction word). Or yo can se the easy way. Stick the whole state table into a memory, like a RO. This wold be mch easier, since yo don t have to derive eqations. 55

Smmary Now yo know how to bild a mlticycle controller! Each instrction takes several cycles to eecte. Different instrctions reqire different control signals and a different nmber of cycles. We have to provide the control signals in the right seqence. 56