4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations

Size: px
Start display at page:

Download "4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations"

Transcription

1 .3 An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline and ore Pipelining Illstrations This online section covers hardware description langages and then gives a dozen eamples of pipeline diagrams, starting on page.3-8. As mentioned in Appi B, Verilog can describe processors for simlation or with the intention that the Verilog specification be synthesized. To achieve acceptable synthesis s in size and speed, and a behavioral specification inted for synthesis mst careflly delineate the highly combinational portions of the design, sch as a path, from the control. The path can then be synthesized sing available libraries. A Verilog specification inted for synthesis is sally longer and more comple. We start with a behavioral model of the five-stage pipeline. To illstrate the dichotomy between behavioral and synthesizable designs, we then give two Verilog descriptions of a mltiple-cycle-per-instrction IPS processor: one inted solely for simlations and one sitable for synthesis. This IPS architectre is satisfactory to demonstrate Verilog, bt if any reader bilds one for LEGv8 and wants to share it with others, please contact the pblisher. Using Verilog for Behavioral Specification with Simlation for the Five-Stage Pipeline Figre.3. shows a Verilog behavioral description of the pipeline that handles instrctions as well as loads and stores. It does not accommodate branches (even incorrectly!), which we postpone inclding ntil later in the chapter. Becase Verilog lacks the ability to define registers with named fields sch as strctres in C, we se several indepent registers for each pipeline register. We name these registers with a prefi sing the same convention; hence, IFIDIR is the IR portion of the IFID pipeline register. This version is a behavioral description not inted for synthesis. s take the same nmber of clock cycles as or hardware design, bt the control is done in a simpler fashion by repeatedly decoding fields of the instrction in each pipe stage. Becase of this difference, the instrction register (IR) is needed throghot the pipeline, and the entire IR is passed from pipe stage to pipe stage. As yo read the Verilog descriptions in this chapter, remember that the actions in the always block all occr in parallel on every clock cycle. Since there are no blocking assignments, the order of the events within the always block is arbitrary.

2 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-3 modle CPU (clock); // opcodes parameter LW = 6 b, SW = 6 b, BEQ = 6 b, no-op = 3 b_, op = 6 b; inpt clock; reg[3:], Regs[:3], Iemory[:3], Demory[:3], // separate memories IFIDIR, IDA, IDB, IDIR, EIR, EB, // pipeline registers EOt, EVale, EIR; // pipeline registers wire [:] IDrs, IDrt, Erd, Erd, Ert; // Access register fields wire [5:] Eop, Eop, IDop; // Access opcodes wire [3:] Ain, Bin; // the inpts // These assignments define fields from the pipeline registers assign IDrs = IDIR[5:]; // rs field assign IDrt = IDIR[:6]; // rt field assign Erd = EIR[5:]; // rd field assign Erd = EIR[5:]; //rd field assign Ert = EIR[:6]; //rt field--sed for loads assign Eop = EIR[3:6]; // the opcode assign Eop = EIR[3:6]; // the opcode assign IDop = IDIR[3:6]; // the opcode // Inpts to the come directly from the pipeline registers assign Ain = IDA; assign Bin = IDB; reg [5:] i; //sed to initialize registers initial begin = ; IFIDIR = no-op; IDIR = no-op; EIR = no-op; EIR = no-op; // pt no-ops in pipeline registers for (i=;i<=3;i=i+) Regs[i] = i; //initialize registers--jst so they aren t cares (posedge clock) begin // Remember that ALL these actions happen every pipe stage and with the se of <= they happen in parallel! // first instrction in the pipeline is being fetched IFIDIR <= Iemory[>>]; <= + ; // Fetch & increment // second instrction in pipeline is fetching registers IDA <= Regs[IFIDIR[5:]]; IDB <= Regs[IFIDIR[:6]]; // get two registers IDIR <= IFIDIR; //pass along IR--can happen anywhere, since this affects net stage only! // third instrction is doing address calclation or operation if ((IDop==LW) (IDop==SW)) // address calclation EOt <= IDA +{{6{IDIR[5]}}, IDIR[5:]}; else if (IDop==op) case (IDIR[5:]) //case for the varios R-type instrctions 3: EOt <= Ain + Bin; //add operation defalt: ; //other R-type operations: sbtract, SLT, etc. case FIGURE.3. A Verilog behavioral model for the IPS five-stage pipeline, ignoring branch and hazards. As in the design earlier in Chapter, we se separate instrction and memories, which wold be implemented sing separate caches as we describe in Chapter 5.

3 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage EIR <= IDIR; EB <= IDB; //pass along the IR & B register //em stage of pipeline if (Eop==op) EVale <= EOt; //pass along else if (Eop == LW) EVale <= Demory[EOt>>]; else if (Eop == SW) Demory[EOt>>] <=EB; //store EIR <= EIR; //pass along IR // the stage if ((Eop==op) & (Erd!= )) // pdate registers if operation and destination not Regs[Erd] <= EVale; // operation else if ((Eop == LW)& (Ert!= )) // Update registers if load and destination not Regs[Ert] <= EVale; modle FIGURE.3. A Verilog behavioral model for the IPS five-stage pipeline, ignoring branch and hazards. (Contined). Implementing Forwarding in Verilog To et the Verilog model frther, Figre.3. shows the addition of forwarding logic for the case when the sorce and destination are instrctions. Neither load stalls nor branches are handled; we will add these shortly. The changes from the earlier Verilog description are highlighted. Check Yorself Someone has proposed moving the write for a from an instrction from the to the E stage, pointing ot that this wold redce the maimm length of forwards from an instrction by one cycle. Which of the following is accrate reasons not to consider sch a change?. It wold not actally change the forwarding logic, so it has no advantage.. It is impossible to implement this change nder any circmstance since the write for the mst stay in the same pipe stage as the write for a load. 3. oving the write for instrctions wold create the possibility of writes occrring from two different instrctions dring the same clock cycle. Either an etra write port wold be reqired on the register file or a strctral hazard wold be created.. The of an instrction is not available in time to do the write dring E.

4 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-5 modle CPU (clock); parameter LW = 6 b, SW = 6 b, BEQ = 6 b, no-op = 3 b_, op = 6 b; inpt clock; reg[3:], Regs[:3], Iemory[:3], Demory[:3], // separate memories IFIDIR, IDA, IDB, IDIR, EIR, EB, // pipeline registers EOt, EVale, EIR; // pipeline registers wire [:] IDrs, IDrt, Erd, Erd, Ert; //hold register fields wire [5:] Eop, Eop, IDop; Hold opcodes wire [3:] Ain, Bin; // declare the bypass signals wire bypassafrome, bypassafromin,bypassbfrome, bypassbfromin, bypassafromlwin, bypassbfromlwin; assign IDrs = IDIR[5:]; assign IDrt = IDIR[5:]; assign Erd = EIR[5:]; assign Erd = EIR[:6]; assign Eop = EIR[3:6]; assign Ert = EIR[5:]; assign Eop = EIR[3:6]; assign IDop = IDIR[3:6]; // The bypass to inpt A from the E stage for an operation assign bypassafrome = (IDrs == Erd) & (IDrs!=) & (Eop==op); // yes, bypass // The bypass to inpt B from the E stage for an operation assign bypassbfrome = (IDrt == Erd)&(IDrt!=) & (Eop==op); // yes, bypass // The bypass to inpt A from the stage for an operation assign bypassafromin =( IDrs == Erd) & (IDrs!=) & (Eop==op); // The bypass to inpt B from the stage for an operation assign bypassbfromin = (IDrt == Erd) & (IDrt!=) & (Eop==op); / // The bypass to inpt A from the stage for an LW operation assign bypassafromlwin =( IDrs == EIR[:6]) & (IDrs!=) & (Eop==LW); // The bypass to inpt B from the stage for an LW operation assign bypassbfromlwin = (IDrt == EIR[:6]) & (IDrt!=) & (Eop==LW); // The A inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Ain = bypassafrome? EOt : (bypassafromin bypassafromlwin)? EVale : IDA; // The B inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Bin = bypassbfrome? EOt : (bypassbfromin bypassbfromlwin)? EVale: IDB; reg [5:] i; //sed to initialize registers initial begin = ; IFIDIR = no-op; IDIR = no-op; EIR = no-op; EIR = no-op; // pt no-ops in pipeline registers for (i = ;i<=3;i = i+) Regs[i] = i; //initialize registers--jst so they aren t cares (posedge clock) begin // first instrction in the pipeline is being fetched IFIDIR <= Iemory[>>]; <= + ; // Fetch & increment FIGURE.3. A behavioral definition of the five-stage IPS pipeline with bypassing to operations and address calclations. The code added to Figre.3. to handle bypassing is highlighted. Becase these bypasses only reqire changing where the inpts come from, the only changes reqired are in the combinational logic responsible for selecting the inpts. (contines on net page)

5 An Introdction to Digital Design Using a Hardware Design Langage // second instrction is in register fetch IDA <= Regs[IFIDIR[5:]]; IDB <= Regs[IFIDIR[:6]]; // get two registers IDIR <= IFIDIR; //pass along IR--can happen anywhere, since this affects net stage only! // third instrction is doing address calclation or operation if ((IDop==LW) (IDop==SW)) // address calclation & copy B EOt <= IDA +{{6{IDIR[5]}}, IDIR[5:]}; else if (IDop==op) case (IDIR[5:]) //case for the varios R-type instrctions 3: EOt <= Ain + Bin; //add operation defalt: ; //other R-type operations: sbtract, SLT, etc. case EIR <= IDIR; EB <= IDB; //pass along the IR & B register //em stage of pipeline if (Eop==op) EVale <= EOt; //pass along else if (Eop == LW) EVale <= Demory[EOt>>]; else if (Eop == SW) Demory[EOt>>] <=EB; //store EIR <= EIR; //pass along IR // the stage if ((Eop==op) & (Erd!= )) Regs[Erd] <= EVale; // operation else if ((Eop == LW)& (Ert!= )) Regs[Ert] <= EVale; modle FIGURE.3. A behavioral definition of the five-stage IPS pipeline with bypassing to operations and address calclations. (Contined) The Behavioral Verilog with Stall Detection If we ignore branches, stalls for hazards in the IPS pipeline are confined to one simple case: loads whose s are crrently in the clock stage. Ths, eting the Verilog to handle a load with a destination that is either an instrction or an effective address calclation is reasonably straightforward, and Figre.3.3 shows the few additions needed. Check Yorself Someone has asked abot the possibility of hazards occrring throgh, contrary to throgh a register. Which of the following statements abot sch hazards is tre?. Since accesses only occr in the E stage, all operations are done in the same order as instrction eection, making sch hazards impossible in this pipeline.. Sch hazards are possible in this pipeline; we jst have not discssed them yet. 3. No pipeline can ever have a hazard involving, since it is the programmer s job to keep the order of references accrate.. emory hazards may be possible in some pipelines, bt they cannot occr in this particlar pipeline. 5. Althogh the pipeline control wold be obligated to maintain ordering among references to avoid hazards, it is impossible to design a pipeline where the references cold be ot of order.

6 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-7 modle CPU (clock); parameter LW = 6 b, SW = 6 b, BEQ = 6 b, no-op = 3 b_, op = 6 b; inpt clock; reg[3:], Regs[:3], Iemory[:3], Demory[:3], // separate memories IFIDIR, IDA, IDB, IDIR, EIR, EB, // pipeline registers EOt, EVale, EIR; // pipeline registers wire [:] IDrs, IDrt, Erd, Erd, Ert; //hold register fields wire [5:] Eop, Eop, IDop; Hold opcodes wire [3:] Ain, Bin; // declare the bypass signals wire stall, bypassafrome, bypassafromin,bypassbfrome, bypassbfromin, bypassafromlwin, bypassbfromlwin; assign IDrs = IDIR[5:]; assign IDrt = IDIR[5:]; assign Erd = EIR[5:]; assign Erd = EIR[:6]; assign Eop = EIR[3:6]; assign Ert = EIR[5:]; assign Eop = EIR[3:6]; assign IDop = IDIR[3:6]; // The bypass to inpt A from the E stage for an operation assign bypassafrome = (IDrs == Erd) & (IDrs!=) & (Eop==op); // yes, bypass // The bypass to inpt B from the E stage for an operation assign bypassbfrome = (IDrt== Erd)&(IDrt!=) & (Eop==op); // yes, bypass // The bypass to inpt A from the stage for an operation assign bypassafromin =( IDrs == Erd) & (IDrs!=) & (Eop==op); // The bypass to inpt B from the stage for an operation assign bypassbfromin = (IDrt==Erd) & (IDrt!=) & (Eop==op); / // The bypass to inpt A from the stage for an LW operation assign bypassafromlwin =( IDrs ==EIR[:6]) & (IDrs!=) & (Eop==LW); // The bypass to inpt B from the stage for an LW operation assign bypassbfromlwin = (IDrt==EIR[:6]) & (IDrt!=) & (Eop==LW); // The A inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Ain = bypassafrome? EOt : (bypassafromin bypassafromlwin)? EVale : IDA; // The B inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Bin = bypassbfrome? EOt : (bypassbfromin bypassbfromlwin)? EVale: IDB; // The signal for detecting a stall based on the se of a from LW assign stall = (EIR[3:6]==LW) && // sorce instrction is a load ((((IDop==LW) (IDop==SW)) && (IDrs==Erd)) // stall for address calc ((IDop==op) && ((IDrs==Erd) (IDrt==Erd)))); // se reg [5:] i; //sed to initialize registers initial begin = ; IFIDIR = no-op; IDIR = no-op; EIR = no-op; EIR = no-op; // pt no-ops in pipeline registers for (i = ;i<=3;i = i+) Regs[i] = i; //initialize registers--jst so they aren t cares (posedge clock) begin if (~stall) begin // the first three pipeline stages stall if there is a load hazard FIGURE.3.3 A behavioral definition of the five-stage IPS pipeline with stalls for loads when the destination is an instrction or effective address calclation. The changes from Figre.3. are highlighted. (contines on net page)

7 An Introdction to Digital Design Using a Hardware Design Langage // first instrction in the pipeline is being fetched IFIDIR <= Iemory[>>]; <= + ; IDIR <= IFIDIR; //pass along IR--can happen anywhere, since this affects net stage only! // second instrction is in register fetch IDA <= Regs[IFIDIR[5:]]; IDB <= Regs[IFIDIR[:6]]; // get two registers // third instrction is doing address calclation or operation if ((IDop==LW) (IDop==SW)) // address calclation & copy B EOt <= IDA +{{6{IDIR[5]}}, IDIR[5:]}; else if (IDop==op) case (IDIR[5:]) //case for the varios R-type instrctions 3: EOt <= Ain + Bin; //add operation defalt: ; //other R-type operations: sbtract, SLT, etc. case EIR <= IDIR; EB <= IDB; //pass along the IR & B register else EIR <= no-op; /Freeze first three stages of pipeline; inject a nop into the otpt //em stage of pipeline if (Eop==op) EVale <= EOt; //pass along else if (Eop == LW) EVale <= Demory[EOt>>]; else if (Eop == SW) Demory[EOt>>] <=EB; //store EIR <= EIR; //pass along IR // the stage if ((Eop==op) & (Erd!= )) Regs[Erd] <= EVale; // operation else if ((Eop == LW)& (Ert!= )) Regs[Ert] <= EVale; modle FIGURE.3.3 A behavioral definition of the five-stage IPS pipeline with stalls for loads when the destination is an instrction or effective address calclation. (Contined) Implementing the Branch Hazard Logic in Verilog We can et or Verilog behavioral model to implement the control for branches. We add the code to model branch eqal sing a predict not taken strategy. The Verilog code is shown in Figre.3.. It implements the branch hazard by detecting a taken branch in ID and sing that signal to sqash the instrction in IF (by setting the IR to, which is an effective no-op in IPS-3); in addition, the is assigned to the branch target. Note that to prevent an nepected latch, it is important that the is clearly assigned on every path throgh the always block; hence, we assign the in a single if statement. Lastly, note that althogh Figre.3. incorporates the basic logic for branches and control hazards, spporting branches reqires additional bypassing and hazard detection, which we have not inclded.

8 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-9 modle CPU (clock); parameter LW = 6 b, SW = 6 b, BEQ = 6 b, no-op = 3 b, op = 6 b; inpt clock; reg[3:], Regs[:3], Iemory[:3], Demory[:3], // separate memories IFIDIR, IDA, IDB, IDIR, EIR, EB, // pipeline registers EOt, EVale, EIR; // pipeline registers wire [:] IDrs, IDrt, Erd, Erd; //hold register fields wire [5:] Eop, Eop, IDop; Hold opcodes wire [3:] Ain, Bin; // declare the bypass signals wire takebranch, stall, bypassafrome, bypassafromin,bypassbfrome, bypassbfromin, bypassafromlwin, bypassbfromlwin; assign IDrs = IDIR[5:]; assign IDrt = IDIR[5:]; assign Erd = EIR[5:]; assign Erd = EIR[:6]; assign Eop = EIR[3:6]; assign Eop = EIR[3:6]; assign IDop = IDIR[3:6]; // The bypass to inpt A from the E stage for an operation assign bypassafrome = (IDrs == Erd) & (IDrs!=) & (Eop==op); // yes, bypass // The bypass to inpt B from the E stage for an operation assign bypassbfrome = (IDrt == Erd)&(IDrt!=) & (Eop==op); // yes, bypass // The bypass to inpt A from the stage for an operation assign bypassafromin =( IDrs == Erd) & (IDrs!=) & (Eop==op); // The bypass to inpt B from the stage for an operation assign bypassbfromin = (IDrt == Erd) & (IDrt!=) & (Eop==op); / // The bypass to inpt A from the stage for an LW operation assign bypassafromlwin =( IDrs == EIR[:6]) & (IDrs!=) & (Eop==LW); // The bypass to inpt B from the stage for an LW operation assign bypassbfromlwin = (IDrt == EIR[:6]) & (IDrt!=) & (Eop==LW); // The A inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Ain = bypassafrome? EOt : (bypassafromin bypassafromlwin)? EVale : IDA; // The B inpt to the is bypassed from E if there is a bypass there, // Otherwise from if there is a bypass there, and otherwise comes from the ID register assign Bin = bypassbfrome? EOt : (bypassbfromin bypassbfromlwin)? EVale: IDB; // The signal for detecting a stall based on the se of a from LW assign stall = (EIR[3:6]==LW) && // sorce instrction is a load ((((IDop==LW) (IDop==SW)) && (IDrs==Erd)) // stall for address calc ((IDop==op) && ((IDrs==Erd) (IDrt==Erd)))); // se FIGURE.3. A behavioral definition of the five-stage IPS pipeline with stalls for loads when the destination is an instrction or effective address calclation. The changes from Figre.3. are highlighted. (contines on net page)

9 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage // Signal for a taken branch: instrction is BEQ and registers are eqal assign takebranch = (IFIDIR[3:6]==BEQ) && (Regs[IFIDIR[5:]]== Regs[IFIDIR[:6]]); reg [5:] i; //sed to initialize registers initial begin = ; IFIDIR = no-op; IDIR = no-op; EIR = no-op; EIR = no-op; // pt no-ops in pipeline registers for (i = ;i<=3;i = i+) Regs[i] = i; //initialize registers--jst so they aren t don t cares (posedge clock) begin if (~stall) begin // the first three pipeline stages stall if there is a load hazard if (~takebranch) begin // first instrction in the pipeline is being fetched normally IFIDIR <= Iemory[>>]; <= + ; else begin // a taken branch is in ID; instrction in IF is wrong; insert a no-op and reset the IFDIR <= no-op; <= + + ({{6{IFIDIR[5]}}, IFIDIR[5:]}<<); // second instrction is in register fetch IDA <= Regs[IFIDIR[5:]]; IDB <= Regs[IFIDIR[:6]]; // get two registers // third instrction is doing address calclation or operation IDIR <= IFIDIR; //pass along IR if ((IDop==LW) (IDop==SW)) // address calclation & copy B EOt <= IDA +{{6{IDIR[5]}}, IDIR[5:]}; else if (IDop==op) case (IDIR[5:]) //case for the varios R-type instrctions 3: EOt <= Ain + Bin; //add operation defalt: ; //other R-type operations: sbtract, SLT, etc. case EIR <= IDIR; EB <= IDB; //pass along the IR & B register else EIR <= no-op; /Freeze first three stages of pipeline; inject a nop into the otpt //em stage of pipeline if (Eop==op) EVale <= EOt; //pass along else if (Eop == LW) EVale <= Demory[EOt>>]; else if (Eop == SW) Demory[EOt>>] <=EB; //store // the stage EIR <= EIR; //pass along IR if ((Eop==op) & (Erd!= )) Regs[Erd] <= EVale; // operation else if ((Eop == LW)& (EIR[:6]!= )) Regs[EIR[:6]] <= EVale; modle FIGURE.3. A behavioral definition of the five-stage IPS pipeline with stalls for loads when the destination is an instrction or effective address calclation. (Contined)

10 .3 An Introdction to Digital Design Using a Hardware Design Langage.3- Using Verilog for Behavioral Specification with Synthesis To demonstrate the contrasting types of Verilog, we show two descriptions of a different, nonpipelined implementation style of IPS that ses mltiple clock cycles per instrction. (Since some instrctors make a synthesizable description of the IPS pipeline project for a class, we chose not to inclde it here. It wold also be long.) Figre.3.5 gives a behavioral specification of a mlticycle implementation of the IPS processor. Becase of the se of behavioral operations, it wold be difficlt to synthesize a separate path and control nit with any reasonable efficiency. This version demonstrates another approach to the control by sing a ealy finite-state machine (see discssion in Section A. of Appi A). The se of a ealy machine, which allows the otpt to dep both on inpts and the crrent state, allows s to decrease the total nmber of states. Since a version of the IPS design inted for synthesis is considerably more comple, we have relied on a nmber of Verilog modles that were specified in Appi A, inclding the following: The -to- mltipleor shown in Figre A.., and the 3-to- mltipleor that can be trivially derived based on the -to- mltipleor. The IPS shown in Figre A.5.5. The IPS control defined in Figre A.5.6. The IPS register file defined in Figre A.8.. Now, let s look at a Verilog version of the IPS processor inted for synthesis. Figre.3.6 shows the strctral version of the IPS path. Figre.3.7 ses the path modle to specify the IPS CPU. This version also demonstrates another approach to implementing the control nit, as well as some optimizations that rely on relationships between varios control signals. Observe that the state machine specification only provides the seqencing actions. The setting of the control lines is done with a series of assign statements that dep on the state as well as the opcode field of the instrction register. If one were to fold the setting of the control into the state specification, this wold look like a ealy-style finite-state control nit. Becase the setting of the control lines is specified sing assign statements otside of the always block, most logic synthesis systems will generate a small implementation of a finite-state machine that determines the setting of the state register and then ses eternal logic to derive the control inpts to the path. In writing this version of the control, we have also taken advantage of a nmber of insights abot the relationship between varios control signals as well as sitations where we don t care abot the control signal vale; some eamples of these are given in the following elaboration.

11 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage modle CPU (clock); parameter LW = 6 b, SW = 6 b, BEQ=6 b, J=6 d; inpt clock; //the clock is an eternal inpt // The architectrally visible registers and scratch registers for implementation reg [3:], Regs[:3], emory [:3], IR, Ot, DR, A, B; reg [:] state; // processor state wire [5:] opcode; //se to get opcode easily wire [3:] SignEt,Offset; //sed to get sign-eted offset field assign opcode = IR[3:6]; //opcode is pper 6 bits assign SignEt = {{6{IR[5]}},IR[5:]}; //sign etension of lower 6 bits of instrction assign Offset = SignEt << ; // offset is shifted // set the to and start the control in state initial begin = ; state = ; //The state machine--triggered on a rising clock clock) begin Regs[] = ; //make R //shortct way to make sre R is always case (state) //action deps on the state : begin // first step: fetch the instrction, increment, go to net state IR <= emory[>>]; <= + ; state = ; //net state : begin // second step: decode, register fetch, also compte branch address A <= Regs[IR[5:]]; B <= Regs[IR[:6]]; state = 3; Ot <= + Offset; // compte -relative branch target 3: begin // third step: Load-store eection, eection, Branch completion state = ; // defalt net state if ((opcode==lw) (opcode==sw)) Ot <= A + SignEt; //compte effective address else if (opcode==6 b) case (IR[5:]) //case for the varios R-type instrctions 3: Ot = A + B; //add operation defalt: Ot = A; //other R-type operations: sbtract, SLT, etc. case FIGURE.3.5 A behavioral specification of the mlticycle IPS design. Th is has the same cycle behavior as the mlticycle design, bt is prely for simlation and specification. It cannot be sed for synthesis. (contines on net page)

12 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-3 else if (opcode == BEQ) begin if (A==B) <= Ot; // branch taken--pdate state = ; else if (opocde=j) begin = {[3:8], IR[5:], b}; // the branch target state = ; //Jmps else ; // other opcodes or eception for ndefined instrction wold go here : begin if (opcode==6 b) begin // Operation Regs[IR[5:]] <= Ot; // write the state = ; //R-type finishes else if (opcode == LW) begin // load instrction DR <= emory[ot>>]; // read the state = 5; // net state else if (opcode == LW) begin emory[ot>>] <= B; // write the state = ; // retrn to state //store finishes else ; // other instrctions go here 5: begin // LW is the only instrction still in eection Regs[IR[:6]] = DR; // write the DR to the register state = ; //complete an LW instrction case modle FIGURE.3.5 A behavioral specification of the mlticycle IPS design. (Contined)

13 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage modle path (Op, RegDst, emtoreg, em, em, IorD, Reg, IR,, Cond, SrcA, SrcB, Sorce, opcode, clock); // the control inpts + clock inpt [:] Op, SrcB, Sorce; // -bit control signals inpt RegDst, emtoreg, em, em, IorD, Reg, IR,, Cond, SrcA, clock; // -bit control signals otpt [5:] opcode ;// opcode is needed as an otpt by control reg [3:], emory [:3], DR,IR, Ot; // CPU state + some temporaries wire [3:] A,B,SignEtOffset, Offset, ResltOt, Vale, Jmpr,, Ain, Bin,emOt; / these are signals derived from registers wire [3:] Ctl; //. the control lines wire Zero; the Zero ot signal from the wire[:] reg;// the signal sed to commnicate the destination register initial = ; //start the at //Combinational signals sed in the path // sing word address with either Ot or as the address sorce assign emot = em? emory[(iord? Ot : )>>]:; assign opcode = IR[3:6];// opcode shortct // Get the write register address from one of two fields deping on RegDst assign reg = RegDst? IR[5:]: IR[:6]; // Get the write register either from the Ot or from the DR assign = emtoreg? DR : Ot; // Sign-et the lower half of the IR from load/store/branch offsets assign SignEtOffset = {{6{IR[5]}},IR[5:]}; //sign-et lower 6 bits; // The branch offset is also shifted to make it a word offset assign Offset = SignEtOffset << ; // The A inpt to the is either the rs register or the assign Ain = SrcA? A : ; // inpt is or A // Compose the Jmp address assign Jmpr = {[3:8], IR[5:], b}; //The jmp address FIGURE.3.6 A Verilog version of the mlticycle IPS path that is appropriate for synthesis. This path relies on several nits from Appi A. Initial statements do not synthesize, and a version sed for synthesis wold have to incorporate a reset signal that had this effect. Also note that resetting R to on every clock is not the best way to ensre that R stays at ; instead, modifying the register file modle to prodce whenever R is read and to ignore writes to R wold be a more efficient soltion. (contines on net page)

14 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-5 // Creates an instance of the control nit (see the modle defined in Figre B.5.6 // Inpt Op is control-nit set and sed to describe the instrction class as in Chapter // Inpt IR[5:] is the fnction code field for an instrction // Otpt Ctl are the actal control bits as in Chapter alcontroller (Op,IR[5:],Ctl); // control nit // Creates a 3-to- mltipleor sed to select the sorce of the net // Inpts are ResltOt (the incremented ), Ot (the branch address), the jmp target address // Sorce is the selector inpt and Vale is the mltipleor otpt lt3to src (ResltOt,Ot,Jmpr, Sorce, Vale); // Creates a -to- mltipleor sed to select the B inpt of the // Inpts are register B,constant, sign-eted lower half of IR, sign-eted lower half of IR << // SrcB is the selector inpt // Bin is the mltipleor otpt ltto Binpt (B,3 d,signetoffset,offset,srcb,bin); // Creates a IPS // Inpts are Ctl (the control), vale inpts (Ain, Bin) // Otpts are ResltOt (the 3-bit otpt) and Zero (zero detection otpt) IPS (Ctl, Ain, Bin, ResltOt,Zero); //the // Creates a IPS register file // Inpts are // the rs and rt fields of the IR sed to specify which registers to read, // reg (the write register nmber), (the to be written), Reg (indicates a write), the clock // Otpts are A and B, the registers read registerfile regs (IR[5:],IR[:6],reg,,Reg,A,B,clock); //Register file // The clock-triggered actions of the path clock) begin if (em) emory[ot>>] <= B; // --mst be a store Ot <= ResltOt; //Save the for se on a later clock cycle if (IR) IR <= emot; // the IR if an instrction fetch DR <= emot; // Always save the read vale // The is written both conditionally (controlled by ) and nconditionally if ( (Cond & Zero)) <=Vale; modle FIGURE.3.6 A Verilog version of the mlticycle IPS path that is appropriate for synthesis.

15 An Introdction to Digital Design Using a Hardware Design Langage modle CPU (clock); parameter LW = 6 b, SW = 6 b, BEQ = 6 b, J = 6 d; //constants inpt clock; reg [:] state; wire [:] Op, SrcB, Sorce; wire [5:] opcode; wire RegDst, em, em, IorD, Reg, IR,, Cond, SrcA, emoryop, IRWwrite, emreg; // Create an instance of the IPS path, the inpts are the control signals; opcode is only otpt path IPSDP (Op,RegDst,emReg, em, em, IorD, Reg, IR,, Cond, SrcA, SrcB, Sorce, opcode, clock); initial begin state = ; // start the state machine in state // These are the definitions of the control signals assign IR = (state==); assign emreg = ~ RegDst; assign emoryop = (opcode==lw) (opcode==sw); // a operation assign Op = ((state==) (state==) ((state==3)&emoryop))? b : // add ((state==3)&(opcode==beq))? b : b; // sbtract or se fnction code assign RegDst = ((state==)&(opcode==))? : ; assign em = (state==) ((state==)&(opcode==lw)); assign em = (state==)&(opcode==sw); assign IorD = (state==)? : (state==)? : X; assign Reg = (state==5) ((state==) &(opcode==)); assign = (state==) ((state==3)&(opcode==j)); assign Cond = (state==3)&(opcode==beq); assign SrcA = ((state==) (state==))? :; assign SrcB = ((state==) ((state==3)&(opcode==beq)))? b : (state==)? b : ((state==3)&emoryop)? b : b; // operation or other assign Sorce = (state==)? b : ((opcode==beq)? b : b); // Here is the state machine, which only has to seqence states clock) begin // all state pdates on a positive clock edge case (state) : state = ; //nconditional net state : state = 3; //nconditional net state 3: // third step: jmps and branches complete state = ((opcode==beq) (opcode==j))? : ;// branch or jmp go back else net state : state = (opcode==lw)? 5 : ; //R-type and SW finish 5: state = ; // go back case modle FIGURE.3.7 The IPS CPU sing the path from Figre.3.6.

16 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-7 Elaboration When specifying control, designers often take advantage of knowledge of the control so as to simplify or shorten the control specifi cation. Here are a few eamples from the specification in Figres.3.6 and emtoreg is set only in two cases, and then it is always the inverse of RegLoc, so we jst se the inverse of RegLoc.. IR is set only in state. 3. The does not operate in every state and, when nsed, can safely do anything.. RegLoc is in only one case and can elsewhere be set to. In practice, it might be better to set it eplicitly when needed and otherwise set it to X, as we do for IorD. First, it allows frther logic optimization possibilities throgh the eploitation of don t-care terms (see Appi A for frther discssion and eamples). Second, it is a more precise specification, and this allows the simlation to more closely model the hardware, possibly ncovering additional errors in the specification. ore Illstrations of Eection on the Hardware To redce the cost of this book, starting with the third edition, we moved sections and figres that were sed by a minority of instrctors online. This sbsection recaptres those figres for readers who wold like more spplemental material to nderstand pipelining better. These are all single-clock-cycle pipeline diagrams, which take many figres to illstrate the eection of a seqence of instrctions. The three eamples are respectively for code with no hazards, an eample of forwarding on the pipelined implementation, and an eample of bypassing on the pipelined implementation. No Hazard Illstrations On page 38, we gave the eample code seqence LDUR SUB ADD LDUR ADD X, [X,#] X, X, X3 X, X3, X X3, [X,#8] X, X5, X6 Figres. and.3 showed the mltiple-clock-cycle pipeline diagrams for this two-instrction seqence eecting across si clock cycles. Figres.3.8 throgh.3. show the corresponding single-clock-cycle pipeline diagrams for these two instrctions. Note that the order of the instrctions differs between these two types of diagrams: the newest instrction is at the bottom and to the right of the mltiple-clock-cycle pipeline diagram, and it is on the left in the single-clock-cycle pipeline diagram.

17 An Introdction to Digital Design Using a Hardware Design Langage LDUR X, [X, ] fetch /E E/ ress register register register 3 Signet 6 Shift left Zero ress Clock SUB X, X, X3 LDUR X, [X, ] fetch decode /E E/ ress register register register 3 Signet 6 Shift left Zero ress Clock FIGURE.3.8 Single-cycle pipeline diagrams for clock cycles (top diagram) and (bottom diagram). This style of pipeline representation is a snapshot of every instrction eecting dring one clock cycle. Or eample has bt two instrctions, so at most two stages are identified in each clock cycle; normally, all five stages are occpied. The highlighted portions of the path are active in that clock cycle. The load is fetched in clock cycle and decoded in clock cycle, with the sbtract fetched in the second clock cycle. To make the figres easier to nderstand, the other pipeline stages are empty, bt normally there is an instrction in every pipeline stage.

18 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-9 SUB X, X, X3 decode LDUR X, [X, ] Eection /E E/ ress register register register 3 Signet 6 Shift left Zero ress Clock 3 SUB X, X, X3 Eection LDUR X, [X, ] emory /E E/ ress register register register Shift left Zero ress 3 Signet 6 Clock FIGURE.3.9 Single-cycle pipeline diagrams for clock cycles 3 (top diagram) and (bottom diagram). In the third clock cycle in the top diagram, LDUR enters the stage. At the same time, SUB enters ID. In the forth clock cycle (bottom path), LDUR moves into E stage, reading sing the address fond in /E at the beginning of clock cycle. At the same time, the sbtracts and then places the difference into /E at the of the clock cycle.

19 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage SUB X, X, X3 emory LDUR X, [X, ] back /E E/ ress register register register 3 Signet 6 Shift left Zero ress Clock 5 SUB X, X, X3 back /E E/ ress register register register 3 Signet 6 Shift left Zero ress Clock 6 FIGURE.3. Single-cycle pipeline diagrams for clock cycles 5 (top diagram) and 6 (bottom diagram). In clock cycle 5, LDUR completes by writing the in E/ into register, and SUB ss the difference in /E to E/. In the net clock cycle, SUB writes the vale in E/ to register.

20 .3 An Introdction to Digital Design Using a Hardware Design Langage.3- IF:LDUR X, [X, ] ID: before<> : before<> E: before<3> : before<> /E E/ ress Clock Reg register register register [3 ] [ ] Shift left [3 ] Signet control Src Zero Op Branch em ress em emtoreg IF: SUB X, X, X3 ID: LDUR X, [X, ] : before<> E: before<> : before<3> /E E/ LDUR ress Clock X Reg register register register [3 ] [3 ] [ ] X X Signet 986 Shift left control Src Zero Op Branch em ress em emtoreg FIGURE.3. Clock cycles and. The phrase before < i> means the i th instrction before LDUR. The LDUR instrction in the top path is in the IF stage. At the of the clock cycle, the LDUR instrction is in the pipeline registers. In the second clock cycle, seen in the bottom path, the LDUR moves to the ID stage, and SUB enters in the IF stage. Note that the vales of the instrction fields and the selected sorce registers are shown in the ID stage. Hence, register X and the constant, the operands of LDUR, are written into the pipeline register. The nmber, representing the destination register nmber of LDUR, is also placed in. The top of the pipeline register shows the control vales for LDUR to be sed in the remaining stages. These control vales can be read from the LDUR row of the table in Figre.8.

21 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage IF: AND X, X, X5 ID: SUB X, X, X3 : LDUR X E: before<> : before<> SUB /E E/ ress Clock 3 3 Reg register register register [3 ] [3 ] [ ] X X3 Signet X 6 Shift left X 986 Src Zero control Op Branch em ress em emtoreg IF: ORR X3, X6, X7 ID: AND X, X, X5 : SUB X,... E: LDUR X,... : before<> /E E/ AND ress Clock 5 Reg register register register [3 ] [3 ] [ ] Signet X X X5 Shift left X X3 6 Src control Zero Op Branch em ress em emtoreg FIGURE.3. Clock cycles 3 and. In the top diagram, LDUR enters the stage in the third clock cycle, adding X and to form the address in the /E pipeline register. (The LDUR instrction is written LDUR X, pon reaching, becase the identity of instrction operands is not needed by or the sbseqent stages. In this version of the pipeline, the actions of, E, and dep only on the instrction and its destination register or its target address.) At the same time, SUB enters ID, reading registers X and X3, and the and instrction starts IF. In the forth clock cycle (bottom path), LDUR moves into E stage, reading sing the vale in /E as the address. In the same clock cycle, the sbtracts X3 from X and places the difference into /E, reads registers X and X5 dring ID, and the or instrction enters IF. The two diagrams show the control signals being created in the ID stage and peeled off as they are sed in sbseqent pipe stages.

22 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-3 IF: ADD X, X8, X9 ID: ORR X3, X6, X7 : AND X,... E: SUB X,... : LDUR X,.. /E E/ ORR ress Clock Reg register register register [3 ] [3 ] [5 ] X6 X7 Signet X 36 Shift left X X5 3 Src Zero control Op Branch em ress em emtoreg IF: after<> ID: ADD X, X8, X9 : ORR X3,... E: AND X,... : SUB X,. /E E/ ADD ress Clock Reg register register register [3 ] [3 ] [5 ] Signet X X8 X9 Shift left X6 X7 36 Src control Zero Op Branch em 3 em ress emtoreg FIGURE.3.3 Clock cycles 5 and 6. Wit h ADD, the final instrction in this eample, entering IF in the top path, all instrctions are engaged. By writing the in E/ into register, LDUR completes; both the and the register nmber are in E/. In the same clock cycle, SUB ss the difference in /E to E/, and the rest of the instrctions move forward. In the net clock cycle, SUB selects the vale in E/ to write to register nmber, again fond in E/. The remaining instrctions play follow-theleader: the calclates the OR of X6 and X7 for the ORR instrction in the stage, and registers X8 and X9 are read in the ID stage for the ADD instrction. The instrctions after ADD are shown as inactive jst to emphasize what occrs for the five instrctions in the eample. The phrase after < i> means the i th instrction after ADD.

23 .3-.3 An Introdction to Digital Design Using a Hardware Design Langage IF: after<> ID: after<> : ADD X,... E: ORR X3,... : AND X,. /E E/ ress Clock 7 Reg register register register [3 ] [ ] Shift left X8 X9 Src Zero Op Branch em ress em 3 emtoreg IF: after<3> ID: after<> : after<> E: ADD X,... : ORR X3,.. /E E/ ress Clock 8 3 Reg register register register [3 ] [3 ] [3 ] Signet control Signet Shift left Src control Zero Branch em Op [ ] 3 em ress emtoreg FIGURE.3. Clock cycles 7 and 8. In the top path, the ADD instrction brings p the rear, adding the vales corresponding to registers X8 and X9 dring the stage. The of the or instrction is passed from /E to E/ in the E stage, and the stage writes the of the and instrction in E/ to register X. Note that the control signals are deasserted (set to ) in the ID stage, since no instrction is being eected. In the following clock cycle (lower drawing), the stage writes the to register X3, thereby completing ORR, and the E stage passes the sm from the ADD in /E to E/. The instrctions after ADD are shown as inactive for pedagogical reasons.

24 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-5 IF: after<> ID: after<3> : after<> E: after<> : ADD X,. /E E/ ress Clock 9 Reg register register register [3 ] [3 ] Signet Shift left control Src Zero Branch em Op [ ] em ress emtoreg FIGURE.3.5 Clock cycle 9. The stage writes the sm in E/ into register X, completing ADD and the five-instrction seqence. The instrctions after ADD are shown as inactive for pedagogical reasons. ore Eamples To nderstand how pipeline control works, let s consider these five instrctions going throgh the pipeline: LDUR SUB AND ORR ADD X, [X,#] X, X, X3 X, X, X5 X3, X6, X7 X, X8, X9 Figres.3. throgh.3.5 show these instrctions proceeding throgh the nine clock cycles it takes them to complete eection, highlighting what is active in a stage and identifying the instrction associated with each stage dring a clock cycle. If yo eamine them careflly, yo may notice: In Figre.3.3 yo can see the seqence of the destination register nmbers from left to right at the bottom of the pipeline registers. The nmbers advance

25 An Introdction to Digital Design Using a Hardware Design Langage to the right dring each clock cycle, with the E/ pipeline register spplying the nmber of the register written dring the stage. When a stage is inactive, the vales of control lines that are deasserted are shown as or X (for don t care). Seqencing of control is embedded in the pipeline strctre itself. First, all instrctions take the same nmber of clock cycles, so there is no special control for instrction dration. Second, all control information is compted dring instrction decode and then passed along by the pipeline registers. Forwarding Illstrations We can se the single-clock-cycle pipeline diagrams to show how forwarding operates, as well as how the control activates the forwarding paths. Consider the following code seqence in which the depences have been highlighted: SUB X, X, X3 AND X, X, X5 ORR X, X, X ADD X9, X, X Figres.3.6 and.3.7 show the events in clock cycles 3 6 in the eection of these instrctions. Ths, in clock cycle 5, the forwarding nit selects the /E pipeline register for the pper inpt to the and the E/ pipeline register for the lower inpt to the. The following ADD instrction reads both register X, the target of the AND instrction, and register X, which the SUB instrction has already written. Notice that the prior two instrctions both write register X, so the forwarding nit mst pick the immediately preceding one (E stage). In clock cycle 6, the forwarding nit ths selects the /E pipeline register, containing the of the or instrction, for the pper inpt bt ses the non-forwarding register vale for the lower inpt to the. Illstrating Pipelines with Stalls and Forwarding We can se the single-clock-cycle pipeline diagrams to show how the control for stalls works. Figres.3.8 throgh.3. show the single-cycle diagram for clocks throgh 7 for the following code seqence (depences highlighted): LDUR X,[X,#] AND X, X,X5 ORR X, X, X ADD X9, X, X

26 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-7 ORR X, X, X AND X, X, X5 SUB X, X, X3 before<> before<> /E E/ X X 5 X5 X3 5 3 Forwarding nit Clock 3 ADD X9, X, X ORR X, X, X AND X, X, X5 SUB X,... before<> /E E/ X X X X5 5 Forwarding nit Clock FIGURE.3.6 Clock cycles 3 and of the instrction seqence on page.3-6. The bold lines are those active in a clock cycle, and the italicized register nmbers in color indicate a hazard. The forwarding nit is highlighted by shading it when it is forwarding to the. The instrctions before SUB are shown as inactive jst to emphasize what occrs for the for instrctions in the eample. Operand names are sed in for control of forwarding; ths they are inclded in the instrction label for. Operand names are not needed in E or, so is sed. Compare this with Figres.3. throgh.3.5, which show the path withot forwarding where ID is the last stage to need operand information.

27 An Introdction to Digital Design Using a Hardware Design Langage after<> ADD X9, X, X ORR X, X, X AND X,... SUB X,.. /E E/ X X X X 9 Forwarding nit Clock 5 after<> after<> ADD X9, X, X ORR X,... AND X,.. /E E/ X X 9 Forwarding nit Clock 6 FIGURE.3.7 Clock cycles 5 and 6 of the instrction seqence on page.3-6. The forwarding nit is highlighted when it is forwarding to the. The two instrctions after ADD are shown as inactive jst to emphasize what occrs for the for instrctions in the eample. The bold lines are those active in a clock cycle, and the italicized register nmbers in color indicate a hazard.

28 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-9 AND X, X, X5 LDUR X, [X, ] before<> before<> before<3> X Hazard detection nit.em /E E/ X X XX X.RegisterRd Forwarding nit Clock ORR X, X, X and X, X, X5 5 Hazard detection nit.em LDUR X, [X, ] before<> before<> /E E/ 5 X X5 X XX 5 X.RegisterRd Forwarding nit Clock 3 FIGURE.3.8 Clock cycles and 3 of the instrction seqence on page.3-6 with a load replacing SUB. The bold lines are those active in a clock cycle, the italicized register nmbers in color indicate a hazard, and the in the place of operands means that their identity is information not needed by that stage. The vales of the significant control lines, registers, and register nmbers are labeled in the figres. The AND instrction wants to read the vale created by the LDUR instrction in clock cycle 3, so the hazard detection nit stalls the AND and ORR instrctions. Hence, the hazard detection nit is highlighted.

29 An Introdction to Digital Design Using a Hardware Design Langage ORR X, X, X AND X, X, X5 Bbble LDUR X,... before<> 5 Hazard detection nit.em /E E/ 5 X X5 X X5 5 5.RegisterRd Forwarding nit Clock ADD X9, X, X ORR X, X, X 5 Hazard detection nit.em AND X, X, X5 Bbble LDUR X,... /E E/ X X X X5 5.RegisterRd Forwarding nit Clock 5 FIGURE.3.9 Clock cycles and 5 of the instrction seqence on page.3-6 with a load replacing SUB. The bbble is inserted in the pipeline in clock cycle, and then the and instrction is allowed to proceed in clock cycle 5. The forwarding nit is highlighted in clock cycle 5 becase it is forwarding from LDUR to the. Note that in clock cycle, the forwarding nit forwards the address of the lw as if it were the contents of register X ; this is rered harmless by the insertion of the bbble. The bold lines are those active in a clock cycle, and the italicized register nmbers in color indicate a hazard.

30 .3 An Introdction to Digital Design Using a Hardware Design Langage.3-3 after<> ADD X9, X, X ORR X, X, X AND X,... Bbble Hazard detection nit.em /E E/ X X X X 9.RegisterRd Forwarding nit Clock 6 after<> after<> ADD X9, X, X ORR X,... AND X,... Hazard detection nit.em /E E/ X X 9.RegisterRd Forwarding nit Clock 7 FIGURE.3. Clock cycles 6 and 7 of the instrction seqence on page.3-6 with a load replacing SUB. Note that nlike in Figre.3.7, the stall allows the LDUR to complete, and so there is no forwarding from E/ in clock cycle 6. Register X for the ADD in the stage still deps on the from or in /E, so the forwarding nit passes the to the. The bold lines show inpt lines active in a clock cycle, and the italicized register nmbers indicate a hazard. The instrctions after ADD are shown as inactive for pedagogical reasons.

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1 .3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline

More information

The extra single-cycle adders

The extra single-cycle adders lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since

More information

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read. The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5

More information

The single-cycle design from last time

The single-cycle design from last time lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not

More information

Review Multicycle: What is Happening. Controlling The Multicycle Design

Review Multicycle: What is Happening. Controlling The Multicycle Design Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em

More information

PART I: Adding Instructions to the Datapath. (2 nd Edition):

PART I: Adding Instructions to the Datapath. (2 nd Edition): EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================

More information

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes

More information

Review. A single-cycle MIPS processor

Review. A single-cycle MIPS processor Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?

More information

Using a Hardware Description Language to Design and Simulate a Processor 5.8

Using a Hardware Description Language to Design and Simulate a Processor 5.8 5.8 Using a Hardware Description Language to Design and Simulate a Processor 5.8 As mentioned in Appix B, Verilog can describe processors for simulation or with the intention that the Verilog specification

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse

More information

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate

More information

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard

More information

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

Solutions for Chapter 6 Exercises

Solutions for Chapter 6 Exercises Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the

More information

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,

More information

Exceptions and interrupts

Exceptions and interrupts Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Chapter 6: Pipelining

Chapter 6: Pipelining CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in

More information

Hardware Design Tips. Outline

Hardware Design Tips. Outline Hardware Design Tips EE 36 University of Hawaii EE 36 Fall 23 University of Hawaii Otline Verilog: some sbleties Simlators Test Benching Implementing the IPS Actally a simplified 6 bit version EE 36 Fall

More information

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13 Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction

More information

1048: Computer Organization

1048: Computer Organization 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

Lecture 7. Building A Simple Processor

Lecture 7. Building A Simple Processor Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report

More information

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION

EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage

More information

CS 251, Spring 2018, Assignment 3.0 3% of course mark

CS 251, Spring 2018, Assignment 3.0 3% of course mark CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following

More information

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code: EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address

More information

EEC 483 Computer Organization. Branch (Control) Hazards

EEC 483 Computer Organization. Branch (Control) Hazards EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

Chapter 6: Pipelining

Chapter 6: Pipelining Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining

More information

Quiz #1 EEC 483, Spring 2019

Quiz #1 EEC 483, Spring 2019 Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following

More information

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries

More information

Lab 8 (All Sections) Prelab: ALU and ALU Control

Lab 8 (All Sections) Prelab: ALU and ALU Control Lab 8 (All Sections) Prelab: and Control Name: Sign the following statement: On my honor, as an Aggie, I have neither given nor received nathorized aid on this academic work Objective In this lab yo will

More information

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page.

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page. page of 7 University of Calgary Departent of Electrical and Copter Engineering ENCM 369: Copter Organization Lectre Instrctors: Steve Noran and Nor Bartley Winter 23 MIDTERM TEST #2 Wednesday, March 2

More information

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data

More information

MIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture

MIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture Another Eample: IPS From the Harris/Weste book Based on the IPS-like processor from the Hennessy/Patterson book IPS Architectre Eample: sbset of IPS processor architectre Drawn from Patterson & Hennessy

More information

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics

More information

Computer Architecture. Lecture 6: Pipelining

Computer Architecture. Lecture 6: Pipelining Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres

More information

Lecture 13: Exceptions and Interrupts

Lecture 13: Exceptions and Interrupts 18 447 Lectre 13: Eceptions and Interrpts S 10 L13 1 James C. Hoe Dept of ECE, CU arch 1, 2010 Annoncements: Handots: Spring break is almost here Check grades on Blackboard idterm 1 graded Handot #9: Lab

More information

Tdb: A Source-level Debugger for Dynamically Translated Programs

Tdb: A Source-level Debugger for Dynamically Translated Programs Tdb: A Sorce-level Debgger for Dynamically Translated Programs Naveen Kmar, Brce R. Childers, and Mary Lo Soffa Department of Compter Science University of Pittsbrgh Pittsbrgh, Pennsylvania 15260 {naveen,

More information

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres

More information

POWER-OF-2 BOUNDARIES

POWER-OF-2 BOUNDARIES Warren.3.fm Page 5 Monday, Jne 17, 5:6 PM CHAPTER 3 POWER-OF- BOUNDARIES 3 1 Ronding Up/Down to a Mltiple of a Known Power of Ronding an nsigned integer down to, for eample, the net smaller mltiple of

More information

The Disciplined Flood Protocol in Sensor Networks

The Disciplined Flood Protocol in Sensor Networks The Disciplined Flood Protocol in Sensor Networks Yong-ri Choi and Mohamed G. Goda Department of Compter Sciences The University of Texas at Astin, U.S.A. fyrchoi, godag@cs.texas.ed Hssein M. Abdel-Wahab

More information

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Harware Organization an Design ectre 11: Introction to IPs path apte from Compter Organization an Design, Patterson & Hennessy, CB IPS-lite processor Compter Want to bil a processor for a sbset

More information

Pavlin and Daniel D. Corkill. Department of Computer and Information Science University of Massachusetts Amherst, Massachusetts 01003

Pavlin and Daniel D. Corkill. Department of Computer and Information Science University of Massachusetts Amherst, Massachusetts 01003 From: AAAI-84 Proceedings. Copyright 1984, AAAI (www.aaai.org). All rights reserved. SELECTIVE ABSTRACTION OF AI SYSTEM ACTIVITY Jasmina Pavlin and Daniel D. Corkill Department of Compter and Information

More information

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,

More information

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET Mltiple-Choice Test Chapter 09.0 Golden Section Search Method Optimization COMPLETE SOLUTION SET. Which o the ollowing statements is incorrect regarding the Eqal Interval Search and Golden Section Search

More information

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University 8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why

More information

Lecture 9: Microcontrolled Multi-Cycle Implementations

Lecture 9: Microcontrolled Multi-Cycle Implementations 8-447 Lectre 9: icroled lti-cycle Implementations James C. Hoe Dept of ECE, CU Febrary 8, 29 S 9 L9- Annoncements: P&H Appendi D Get started t on Lab Handots: Handot #8: Project (on Blackboard) Single-Cycle

More information

Lecture 10: Pipelined Implementations

Lecture 10: Pipelined Implementations U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded

More information

CS 153 Design of Operating Systems Spring 18

CS 153 Design of Operating Systems Spring 18 CS 153 Design of Operating Systems Spring 18 Lectre 15: Virtal Address Space Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian OS Abstractions Applications

More information

Resolving Linkage Anomalies in Extracted Software System Models

Resolving Linkage Anomalies in Extracted Software System Models Resolving Linkage Anomalies in Extracted Software System Models Jingwei W and Richard C. Holt School of Compter Science University of Waterloo Waterloo, Canada j25w, holt @plg.waterloo.ca Abstract Program

More information

Pipelined Datapath. One register file is enough

Pipelined Datapath. One register file is enough ipelined path The goal of pipelining is to allow multiple instructions execute at the same time We may need to perform several operations in a cycle Increment the and add s at the same time. Fetch one

More information

Review. How to represent real numbers

Review. How to represent real numbers PCWrite PC IorD Review ALUSrcA emread Address Write data emory emwrite em Data IRWrite [3-26] [25-2] [2-6] [5-] [5-] RegDst Read register Read register 2 Write register Write data RegWrite Read data Read

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Computer User s Guide 4.0

Computer User s Guide 4.0 Compter User s Gide 4.0 2001 Glenn A. Miller, All rights reserved 2 The SASSI Compter User s Gide 4.0 Table of Contents Chapter 1 Introdction...3 Chapter 2 Installation and Start Up...5 System Reqirements

More information

MIPS Architecture. An Example: MIPS. From the Harris/Weste book Based on the MIPS-like processor from the Hennessy/Patterson book

MIPS Architecture. An Example: MIPS. From the Harris/Weste book Based on the MIPS-like processor from the Hennessy/Patterson book An Eample: IPS From the Harris/Weste book Based on the IPS-like processor from the Hennessy/Patterson book IPS Architectre w Eample: sbset of IPS processor architectre n Drawn from Patterson & Hennessy

More information

Networks An introduction to microcomputer networking concepts

Networks An introduction to microcomputer networking concepts Behavior Research Methods& Instrmentation 1978, Vol 10 (4),522-526 Networks An introdction to microcompter networking concepts RALPH WALLACE and RICHARD N. JOHNSON GA TX, Chicago, Illinois60648 and JAMES

More information

CS 153 Design of Operating Systems

CS 153 Design of Operating Systems CS 153 Design of Operating Systems Spring 18 Lectre 3: OS model and Architectral Spport Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Last time/today

More information

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.

Instruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time. Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction

More information

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W12-M

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W12-M CSE 22 Computer Organization Hugh Chesser, CSEB 2U W2- Graphical Representation Time 2 6 8 add $s, $t, $t IF ID E E Decode / Execute emory Back fetch from / stage into the instruction register file. Shading

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

Evaluating Influence Diagrams

Evaluating Influence Diagrams Evalating Inflence Diagrams Where we ve been and where we re going Mark Crowley Department of Compter Science University of British Colmbia crowley@cs.bc.ca Agst 31, 2004 Abstract In this paper we will

More information

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012 8-447 Compter Architectre Lectre 6: lti-cycle icroarchitectres Prof. Onr tl Carnegie ellon University Spring 22, 2/6/22 Reminder: Homeworks Homework soltions Check and stdy the soltions! Learning now is

More information

CS 153 Design of Operating Systems Spring 18

CS 153 Design of Operating Systems Spring 18 CS 153 Design of Operating Systems Spring 18 Lectre 11: Semaphores Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Last time Worked throgh software implementation

More information

CSSE232 Computer Architecture I. Mul5cycle Datapath

CSSE232 Computer Architecture I. Mul5cycle Datapath CSSE232 Compter Architectre I Ml5cycle Datapath Class Stats Next 3 days : Ml5cycle datapath ing Ml5cycle datapath is not in the book! How long do instrc5ons take? ALU 2ns Mem 2ns Reg File 1ns Everything

More information

CS 153 Design of Operating Systems Spring 18

CS 153 Design of Operating Systems Spring 18 CS 153 Design of Operating Systems Spring 18 Lectre 9: Synchronization (1) Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Cooperation between Threads

More information

On the Computational Complexity and Effectiveness of N-hub Shortest-Path Routing

On the Computational Complexity and Effectiveness of N-hub Shortest-Path Routing 1 On the Comptational Complexity and Effectiveness of N-hb Shortest-Path Roting Reven Cohen Gabi Nakibli Dept. of Compter Sciences Technion Israel Abstract In this paper we stdy the comptational complexity

More information

CS 153 Design of Operating Systems Spring 18

CS 153 Design of Operating Systems Spring 18 CS 53 Design of Operating Systems Spring 8 Lectre 2: Virtal Memory Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Recap: cache Well-written programs exhibit

More information

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control Chapter 5 The Processor: Datapath and Control Big Picture: Where are We Now? Performance of a

More information

5 Performance Evaluation

5 Performance Evaluation 5 Performance Evalation his chapter evalates the performance of the compared to the MIP, and FMIP individal performances. We stdy the packet loss and the latency to restore the downstream and pstream of

More information

Lecture 5: The Processor

Lecture 5: The Processor Lecture 5: The Processor CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and

More information

Single-Cycle Examples, Multi-Cycle Introduction

Single-Cycle Examples, Multi-Cycle Introduction Single-Cycle Examples, ulti-cycle Introduction 1 Today s enu Single cycle examples Single cycle machines vs. multi-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of

More information

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W9-W

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W9-W CSE 22 Computer Organization Hugh Chesser, CSEB 2U Agenda Topics:. Single Cycle Review (Sample Exam/Quiz Q) 2. ultiple cycle implementation Patterson: Section 4.5 Reminder: Quiz #2 Next Wednesday (November

More information

Real-time mean-shift based tracker for thermal vision systems

Real-time mean-shift based tracker for thermal vision systems 9 th International Conference on Qantitative InfraRed Thermography Jly -5, 008, Krakow - Poland Real-time mean-shift based tracker for thermal vision systems G. Bieszczad* T. Sosnowski** * Military University

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

Functions of Combinational Logic

Functions of Combinational Logic CHPTER 6 Fnctions of Combinational Logic CHPTER OUTLINE 6 6 6 6 6 5 6 6 6 7 6 8 6 9 6 6 Half and Fll dders Parallel inary dders Ripple Carry and Look-head Carry dders Comparators Decoders Encoders Code

More information

This chapter is based on the following sources, which are all recommended reading:

This chapter is based on the following sources, which are all recommended reading: Bioinformatics I, WS 09-10, D. Hson, December 7, 2009 105 6 Fast String Matching This chapter is based on the following sorces, which are all recommended reading: 1. An earlier version of this chapter

More information

CAPL Scripting Quickstart

CAPL Scripting Quickstart CAPL Scripting Qickstart CAPL (Commnication Access Programming Langage) For CANalyzer and CANoe V1.01 2015-12-03 Agenda Important information before getting started 3 Visal Seqencer (GUI based programming

More information

Isilon InsightIQ. Version 2.5. User Guide

Isilon InsightIQ. Version 2.5. User Guide Isilon InsightIQ Version 2.5 User Gide Pblished March, 2014 Copyright 2010-2014 EMC Corporation. All rights reserved. EMC believes the information in this pblication is accrate as of its pblication date.

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Decoders. 3-Input Full Decoder

Decoders. 3-Input Full Decoder ecoders A fll decoder with n inpts has 2 n otpts. Let inpts be labeled In, In, In 2,..., In n, and let otpts be labeled Ot, Ot,..., Ot 2 n. A fll decoder fnctions as follows: Only one of otpts has vale

More information

Master for Co-Simulation Using FMI

Master for Co-Simulation Using FMI Master for Co-Simlation Using FMI Jens Bastian Christoph Claß Ssann Wolf Peter Schneider Franhofer Institte for Integrated Circits IIS / Design Atomation Division EAS Zenerstraße 38, 69 Dresden, Germany

More information

Multi-lingual Multi-media Information Retrieval System

Multi-lingual Multi-media Information Retrieval System Mlti-lingal Mlti-media Information Retrieval System Shoji Mizobchi, Sankon Lee, Fmihiko Kawano, Tsyoshi Kobayashi, Takahiro Komats Gradate School of Engineering, University of Tokshima 2-1 Minamijosanjima,

More information

CS 153 Design of Operating Systems

CS 153 Design of Operating Systems CS 53 Design of Operating Systems Spring 8 Lectre 9: Locality and Cache Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Some slides modified from originals

More information

LDAP Configuration Guide

LDAP Configuration Guide LDAP Configration Gide Content Content LDAP directories on Gigaset phones............................................... 3 Configration.....................................................................

More information

Local Run Manager Generate FASTQ Analysis Module

Local Run Manager Generate FASTQ Analysis Module Local Rn Manager Generate FASTQ Analysis Modle Workflow Gide For Research Use Only. Not for se in diagnostic procedres. Overview 3 Set Parameters 3 Analysis Methods 5 View Analysis Reslts 5 Analysis Report

More information