Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012
|
|
- Simon Wilcox
- 6 years ago
- Views:
Transcription
1 8-447 Compter Architectre Lectre 6: lti-cycle icroarchitectres Prof. Onr tl Carnegie ellon University Spring 22, 2/6/22
2 Reminder: Homeworks Homework soltions Check and stdy the soltions! Learning now is better than rshing later Homework 2 Already ot De Febrary 3 ISA concepts, ISA vs. microarchitectre, microcoded machines 2
3 Reminder: Lab Assignment 2 Lab Assignment.5 Verilog practice Not to be trned in Lab Assignment 2 De Friday, Feb 7, at the end of the lab Individal assignment No collaboration; please respect the honor code 3
4 Etra Credit for Lab Assignment 2 Complete yor normal (single-cycle) implementation first, and get it checked off in lab. Then, implement the IPS core sing a microcoded approach similar to what we are discssing in class. We are not specifying any particlar details of the microcode format or the microarchitectre; yo shold be creative. For the etra credit, the microcoded implementation shold eecte the same programs that yor ordinary implementation does, and yo shold demo it by the normal lab deadline. 4
5 Feedback on Lab Assignment Chris, Lavanya, and Abeer are working hard on grading We will have very comprehensive tests for all labs Lab tests eercise every case of each instrction as well as long programs (e.g., REP OVS) We will release test cases and register dmps Be thorogh and test all possible cases Follow directions they are there for a reason No modifications to shell code! No naligned accesses to memory Remove all yor debgging printf s before handing in code Do the etra credit work if the lab is too easy! 5
6 ings for Today P&P, Revised Appendi C icroarchitectre of the LC-3b Appendi A (LC-3b ISA) will be sefl in following this P&H, Appendi D apping Control to Hardware Optional arice Wilkes, The Best Way to Design an Atomatic Calclating achine, anchester Univ. Compter Inagral Conf., 95. 6
7 ings for Net Lectre Pipelining P&H Chapter
8 Review of Last Lectre: Single-Cycle Uarch What phases of the instrction processing cycle does the IPS JAL instrction eercise? How many cycles does it take to process an instrction in the single-cycle microarchitectre? What determines the clock cycle time? What is the difference between path and control logic? What abot combinational vs. seqential control? What is the semantics of a delayed branch? Why this is so will become clear when we cover pipelining 8
9 Review: Instrction Processing Cycle Instrctions are processed nder the direction of a control nit step by step. Instrction cycle: Seqence of steps to process an instrction Fndamentally, there are si phases: Fetch Decode Evalate Address Fetch Operands Eecte Store Reslt Not all instrctions reqire all si stages (see P&P Ch. 4) 9
10 Review: Datapath vs. Control Logic Instrctions transform Data (AS) to Data (AS ) This transformation is done by fnctional nits Units that operate on These nits need to be told what to do to the An instrction processing engine consists of two components Datapath: Consists of hardware elements that deal with and transform signals fnctional nits that operate on hardware strctres (e.g. wires and mes) that enable the flow of into the fnctional nits and registers storage nits that store (e.g., registers) Control logic: Consists of hardware elements that determine control signals, i.e., signals that specify what the path elements shold do to the
11 Today s Agenda Finish single-cycle microarchitectres Critical path icroarchitectre design principles Performance evalation primer lti-cycle microarchitectres icroprogrammed control
12 A Note: How to ake the Best Ot of 447? Do the readings P&P Appendies A and C Wilkes 95 paper Today s lectres will be easy to nderstand if yo read these And, yo can ask more in-depth qestions and learn more Do the assignments early Yo can do things for etra credit if yo finish early We will describe what to do for etra credit Stdy the material and bzzwords daily Lectre notes, videos Bzzwords take notes dring class 2
13 Review: The Fll Single-Cycle Datapath Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp 4 Add PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 bcond Zero ALU ALU reslt Address Data memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] JAL, JR, JALR omitted 3
14 Single-Cycle Datapath for Arithmetic and Logical Instrctions 4
15 Review: Datapath for R and I-Type ALU Insts. Add 4 PC address Instrction memory Instrction 25:2 2:6 Instrction 5: RegDest isitype register register 2 Registers register Reg 2 6 Sign 32 etend 3 ALUSrc isitype ALU operation Zero ALU ALU reslt Address em Data memo em if E[PC] == ADDI rt rs immediate GPR[rt] GPR[rs] + sign-etend (immediate) PC PC + 4 **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] IF ID EX E WB Combinational state pdate logic 5
16 Single-Cycle Datapath for Data ovement Instrctions 6
17 Review: Datapath for Non-Control-Flow Insts. Add PC address Instrction memory 4 Instrction Instrction RegDest isitype register register 2 Registers register Reg!isStore 2 6 Sign 32 etend 3 Zero ALU ALU reslt ALUSrc isitype ALU operation Address isstore em Data memory isload em emtoreg isload **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 7
18 Single-Cycle Datapath for Control Flow Instrctions 8
19 Review: Unconditional Jmp Instrctions Assembly J immediate 26 achine encoding J 6-bit immediate 26-bit J-type Semantics if E[PC]==J immediate 26 target = { PC[3:28], immediate 26, 2 b } PC target 9
20 Review: Unconditional Jmp Datapath isj PCSrc concat PC address Instrction memory **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 4 Add Instrction Instrction? register register 2 Registers register Reg 2 6 Sign 32 etend X 3 X ALU operation Zero ALU ALU reslt ALUSrc Address em Data memory em if E[PC]==J immediate26 PC = { PC[3:28], immediate26, 2 b } What abot JR, JAL, JALR? 2
21 Conditional Branch Instrctions Assembly (e.g., branch if eqal) BEQ rs reg rt reg immediate 6 achine encoding BEQ 6-bit rs 5-bit rt 5-bit immediate 6-bit I-type Semantics (assming no branch delay slot) if E[PC]==BEQ rs rt immediate 6 target = PC sign-etend(immediate) 4 if GPR[rs]==GPR[rt] then PC target else PC PC + 4 2
22 Conditional Branch Datapath (For Yo to Fi) watch ot PCSrc concat PC address Instrction memory 4 Instrction Add Instrction PC + 4 from instrction path register register 2 Registers register 2 Shift left 2 Add sb 3 Sm ALU operation ALU bcond Zero Branch target To branch control logic Reg 6 Sign 32 etend **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] How to phold the delayed branch semantics? 22
23 Ptting It All Together Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp 4 Add PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 bcond Zero ALU ALU reslt Address Data memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] JAL, JR, JALR omitted 23
24 Single-Cycle Control Logic 24
25 Single-Cycle Hardwired Control As combinational fnction of Inst=E[PC] 3 6-bit 3 3 opcode 6-bit opcode 6-bit rs 5-bit rs 5-bit 2 2 rt 5-bit rt 5-bit immediate 26-bit 6 6 rd 5-bit immediate 6-bit shamt 5-bit 6 fnct 6-bit R-type I-type J-type Consider All R-type and I-type ALU instrctions LW and SW BEQ, BNE, BLEZ, BGTZ J, JR, JAL, JALR 25
26 Single-Bit Control Signals When De-asserted When asserted Eqation RegDest GPR write select according to rt, i.e., inst[2:6] GPR write select according to rd, i.e., inst[5:] opcode== ALUSrc 2 nd ALU inpt from 2 nd GPR read port 2 nd ALU inpt from signetended 6-bit immediate (opcode!=) && (opcode!=beq) && (opcode!=bne) Steer ALU reslt to GPR steer memory load to opcode==lw write port GPR wr. port JAL and JALR reqire additional RegDest and emtoreg options 26
27 Single-Bit Control Signals When De-asserted When asserted Eqation em emory read disabled emory read port retrn load vale opcode==lw em emory write disabled emory write enabled opcode==sw PCSrc According to PCSrc 2 net PC is based on 26- bit immediate jmp target (opcode==j) (opcode==jal) PCSrc 2 net PC = PC + 4 net PC is based on 6- bit immediate branch target (opcode==b) && bcond is satisfied JR and JALR reqire additional PCSrc options 27
28 ALU Control case opcode select operation according to fnct ALUi selection operation according to opcode LW select addition SW select addition B select bcond generation fnction don t care Eample ALU operations ADD, SUB, AND, OR, XOR, NOR, etc. bcond on eqal, not eqal, LE zero, GT zero, etc. 28
29 R-Type ALU Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU fnct control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 29
30 I-Type ALU Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU opcode control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 3
31 LW Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU Add control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 3
32 SW Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control * RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken * Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU Add control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 32
33 Branch Not Taken Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control * RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken * Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU bcond control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 33
34 Branch Taken Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control * RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 Add ALU reslt bcond Zero ALU ALU reslt Address Data memory PCSrc 2 =Br Taken * Instrction [5 ] Instrction [5 ] 6 Sign 32 etend ALU bcond control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 34
35 Jmp Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp PC 4 address Instrction memory Add Instrction [3 ] PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Control * RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 Shift left 2 * Add ALU reslt bcond Zero ALU ALU reslt Address * Data memory PCSrc 2 =Br Taken * Instrction [5 ] Instrction [5 ] 6 Sign 32 etend * ALU control ALU operation **Based on original figre from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 35
36 What is in That Control Bo? Combinational Logic Hardwired Control Idea: Control signals generated combinationally based on instrction Seqential Logic Seqential/icroprogrammed Control Control Store Idea: A memory strctre contains the control signals associated with an instrction 36
37 Evalating the Single-Cycle icroarchitectre 37
38 A Single-Cycle icroarchitectre Is this a good idea/design? When is this a good design? When is this a bad design? How can we design a better microarchitectre? 38
39 A Single-Cycle icroarchitectre: Analysis Every instrction takes cycle to eecte CPI (Cycles per instrction) is strictly How long each instrction takes is determined by how long the slowest instrction takes to eecte Even thogh many instrctions do not need that long to eecte Clock cycle time of the microarchitectre is determined by how long it takes to complete the slowest instrction Critical path of the design is determined by the processing time of the slowest instrction 39
40 What is the Slowest Instrction to Process? Let s go back to the basics All si phases of the instrction processing cycle take a single machine clock cycle to complete Fetch Decode Evalate Address Fetch Operands Eecte Store Reslt. Instrction fetch (IF) 2. Instrction decode and register operand fetch (ID/RF) 3. Eecte/Evalate memory address (EX/AG) 4. emory operand fetch (E) 5. Store/writeback reslt (WB) Do each of the above phases take the same time (latency) for all instrctions? 4
41 Single-Cycle Datapath Analysis Assme memory nits (read or write): 2 ps ALU and adders: ps register file (read or write): 5 ps other combinational logic: ps steps IF ID EX E WB resorces mem RF ALU mem RF Delay R-type I-type LW SW Branch Jmp 2 2 4
42 Let s Find the Critical Path Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp 4 Add PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 bcond Zero ALU ALU reslt Address Data memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 42
43 R-Type and I-Type ALU Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp ps 4 Add ps PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] 2ps Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 4ps 25ps bcond Zero ALU ALU reslt Address 35ps Data memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 43
44 LW Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp ps 4 Add ps PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] 2ps Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers 2 register 6ps 25ps bcond Zero ALU ALU reslt Address 35ps Data memory 55ps Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 44
45 SW Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp ps 4 Add ps PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] 2ps Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 25ps bcond Zero ALU ALU reslt 35ps Address Data 55ps memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 45
46 Branch Taken 35ps PC 4 address Instrction memory Add ps Instrction [3 ] Instrction [25 ] Shift Jmp address [3 ] left ps PC+4 [3 28] Instrction [3 26] Instrction [25 2] Instrction [2 6] Instrction [5 ] Instrction [5 ] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg register register 2 Registers register 2 6 Sign 32 etend Shift left 2 25ps ALU control 2ps Add ALU reslt bcond Zero ALU ALU reslt 35ps ALU operation Address PCSrc =Jmp Data memory PCSrc 2 =Br Taken Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 46
47 Jmp Instrction [25 ] Shift Jmp address [3 ] left PCSrc =Jmp 2ps 4 Add ps PC+4 [3 28] Instrction [3 26] Control RegDst Jmp Branch em emtoreg ALUOp em ALUSrc Reg Shift left 2 Add ALU reslt PCSrc 2 =Br Taken PC address Instrction memory Instrction [3 ] 2ps Instrction [25 2] Instrction [2 6] Instrction [5 ] register register 2 Registers register 2 bcond Zero ALU ALU reslt Address Data memory Instrction [5 ] 6 Sign 32 etend ALU control ALU operation Instrction [5 ] [Based on original figre from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 47
48 What Abot Control Logic? How does that affect the critical path? Food for thoght for yo: Can control logic be on the critical path? A note on CDC 56: control store access too long 48
49 What is the Slowest Instrction to Process? emory is not magic What if memory sometimes takes ms to access? Does it make sense to have a simple register to register add or jmp to take {ms+all else to do a memory operation}? And, what if yo need to access memory more than once to process an instrction? Which instrctions need this? VAX INDEX instrction Do yo provide mltiple ports to memory? 49
50 Single Cycle Arch: Compleity Contrived All instrctions rn as slow as the slowest instrction Inefficient All instrctions rn as slow as the slowest instrction st provide worst-case combinational resorces in parallel as reqired by any instrction Need to replicate a resorce if it is needed more than once by an instrction dring different parts of the instrction processing cycle Not necessarily the simplest way to implement an ISA Single-cycle implementation of REP OVS, INDEX, POLY? Not easy to optimize/improve performance Optimizing the common case does not work (e.g. common instrctions) Need to optimize the worst case all the time 5
51 icroarchitectre Design Principles Critical path design Find the maimm combinational logic delay and decrease it Bread and btter (common case) design Spend time and resorces on where it matters i.e., improve what the machine is really designed to do Common case vs. ncommon case Balanced design Balance instrction/ flow throgh hardware components Balance the hardware needed to accomplish the work How does a single-cycle microarchitectre fare in light of these principles? 5
52 lti-cycle icroarchitectres 52
53 lti-cycle icroarchitectres Goal: Let each instrction take (close to) only as mch time it really needs Idea Determine clock cycle time independently of instrction processing time Each instrction takes as many clock cycles as it needs to take ltiple state transitions per instrction The states followed by each instrction is different 53
54 Remember: The Process instrction Step ISA specifies abstractly what A shold be, given an instrction and A It defines an abstract finite state machine where State = programmer-visible state Net-state logic = instrction eection specification From ISA point of view, there are no intermediate states between A and A dring instrction eection One state transition per instrction icroarchitectre implements how A is transformed to A There are many choices in implementation We can have programmer-invisible state to optimize the speed of instrction eection: mltiple state transitions per instrction Choice : AS AS (transform A to A in a single clock cycle) Choice 2: AS AS+S AS+S2 AS+S3 AS (take mltiple clock cycles to transform AS to AS ) 54
55 lti-cycle icroarchitectre AS = Architectral (programmer visible) state at the beginning of an instrction Step : Process part of instrction in one clock cycle Step 2: Process part of instrction in the net clock cycle AS = Architectral (programmer visible) state at the end of a clock cycle 55
56 Benefits of lti-cycle Design Critical path design Can keep redcing the critical path independently of worst-case processing time of any instrction Bread and btter (common case) design Can optimize the nmber of states it takes to eecte important instrctions that make p mch of the eection time Balanced design No need to provide more capability or resorces than really needed An instrction that needs resorce X mltiple times does not reqire mltiple X s to be implemented Leads to more efficient hardware: Can rese hardware components needed mltiple times for an instrction 56
57 Performance Analysis Eection time of an instrction {CPI} {clock cycle time} Eection time of a program Sm over all instrctions [{CPI} {clock cycle time}] {# of instrctions} {Average CPI} {clock cycle time} Single cycle microarchitectre performance CPI = Clock cycle time = long lti-cycle microarchitectre performance CPI = different for each instrction Average CPI hopeflly small Clock cycle time = short Now, we have two degrees of freedom to optimize independently 57
58 An Aside: CPI vs. Freqency CPI vs. Clock cycle time At odds with each other Redcing one increases the other for a single instrction Why? Average CPI can be amortized/redced via concrrent processing of mltiple instrctions The same cycle is devoted to mltiple instrctions Eample: Pipelining, sperscalar eection 58
59 A lti-cycle icroarchitectre A Closer Look 59
60 How Do We Implement This? arice Wilkes, The Best Way to Design an Atomatic Calclating achine, anchester Univ. Compter Inagral Conf., 95. The concept of microcoded/microprogrammed machines Realization One can implement the process instrction step as a finite state machine that seqences between states and eventally retrns back to the fetch instrction state A state is defined by the control signals asserted in it Control signals for the net state determined in crrent state 6
61 The Instrction Processing Cycle Fetch Decode Evalate Address Fetch Operands Eecte Store Reslt 6
62 A Basic lti-cycle icroarchitectre Instrction processing cycle divided into states A stage in the instrction processing cycle can take mltiple states A mlti-cycle microarchitectre seqences from state to state to process an instrction The behavior of the machine in a state is completely determined by control signals in that state The behavior of the entire processor is specified flly by a finite state machine In a state (clock cycle), control signals control How the path shold process the How to generate the control signals for the net clock cycle 62
63 icroprogrammed Control Terminology Control signals associated with the crrent state icroinstrction Act of transitioning from one state to another Determining the net state and the microinstrction for the net state icroseqencing Control store stores control signals for every possible state Store for microinstrctions for the entire FS icroseqencer determines which set of control signals will be sed in the net clock cycle (i.e. net state) 63
64 What Happens In A Clock Cycle? The control signals (microinstrction) for the crrent state control Processing in the path Generation of control signals (microinstrction) for the net cycle See Spplemental Fig Datapath and microseqencer operate concrrently Qestion: why not generate control signals for the crrent cycle in the crrent cycle? This will lengthen the clock cycle Why wold it lengthen the clock cycle? See Spplemental Fig 2 64
65 A Clock Cycle 65
66 A Bad Clock Cycle! 66
67 A Simple LC-3b Control and Datapath 67
68 What Determines Net-State Control Signals? What is happening in the crrent clock cycle See the 9 control signals coming from Control block What are these for? The instrction that is being eected IR[5:] coming from the Data Path Whether the condition of a branch is met, if the instrction being processed is a branch BEN bit coming from the path Whether the memory operation is completing in the crrent cycle, if one is in progress R bit coming from memory 68
69 A Simple LC-3b Control and Datapath 69
70 The State achine for lti-cycle Processing The behavior of the LC-3b arch is completely determined by the 35 control signals and additional 7 bits that go into the control logic from the path 35 control signals completely describe the state of the control strctre We can completely describe the behavior of the LC-3b as a state machine, i.e. a directed graph of Nodes (one corresponding to each state) Arcs (showing flow from each state to the net state(s)) 7
71 An LC-3b State achine Patt and Patel, App C, Figre C.2 Each state mst be niqely specified Done by means of state variables 3 distinct states in this LC-3b state machine Encoded with 6 state variables Eamples State 8,9 correspond to the beginning of the instrction processing cycle Fetch phase: state 8, 9 state 33 state 35 Decode phase: state 32 7
72 AR <! PC PC <! PC + 2 8, 9 DR <! 33 R R IR <! DR 35 To 8 RTI ADD BEN<! IR[] & N + IR[] & Z + IR[9] & P [IR[5:2]] 32 BR To To To 8 DR<! SR+OP2* set CC DR<! SR&OP2* set CC 5 AND XOR TRAP SHF LEA LDB LDW STW STB JSR JP [BEN] 22 PC<! PC+LSHF(off9,) To 8 9 DR<! SR XOR OP2* set CC 2 PC<! BaseR To 8 To 8 AR<! LSHF(ZEXT[IR[7:]],) 5 4 [IR[]] To 8 R DR<! [AR] R7<! PC R PC<! DR R7<! PC PC<! BaseR 2 R7<! PC To 8 PC<! PC+LSHF(off,) To 8 3 DR<! SHF(SR,A,D,amt4) set CC To 8 To 8 4 DR<! PC+LSHF(off9, ) set CC 2 AR<! B+off6 6 AR<! B+LSHF(off6,) 7 AR<! B+LSHF(off6,) 3 AR<! B+off6 To NOTES B+off6 : Base + SEXT[offset6] PC+off9 : PC + SEXT[offset9] *OP2 may be SR2 or SEXT[imm5] ** [5:8] or [7:] depending on AR[] DR<! [AR[5:] ] R R 3 DR<! SEXT[BYTE.DATA] set CC DR<! [AR] 27 R DR<! DR set CC R DR<! SR 6 [AR]<! DR R R DR<! SR[7:] 7 [AR]<! DR** R R To 8 To 8 To 8 To 9
73 LC-3b State achine: Some Qestions How many cycles does the fastest instrction take? How many cycles does the slowest instrction take? Why does the BR take as long as it takes in the FS? What determines the clock cycle? Is this a ealy machine or a oore machine? 73
74 LC-3b Datapath Patt and Patel, App C, Figre C.3 Single-bs path design At any point only one vale can be gated on the bs (i.e., can be driving the bs) Advantage: Low hardware cost: one bs Disadvantage: Redced concrrency if instrction needs the bs twice for two different things, these need to happen in different states Control signals (26 of them) determine what happens in the path in one clock cycle Patt and Patel, App C, Table C. 74
75
76 We did not cover the following slides in lectre. These are for yor preparation for the net lectre.
77 C.4. THE CONTROL STRUCTURE IR[:9] IR[:9] DR SR IR[8:6] DRUX SRUX (a) (b) IR[:9] N Z P Logic BEN (c) Figre C.6: Additional logic reqired to provide control signals
78
79 LC-3b Datapath: Some Qestions How does instrction fetch happen in this path according to the state machine? What is the difference between gating and loading? Is this the smallest hardware yo can design? 79
80 LC-3b icroprogrammed Control Strctre Patt and Patel, App C, Figre C.4 Three components: icroinstrction, control store, microseqencer icroinstrction: control signals that control the path (26 of them) and determine the net state (9 of them) Each microinstrction is stored in a niqe location in the control store (a special memory strctre) Uniqe location: address of the state corresponding to the microinstrction Remember each state corresponds to one microinstrction icroseqencer determines the address of the net microinstrction (i.e., net state) 8
81 R IR[5:] BEN icroseqencer 6 Control Store icroinstrction 9 26 (J, COND, IRD)
82 APPENDIX C. THE ICROARCHITECTURE OF THE LC-3B, BASIC ACHINE COND COND BEN R IR[] Branch y Addr. ode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Net State Figre C.5: The microseqencer of the LC-3b base machine
83 J IRD Cond LD.DR LD.IR LD.BEN LD.REG LD.CC LD.AR GatePC GateDR GateALU LD.PC GateARUX GateSHF PCUX DRUX SRUX ADDRUX ADDR2UX ARUX ALUK IO.EN R.W DATA.SIZE LSHF (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State ) (State ) (State 2) (State 3) (State 4) (State 5) (State 6) (State 7) (State 8) (State 9) (State 2) (State 2) (State 22) (State 23) (State 24) (State 25) (State 26) (State 27) (State 28) (State 29) (State 3) (State 3) (State 32) (State 33) (State 34) (State 35) (State 36) (State 37) (State 38) (State 39) (State 4) (State 4) (State 42) (State 43) (State 44) (State 45) (State 46) (State 47) (State 48) (State 49) (State 5) (State 5) (State 52) (State 53) (State 54) (State 55) (State 56) (State 57) (State 58) (State 59) (State 6) (State 6) (State 62) (State 63)
84 LC-3b icroseqencer Patt and Patel, App C, Figre C.5 The prpose of the microseqencer is to determine the address of the net microinstrction (i.e., net state) Net address depends on 9 control signals 84
85 APPENDIX C. THE ICROARCHITECTURE OF THE LC-3B, BASIC ACHINE COND COND BEN R IR[] Branch y Addr. ode J[5] J[4] J[3] J[2] J[] J[],,IR[5:2] 6 IRD 6 Address of Net State Figre C.5: The microseqencer of the LC-3b base machine
86 The icroseqencer: Some Qestions When is the IRD signal asserted? What happens if an illegal instrction is decoded? What are condition (COND) bits for? How is variable latency memory handled? How do yo do the state encoding? inimize nmber of state variables Start with the 6-way branch Then determine constraint tables and states dependent on COND 86
87 Variable-Latency emory The ready signal (R) enables memory read/write to eecte correctly Eample: transition from state 8 to state 33 is controlled by the R bit asserted by memory when memory is available Cold we have done this in a single-cycle microarchitectre? 87
88 The icroseqencer: Advanced Qestions What happens if the machine is interrpted? What if an instrction generates an eception? How can yo implement a comple instrction sing this control strctre? Think REP OVS 88
Computer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics
More informationComputer Architecture. Lecture 6: Pipelining
Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres
More informationComputer Architecture. Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures
Computer Architecture Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures Dr. Ahmed Sallam Based on original slides by Prof. Onur Mutlu Agenda for Today & Next Few Lectures Single-cycle Microarchitectures
More informationLecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University
8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why
More informationLecture 9: Microcontrolled Multi-Cycle Implementations
8-447 Lectre 9: icroled lti-cycle Implementations James C. Hoe Dept of ECE, CU Febrary 8, 29 S 9 L9- Annoncements: P&H Appendi D Get started t on Lab Handots: Handot #8: Project (on Blackboard) Single-Cycle
More informationThe extra single-cycle adders
lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since
More informationThe single-cycle design from last time
lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not
More informationThe final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.
The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5
More informationPipelining. Chapter 4
Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we
More informationReview Multicycle: What is Happening. Controlling The Multicycle Design
Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory
More informationCMSC Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining Prof. Yanjing Li University of Chicago Administrative Stuff! Lab1 due at 11:59pm today! Lab2 out " Pipeline ARM simulator "
More informationComputer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University
Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the
More informationEXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION
EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage
More informationQuiz #1 EEC 483, Spring 2019
Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored
More informationEEC 483 Computer Organization
EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and
More informationEXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation
EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries
More informationReview: Computer Organization
Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes
More informationThe multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM
Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle
More informationReview. A single-cycle MIPS processor
Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?
More informationPART I: Adding Instructions to the Datapath. (2 nd Edition):
EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================
More informationCS 251, Winter 2019, Assignment % of course mark
CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse
More informationWhat do we have so far? Multi-Cycle Datapath
What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining
More informationProf. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:
EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i
More informationComputer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures Single-cycle
More informationExceptions and interrupts
Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes
More information1048: Computer Organization
8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining (cwli@twins.ee.nct.ed.tw) 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards
More informationCS 251, Spring 2018, Assignment 3.0 3% of course mark
CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following
More informationEEC 483 Computer Organization
EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages
More informationEnhanced Performance with Pipelining
Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the
More informationCSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control
CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data
More information1048: Computer Organization
48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation (cwli@twins.ee.nct.ed.tw) 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address
More informationTDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction
Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard
More informationCS 251, Winter 2018, Assignment % of course mark
CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following
More informationLecture 7. Building A Simple Processor
Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report
More informationChapter 6: Pipelining
CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction
More informationLecture 3: Single Cycle Microarchitecture. James C. Hoe Department of ECE Carnegie Mellon University
8 447 Lecture 3: Single Cycle Microarchitecture James C. Hoe Department of ECE Carnegie Mellon University 8 447 S8 L03 S, James C. Hoe, CMU/ECE/CALCM, 208 Your goal today Housekeeping first try at implementing
More informationSolutions for Chapter 6 Exercises
Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the
More informationPS Midterm 2. Pipelining
PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry
More informationDesign of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 3: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring 27 6 April 27 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationLab 8 (All Sections) Prelab: ALU and ALU Control
Lab 8 (All Sections) Prelab: and Control Name: Sign the following statement: On my honor, as an Aggie, I have neither given nor received nathorized aid on this academic work Objective In this lab yo will
More informationComp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13
Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign
More informationLecture 10: Pipelined Implementations
U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded
More informationOverview of Pipelining
EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase
More informationHardware Design Tips. Outline
Hardware Design Tips EE 36 University of Hawaii EE 36 Fall 23 University of Hawaii Otline Verilog: some sbleties Simlators Test Benching Implementing the IPS Actally a simplified 6 bit version EE 36 Fall
More informationChapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind
Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate
More informationLecture 13: Exceptions and Interrupts
18 447 Lectre 13: Eceptions and Interrpts S 10 L13 1 James C. Hoe Dept of ECE, CU arch 1, 2010 Annoncements: Handots: Spring break is almost here Check grades on Blackboard idterm 1 graded Handot #9: Lab
More informationAnimating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle
nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if
More informationInstruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion
. (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA
More informationChapter 6: Pipelining
Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining
More informationEEC 483 Computer Organization. Branch (Control) Hazards
EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o
More informationSingle-Cycle Examples, Multi-Cycle Introduction
Single-Cycle Examples, ulti-cycle Introduction 1 Today s enu Single cycle examples Single cycle machines vs. multi-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of
More informationCSEN 601: Computer System Architecture Summer 2014
CSEN 601: Computer System Architecture Summer 2014 Practice Assignment 5 Solutions Exercise 5-1: (Midterm Spring 2013) a. What are the values of the control signals (except ALUOp) for each of the following
More informationCSSE232 Computer Architecture I. Mul5cycle Datapath
CSSE232 Compter Architectre I Ml5cycle Datapath Class Stats Next 3 days : Ml5cycle datapath ing Ml5cycle datapath is not in the book! How long do instrc5ons take? ALU 2ns Mem 2ns Reg File 1ns Everything
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationPIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons
Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationETH, Design of Digital Circuits, SS17 Practice Exercises III
ETH, Design of Digital Circuits, SS17 Practice Exercises III Instructors: Prof. Onur Mutlu, Prof. Srdjan Capkun TAs: Jeremie Kim, Minesh Patel, Hasan Hassan, Arash Tavakkol, Der-Yeuan Yu, Francois Serre,
More informationMIPS Architecture. Fibonacci (C) Fibonacci (Assembly) Another Example: MIPS. Example: subset of MIPS processor architecture
Another Eample: IPS From the Harris/Weste book Based on the IPS-like processor from the Hennessy/Patterson book IPS Architectre Eample: sbset of IPS processor architectre Drawn from Patterson & Hennessy
More informationLecture 8: Data Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 8: Data Hazard and Resolution James C. Hoe Department of ECE Carnegie ellon University 18 447 S18 L08 S1, James C. Hoe, CU/ECE/CALC, 2018 Your goal today Housekeeping detect and resolve
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationComputer and Information Sciences College / Computer Science Department The Processor: Datapath and Control
Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control Chapter 5 The Processor: Datapath and Control Big Picture: Where are We Now? Performance of a
More informationLecture 9: Microcontrolled Multi-Cycle Implementations. Who Am I?
18-447 Lecture 9: Microcontrolled Multi-Cycle Implementations S 10 L9-1 James C. Hoe José F. Martínez Electrical & Computer Engineering Carnegie Mellon University February 1, 2010 Who Am I? S 10 L9-2 Associate
More informationCSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade
CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)
More information4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1
.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline
More informationCC 311- Computer Architecture. The Processor - Control
CC 311- Computer Architecture The Processor - Control Control Unit Functions: Instruction code Control Unit Control Signals Select operations to be performed (ALU, read/write, etc.) Control data flow (multiplexor
More informationFull Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI
CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked
More informationMark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control
EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Spring 2003 Yale Patt, Instructor Hyesoon Kim, Onur Mutlu, Moinuddin Qureshi, Santhosh Srinath, TAs Exam 1,
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationDesign of Digital Circuits Lecture 15: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 5: Pipelining Prof. Onur Mutlu ETH Zurich Spring 27 3 April 27 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationLecture 5: The Processor
Lecture 5: The Processor CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and
More informationReview: Abstract Implementation View
Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:
More informationCPE 335 Computer Organization. Basic MIPS Architecture Part I
CPE 335 Computer Organization Basic MIPS Architecture Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s8/index.html CPE232 Basic MIPS Architecture
More informationInf2C - Computer Systems Lecture Processor Design Single Cycle
Inf2C - Computer Systems Lecture 10-11 Processor Design Single Cycle Boris Grot School of Informatics University of Edinburgh Previous lectures Combinational circuits Combinations of gates (INV, AND, OR,
More informationInstruction Pipelining is the use of pipelining to allow more than one instruction to be in some stage of execution at the same time.
Pipelining Pipelining is the se of pipelining to allow more than one instrction to be in some stage of eection at the same time. Ferranti ATLAS (963): Pipelining redced the average time per instrction
More informationCS 152 Computer Architecture and Engineering Lecture 4 Pipelining
CS 152 Computer rchitecture and Engineering Lecture 4 Pipelining 2014-1-30 John Lazzaro (not a prof - John is always OK) T: Eric Love www-inst.eecs.berkeley.edu/~cs152/ Play: 1 otorola 68000 Next week
More informationWinter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page.
page of 7 University of Calgary Departent of Electrical and Copter Engineering ENCM 369: Copter Organization Lectre Instrctors: Steve Noran and Nor Bartley Winter 23 MIDTERM TEST #2 Wednesday, March 2
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationMIPS-Lite Single-Cycle Control
MIPS-Lite Single-Cycle Control COE68: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview Single cycle
More informationCENG 3420 Lecture 06: Datapath
CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference
More informationImproving Performance: Pipelining
Improving Performance: Pipelining Memory General registers Memory ID EXE MEM WB Instruction Fetch (includes PC increment) ID Instruction Decode + fetching values from general purpose registers EXE EXEcute
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationChapter 4. The Processor. Computer Architecture and IC Design Lab
Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS
More informationCPE 335. Basic MIPS Architecture Part II
CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture
More informationCS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING 2014
CS COPTER ARCHITECTRE & ORGANIZATION SPRING DE : TA HOEWORK IV READ : i) Related portions of Chapter (except Sections. through.) ii) Related portions of Appendix A iii) Related portions of Appendix iv)
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationCENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu
CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified
More informationCOMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions
COP33 - Computer Architecture Lecture ulti-cycle Design & Exceptions Single Cycle Datapath We designed a processor that requires one cycle per instruction RegDst busw 32 Clk RegWr Rd ux imm6 Rt 5 5 Rs
More informationLecture 10 Multi-Cycle Implementation
Lecture 10 ulti-cycle Implementation 1 Today s enu ulti-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of Datapath and Control icroprogramming 2 ulti-cycle Solution
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 360N, Fall 003 Yale Patt, Instructor Santhosh Srinath, Danny Lynch, TAs Exam 1, October 0, 003 Name: Problem 1 (0
More information361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control
36 control. EECS 36 Computer Architecture Lecture 9: Designing Single Cycle Control Recap: The MIPS Subset ADD and subtract add rd, rs, rt sub rd, rs, rt OR Imm: ori rt, rs, imm6 3 3 26 2 6 op rs rt rd
More informationDepartment of Electrical and Computer Engineering The University of Texas at Austin
Department of Electrical and Computer Engineering The University of Texas at Austin EE 60N, Fall 00 Yale Patt, Instructor Santhosh Srinath, Danny Lynch, TAs Exam, November 9, 00 Name: Problem (0 points):
More informationCOMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: A Based on P&H Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined
More informationChapter 5 Solutions: For More Practice
Chapter 5 Solutions: For More Practice 1 Chapter 5 Solutions: For More Practice 5.4 Fetching, reading registers, and writing the destination register takes a total of 300ps for both floating point add/subtract
More information