Design of Digital Circuits Lecture 13: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring April 2017
|
|
- Rafe Clarke
- 5 years ago
- Views:
Transcription
1 Design of Digital Circuits Lecture 3: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring 27 6 April 27
2 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed Microarchitectures! Pipelining! Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery,! Out-of-Order Execution! Issues in OoO Execution: Load-Store Handling, 2
3 Readings for Today! P&P, Chapter 4, 5 " Microarchitecture! P&P, Revised Appendix C " Microarchitecture of the LC-3b " Appendix A (LC-3b ISA) will be useful in following this! H&H, Chapter 7.4 (keep reading)! Optional " Maurice Wilkes, The Best Way to Design an Automatic Calculating Machine, Manchester Univ. Computer Inaugural Conf., 95. 3
4 Implementing the ISA: Microarchitecture Basics (Review) 4
5 How Does a Machine Process Instructions?! What does processing an instruction mean?! We will assume the von Neumann model (for now) AS = Architectural (programmer visible) state before an instruction is processed Process instruction AS = Architectural (programmer visible) state after an instruction is processed! Processing an instruction: Transforming AS to AS according to the ISA specification of the instruction 5
6 The Von Neumann Model/Architecture! Also called stored program computer (instructions in memory). Two key properties:! Stored program " Instructions stored in a linear memory array " Memory is unified between instructions and data! The interpretation of a stored value depends on the control signals When is a value interpreted as an instruction?! Sequential instruction processing " One instruction processed (fetched, executed, and completed) at a time " Program counter (instruction pointer) identifies the current instr. " Program counter is advanced sequentially except for control transfer instructions 6
7 The Von Neumann Model/Architecture! Recommended reading " Burks, Goldstein, von Neumann, Preliminary discussion of the logical design of an electronic computing instrument, 946.! Required reading " Patt and Patel book, Chapter 4, The von Neumann Model! Stored program! Sequential instruction processing 7
8 The Von Neumann Model (of a Computer) MEMORY Mem Addr Reg Mem Data Reg INPUT PROCESSING UNIT ALU TEMP OUTPUT CONTROL UNIT IP Inst Register 8
9 The Von Neumann Model (of a Computer)! Q: Is this the only way that a computer can operate?! A: No.! Qualified Answer: But, it has been the dominant way " i.e., the dominant paradigm for computing " for N decades 9
10 The Dataflow Model (of a Computer)! Von Neumann model: An instruction is fetched and executed in control flow order " As specified by the instruction pointer " Sequential unless explicit control flow instruction! Dataflow model: An instruction is fetched and executed in data flow order " i.e., when its operands are ready " i.e., there is no instruction pointer " Instruction ordering specified by data flow dependence! Each instruction specifies who should receive the result! An instruction can fire whenever all operands are received " Potentially many instructions can execute at the same time! Inherently more parallel
11 Von Neumann vs Dataflow! Consider a Von Neumann program " What is the significance of the program order? " What is the significance of the storage locations? a b v <= a + b; w <= b * 2; x <= v - w + *2 y <= v + w z <= x * y Sequential - + * Dataflow! Which model is more natural to you as a programmer? z
12 More on Data Flow! In a data flow machine, a program consists of data flow nodes " A data flow node fires (fetched and executed) when all it inputs are ready! i.e. when all inputs have tokens! Data flow node and its ISA representation 2
13 Data Flow Nodes 3
14 An Example Data Flow Program OUT 4
15 ISA-level Tradeoff: Instruction Pointer! Do we need an instruction pointer in the ISA? " Yes: Control-driven, sequential execution! An instruction is executed when the IP points to it! IP automatically changes sequentially (except for control flow instructions) " No: Data-driven, parallel execution! An instruction is executed when all its operand values are available (data flow)! Tradeoffs: MANY high-level ones " Ease of programming (for average programmers)? " Ease of compilation? " Performance: Extraction of parallelism? " Hardware complexity? 5
16 ISA vs. Microarchitecture Level Tradeoff! A similar tradeoff (control vs. data-driven execution) can be made at the microarchitecture level! ISA: Specifies how the programmer sees instructions to be executed " Programmer sees a sequential, control-flow execution order vs. " Programmer sees a data-flow execution order! Microarchitecture: How the underlying implementation actually executes instructions " Microarchitecture can execute instructions in any order as long as it obeys the semantics specified by the ISA when making the instruction results visible to software! Programmer should see the order specified by the ISA 6
17 The Process Instruction Step! ISA specifies abstractly what AS should be, given an instruction and AS " It defines an abstract finite state machine where! State = programmer-visible state! Next-state logic = instruction execution specification " From ISA point of view, there are no intermediate states between AS and AS during instruction execution! One state transition per instruction! Microarchitecture implements how AS is transformed to AS " There are many choices in implementation " We can have programmer-invisible state to optimize the speed of instruction execution: multiple state transitions per instruction! Choice : AS # AS (transform AS to AS in a single clock cycle)! Choice 2: AS # AS+MS # AS+MS2 # AS+MS3 # AS (take multiple clock cycles to transform AS to AS ) 7
18 A Very Basic Instruction Processing Engine! Each instruction takes a single clock cycle to execute! Only combinational logic is used to implement instruction execution " No intermediate, programmer-invisible state updates AS = Architectural (programmer visible) state at the beginning of a clock cycle Process instruction in one clock cycle AS = Architectural (programmer visible) state at the end of a clock cycle 8
19 A Very Basic Instruction Processing Engine! Single-cycle machine Combina.onal Logic AS Sequen.al Logic (State) AS! What is the clock cycle time determined by?! What is the critical path of the combinational logic determined by? 9
20 Remember: Programmer Visible (Architectural) State M[] M[] M[2] M[3] M[4] Registers - given special names in the ISA (as opposed to addresses) - general vs. special purpose M[N-] Memory array of storage loca.ons indexed by an address Program Counter memory address of the current instruc.on Instruc.ons (and programs) specify how to transform the values of programmer visible state 2
21 Single-cycle vs. Multi-cycle Machines! Single-cycle machines " Each instruction takes a single clock cycle " All state updates made at the end of an instruction s execution " Big disadvantage: The slowest instruction determines cycle time # long clock cycle time! Multi-cycle machines " Instruction processing broken into multiple cycles/stages " State updates can be made during an instruction s execution " Architectural state updates made only at the end of an instruction s execution " Advantage over single-cycle: The slowest stage determines cycle time! Both single-cycle and multi-cycle machines literally follow the von Neumann model at the microarchitecture level 2
22 Instruction Processing Cycle! Instructions are processed under the direction of a control unit step by step.! Instruction cycle: Sequence of steps to process an instruction! Fundamentally, there are six steps:! Fetch! Decode! Evaluate Address! Fetch Operands! Execute! Store Result! Not all instructions require all six steps (see P&P Ch. 4) 22
23 Instruction Processing Cycle vs. Machine Clock Cycle! Single-cycle machine: " All six phases of the instruction processing cycle take a single machine clock cycle to complete! Multi-cycle machine: " All six phases of the instruction processing cycle can take multiple machine clock cycles to complete " In fact, each phase can take multiple clock cycles to complete 23
24 Instruction Processing Viewed Another Way! Instructions transform Data (AS) to Data (AS )! This transformation is done by functional units " Units that operate on data! These units need to be told what to do to the data! An instruction processing engine consists of two components " Datapath: Consists of hardware elements that deal with and transform data signals! functional units that operate on data! hardware structures (e.g. wires and muxes) that enable the flow of data into the functional units and registers! storage units that store data (e.g., registers) " Control logic: Consists of hardware elements that determine control signals, i.e., signals that specify what the datapath elements should do to the data 24
25 Single-cycle vs. Multi-cycle: Control & Data! Single-cycle machine: " Control signals are generated in the same clock cycle as the one during which data signals are operated on " Everything related to an instruction happens in one clock cycle (serialized processing)! Multi-cycle machine: " Control signals needed in the next cycle can be generated in the current cycle " Latency of control processing can be overlapped with latency of datapath operation (more parallelism)! We will see the difference clearly in microprogrammed multi-cycle microarchitectures 25
26 Many Ways of Datapath and Control Design! There are many ways of designing the data path and control logic! Single-cycle, multi-cycle, pipelined datapath and control! Single-bus vs. multi-bus datapaths! Hardwired/combinational vs. microcoded/microprogrammed control " Control signals generated by combinational logic versus " Control signals stored in a memory structure! Control signals and structure depend on the datapath design 26
27 Performance Analysis! Execution time of an instruction " {CPI} x {clock cycle time}! Execution time of a program " Sum over all instructions [{CPI} x {clock cycle time}] " {# of instructions} x {Average CPI} x {clock cycle time}! Single cycle microarchitecture performance " CPI = " Clock cycle time = long! Multi-cycle microarchitecture performance " CPI = different for each instruction! Average CPI # hopefully small " Clock cycle time = short Now, we have two degrees of freedom to optimize independently 27
28 Review: Complete Single-Cycle Processor SignImm A RD Instruction Memory + 4 A A3 WD3 RD2 RD WE3 A2 Sign Extend Register File A RD Data Memory WD WE PC PC' Instr 25:2 2:6 5: 5: SrcB 2:6 5: <<2 + ALUResult ReadData WriteData SrcA PCPlus4 PCBranch WriteReg 4: Result 3:26 RegDst Branch MemWrite MemtoReg ALUSrc RegWrite Op Funct Control Unit Zero PCSrc ALUControl 2: ALU
29 Another Example Single-Cycle Processor Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump 4 Add PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data M u x bcond Zero ALU ALU result Address Write data Data memory Read data M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] **Based on original figure from [P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] JAL, JR, JALR omiaed 29
30 Evaluating the Single-Cycle Microarchitecture 3
31 A Single-Cycle Microarchitecture! Is this a good idea/design?! When is this a good design?! When is this a bad design?! How can we design a better microarchitecture? 3
32 A Single-Cycle Microarchitecture: Analysis! Every instruction takes cycle to execute " CPI (Cycles per instruction) is strictly! How long each instruction takes is determined by how long the slowest instruction takes to execute " Even though many instructions do not need that long to execute! Clock cycle time of the microarchitecture is determined by how long it takes to complete the slowest instruction " Critical path of the design is determined by the processing time of the slowest instruction 32
33 What is the Slowest Instruction to Process?! Let s go back to the basics! All six phases of the instruction processing cycle take a single machine clock cycle to complete " Fetch " Decode " Evaluate Address " Fetch Operands " Execute " Store Result. Instruction fetch (IF) 2. Instruction decode and register operand fetch (ID/RF) 3. Execute/Evaluate memory address (EX/AG) 4. Memory operand fetch (MEM) 5. Store/writeback result (WB)! Do each of the above phases take the same time (latency) for all instructions? 33
34 Example Single-Cycle Datapath Analysis! Assume (for the design in Slide 29) " memory units (read or write): 2 ps " ALU and adders: ps " register file (read or write): 5 ps " other combinational logic: ps steps IF ID EX MEM WB resources mem RF ALU mem RF Delay R-type I-type LW SW Branch Jump
35 Let s Find the Critical Path Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump 4 Add PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data M u x bcond Zero ALU ALU result Address Write data Data memory Read data M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 35
36 R-Type and I-Type ALU Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump ps 4 Add ps PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] 2ps Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data 4ps 25ps M u x bcond Zero ALU ALU result Address 35ps Write data Data memory Read data M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 36
37 LW Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump ps 4 Add ps PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] 2ps Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write 6ps data 25ps M u x bcond Zero ALU ALU result Address 35ps Write data Data memory Read data 55ps M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 37
38 SW Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump ps 4 Add ps PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] 2ps Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data 25ps M u x bcond Zero ALU ALU result 35ps Address Read data Data Write 55ps memory data M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 38
39 Branch Taken 35ps PC 4 Read address Instruction memory Add Instruction [3 ] ps Instruction [25 ] Shift Jump address [3 ] left ps PC+4 [3 28] Instruction [3 26] Instruction [25 2] Instruction [2 6] Instruction [5 ] Instruction [5 ] Control M u x RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Read register Read data Read register 2 Registers Read Write data 2 register Write data 6 Sign 32 extend Shift left 2 25ps M u x ALU control 2ps Add ALU result bcond Zero ALU ALU result 35ps ALU opera.on Write data Address PCSrc =Jump M u x Data memory M u x PCSrc 2 =Br Taken Read data M u x Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 39
40 Jump Instruction [25 ] Shift Jump address [3 ] left PCSrc =Jump 2ps 4 Add ps PC+4 [3 28] Instruction [3 26] Control RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x M u x PCSrc 2 =Br Taken PC Read address Instruction memory Instruction [3 ] 2ps Instruction [25 2] Instruction [2 6] Instruction [5 ] M u x Read register Read data Read register 2 Registers Read Write data 2 register Write data M u x bcond Zero ALU ALU result Address Write data Data memory Read data M u x Instruction [5 ] 6 Sign 32 extend ALU control ALU opera.on Instruction [5 ] [Based on original figure from P&H CO&D, COPYRIGHT 24 Elsevier. ALL RIGHTS RESERVED.] 4
41 What About Control Logic?! How does that affect the critical path?! Food for thought for you: " Can control logic be on the critical path? " A note on CDC 56: control store access too long 4
42 What is the Slowest Instruction to Process?! Memory is not magic! What if memory sometimes takes ms to access?! Does it make sense to have a simple register to register add or jump to take {ms+all else to do a memory operation}?! And, what if you need to access memory more than once to process an instruction? " Which instructions need this? " Do you provide multiple ports to memory? 42
43 Single Cycle uarch: Complexity! Contrived " All instructions run as slow as the slowest instruction! Inefficient " All instructions run as slow as the slowest instruction " Must provide worst-case combinational resources in parallel as required by any instruction " Need to replicate a resource if it is needed more than once by an instruction during different parts of the instruction processing cycle! Not necessarily the simplest way to implement an ISA " Single-cycle implementation of REP MOVS (x86) or INDEX (VAX)?! Not easy to optimize/improve performance " Optimizing the common case does not work (e.g. common instructions) " Need to optimize the worst case all the time 43
44 (Micro)architecture Design Principles! Critical path design " Find and decrease the maximum combinational logic delay " Break a path into multiple cycles if it takes too long! Bread and butter (common case) design " Spend time and resources on where it matters most! i.e., improve what the machine is really designed to do " Common case vs. uncommon case! Balanced design " Balance instruction/data flow through hardware components " Design to eliminate bottlenecks: balance the hardware for the work 44
45 Single-Cycle Design vs. Design Principles! Critical path design! Bread and butter (common case) design! Balanced design How does a single-cycle microarchitecture fare in light of these principles? 45
46 Aside: System Design Principles! When designing computer systems/architectures, it is important to follow good principles! Remember: principled design from our first lecture " Frank Lloyd Wright: architecture [ ] based upon principle, and not upon precedent 46
47 Aside: From Lecture! architecture [ ] based upon principle, and not upon precedent 47
48 Aside: System Design Principles! We will continue to cover key principles in this course! Here are some references where you can learn more! Yale Patt, Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor Evolution, Proc. of IEEE, 2. (Levels of transformation, design point, etc)! Mike Flynn, Very High-Speed Computing Systems, Proc. of IEEE, 966. (Flynn s Bottleneck # Balanced design)! Gene M. Amdahl, "Validity of the single processor approach to achieving large scale computing capabilities," AFIPS Conference, April 967. (Amdahl s Law # Common-case design)! Butler W. Lampson, Hints for Computer System Design, ACM Operating Systems Review, 983. " 48
49 A Key System Design Principle!! Keep it simple Everything should be made as simple as possible, but no simpler. "!! Albert Einstein And, keep it low cost: An engineer is a person who can do for a dime what any fool can do for a dollar. For more, see: " " Butler W. Lampson, Hints for Computer System Design, ACM Operating Systems Review,
50 Multi-Cycle Microarchitectures 5
51 Multi-Cycle Microarchitectures! Goal: Let each instruction take (close to) only as much time it really needs! Idea " Determine clock cycle time independently of instruction processing time " Each instruction takes as many clock cycles as it needs to take! Multiple state transitions per instruction! The states followed by each instruction is different 5
52 Remember: The Process instruction Step! ISA specifies abstractly what AS should be, given an instruction and AS " It defines an abstract finite state machine where! State = programmer-visible state! Next-state logic = instruction execution specification " From ISA point of view, there are no intermediate states between AS and AS during instruction execution! One state transition per instruction! Microarchitecture implements how AS is transformed to AS " There are many choices in implementation " We can have programmer-invisible state to optimize the speed of instruction execution: multiple state transitions per instruction! Choice : AS # AS (transform AS to AS in a single clock cycle)! Choice 2: AS # AS+MS # AS+MS2 # AS+MS3 # AS (take multiple clock cycles to transform AS to AS ) 52
53 Multi-Cycle Microarchitecture AS = Architectural (programmer visible) state at the beginning of an instruction Step : Process part of instruction in one clock cycle Step 2: Process part of instruction in the next clock cycle AS = Architectural (programmer visible) state at the end of a clock cycle 53
54 Benefits of Multi-Cycle Design! Critical path design " Can keep reducing the critical path independently of the worstcase processing time of any instruction! Bread and butter (common case) design " Can optimize the number of states it takes to execute important instructions that make up much of the execution time! Balanced design " No need to provide more capability or resources than really needed! An instruction that needs resource X multiple times does not require multiple X s to be implemented! Leads to more efficient hardware: Can reuse hardware components needed multiple times for an instruction 54
55 Downsides of Multi-Cycle Design! Need to store the intermediate results at the end of each clock cycle " Hardware overhead for registers " Register setup/hold overhead paid multiple times for an instruction 55
56 Remember: Performance Analysis! Execution time of an instruction " {CPI} x {clock cycle time}! Execution time of a program " Sum over all instructions [{CPI} x {clock cycle time}] " {# of instructions} x {Average CPI} x {clock cycle time}! Single cycle microarchitecture performance " CPI = " Clock cycle time = long! Multi-cycle microarchitecture performance " CPI = different for each instruction! Average CPI # hopefully small " Clock cycle time = short Not easy to optimize design We have two degrees of freedom to optimize independently 56
57 A Multi-Cycle Microarchitecture A Closer Look 57
58 How Do We Implement This?! Maurice Wilkes, The Best Way to Design an Automatic Calculating Machine, Manchester Univ. Computer Inaugural Conf., 95.! An elegant implementation: " The concept of microcoded/microprogrammed machines 58
59 Multi-Cycle uarch! Key Idea for Realization " One can implement the process instruction step as a finite state machine that sequences between states and eventually returns back to the fetch instruction state " A state is defined by the control signals asserted in it " Control signals for the next state determined in current state 59
60 The Instruction Processing Cycle " Fetch " Decode " Evaluate Address " Fetch Operands " Execute " Store Result 6
61 A Basic Multi-Cycle Microarchitecture! Instruction processing cycle divided into states! A stage in the instruction processing cycle can take multiple states! A multi-cycle microarchitecture sequences from state to state to process an instruction! The behavior of the machine in a state is completely determined by control signals in that state! The behavior of the entire processor is specified fully by a finite state machine! In a state (clock cycle), control signals control two things:! How the datapath should process the data! How to generate the control signals for the (next) clock cycle 6
62 One Example Multi-Cycle Microarchitecture 62
63 Carnegie Mellon Remember: Single-Cycle MIPS Processor Jump 3:26 5: MemtoReg Control MemWrite Unit Branch ALUControl 2: Op ALUSrc Funct RegDst RegWrite PCSrc PC' PC A RD Instruction Memory Instr 25:2 2:6 A A2 A3 WD3 WE3 Register File RD RD2 SrcA SrcB ALU Zero ALUResult WriteData A WE RD Data Memory WD ReadData Result PCJump 4 + PCPlus4 2:6 5: 5: Sign Extend WriteReg 4: SignImm <<2 + PCBranch 27: 3:28 25: <<2 63
64 Carnegie Mellon MulB-cycle MIPS Processor! Single-cycle microarchitecture: + simple - cycle.me limited by longest instruc.on (lw) - three adders/alus and two memories! MulB-cycle microarchitecture: + higher clock speed + simpler instruc.ons run faster + reuse expensive hardware on mul.ple cycles - sequencing overhead paid many.mes - hardware overhead for storing intermediate results! Same design steps: datapath & control 64
65 Carnegie Mellon What Do We Want To OpBmize! Single Cycle Architecture uses two memories $ One memory stores instruc.ons, the other data $ We want to use a single memory (Smaller size) 65
66 Carnegie Mellon What Do We Want To OpBmize! Single Cycle Architecture uses two memories $ One memory stores instruc.ons, the other data $ We want to use a single memory (Smaller size)! Single Cycle Architecture needs three adders $ ALU, PC, Branch address calcula.on $ We want to use the ALU for all opera.ons (smaller size) 66
67 Carnegie Mellon What Do We Want To OpBmize! Single Cycle Architecture uses two memories $ One memory stores instruc.ons, the other data $ We want to use a single memory (Smaller size)! Single Cycle Architecture needs three adders $ ALU, PC, Branch address calcula.on $ We want to use the ALU for all opera.ons (smaller size)! In Single Cycle Architecture all instrucbons take one cycle $ The most complex opera.on slows down everything! $ Divide all instruc.ons into mul.ple steps $ Simpler instruc.ons can take fewer cycles (average case may be faster) 67
68 Carnegie Mellon Consider the lw instrucbon! For an instrucbon such as: lw $t, x2($t)! We need to: $ Read the instruc.on from memory $ Then read $t from register array $ Add the immediate value (x2) to calculate the memory address $ Read the content of this address $ Write to the register $t this content 68
69 Carnegie Mellon MulB-cycle Datapath: instrucbon fetch! First consider execubng lw $ STEP : Fetch instruc.on IRWrite PC' b PC A WE RD EN Instr A A2 WE3 RD RD2 Instr / Data Memory WD A3 WD3 Register File read from the memory location [rs]+imm to location [rt] I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 69
70 Carnegie Mellon MulB-cycle Datapath: lw register read IRWrite PC' b PC A WE RD EN Instr 25:2 A A2 WE3 RD RD2 A Instr / Data Memory WD A3 WD3 Register File I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 7
71 Carnegie Mellon MulB-cycle Datapath: lw immediate IRWrite PC' b PC A WE RD EN Instr 25:2 A A2 WE3 RD RD2 A Instr / Data Memory WD A3 WD3 Register File SignImm 5: Sign Extend I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 7
72 Carnegie Mellon MulB-cycle Datapath: lw address IRWrite ALUControl 2: PC' b PC A RD Instr / Data Memory WD WE EN Instr 25:2 A A2 A3 WD3 WE3 Register File RD RD2 A SrcA SrcB ALU ALUResult ALUOut SignImm 5: Sign Extend I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 72
73 Carnegie Mellon MulB-cycle Datapath: lw memory read IorD IRWrite ALUControl 2: PC' b PC Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 A A2 A3 WD3 WE3 Register File RD RD2 A SrcA SrcB ALU ALUResult ALUOut SignImm 5: Sign Extend I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 73
74 Carnegie Mellon MulB-cycle Datapath: lw write register IorD IRWrite RegWrite ALUControl 2: PC' b PC Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 A A2 A3 WD3 WE3 Register File RD RD2 A SrcA SrcB ALU ALUResult ALUOut SignImm 5: Sign Extend I-Type op rs rt imm 6 bits 5 bits 5 bits 6 bits 74
75 Carnegie Mellon MulB-cycle Datapath: increment PC PCWrite IorD IRWrite RegWrite ALUSrcA ALUSrcB : ALUControl 2: PC' b EN PC Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 A A2 A3 WD3 WE3 Register File RD RD2 A 4 SrcA SrcB ALU ALUResult ALUOut 5: Sign Extend SignImm 75
76 Carnegie Mellon MulB-cycle Datapath: sw! Write data in rt to memory PCWrite IorD MemWrite IRWrite RegWrite ALUSrcA ALUSrcB : ALUControl 2: PC' PC EN b Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 A A2 A3 WD3 WE3 Register File RD RD2 A B 4 SrcA SrcB ALU ALUResult ALUOut 5: Sign Extend SignImm 76
77 Carnegie Mellon MulB-cycle Datapath: R-type InstrucBons! Read from rs and rt $ Write ALUResult to register file $ Write to rd (instead of rt) PCWrite IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB : ALUControl 2: PC' PC EN b Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: A A2 A3 WD3 WE3 Register File RD RD2 A B 4 SrcA SrcB ALU ALUResult ALUOut 5: Sign Extend SignImm 77
78 Carnegie Mellon MulB-cycle Datapath: beq! Determine whether values in rs and rt are equal $ Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4) PCEn IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB : ALUControl 2: BranchPCWrite PCSrc PC' PC EN b Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 78
79 Carnegie Mellon Complete MulB-cycle Processor IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: RegDst MemtoReg A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 79
80 Carnegie Mellon Control Unit Control Unit Opcode 5: Main Controller (FSM) MemtoReg RegDst IorD PCSrc ALUSrcB : ALUSrcA IRWrite MemWrite PCWrite Branch RegWrite Multiplexer Selects Register Enables ALUOp : Funct 5: ALU Decoder ALUControl 2: 8
81 Carnegie Mellon Main Controller FSM: Fetch Reset S: Fetch IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 8
82 Carnegie Mellon Main Controller FSM: Fetch S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 82
83 Carnegie Mellon Main Controller FSM: Decode S: Fetch S: Decode IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B X SrcA XXX Zero X XX ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 83
84 Carnegie Mellon Main Controller FSM: Address CalculaBon S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero X ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 84
85 Carnegie Mellon Main Controller FSM: Address CalculaBon S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW IorD MemWrite IRWrite 3:26 5: Control Unit Op Funct PCWrite Branch PCSrc ALUControl 2: ALUSrcB : ALUSrcA RegWrite PCEn PC' X WE PC Instr Adr RD EN A EN Instr / Data Memory WD Data 25:2 2:6 2:6 5: RegDst X MemtoReg X A A2 A3 WD3 WE3 Register File RD RD2 A B SrcA Zero X ALUResult ALUOut 4 SrcB <<2 ALU 5: Sign Extend SignImm 85
86 Carnegie Mellon Main Controller FSM: lw S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead IorD = S4: Mem Writeback RegDst = MemtoReg = RegWrite 86
87 Carnegie Mellon Main Controller FSM: sw S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite IorD = IorD = MemWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 87
88 Carnegie Mellon Main Controller FSM: R-Type S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite S: Decode S2: MemAdr Op = LW or Op = SW Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 88
89 Carnegie Mellon Main Controller FSM: beq S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 89
90 Carnegie Mellon Complete MulB-cycle Controller FSM S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 9
91 Carnegie Mellon Main Controller FSM: addi S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S9: ADDI Execute Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 9
92 Carnegie Mellon Main Controller FSM: addi S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S9: ADDI Execute ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 92
93 Carnegie Mellon Extended FuncBonality: j PCEn IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB : ALUControl 2: BranchPCWrite PCSrc : PC' PC EN Adr A RD Instr / Data Memory WD WE Instr EN Data 25:2 2:6 2:6 5: A A2 A3 WD3 WE3 Register File RD RD2 A B 3:28 4 <<2 SrcA SrcB ALU Zero ALUResult ALUOut PCJump <<2 27: SignImm 5: Sign Extend 25: (jump) 93
94 Carnegie Mellon Control FSM: j S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = J Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S: Jump S9: ADDI Execute ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 94
95 Carnegie Mellon Control FSM: j S2: MemAdr S: Fetch IorD = Reset AluSrcA = ALUSrcB = ALUOp = PCSrc = IRWrite PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW or Op = SW S: Decode ALUSrcA = ALUSrcB = ALUOp = Op = R-type S6: Execute ALUSrcA = ALUSrcB = ALUOp = Op = BEQ Op = J Op = ADDI S8: Branch ALUSrcA = ALUSrcB = ALUOp = PCSrc = Branch S: Jump S9: ADDI Execute PCSrc = PCWrite ALUSrcA = ALUSrcB = ALUOp = Op = LW S3: MemRead Op = SW S5: MemWrite S7: ALU Writeback S: ADDI Writeback IorD = IorD = MemWrite RegDst = MemtoReg = RegWrite RegDst = MemtoReg = RegWrite S4: Mem Writeback RegDst = MemtoReg = RegWrite 95
96 Design of Digital Circuits Lecture 3: Multi-Cycle Microarch. Prof. Onur Mutlu ETH Zurich Spring 27 6 April 27
Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)
Microarchitecture Design of Digital Circuits 27 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan) http://www.syssec.ethz.ch/education/digitaltechnik_7 Adapted from Digital
More informationDigital Design & Computer Architecture (E85) D. Money Harris Fall 2007
Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007 Final Exam This is a closed-book take-home exam. You are permitted a calculator and two 8.5x sheets of paper with notes. The exam
More informationCPE 335. Basic MIPS Architecture Part II
CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationComputer Architecture. Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures
Computer Architecture Lecture 5: Multi-Cycle and Microprogrammed Microarchitectures Dr. Ahmed Sallam Based on original slides by Prof. Onur Mutlu Agenda for Today & Next Few Lectures Single-cycle Microarchitectures
More informationCSE 2021 COMPUTER ORGANIZATION
CSE 22 COMPUTER ORGANIZATION HUGH CHESSER CHESSER HUGH CSEB 2U 2U CSEB Agenda Topics:. Sample Exam/Quiz Q - Review 2. Multiple cycle implementation Patterson: Section 4.5 Reminder: Quiz #2 Next Wednesday
More informationRISC Processor Design
RISC Processor Design Single Cycle Implementation - MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 13 SE-273: Processor Design Feb 07, 2011 SE-273@SERC 1 Courtesy:
More informationLecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August
Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,
More informationEECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction
EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement
More informationInf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle
Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle Boris Grot School of Informatics University of Edinburgh Previous lecture: single-cycle processor Inf2C Computer Systems - 2017-2018. Boris
More informationDigital Design and Computer Architecture
Digital Design and Computer Architecture Lab 0: Multicycle Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle MIPS processor. You will be much more on
More informationLecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón
ICS 152 Computer Systems Architecture Prof. Juan Luis Aragón Lecture 5 and 6 Multicycle Implementation Introduction to Microprogramming Readings: Sections 5.4 and 5.5 1 Review of Last Lecture We have seen
More informationMajor CPU Design Steps
Datapath Major CPU Design Steps. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected
More informationCMSC Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 4: Single-Cycle uarch and Pipelining Prof. Yanjing Li University of Chicago Administrative Stuff! Lab1 due at 11:59pm today! Lab2 out " Pipeline ARM simulator "
More informationProcessor: Multi- Cycle Datapath & Control
Processor: Multi- Cycle Datapath & Control (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann, 27) COURSE
More informationRISC Architecture: Multi-Cycle Implementation
RISC Architecture: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay
More informationECE369. Chapter 5 ECE369
Chapter 5 1 State Elements Unclocked vs. Clocked Clocks used in synchronous logic Clocks are needed in sequential logic to decide when an element that contains state should be updated. State element 1
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres
More informationRISC Architecture: Multi-Cycle Implementation
RISC Architecture: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design
ENGN64: Design of Computing Systems Topic 4: Single-Cycle Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationDigital Design and Computer Architecture Harris and Harris
Digital Design and Computer Architecture Harris and Harris Lab 0: Multicycle Processor (Part ) Introduction In this lab and the next, you will design and build your own multicycle MIPS processor. You will
More informationCC 311- Computer Architecture. The Processor - Control
CC 311- Computer Architecture The Processor - Control Control Unit Functions: Instruction code Control Unit Control Signals Select operations to be performed (ALU, read/write, etc.) Control data flow (multiplexor
More informationCSE 2021 COMPUTER ORGANIZATION
CSE 2021 COMPUTER ORGANIZATION HUGH LAS CHESSER 1012U HUGH CHESSER CSEB 1012U W10-M Agenda Topics: 1. Multiple cycle implementation review 2. State Machine 3. Control Unit implementation for Multi-cycle
More informationEECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer
EECS150 - Digital Design Lecture 9- CPU Microarchitecture Feb 15, 2011 John Wawrzynek Spring 2011 EECS150 - Lec09-cpu Page 1 Watson: Jeopardy-playing Computer Watson is made up of a cluster of ninety IBM
More informationComputer Architecture
Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics
More informationLECTURE 6. Multi-Cycle Datapath and Control
LECTURE 6 Multi-Cycle Datapath and Control SINGLE-CYCLE IMPLEMENTATION As we ve seen, single-cycle implementation, although easy to implement, could potentially be very inefficient. In single-cycle, we
More informationSystems Architecture I
Systems Architecture I Topics A Simple Implementation of MIPS * A Multicycle Implementation of MIPS ** *This lecture was derived from material in the text (sec. 5.1-5.3). **This lecture was derived from
More informationCOMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: A Based on P&H Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined
More informationENE 334 Microprocessors
ENE 334 Microprocessors Lecture 6: Datapath and Control : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 3 th & 4 th Edition, Patterson & Hennessy, 2005/2008, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationChapter 5 Solutions: For More Practice
Chapter 5 Solutions: For More Practice 1 Chapter 5 Solutions: For More Practice 5.4 Fetching, reading registers, and writing the destination register takes a total of 300ps for both floating point add/subtract
More informationLecture 9: Microcontrolled Multi-Cycle Implementations. Who Am I?
18-447 Lecture 9: Microcontrolled Multi-Cycle Implementations S 10 L9-1 James C. Hoe José F. Martínez Electrical & Computer Engineering Carnegie Mellon University February 1, 2010 Who Am I? S 10 L9-2 Associate
More informationﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control
معماري كامپيوتر پيشرفته معماري MIPS data path and control abbasi@basu.ac.ir Topics Building a datapath support a subset of the MIPS-I instruction-set A single cycle processor datapath all instruction actions
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationCHW 362 : Computer Architecture & Organization
CHW 362 : Computer Architecture & Organization Instructors: Dr Ahmed Shalaby Dr Mona Ali http://bu.edu.eg/staff/ahmedshalaby4# http://www.bu.edu.eg/staff/mona.abdelbaset Review: Instruction Formats R-Type
More informationMultiple Cycle Data Path
Multiple Cycle Data Path CS 365 Lecture 7 Prof. Yih Huang CS365 1 Multicycle Approach Break up the instructions into steps, each step takes a cycle balance the amount of work to be done restrict each cycle
More informationComputer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming
Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: Data Paths and Microprogramming We have spent time looking at the MIPS instruction set architecture and building
More informationLecture 4: ISA Tradeoffs (Continued) and Single-Cycle Microarchitectures
Lecture 4: ISA Tradeoffs (Continued) and Single-Cycle Microarchitectures ISA-level Tradeoffs: Instruction Length Fixed length: Length of all instructions the same + Easier to decode single instruction
More informationCSEN 601: Computer System Architecture Summer 2014
CSEN 601: Computer System Architecture Summer 2014 Practice Assignment 5 Solutions Exercise 5-1: (Midterm Spring 2013) a. What are the values of the control signals (except ALUOp) for each of the following
More informationECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.
This exam is open book and open notes. You have 2 hours. Problems 1-4 refer to a proposed MIPS instruction lwu (load word - update) which implements update addressing an addressing mode that is used in
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationChapter 4. The Processor. Computer Architecture and IC Design Lab
Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS
More informationCOMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath
COMP33 - Computer Architecture Lecture 8 Designing a Single Cycle Datapath The Big Picture The Five Classic Components of a Computer Processor Input Control Memory Datapath Output The Big Picture: The
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationMulti-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction
Multi-cycle Approach Single cycle CPU State element Combinational logic State element clock one clock cycle or instruction Multi-cycle CPU Requires state elements to hold intermediate values State Element
More informationComputer Architecture 计算机体系结构. Lecture 2. Instruction Set Architecture 第二讲 指令集架构. Chao Li, PhD. 李超博士
Computer Architecture 计算机体系结构 Lecture 2. Instruction Set Architecture 第二讲 指令集架构 Chao Li, PhD. 李超博士 SJTU-SE346, Spring 27 Review ENIAC (946) used decimal representation; vacuum tubes per digit; could store
More informationTopic #6. Processor Design
Topic #6 Processor Design Major Goals! To present the single-cycle implementation and to develop the student's understanding of combinational and clocked sequential circuits and the relationship between
More informationRISC Design: Multi-Cycle Implementation
RISC Design: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationCOMP2611: Computer Organization. The Pipelined Processor
COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among
More informationControl Unit for Multiple Cycle Implementation
Control Unit for Multiple Cycle Implementation Control is more complex than in single cycle since: Need to define control signals for each step Need to know which step we are on Two methods for designing
More informationLECTURE 5. Single-Cycle Datapath and Control
LECTURE 5 Single-Cycle Datapath and Control PROCESSORS In lecture 1, we reminded ourselves that the datapath and control are the two components that come together to be collectively known as the processor.
More information15-740/ Computer Architecture Lecture 7: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011
15-740/18-740 Computer Architecture Lecture 7: Pipelining Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011 Review of Last Lecture More ISA Tradeoffs Programmer vs. microarchitect Transactional
More informationCPE 335 Computer Organization. Basic MIPS Architecture Part I
CPE 335 Computer Organization Basic MIPS Architecture Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s8/index.html CPE232 Basic MIPS Architecture
More informationComputer Architecture
Computer Architecture Lecture 2: Fundamental Concepts and ISA Dr. Ahmed Sallam Based on original slides by Prof. Onur Mutlu What Do I Expect From You? Chance favors the prepared mind. (Louis Pasteur) كل
More informationTHE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination May 23, 2014 Name: Email: Student ID: Lab Section Number: Instructions: 1. This
More informationMapping Control to Hardware
C A P P E N D I X A custom format such as this is slave to the architecture of the hardware and the instruction set it serves. The format must strike a proper compromise between ROM size, ROM-output decoding,
More informationChapter 5: The Processor: Datapath and Control
Chapter 5: The Processor: Datapath and Control Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor
More informationComputer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures Single-cycle
More informationComputer Science 141 Computing Hardware
Computer Science 4 Computing Hardware Fall 6 Harvard University Instructor: Prof. David Brooks dbrooks@eecs.harvard.edu Upcoming topics Mon, Nov th MIPS Basic Architecture (Part ) Wed, Nov th Basic Computer
More informationCENG 3420 Lecture 06: Datapath
CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference
More informationALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR
Microprogramming Exceptions and interrupts 9 CMPE Fall 26 A. Di Blas Fall 26 CMPE CPU Multicycle From single-cycle to Multicycle CPU with sequential control: Finite State Machine Textbook Edition: 5.4,
More informationInitial Representation Finite State Diagram. Logic Representation Logic Equations
Control Implementation Alternatives Control may be designed using one of several initial representations. The choice of sequence control, and how logic is represented, can then be determined independently;
More informationCS/COE0447: Computer Organization
CS/COE0447: Computer Organization and Assembly Language Datapath and Control Sangyeun Cho Dept. of Computer Science A simple MIPS We will design a simple MIPS processor that supports a small instruction
More informationCS/COE0447: Computer Organization
A simple MIPS CS/COE447: Computer Organization and Assembly Language Datapath and Control Sangyeun Cho Dept. of Computer Science We will design a simple MIPS processor that supports a small instruction
More informationPoints available Your marks Total 100
CSSE 3 Computer Architecture I Rose-Hulman Institute of Technology Computer Science and Software Engineering Department Exam Name: Section: 3 This exam is closed book. You are allowed to use the reference
More informationCENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu
CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified
More informationDesign of the MIPS Processor
Design of the MIPS Processor We will study the design of a simple version of MIPS that can support the following instructions: I-type instructions LW, SW R-type instructions, like ADD, SUB Conditional
More informationAlternative to single cycle. Drawbacks of single cycle implementation. Multiple cycle implementation. Instruction fetch
Drawbacks of single cycle implementation Alternative to single cycle All instructions take the same time although some instructions are longer than others; e.g. load is longer than add since it has to
More informationMulticycle Approach. Designing MIPS Processor
CSE 675.2: Introduction to Computer Architecture Multicycle Approach 8/8/25 Designing MIPS Processor (Multi-Cycle) Presentation H Slides by Gojko Babić and Elsevier Publishing We will be reusing functional
More informationThe Processor: Datapath & Control
Chapter Five 1 The Processor: Datapath & Control We're ready to look at an implementation of the MIPS Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions:
More information15-740/ Computer Architecture Lecture 4: Pipelining. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 4: Pipelining Prof. Onur Mutlu Carnegie Mellon University Last Time Addressing modes Other ISA-level tradeoffs Programmer vs. microarchitect Virtual memory Unaligned
More informationSystems Architecture
Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationThe MIPS Processor Datapath
The MIPS Processor Datapath Module Outline MIPS datapath implementation Register File, Instruction memory, Data memory Instruction interpretation and execution. Combinational control Assignment: Datapath
More informationCSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions
CSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions 2 3 4 5 Prof. Stephen A. Edwards Columbia University Due June 26, 207 at :00 PM ame: Solutions Uni: Show your work for each problem;
More informationThe Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (1) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationLets Build a Processor
Lets Build a Processor Almost ready to move into chapter 5 and start building a processor First, let s review Boolean Logic and build the ALU we ll need (Material from Appendix B) operation a 32 ALU result
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationProcessor (multi-cycle)
CS359: Computer Architecture Processor (multi-cycle) Yanyan Shen Department of Computer Science and Engineering Five Instruction Steps ) Instruction Fetch ) Instruction Decode and Register Fetch 3) R-type
More informationECE 3056: Architecture, Concurrency and Energy of Computation. Single and Multi-Cycle Datapaths: Practice Problems
ECE 3056: Architecture, Concurrency and Energy of Computation Single and Multi-Cycle Datapaths: Practice Problems 1. Consider the single cycle SPIM datapath. a. Specify the values of the control signals
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationMulticycle conclusion
Multicycle conclusion The last few lectures covered a lot of material! We introduced a multicycle datapath, where different instructions take different numbers of cycles to execute. A multicycle unit is
More informationDesign of the MIPS Processor (contd)
Design of the MIPS Processor (contd) First, revisit the datapath for add, sub, lw, sw. We will augment it to accommodate the beq and j instructions. Execution of branch instructions beq $at, $zero, L add
More informationLab 8: Multicycle Processor (Part 1) 0.0
Lab 8: ulticycle Processor (Part ). Introduction In this lab and the next, you will design and build your own multicycle IPS processor! Your processor should match the design from the text reprinted below
More informationDesigning a Multicycle Processor
Designing a Multicycle Processor Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras Multicycle-
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationProcessor Design Pipelined Processor (II) Hung-Wei Tseng
Processor Design Pipelined Processor (II) Hung-Wei Tseng Recap: Pipelining Break up the logic with pipeline registers into pipeline stages Each pipeline registers is clocked Each pipeline stage takes one
More informationEE457. Note: Parts of the solutions are extracted from the solutions manual accompanying the text book.
EE457 Instructor: G. Puvvada ======================================================================= Homework 5b, Solution ======================================================================= Note:
More informationChapter 4 The Processor (Part 2)
Department of Electr rical Eng ineering, Chapter 4 The Processor (Part 2) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Feng-Chia Unive ersity Outline A Multicycle Implementation Mapping Control
More informationCPU Organization (Design)
ISA Requirements CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationLecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)
Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20
More informationBeyond Pipelining. CP-226: Computer Architecture. Lecture 23 (19 April 2013) CADSL
Beyond Pipelining Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationDesign of Digital Circuits Lecture 16: Dependence Handling. Prof. Onur Mutlu ETH Zurich Spring April 2017
Design of Digital Circuits Lecture 16: Dependence Handling Prof. Onur Mutlu ETH Zurich Spring 2017 27 April 2017 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed
More informationFull Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI
CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked
More informationMIPS-Lite Single-Cycle Control
MIPS-Lite Single-Cycle Control COE68: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview Single cycle
More information4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?
Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationCSE Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009
CSE 30321 Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009 Question 1: First, we briefly review the notion of a clock cycle (CC). Generally speaking a CC is the
More information