ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017
|
|
- Damon Dorsey
- 6 years ago
- Views:
Transcription
1 ECE 550D Funamentals of Computer Systems an Engineering Fall 017 Datapaths Prof. John Boar Duke University Slies are erive from work by Profs. Tyler Bletch an Anrew Hilton (Duke) an Amir Roth (Penn)
2 What i we o last time? Last time MIPS Assembly Practice translating C to assembly together Using functions Calling conventions jal (call) jr (return)
3 Now confluence of MIPS + igital logic Start of semester: Digital Logic Builing blocks of igital esign Most recently: MIPS assembly, ISA Lowest level software Now: where they meet Datapaths: harware implementation of processors By the way: homework 4 = buil a atapath With some components from the TAs 3
4 Necessary ingreient: the ALU ALU: Arithmetic/Logic Unit Performs any supporte math or logic operation on two inputs Which operation is chosen by a thir input A B ALU out op 4
5 A/Subtract With Overflow Detection Overflow S n- 1 S n- S 1 S 0 Full Aer Full Aer Full Aer Full Aer A/Sub b n- 1 a n- 1 b n- a n- b 1 a 1 b 0 a 0 5
6 ALU Slice Cin a b 3 1 Q A F Q 0 0 a + b 1 0 a - b - 1 NOT b - a OR b - 3 a AND b A/sub 0 A/sub Cout F 6
7 The ALU A ALU out Overflow Is non-zero? B op Q n-1 Q n- Q 1 Q 0 ALU Slice ALU Slice ALU Slice ALU Slice ALU control b n-1 a n-1 b n- a n- b 1 a 1 b 0 a 0 7
8 Datapath for MIPS ISA Consier only the following instructions a $1,$,$3 ai $1,,$3 lw $1,4($3) sw $1,4($3) beq $1,$,PC_relative_target j absolute_target Why only these? Most other instructions are the same from atapath viewpoint The one s that aren t are left for you to figure out 8
9 Remember The von Neumann Moel? Instruction Fetch Instruction Decoe Operan Fetch Execute Result Store Next Instruction Instruction Fetch: Rea instruction bits from memory Decoe: Figure out what those bits mean Operan Fetch: Rea registers (+ mem to get sources) Execute: Do the actual operation (e.g., a the #s) Result Store: Write result to register or memory Next Instruction: Figure out mem ar of next insn, repeat 9
10 Start With Fetch + 4 P C Same for all instructions (on t know insn yet) PC an instruction memory A +4 incrementer computes efault next instruction PC Details of : later For now: just assume a bunch of DFFs 10
11 First Instruction: a Decoing: Very easy in MIPS + 4 P C Register File s1 s R-type Op(6) Rs(5) Rt(5) R(5) Sh(5) Func(6) A register file an ALU 11
12 First Instruction: a Decoing: Very easy in MIPS P C + 4 Register File s1 s AND, OR, other r- type ientical, just change func coe! Same atapath R-type Op(6) Rs(5) Rt(5) R(5) Sh(5) Func(6) A register file an ALU 1
13 Secon Instruction: ai + 4 P C Register File s1 s S X I-type Op(6) Rs(5) Rt(5) Imme(16) Destination register can now be either R or Rt A sign extension unit an mux into secon ALU input 13
14 Thir Instruction: lw + 4 P C Register File s1 s a Data S X I-type Op(6) Rs(5) Rt(5) Imme(16) A ata memory, aress is ALU output A register write ata mux to select memory output or ALU output 14
15 Fourth Instruction: sw + 4 P C Register File s1 s a Data S X I-type Op(6) Rs(5) Rt(5) Imme(16) A path from secon input register to ata memory ata input 15
16 Fifth Instruction: beq + 4 << P C Register File s1 s z a Data S X I-type Op(6) Rs(5) Rt(5) Imme(16) A left shift unit an aer to compute PC-relative branch target A PC input mux to select PC+4 or branch target Note: shift by fixe amount very simple 16
17 Sixth Instruction: j + 4 << << P C Register File s1 s a Data S X J-type Op(6) Imme(6) A shifter to compute left shift of 6-bit immeiate A aitional PC input mux for jump target 17
18 More Instructions + 4 << << P C Register File s1 s a Data S X Figure out atapath moifications for jal (J-type) jr (R-type) 18
19 Jal + 4 << << P C Register File s1 s a Data S X For jal, nee to get PC+4 to RF write mux (an constant 31 to estination register ID probably another mux) 19
20 JR + 4 << << P C Register File s1 s a Data S X For JR nee to get RF rea value to next PC mux (an constant 31 to source register ID, again probably another mux) 0
21 Goo practice: Try other insns + 4 << << P C Register File s1 s a Data S X Pick other MIPS instructions, contemplate how to a them 1
22 Continuous Rea Datapath Timing + 4 P C Register File s1 s a Data S X Rea I Rea Registers Rea DMEM Write DMEM Write Registers Write PC Works because writes (PC, RegFile, D) are inepenent An because no rea logically follows any write (until next complete instruction) ONE LONG CLOCK CYCLE!
23 What Is Control? + 4 << << BR JP P C Register File s1 s a Data Rw Rwe S X ALUop DMwe Rst ALUinB 8 signals control flow of ata through this atapath (well, ALUop is more than one bit..) MUX selectors, or register/memory write enable signals A real atapath might have control signals 3
24 Example: Control for a + 4 << << BR=0 JP=0 P C Register File s1 s a Data Rw=0 Rwe=1 S X ALUop=0 DMwe=0 Rst=1 ALUinB=0 Control for an instruction: Values of all control signals to correctly execute it 4
25 Example: Control for sw + 4 << << BR=0 JP=0 P C Register File s1 s a Data Rw=X Rwe=0 S X ALUop=0DMwe=1 Rst=X ALUinB=1 Difference between sw an a is 5 signals 3 if you on t count the X (on t care) signals 5
26 Example: Control for beq + 4 << << BR=1 JP=0 P C Register File s1 s a Data Rw=X Rwe=0 Rst=X Difference between sw an beq is only 4 signals S X ALUinB=0 ALUop=1DMwe=0 6
27 Let s figure out LW + 4 << << BR JP P C Register File s1 s a Data Rw Rwe S X ALUop DMwe Rst ALUinB How woul these control signals be set for LW? 7
28 Example: Control for LW + 4 << << BR=0 JP=0 P C Register File s1 s a Data Rw=1 Rwe=1 S X ALUop=0 DMwe=0 Rst=0 ALUinB=1 8
29 How Is Control Implemente? + 4 << << BR JP P C Register File s1 s a Data Rw Rwe S X ALUop DMwe Rst ALUinB Control? 9
30 Implementing Control Each insn has a unique set of control signals Most are function of opcoe Some may be encoe in the instruction itself E.g., the ALUop signal is some portion of the MIPS Func fiel + Simplifies controller implementation Requires careful ISA esign 30
31 Control Implementation: ROM ROM (rea only memory): think rows of bits Bits in ata wors are control signals Lines inexe by opcoe Example: ROM control for 6-insn MIPS atapath X is on t care (electrically must be 0 or 1 but oesn t matter) BR JP ALUinB ALUop DMwe Rwe Rst Rw opcoe a ai lw sw X X beq X X j X X 31
32 Control Implementation: Ranom Logic Real machines have 100+ insns 300+ control signals 30,000+ control bits (~4KB) Not huge, but har to make faster than atapath (important!) Alternative: ranom logic (ranom = non-repeating ) More or less what I i for protocomputer Exploits the observation: many signals have few 1s or few 0s Example: ranom logic control for 6-insn MIPS atapath opcoe a ai lw sw beq j BR JP DMwe Rwe Rw Rst ALUop Yes, ranom logic is a very umb an misleaing name for this concept. Sorry. ALUinB 3
33 Datapath an Control Timing + 4 P C Register File s1 s a Data S X Control ROM/ranom logic Rea I Rea Registers (Rea Control ROM) Rea DMEM Write DMEM Write Registers Write PC Will usually nee an IR (instruction register) buffering current instruction, as in protocomputer, but here can get by with Imem output 33
34 Single-Cycle Datapath Performance + 4 P C Register File s1 s a Data S X Control ROM/ranom logic This machine will work, an it will be simple, but it will be slow Goes against make common case fast (MCCF) principle + Low Cycles Per Instruction (CPI): 1 Long clock perio: to accommoate slowest insn 34
35 Interlue: Performance Previous slie allues to something new: Performance Don t just want it to work But want it to go fast! Three components to performance: Number of instructions x Cycles per instruction x Clock Perio (CPI) (1 / Clock frequency) Instructions Cycles Secons Secons x x = Program Instruction Cycle Program 35
36 Interlue: Performance Three components to performance: Number of instructions <- Compiler s Job x Cycles per instruction (CPI) x Clock Perio (1 / Clock frequency) Instructions Cycles Secons Secons x x = Program Instruction Cycle Program s/program: etermine by compiler + ISA Generally assume fixe program when oing micro-architecture 36
37 Micro-architectural factors Micro-architecture: The etails of how the ISA is implemente Affects CPI an Clock frequency Often will look at fixe program, an consier MIPS Million Instructions Per Secon MIPS = IPC * Frequency (in MHz) IPC = Instruction Per Cycle (1 / CPI) Gives Bigger is better number Instructions Cycles Instructions x = Cycle Secon Secon (IPC) (Frequency) (Throughput) The use of MIPS to mean Millions of Instructions Per Secon has nothing to o with the CPU architecture also calle MIPS, which actually stans for Microprocessor without Interlocke Pipeline Stages. This fact that a major CPU architecture shares a name with an important metric for performance is increibly confusing an umb, an I apologize. I blame the cocaine-fuele CPU architects of the 1980s. 37
38 Best IPC For now, best we can o: IPC = 1 (CPI = 1) Do 1 instruction every cycle Later: Real processors can o multiple instructions at once! Potentially: IPC > 1! (CPI < 1!) Best possible IPC epens on esign 38
39 Performance vs. 1990s: Performance at all cost Actually more clock frequency at all cost Now: Care about other things Energy (electric bill, battery life) Power (cooling, also affects energy) Area (chip cost) Reliability (tolerance of transient faults: e.g., charge particle strikes) Important metric these ays Performance / Watt Throughput ivie by power consumption Why? What evice in particular? 39
40 Performance Moeling an Analysis Speaking of performance Making a processor takes time (years) an money (millions) Want to know it will perform well before you finish If its wrong, oing it all over is painful Performance can be simulate in software Estimate what IPC will be Guie esign Patterson an Hennessy s other more avance textbook: Computer Architecture: A Quantitative Approach" 40
41 Single-Cycle Datapath Performance + 4 P C Register File s1 s a Data S X Control ROM/ranom logic Goes against make common case fast (MCCF) principle + Low Cycles Per Instruction (CPI): 1 Long clock perio: to accommoate slowest insn 41
42 Alternative: Multi-Cycle Datapath s3 + 4 << P C s5 I R s5 Register File s1 s A B Multi-cycle atapath: attacks high clock perio Cut atapath into multiple stages (5 here), isolate using FFs FSM control walks insns thru stages (by staging control signals) Not every instruction nees every stage + Instructions can bypass stages an exit early S X s3 s3 O a Data s4 D s5 4
43 Multi-cycle Datapath FSM Next Decoe First state: Get a New Instruction Output signals to fetch (e.g., rea enable IMEM) Next State: Always Decoe 43
44 Multi-cycle Datapath FSM Next NOP Decoe Execute Secon State: Decoe Output signals to ecoe instruction (REn RegFile) Go to Next if NOP Otherwise Execute 44
45 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute Execute State Execute (varies by insn type) Next State: Also epens on insn type Branches: Next 45
46 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute ALU Writeback Execute State Execute (varies by insn type) Next State: Also epens on insn type ALU op: write register - we call this Writeback (to register file) 46
47 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute ALU Loa Writeback Execute State Execute (varies by insn type) Next State: Also epens on insn type Loa: Rea ory Rea DMEM 47
48 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute Write DMEM Store ALU Loa Writeback Execute State Execute (varies by insn type) Next State: Also epens on insn type Store: Write ory Rea DMEM 48
49 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute Write DMEM Store ALU Loa Writeback Rea DMEM Rea DMEM State Control signals enable DMEM Rea Next state is writeback (what we rea from memory nees to go to a register) 49
50 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute Write DMEM Store ALU Loa Writeback Writeback state Control signals enable regfile write Next state: Next Rea DMEM 50
51 Multi-cycle Datapath FSM Next NOP Decoe Branch Execute Write DMEM Store ALU Loa Writeback Write DMEM state Control signals enable memory write Next state: Next Rea DMEM 51
52 Multi-Cycle Datapath Example: A + 4 << P C I R Example: A Cycle 1: Rea IMEM Register File s1 s A B S X O a Data D 5
53 Multi-Cycle Datapath Example: A + 4 << P C I R Register File s1 s Example: A Cycle 1: Rea IMEM Cycle : Decoe + Rea RF A B S X O a Data D 53
54 Multi-Cycle Datapath Example: A + 4 << P C I R Register File s1 s Example: A Cycle 1: Rea IMEM Cycle : Decoe + Rea RF Cycle 3: ALU A B S X O a Data D 54
55 Multi-Cycle Datapath Example: A + 4 << P C I R Register File s1 s Example: A Cycle 1: Rea IMEM Cycle : Decoe + Rea RF Cycle 3: ALU Cycle 4: Writeback + Increment PC A B S X O a Data D 55
56 Multi-Cycle Datapath Performance + 4 << P C I R Register File s1 s A B S X O a Data D Opposite performance split of single-cycle atapath + Short clock perio High CPI 56
57 CPI epens on instructions Branches / Jumps: 3 cycles ALU: 4 cycles Stores: 4 cycles Loas: 5 cycles Multi-cycle Data-path CPI Overall CPI is weighte average Example: 0% loas, 15% stores, 0% branches, 45% ALU 57
58 CPI epens on instructions Branches / Jumps: 3 cycles ALU: 4 cycles Stores: 4 cycles Loas: 5 cycles Multi-cycle Data-path CPI Overall CPI is weighte average Example: 0% loas, 15% stores, 0% branches, 45% ALU CPI= 0.0 *
59 CPI epens on instructions Branches / Jumps: 3 cycles ALU: 4 cycles Stores: 4 cycles Loas: 5 cycles Multi-cycle Data-path CPI Overall CPI is weighte average Example: 0% loas, 15% stores, 0% branches, 45% ALU CPI= 0.0 * *
60 CPI epens on instructions Branches / Jumps: 3 cycles ALU: 4 cycles Stores: 4 cycles Loas: 5 cycles Multi-cycle Data-path CPI Overall CPI is weighte average Example: 0% loas, 15% stores, 0% branches, 45% ALU CPI= 0.0 * * * * 4 =
61 Multi-cycle Datapath Performance Single-cycle Clock perio = 50ns, CPI = 1 Performace = 50 ns/insn Multi-cycle Clock perio = 10ns CPI = (0.*3+0.*5+0.6*4) = 4 Performance = 40 ns/insn But wait 61
62 Multi-Cycle Datapath Performance + 4 << P C I R Register File s1 s A B S X O a Data D Di not just cut up existing logic into 5 pieces Also ae logic (flip flops) So clock perio not 1/5 of single cycle, but slightly longer 6
63 Multi-cycle Datapath Performance Single-cycle Clock perio = 50ns, CPI = 1 Performace = 50 ns/insn Multi-cycle Clock perio = 1ns CPI = (0.*3+0.*5+0.6*4) = 4 Performance = 48 ns/insn Better, but not as exciting Can we o better still? Have our cake (low CPI) an eat it too (high clock frequency)? 63
64 Clock Perio an CPI Single-cycle atapath + Low CPI: 1 Long clock perio: to accommoate slowest insn insn0.fetch, ec, exec Multi-cycle atapath + Short clock perio High CPI insn1.fetch, ec, exec insn0.fetch insn0.ec insn0.exec insn1.fetch insn1.ec insn1.exec Can we have both low CPI an short clock perio? No goo way to make a single insn go faster + latency oesn t matter anyway insn throughput matters Key: exploit inter-insn parallelism 64
65 Pipelining Pipelining: important performance technique Improves insn throughput rather than insn latency Exploits parallelism at insn-stage level to o so Begin with multi-cycle esign insn0.fetch insn0.ec insn0.exec insn1.fetch insn1.ec insn1.exec When insn avances from stage 1 to, next insn enters stage 1 insn0.fetch insn0.ec insn1.fetch insn0.exec insn1.ec insn1.exec Iniviual insns take same number of stages + But insns enter an leave at a much faster rate Physically breaks atomic VN loop... but must maintain illusion Revisit at en of semester (hopefully) 65
66 Datapaths: Single Cycle What o we nee? Control How control is implemente Multi-cycle Faster clock (yay!) Worse CPI (boooo) Performance: IPC Performance / Watt CPU Performance Equation Pipelining Teaser for later! Summary 66
ECE 250 / CPS 250 Computer Architecture. Processor Design Datapath and Control
ECE 250 / CPS 250 Computer Architecture Processor Design Datapath and Control Benjamin Lee Slides based on those from Andrew Hilton (Duke), Alvy Lebeck (Duke) Benjamin Lee (Duke), and Amir Roth (Penn)
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2016
ECE 550D Fundamentals of Computer ystems and Engineering Fall 2016 Pipelines Tyler letsch Duke University lides are derived from work by Andrew Hilton (Duke) and Amir Roth (Penn) Clock Period and CPI ingle-cycle
More informationPerformance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]
Performance CS 3410 Computer System Organization & Programming [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon] Performance Complex question How fast is the processor? How fast your application runs?
More informationThis Unit: Processor Design. What Is Control? Example: Control for sw. Example: Control for add
This Unit: rocessor Design Appliction O ompiler U ory Firmwre I/O Digitl ircuits Gtes & Trnsistors pth components n timing s n register files ories (RAMs) locking strtegies Mpping n IA to tpth ontrol Exceptions
More informationProfessor Lee, Yong Surk. References 고성능마이크로프로세서구조의개요. Topics Microprocessor & microcontroller
이강좌는 C & S Technology 사의지원으로제작되었으며 copyright가없으므로비영리적인목적에한하여누구든지복사, 배포가가능합니다. 연구실홈페이지에는고성능마이크로프로세서에관련된많은강좌가있으며누구나무료로다운로드받을수있습니다. Professor Lee, Yong Surk 1973 : B.S., Electrical Eng., Yonsei niv. 1981
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationComputer Architectures. DLX ISA: Pipelined Implementation
Computer Architectures L ISA: Pipelined Implementation 1 The Pipelining Principle Pipelining is nowadays the main basic technique deployed to speed-up a CP. The key idea for pipelining is general, and
More informationComputer Organization
Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.
More informationLecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)
Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20
More informationSingle Cycle Datapath
Single Cycle atapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili Section 4.-4.4 Appendices B.7, B.8, B.,.2 Practice Problems:, 4, 6, 9 ing (2) Introduction We will examine two MIPS implementations
More informationCS3350B Computer Architecture Winter 2015
CS3350B Computer Architecture Winter 2015 Lecture 5.5: Single-Cycle CPU Datapath Design Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson
More informationReview: Abstract Implementation View
Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:
More informationEECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction
EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement
More informationCPU Organization (Design)
ISA Requirements CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic
More informationMajor CPU Design Steps
Datapath Major CPU Design Steps. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected
More information15-740/ Computer Architecture Lecture 4: Pipelining. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 4: Pipelining Prof. Onur Mutlu Carnegie Mellon University Last Time Addressing modes Other ISA-level tradeoffs Programmer vs. microarchitect Virtual memory Unaligned
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 14: One Cycle MIPs Datapath Adapted from Computer Organization and Design, Patterson & Hennessy, UCB R-Format Instructions Read two register operands Perform
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationSingle Cycle Datapath
Single Cycle atapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili Section 4.1-4.4 Appendices B.3, B.7, B.8, B.11,.2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7-11 in
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 17: Pipelining Wrapup Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Outline The textbook includes lots of information Focus on
More informationImplementing the Control. Simple Questions
Simple Questions How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label add $t5, $t2, $t3 sw $t5, 8($t3) Label:... #assume not What is going on during the
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationLecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August
Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,
More informationMark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control
EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic
More informationEECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer
EECS150 - Digital Design Lecture 9- CPU Microarchitecture Feb 15, 2011 John Wawrzynek Spring 2011 EECS150 - Lec09-cpu Page 1 Watson: Jeopardy-playing Computer Watson is made up of a cluster of ninety IBM
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationCS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic
CS 61C: Great Ideas in Computer Architecture Datapath Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/fa15 1 Components of a Computer Processor Control Enable? Read/Write
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationIntroduction. Datapath Basics
Introduction CPU performance factors - Instruction count; determined by ISA and compiler - CPI and Cycle time; determined by CPU hardware 1 We will examine a simplified MIPS implementation in this course
More informationDesign of the MIPS Processor
Design of the MIPS Processor We will study the design of a simple version of MIPS that can support the following instructions: I-type instructions LW, SW R-type instructions, like ADD, SUB Conditional
More informationYou Can Do That. Unit 16. Motivation. Computer Organization. Computer Organization Design of a Simple Processor. Now that you have some understanding
.. ou Can Do That Unit Computer Organization Design of a imple Clou & Distribute Computing (CyberPhysical, bases, Mining,etc.) Applications (AI, Robotics, Graphics, Mobile) ystems & Networking (Embee ystems,
More informationPipelined CPUs. Study Chapter 4 of Text. Where are the registers?
Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More information6.823 Computer System Architecture Datapath for DLX Problem Set #2
6.823 Computer System Architecture Datapath for DLX Problem Set #2 Spring 2002 Students are allowed to collaborate in groups of up to 3 people. A group hands in only one copy of the solution to a problem
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationPipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!
Advanced Topics on Heterogeneous System Architectures Pipelining! Politecnico di Milano! Seminar Room @ DEIB! 30 November, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2 Outline!
More informationChapter 5: The Processor: Datapath and Control
Chapter 5: The Processor: Datapath and Control Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationCSEN 601: Computer System Architecture Summer 2014
CSEN 601: Computer System Architecture Summer 2014 Practice Assignment 5 Solutions Exercise 5-1: (Midterm Spring 2013) a. What are the values of the control signals (except ALUOp) for each of the following
More informationComputer Organization and Structure
Computer Organization and Structure 1. Assuming the following repeating pattern (e.g., in a loop) of branch outcomes: Branch outcomes a. T, T, NT, T b. T, T, T, NT, NT Homework #4 Due: 2014/12/9 a. What
More informationCSE Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009
CSE 30321 Computer Architecture I Fall 2009 Lecture 13 In Class Notes and Problems October 6, 2009 Question 1: First, we briefly review the notion of a clock cycle (CC). Generally speaking a CC is the
More informationThe Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (1) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationVery Simple MIPS Implementation
06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,
More informationThe Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
The Processor: Datapath and Control Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction CPU performance factors Instruction count Determined
More informationCPE 335. Basic MIPS Architecture Part II
CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture
More informationUniversity of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Science
University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Science Spring 2000 Prof. Bob Brodersen Midterm 1 March 15, 2000 CS152: Computer Architecture
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationThe Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath
The Big Picture: Where are We Now? EEM 486: Computer Architecture Lecture 3 The Five Classic Components of a Computer Processor Input Control Memory Designing a Single Cycle path path Output Today s Topic:
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 24 Introduction to CPU Design 2007-03-14 CS61C L24 Introduction to CPU Design (1) Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationL19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07
Pipelined CPUs Where are the registers? Study Chapter 6 of Text L19 Pipelined CPU I 1 Review of CPU Performance MIPS = Millions of Instructions/Second MIPS = Freq CPI Freq = Clock Frequency, MHz CPI =
More informationCENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu
CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified
More informationEECS150 - Digital Design Lecture 9 Project Introduction (I), Serial I/O. Announcements
EECS150 - Digital Design Lecture 9 Project Introduction (I), Serial I/O September 22, 2011 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http://www-inst.eecs.berkeley.edu/~cs150
More informationSystems Architecture
Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software
More informationCISC 662 Graduate Computer Architecture. Lecture 4 - ISA
CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationOutline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath
Outline EEL-473 Computer Architecture Designing a Single Cycle path Introduction The steps of designing a processor path and timing for register-register operations path for logical operations with immediates
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationCh 5: Designing a Single Cycle Datapath
Ch 5: esigning a Single Cycle path Computer Systems Architecture CS 365 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Memory path Input Output Today s Topic:
More informationIntroduction. Chapter 4. Instruction Execution. CPU Overview. University of the District of Columbia 30 September, Chapter 4 The Processor 1
Chapter 4 The Processor Introduction CPU performance factors Instruction count etermined by IS and compiler CPI and Cycle time etermined by CPU hardware We will examine two MIPS implementations simplified
More informationCPU Pipelining Issues
CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! Finishing up Chapter 6 L20 Pipeline Issues 1 Structural Data Hazard Consider LOADS: Can we fix all
More informationCPU Design Steps. EECC550 - Shaaban
CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements. 2. Select set of datapath components & establish clock methodology. 3. Assemble datapath meeting the
More informationECE232: Hardware Organization and Design
ECE232: Harware Organization an Design ectre 11: Introction to IPs path apte from Compter Organization an Design, Patterson & Hennessy, CB IPS-lite processor Compter Want to bil a processor for a sbset
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationCPU Performance Pipelined CPU
CPU Performance Pipelined CPU Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Chapters 1.4 and 4.5 In a major matter, no details are small French Proverb 2 Big Picture:
More informationCS3350B Computer Architecture Quiz 3 March 15, 2018
CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.
More informationCSSE232 Computer Architecture I. Datapath
CSSE232 Computer Architecture I Datapath Class Status Reading Sec;ons 4.1-3 Project Project group milestone assigned Indicate who you want to work with Indicate who you don t want to work with Due next
More informationComputer Science 61C Spring Friedland and Weaver. The MIPS Datapath
The MIPS Datapath 1 The Critical Path and Circuit Timing The critical path is the slowest path through the circuit For a synchronous circuit, the clock cycle must be longer than the critical path otherwise
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More informationVery Simple MIPS Implementation
06 1 MIPS Pipelined Implementation 06 1 line: (In this set.) Unpipelined Implementation. (Diagram only.) Pipelined MIPS Implementations: Hardware, notation, hazards. Dependency Definitions. Hazards: Definitions,
More informationECE550 PRACTICE Final
ECE550 PRACTICE Final This is a full length practice midterm exam. If you want to take it at exam pace, give yourself 175 minutes to take the entire test. Just like the real exam, each question has a point
More informationECE410 Design Project Spring 2013 Design and Characterization of a CMOS 8-bit pipelined Microprocessor Data Path
ECE410 Design Project Spring 2013 Design and Characterization of a CMOS 8-bit pipelined Microprocessor Data Path Project Summary This project involves the schematic and layout design of an 8-bit microprocessor
More informationCS/COE1541: Introduction to Computer Architecture
CS/COE1541: Introduction to Computer Architecture Dept. of Computer Science University of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/1541p/index.html 1 Computer Architecture? Application pull Operating
More informationCS Computer Architecture Spring Week 10: Chapter
CS 35101 Computer Architecture Spring 2008 Week 10: Chapter 5.1-5.3 Materials adapated from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [adapted from D. Patterson slides] CS 35101 Ch 5.1
More informationChapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations
Chapter 4 The Processor Part I Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations
More informationRISC Processor Design
RISC Processor Design Single Cycle Implementation - MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 13 SE-273: Processor Design Feb 07, 2011 SE-273@SERC 1 Courtesy:
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationPart II Instruction-Set Architecture. Jan Computer Architecture, Instruction-Set Architecture Slide 1
Part II Instruction-Set Architecture Jan. 211 Computer Architecture, Instruction-Set Architecture Slide 1 Short review of the previous lecture Performance = 1/(Execution time) = Clock rate / (Average CPI
More informationPipeline design. Mehran Rezaei
Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We
More informationEC 513 Computer Architecture
EC 513 Computer Architecture Single-cycle ISA Implementation Prof. Michel A. Kinsy Computer System View Processor Applications Compiler Firmware ISA Memory organization Digital Design Circuit Design Operating
More informationHakim Weatherspoon CS 3410 Computer Science Cornell University
Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register
More informationCOMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: A Based on P&H Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined
More informationInf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle
Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle Boris Grot School of Informatics University of Edinburgh Previous lecture: single-cycle processor Inf2C Computer Systems - 2017-2018. Boris
More informationChapter 4. The Processor. Computer Architecture and IC Design Lab
Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS
More information15-740/ Computer Architecture Lecture 7: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011
15-740/18-740 Computer Architecture Lecture 7: Pipelining Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011 Review of Last Lecture More ISA Tradeoffs Programmer vs. microarchitect Transactional
More informationImplementing a MIPS Processor. Readings:
Implementing a MIPS Processor Readings: 4.1-4.11 1 Goals for this Class Unrstand how CPUs run programs How do we express the computation the CPU? How does the CPU execute it? How does the CPU support other
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 6 Pipelining Part 1
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 6 Pipelining Part 1 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html
More informationInstruction Level Parallelism. Appendix C and Chapter 3, HP5e
Instruction Level Parallelism Appendix C and Chapter 3, HP5e Outline Pipelining, Hazards Branch prediction Static and Dynamic Scheduling Speculation Compiler techniques, VLIW Limits of ILP. Implementation
More informationChapter 4. The Processor Designing the datapath
Chapter 4 The Processor Designing the datapath Introduction CPU performance determined by Instruction Count Clock Cycles per Instruction (CPI) and Cycle time Determined by Instruction Set Architecure (ISA)
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #19 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker CS61C L19 CPU Design : Designing a Single-Cycle CPU
More information