Introduction to Pipelined Datapath
|
|
- Gary Cross
- 6 years ago
- Views:
Transcription
1 14:332:331 Computer Architecture and Assembly Language Week 12 Introduction to Pipelined Datapath [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 W12.1
2 Review: Multicycle Data and Control Path PCWriteCond PCWrite IorD MemRead MemWrite MemtoReg IRWrite Control FSM SrcA RegWrite RegDst PCSource Op SrcB PC 0 1 Address Write Data Memory Read Data (Instr. or Data) IR MDR Instr[31-26] Instr[15-0] Instr[5-0] Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Data 2 Write Data Sign Extend 32 Instr[25-0] Shift left 2 A B PC[31-28] Shift left 2 zero control 28 out W12.2
3 Review: RTL Summary Step R-type Mem Ref Branch Jump Instr fetch Decode IR = Memory[PC]; PC = PC + 4; A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; Out = PC +(sign-extend(ir[15-0])<< 2); Execute Out = A op B; Memory access Writeback Reg[IR[15-11]] = Out; Out = A + sign-extend (IR[15-0]); MDR = Memory[Out]; or Memory[Out] = B; Reg[IR[20-16]] = MDR; if (A==B) PC = Out; PC = PC[31-28] (IR[25-0] << 2); 331 W12.3
4 Review: Multicycle Datapath FSM Unless otherwise assigned Start PCWrite,IRWrite, MemWrite,RegWrite=0 others=x 2 SrcA=1 SrcB=10 Op=00 PCWriteCond=0 3 (Op = lw) MemRead IorD=1 PCWriteCond=0 (Op = sw) Memory Access (Op = lw or sw) Execute 0 IorD=0 Instr Fetch 1 5 MemWrite IorD=1 PCWriteCond=0 MemRead;IRWrite SrcA=0 srcb=01 PCSource,Op=00 PCWrite 6 SrcA=1 SrcB=00 Op=10 PCWriteCond=0 7 RegDst=1 RegWrite MemtoReg=0 PCWriteCond=0 (Op = R-type) 8 9 SrcA=1 SrcB=00 Op=01 PCSource=01 PCWriteCond (Op = beq) (Op = j) Decode SrcA=0 SrcB=11 Op=00 PCWriteCond=0 PCSource=10 PCWrite 4 RegDst=0 RegWrite MemtoReg=1 PCWriteCond=0 Write Back 331 W12.4
5 Review: FSM Implementation Combinational control logic Inputs Outputs PCWrite PCWriteCond IorD MemRead MemWrite IRWrite MemtoReg PCSource Op SourceB SourceA RegWrite RegDst Op5 Op4 Op3 Op2 Op1 Op0 Inst[31-26] System Clock State Reg Next State 331 W12.5
6 Clk Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate the slowest instruction Single Cycle Implementation: Cycle 1 Cycle 2 lw sw Waste Is wasteful of area since some functional units must (e.g., adders) be duplicated since they can not be shared during a clock cycle but Is simple and easy to understand 331 W12.6
7 Multicycle Advantages & Disadvantages Uses the clock cycle efficiently the clock cycle is timed to accommodate the slowest instruction step balance the amount of work to be done in each step restrict each step to use only one major functional unit Multicycle implementations allow but functional units to be used more than once per instruction as long as they are used on different clock cycles faster clock rates different instructions to take a different number of clock cycles Requires additional internal state registers, muxes, and more complicated (FSM) control 331 W12.7
8 The Five Stages of Load Instruction Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 lw IFetch Dec Exec Mem WB IFetch: Instruction Fetch and Update PC Dec: Registers Fetch and Instruction Decode Exec: Execute R-type; calculate memory address Mem: Read/write the data from/to the Data Memory WB: Write the data back to the register file 331 W12.8
9 Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Clk Cycle 1 Cycle 2 lw sw Waste Multiple Cycle Implementation: multicycle clock slower than 1/5 th of single cycle clock due to stage flipflop overhead Clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10 lw IFetch Dec Exec Mem WB sw IFetch Dec Exec Mem R-type IFetch 331 W12.9
10 Pipelined MIPS Processor Start the next instruction while still working on the current one improves throughput - total amount of work done in a given time Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 lw IFetch Dec Exec Mem WB sw IFetch Dec Exec Mem WB R-type IFetch Dec Exec Mem WB instruction latency (execution time, delay time, response time) is not reduced - time from the start of an instruction to its completion 331 W12.10
11 Single Cycle, Multiple Cycle, vs. Pipeline Single Cycle Implementation: Clk Cycle 1 Cycle 2 Load Store Waste Multiple Cycle Implementation: Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10 Clk lw IFetch Dec Exec Mem WB sw IFetch Dec Exec Mem R-type IFetch Pipeline Implementation: lw IFetch Dec Exec Mem WB wasted cycle sw IFetch Dec Exec Mem WB R-type IFetch Dec Exec Mem WB 331 W12.11
12 Pipelining the MIPS ISA What makes it easy all instructions are the same length (32 bits) few instruction formats (three) with symmetry across formats memory operations can occur only in loads and stores operands must be aligned in memory so a single data transfer requires only one memory access What makes it hard structural hazards: what if we had only one memory control hazards: what about branches data hazards: what if an instruction s input operands depend on the output of a previous instruction 331 W12.12
13 MIPS Pipeline Datapath Modifications What do we need to add/modify in our MIPS datapath? State registers between pipeline stages to isolate them IFetch Dec Exec Mem WB 1 0 Add PC 4 Instruction Memory Read Address IFetch/Dec Read Addr 1 Register Read Read Addr 2Data 1 File Write Addr Read Data 2 Write Data Dec/Exec Shift left Add Exec/Mem Data Address Memory Write Data Read Data Mem/WB 1 0 System Clock Sign 16 Extend W12.13
14 MIPS Pipeline Control Path Modifications All control signals are determined during Decode and held in the state registers between pipeline stages 1 IFetch Dec Exec Mem WB 0 Control Add PC 4 Instruction Memory Read Address IFetch/Dec Read Addr 1 Register Read Read Addr 2Data 1 File Write Addr Read Data 2 Write Data Dec/Exec Shift left Add Exec/Mem Data Address Memory Write Data Read Data Mem/WB 1 0 System Clock Sign 16 Extend W12.14
15 Graphically Representing MIPS Pipeline Can help with answering questions like: how many cycles does it take to execute this code? what is the doing during cycle 4? is there a hazard, why does it occur, and how can it be fixed? 331 W12.15
16 Why Pipeline? For Throughput! Time (clock cycles) I n s t r. Inst 0 Inst 1 Once the pipeline is full, one instruction is completed every cycle O r d e r Inst 2 Inst 3 Inst 4 Time to fill the pipeline 331 W12.16
17 Can pipelining get us into trouble? Yes: Pipeline Hazards structural hazards: attempt to use the same resource by two different instructions at the same time data hazards: attempt to use item before it is ready - instruction depends on result of prior instruction still in the pipeline control hazards: attempt to make a decision before condition is evaulated - branch instructions Can always resolve hazards by waiting pipeline control must detect the hazard take action (or delay action) to resolve hazards 331 W12.17
18 A Unified Memory Would Be a Structural Hazard Time (clock cycles) I n s t r. lw Inst 1 Mem Reg Mem Reg Mem Reg Mem Reg Reading data from memory O r d e r Inst 2 Inst 3 Mem Reg Mem Reg Mem Reg Mem Reg Inst 4 Reading instruction from memory Mem Reg Mem Reg 331 W12.18
19 How About Register File Access? Time (clock cycles) I n s t r. O r d e r add Inst 1 Inst 2 add Inst 4 Can fix register file access hazard by doing reads in the second half of the cycle and writes in the first half. 331 W12.19
20 Register Usage Can Cause Data Hazards Dependencies backward in time cause hazards I n s t r. O r d e r add r1,r2,r3 sub r4,r1,r5 and r6,r1,r7 or r8, r1, r9 xor r4,r1,r5 331 W12.20
21 One Way to Fix a Data Hazard I n s t r. add r1,r2,r3 stall Can fix data hazard by waiting stall but affects throughput O r d e r stall sub r4,r1,r5 and r6,r1,r7 331 W12.21
22 Another Way to Fix a Data Hazard I n s t r. add r1,r2,r3 sub r4,r1,r5 Can fix data hazard by forwarding results as soon as they are available to where they are needed O r d e r and r6,r1,r7 or r8, r1, r9 xor r4,r1,r5 331 W12.22
23 Loads Can Cause Data Hazards Dependencies backward in time cause hazards I n s t r. O r d e r lw r1,100(r2) sub r4,r1,r5 and r6,r1,r7 or r8, r1, r9 xor r4,r1,r5 331 W12.23
24 Stores Can Cause Data Hazards Dependencies backward in time cause hazards I n s t r. O r d e r add r1,r2,r3 sw r1,100(r5) and r6,r1,r7 or r8, r1, r9 xor r4,r1,r5 331 W12.24
25 Forwarding with Load-use Data Hazards 331 W12.25
26 Branch Instructions Cause Control Hazards Dependencies backward in time cause hazards I n s t r. O r d e r add beq lw Inst 3 Inst W12.26
27 One Way to Fix a Control Hazard I n s t r. add beq Can fix branch hazard by waiting stall but affects throughput O r d e r stall stall lw Inst W12.27
28 Other Pipeline Structures Are Possible What about (slow) multiply operation? let it take two cycles MUL What if the data memory access is twice as slow as the instruction memory? make the clock twice as slow or let data memory access take two cycles (and keep the same clock rate) IM Reg DM1 DM2 Reg 331 W12.28
29 Sample Pipeline Alternatives ARM7 IM Reg EX PC update IM access decode reg access op DM access shift/rotate commit result (write back) StrongARM-1 XScale PC update BTB access start IM access IM1 IM2 Reg DM1 Reg SHFT DM2 IM access decode reg 1 access op shift/rotate reg 2 access DM write reg write start DM access exception 331 W12.29
30 Summary All modern day processors use pipelining Pipelining doesn t help latency of single task, it helps throughput of entire workload Multiple tasks operating simultaneously using different resources Potential speedup = Number of pipe stages Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup Must detect and resolve hazards Stalling negatively affects throughput 331 W12.30
31 Performance Purchasing perspective given a collection of machines, which has the - best performance? - least cost? - best performance / cost? Design perspective faced with design options, which has the - best performance improvement? - least cost? Both require - best performance / cost? basis for comparison metric for evaluation Our goal is to understand cost & performance implications of architectural choices 331 W12.31
32 Two notions of performance Plane DC to Paris Speed Passengers Throughput (pmph) Boeing hours 610 mph ,700 BAD/Sud Concodre 3 hours 1350 mph ,200 Which has higher performance? Time to do the task (Execution Time) execution time, response time, latency Tasks per day, hour, week, sec, ns... (Performance) throughput, bandwidth Response time and throughput often are in opposition 331 W12.32
33 Definitions Performance is in units of things-per-second bigger is better If we are primarily concerned with response time performance(x) = 1 execution_time(x) " X is n times faster than Y" means Performance(X) n = Performance(Y) 331 W12.33
34 Example Time of Concorde vs. Boeing 747? Concord is 1350 mph / 610 mph = 2.2 times faster = 6.5 hours / 3 hours Throughput of Concorde vs. Boeing 747? Concord is 178,200 pmph / 286,700 pmph = 0.62 times faster Boeing is 286,700 pmph / 178,200 pmph = 1.6 times faster Boeing is 1.6 times ( 60% )faster in terms of throughput Concord is 2.2 times ( 120% ) faster in terms of flying time We will focus primarily on execution time for a single job 331 W12.34
35 Basis of Evaluation Pros representative portable widely used improvements useful in reality Actual Target Workload Full Application Benchmarks Cons very specific non-portable difficult to run, or measure hard to identify cause less representative easy to run, early in design cycle identify peak capability and potential bottlenecks 331 W12.35 Small Kernel Benchmarks Microbenchmarks easy to fool peak may be a long way from application performance
36 SPEC95 Eighteen application benchmarks (with inputs) reflecting a technical computing workload Eight integer go, m88ksim, gcc, compress, li, ijpeg, perl, vortex Ten floating-point intensive tomcatv, swim, su2cor, hydro2d, mgrid, applu, turb3d, apsi, fppp, wave5 Must run with standard compiler flags eliminate special undocumented incantations that may not even generate working code for real programs 331 W12.36
37 Metrics of performance Application Answers per month Useful Operations per second Programming Language Compiler ISA (millions) of Instructions per second MIPS (millions) of (F.P.) operations per second MFLOP/s Datapath Control Function Units Transistors Wires Pins Megabytes per second Cycles per second (clock rate) Each metric has a place and a purpose, and each can be misused 331 W12.37
38 Aspects of CPU Performance CPU CPU time time = Seconds = Instructions x Cycles Cycles x Seconds Program Program Instruction Cycle Cycle Program instr. count CPI clock rate Compiler Instr. Set Arch. Organization Technology 331 W12.38
39 CPI Average cycles per instruction CPI = (CPU Time * Clock Rate) / Instruction Count = Clock Cycles / Instruction Count CPU time = ClockCycleTime * CPI * I i i i = 1 n n CPI = CPI * F where F = I i i i i i = 1 Instruction Count "instruction frequency" Invest Resources where time is Spent! 331 W12.39
40 Example (RISC processor) Base Machine (Reg / Reg) Op Freq Cycles CPI(i) % Time 50% % Load 20% % Store 10% % Branch 20% % Typical Mix 2.2 How much faster would the machine be is a better data cache reduced the average load time to 2 cycles? How does this compare with using branch prediction to shave a cycle off the branch time? What if two instructions could be executed at once? 331 W12.40
41 Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = = ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S and the remainder of the task is unaffected then, ExTime(with E) = ((1-F) + F/S) X ExTime(without E) Speedup(with E) = 1 (1-F) + F/S 331 W12.41
42 Summary: Evaluating Instruction Sets? Design-time metrics: Can it be implemented, in how long, at what cost? Can it be programmed? Ease of compilation? Static Metrics: How many bytes does the program occupy in memory? Dynamic Metrics: How many instructions are executed? How many bytes does the processor fetch to execute the program? CPI How many clocks are required per instruction? How "lean" a clock is practical? Best Metric: Time to execute the program! Inst. Count Cycle Time NOTE: this depends on instructions set, processor organization, and compilation techniques. 331 W12.42
Overview of Today s Lecture: Cost & Price, Performance { 1+ Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class
Overview of Today s Lecture: Cost & Price, Performance EE176-SJSU Computer Architecture and Organization Lecture 2 Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class EE176
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationComputer Science 141 Computing Hardware
Computer Science 4 Computing Hardware Fall 6 Harvard University Instructor: Prof. David Brooks dbrooks@eecs.harvard.edu Upcoming topics Mon, Nov th MIPS Basic Architecture (Part ) Wed, Nov th Basic Computer
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationPerformance, Cost and Amdahl s s Law. Arquitectura de Computadoras
Performance, Cost and Amdahl s s Law Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras Performance-
More informationProcessor (multi-cycle)
CS359: Computer Architecture Processor (multi-cycle) Yanyan Shen Department of Computer Science and Engineering Five Instruction Steps ) Instruction Fetch ) Instruction Decode and Register Fetch 3) R-type
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationCPE 335. Basic MIPS Architecture Part II
CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationComputer Science 141 Computing Hardware
Compute Science 141 Computing Hadwae Fall 2006 Havad Univesity Instucto: Pof. David Books dbooks@eecs.havad.edu [MIPS Pipeline Slides adapted fom Dave Patteson s UCB CS152 slides and May Jane Iwin s CSE331/431
More informationComputer Performance Evaluation: Cycles Per Instruction (CPI)
Computer Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: where: Clock rate = 1 / clock cycle A computer machine
More informationRISC Design: Multi-Cycle Implementation
RISC Design: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationLecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533
Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?
More informationMultiple Cycle Data Path
Multiple Cycle Data Path CS 365 Lecture 7 Prof. Yih Huang CS365 1 Multicycle Approach Break up the instructions into steps, each step takes a cycle balance the amount of work to be done restrict each cycle
More information14:332:331 Pipelined Datapath
14:332:331 Pipelined Datapath I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 3 Inst 4 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be timed to accommodate
More informationCPE 335 Computer Organization. Basic MIPS Pipelining Part I
CPE 335 Computer Organization Basic MIPS Pipelining Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Pipelining
More informationProcessor: Multi- Cycle Datapath & Control
Processor: Multi- Cycle Datapath & Control (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann, 27) COURSE
More informationMulti-cycle Approach. Single cycle CPU. Multi-cycle CPU. Requires state elements to hold intermediate values. one clock cycle or instruction
Multi-cycle Approach Single cycle CPU State element Combinational logic State element clock one clock cycle or instruction Multi-cycle CPU Requires state elements to hold intermediate values State Element
More informationECE 313 Computer Organization FINAL EXAM December 14, This exam is open book and open notes. You have 2 hours.
This exam is open book and open notes. You have 2 hours. Problems 1-4 refer to a proposed MIPS instruction lwu (load word - update) which implements update addressing an addressing mode that is used in
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationInf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle
Inf2C - Computer Systems Lecture 12 Processor Design Multi-Cycle Boris Grot School of Informatics University of Edinburgh Previous lecture: single-cycle processor Inf2C Computer Systems - 2017-2018. Boris
More informationECS 154B Computer Architecture II Spring 2009
ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control 6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD Pipelined CPU Break execution into
More informationRISC Processor Design
RISC Processor Design Single Cycle Implementation - MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 13 SE-273: Processor Design Feb 07, 2011 SE-273@SERC 1 Courtesy:
More informationCSE 2021 COMPUTER ORGANIZATION
CSE 22 COMPUTER ORGANIZATION HUGH CHESSER CHESSER HUGH CSEB 2U 2U CSEB Agenda Topics:. Sample Exam/Quiz Q - Review 2. Multiple cycle implementation Patterson: Section 4.5 Reminder: Quiz #2 Next Wednesday
More informationComputer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining
Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one
More informationCENG 3420 Lecture 06: Pipeline
CENG 3420 Lecture 06: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L06.1 Spring 2019 Outline q Pipeline Motivations q Pipeline Hazards q Exceptions q Background: Flip-Flop Control Signals CENG3420 L06.2
More informationCOMP2611: Computer Organization. The Pipelined Processor
COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among
More informationRISC Architecture: Multi-Cycle Implementation
RISC Architecture: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay
More informationCSE 2021 COMPUTER ORGANIZATION
CSE 2021 COMPUTER ORGANIZATION HUGH LAS CHESSER 1012U HUGH CHESSER CSEB 1012U W10-M Agenda Topics: 1. Multiple cycle implementation review 2. State Machine 3. Control Unit implementation for Multi-cycle
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency
More informationMulticycle conclusion
Multicycle conclusion The last few lectures covered a lot of material! We introduced a multicycle datapath, where different instructions take different numbers of cycles to execute. A multicycle unit is
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationECE369. Chapter 5 ECE369
Chapter 5 1 State Elements Unclocked vs. Clocked Clocks used in synchronous logic Clocks are needed in sequential logic to decide when an element that contains state should be updated. State element 1
More informationRISC Architecture: Multi-Cycle Implementation
RISC Architecture: Multi-Cycle Implementation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay
More informationLecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón
ICS 152 Computer Systems Architecture Prof. Juan Luis Aragón Lecture 5 and 6 Multicycle Implementation Introduction to Microprogramming Readings: Sections 5.4 and 5.5 1 Review of Last Lecture We have seen
More informationPipelined Processor Design
Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm
More informationECE 154A Introduction to. Fall 2012
ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 10 Floating point review Pipelined design IEEE Floating Point Format single: 8 bits double: 11 bits single: 23 bits double:
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More information4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?
Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationLECTURE 6. Multi-Cycle Datapath and Control
LECTURE 6 Multi-Cycle Datapath and Control SINGLE-CYCLE IMPLEMENTATION As we ve seen, single-cycle implementation, although easy to implement, could potentially be very inefficient. In single-cycle, we
More informationBeyond Pipelining. CP-226: Computer Architecture. Lecture 23 (19 April 2013) CADSL
Beyond Pipelining Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationCOSC 6385 Computer Architecture - Pipelining
COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must
More informationPoints available Your marks Total 100
CSSE 3 Computer Architecture I Rose-Hulman Institute of Technology Computer Science and Software Engineering Department Exam Name: Section: 3 This exam is closed book. You are allowed to use the reference
More informationCPE 335 Computer Organization. Basic MIPS Architecture Part I
CPE 335 Computer Organization Basic MIPS Architecture Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s8/index.html CPE232 Basic MIPS Architecture
More informationLecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University
Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationFull Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI
CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked
More informationCS61C Performance. Lecture 23. April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)
cs 61C L23 performance.1 CS61C Performance Lecture 23 April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html Outline Review HP-PA, Intel 80x86 instruction
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationDesign of the MIPS Processor (contd)
Design of the MIPS Processor (contd) First, revisit the datapath for add, sub, lw, sw. We will augment it to accommodate the beq and j instructions. Execution of branch instructions beq $at, $zero, L add
More informationMajor CPU Design Steps
Datapath Major CPU Design Steps. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected
More informationLecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)
Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20
More informationSystems Architecture I
Systems Architecture I Topics A Simple Implementation of MIPS * A Multicycle Implementation of MIPS ** *This lecture was derived from material in the text (sec. 5.1-5.3). **This lecture was derived from
More informationPipelined Processor Design
Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 20 SE-273: Processor Design Courtesy: Prof. Vishwani Agrawal
More informationENE 334 Microprocessors
ENE 334 Microprocessors Lecture 6: Datapath and Control : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 3 th & 4 th Edition, Patterson & Hennessy, 2005/2008, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationLecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August
Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,
More informationTopic #6. Processor Design
Topic #6 Processor Design Major Goals! To present the single-cycle implementation and to develop the student's understanding of combinational and clocked sequential circuits and the relationship between
More informationLecture 6 Datapath and Controller
Lecture 6 Datapath and Controller Peng Liu liupeng@zju.edu.cn Windows Editor and Word Processing UltraEdit, EditPlus Gvim Linux or Mac IOS Emacs vi or vim Word Processing(Windows, Linux, and Mac IOS) LaTex
More informationThe Von Neumann Computer Model
The Von Neumann Computer Model Partitioning of the computing engine into components: Central Processing Unit (CPU): Control Unit (instruction decode, sequencing of operations), Datapath (registers, arithmetic
More informationCENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu
CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationTHE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination May 23, 2014 Name: Email: Student ID: Lab Section Number: Instructions: 1. This
More informationCENG 3420 Lecture 06: Datapath
CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference
More informationSingle vs. Multi-cycle Implementation
Single vs. Multi-cycle Implementation Multicycle: Instructions take several faster cycles For this simple version, the multi-cycle implementation could be as much as 1.27 times faster (for a typical instruction
More informationALUOut. Registers A. I + D Memory IR. combinatorial block. combinatorial block. combinatorial block MDR
Microprogramming Exceptions and interrupts 9 CMPE Fall 26 A. Di Blas Fall 26 CMPE CPU Multicycle From single-cycle to Multicycle CPU with sequential control: Finite State Machine Textbook Edition: 5.4,
More informationSome material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier
Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20
More informationDesign of the MIPS Processor
Design of the MIPS Processor We will study the design of a simple version of MIPS that can support the following instructions: I-type instructions LW, SW R-type instructions, like ADD, SUB Conditional
More informationWhat is Pipelining? Time per instruction on unpipelined machine Number of pipe stages
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationProcessor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)
More informationEECE 417 Computer Systems Architecture
EECE 417 Computer Systems Architecture Department of Electrical and Computer Engineering Howard University Charles Kim Spring 2007 1 Computer Organization and Design (3 rd Ed) -The Hardware/Software Interface
More informationChapter 5: The Processor: Datapath and Control
Chapter 5: The Processor: Datapath and Control Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor
More informationProcessor Architecture
Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationLecture 3: Evaluating Computer Architectures. How to design something:
Lecture 3: Evaluating Computer Architectures Announcements - (none) Last Time constraints imposed by technology Computer elements Circuits and timing Today Performance analysis Amdahl s Law Performance
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control
ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
More informationPipelining. Ideal speedup is number of stages in the pipeline. Do we achieve this? 2. Improve performance by increasing instruction throughput ...
CHAPTER 6 1 Pipelining Instruction class Instruction memory ister read ALU Data memory ister write Total (in ps) Load word 200 100 200 200 100 800 Store word 200 100 200 200 700 R-format 200 100 200 100
More informationComputer Architecture
Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in
More informationQuantifying Performance EEC 170 Fall 2005 Chapter 4
Quantifying Performance EEC 70 Fall 2005 Chapter 4 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation
More informationMEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance
Necessity of evaluation computer performance MEASURING COMPUTER PERFORMANCE For comparing different computer performances User: Interested in reducing the execution time (response time) of a task. Computer
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationMeasure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding effects of underlying architecture
Chapter 2 Note: The slides being presented represent a mix. Some are created by Mark Franklin, Washington University in St. Louis, Dept. of CSE. Many are taken from the Patterson & Hennessy book, Computer
More informationMulticycle Approach. Designing MIPS Processor
CSE 675.2: Introduction to Computer Architecture Multicycle Approach 8/8/25 Designing MIPS Processor (Multi-Cycle) Presentation H Slides by Gojko Babić and Elsevier Publishing We will be reusing functional
More informationCSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements
CSE 4 Computer Architecture Spring 25 Lectures Exceptions and Introduction to Pipelining May 4, 25 Announcements Reading Assignment Sections 5.6, 5.9 The Processor Datapath and Control Section 6., Enhancing
More informationﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control
معماري كامپيوتر پيشرفته معماري MIPS data path and control abbasi@basu.ac.ir Topics Building a datapath support a subset of the MIPS-I instruction-set A single cycle processor datapath all instruction actions
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationLets Build a Processor
Lets Build a Processor Almost ready to move into chapter 5 and start building a processor First, let s review Boolean Logic and build the ALU we ll need (Material from Appendix B) operation a 32 ALU result
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationCS 4200/5200 Computer Architecture I
CS 4200/5200 Computer Architecture I MIPS Instruction Set Architecture Dr. Xiaobo Zhou Department of Computer Science CS420/520 Lec3.1 UC. Colorado Springs Adapted from UCB97 & UCB03 Review: Organizational
More informationCSE 141 Computer Architecture Summer Session I, Lecture 3 Performance and Single Cycle CPU Part 1. Pramod V. Argade
CSE 141 Computer Architecture Summer Session I, 2005 Lecture 3 Performance and Single Cycle CPU Part 1 Pramod V. Argade CSE141: Introduction to Computer Architecture Instructor: Pramod V. Argade (p2argade@cs.ucsd.edu)
More informationECE473 Computer Architecture and Organization. Pipeline: Control Hazard
Computer Architecture and Organization Pipeline: Control Hazard Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 15.1 Pipelining Outline Introduction
More informationECE 313 Computer Organization FINAL EXAM December 11, Multicycle Processor Design 30 Points
This exam is open book and open notes. Credit for problems requiring calculation will be given only if you show your work. 1. Multicycle Processor Design 0 Points In our discussion of exceptions in the
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More information