COSC3330 Computer Architecture Lecture 7. Datapath and Performance
|
|
- Lucas York
- 6 years ago
- Views:
Transcription
1 COSC3330 Computer Architecture Lecture 7. Datapath and Performance Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston
2 Datapath Performance 2
3 Datapath Control Signals Clock Zwa Xra Yra Zdi Logical Flag 0001: AND 0011: A 0101: B 0110: XOR 0111: OR RegFile we Ydo Xdo Sign-extended immediate Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate imm enable 1 0 Memory A B ā/s 00: AU 2 LF 01: LU ALS ALU ST SD 10: SU 11: Disable ALU Shift Direction (SD) 0: Left 1: Right st enable 4 2 ld enable msel Address Data r/w load $Z, ($X) store $Y, ($X)
4 A Simple Processor Instruction Register Program Counter Next PC gen Microcode Memory (Single-Cycle Implementation) X 5 Y 5 Z 5 imm 16 imm_en we ALS 2 ā/s LF 4 ST 2 SD ld_en st_en ṝ/w msel Single Cycle Datapath Memory
5 A Simple Processor add $4, $3, $ opcode rs rt rd shamt funct Microcode Memory X=00011 Y=00010 Z=00100 imm=0 imm_en=0 we=1 ALS=00 ā/s=0 LF=0000 ST=00 SD=0 ld_en=0 st_en=0 ṝ/w=0 msel=0 Single Cycle Datapath Memory
6 Microcode Control (1) Clear memory location 100, 104 (r0 hardwired to 0) Datapath Control Signals instruction sequence X (5) Y (5) Z (5) w e Imm_ en Imm_val ALS ā/s LF ST SD ld_e n st_e n ṝ/w msel li r1,100 sw r0, (r1) addi r1,r1,4 sw r0, (r1) ALS 00: AU 01: LU 10: SU 11: Disable ALU Logical Flag (LF) 0001: AND 0011: A 0101: B 0110: XOR 0111: OR Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate Shift Direction (SD) 0: Left 1: Right
7 Microcode Control (1) Clear memory location 100, 104 (r0 hardwired to 0) instruct ion sequenc e X (5) Y (5) Z (5) we Im m_ en Imm_val ALS ā/s LF ST SD ld_e n st_e n ṝ/w msel li r1,100 sw r0, (r1) addi r1,r1,4 sw r0, (r1) x x x x 0101 x x 0 0 x x 0 0 x 11 x x x x x x x x x 0 0 X x 0 0 x 11 x x x x ALS 00: AU 01: LU 10: SU 11: Disable ALU Logical Flag (LF) 0001: AND 0011: A 0101: B 0110: XOR 0111: OR Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate Shift Direction (SD) 0: Left 1: Right
8 Microcode Control (2) copy 4-byte data from 0xF000 to 0xA100 clear data at 0xF000 instruction sequence X (5) Y (5) Z (5) w e Imm_ en Imm_val ALS ā/s LF ST SD ld_e n st_e n ṝ/w msel li r5, 0xF000 lw r6, (r5) li r7, 0xA100 sw r6, (r7) sw r0, (r5) ALS 00: AU 01: LU 10: SU 11: Disable ALU Logical Flag (LF) 0001: AND 0011: A 0101: B 0110: XOR 0111: OR Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate Shift Direction (SD) 0: Left 1: Right
9 Microcode Control (2) instru ction sequen ce li r5, 0xF000 lw r6, (r5) li r7, 0xA10 0 sw r6, (r7) sw r0, (r5) copy 4-byte data from 0xF000 to 0xA100 clear data at 0xF000 X (5) Y (5) Z (5) w e Imm _en Imm_v al AL S ā/s LF ST SD ld_ en X X xF X 0101 X X 0 0 X X X 11 X X X X X X xA X 0101 X X 0 0 X X 0 0 X 11 X X X X X 0 0 X 11 X X X X st_ en ṝ/ w msel ALS 00: AU 01: LU 10: SU 11: Disable ALU Logical Flag (LF) 0001: AND 0011: A 0101: B 0110: XOR 0111: OR Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate Shift Direction (SD) 0: Left 1: Right
10 Clock Zwa Xra Yra Zdi Logical Flag 0001: AND 0011: A 0101: B 0110: XOR 0111: OR Datapath Control Signals RegFile we Ydo Xdo Sign-extended immediate Shift Type (ST) 00: No Shift 01: Logical 10: Arithmetic 11: Rotate imm enable 1 0 Memory A B ā/s 00: AU 2 LF 01: LU ALS ALU ST SD 10: SU 11: Disable ALU Shift Direction (SD) 0: Left 1: Right st enable 4 2 ld enable msel Address Data r/w load $Z, 16 ($X) store $Y, 4 ($X)
11 A Simple Processor Instruction Register Program Counter Next PC gen Microcode Memory (Single-Cycle Implementation) X 5 Y 5 Z 5 imm 16 imm_en we ALS 2 ā/s LF 4 ST 2 SD ld_en st_en ṝ/w msel Single Cycle Datapath Memory
12 Instruction Fetching (PC Update) Next PC generation Program Counter Instruction Register addr Memory data x RegFile Datapath Microcode ROM
13 Absolute vs. Relative Instruction Addressing The next PC address is given as an absolute value PC address = <given address> Jump class examples J LABEL JR $r An offset relative to the current PC address is given instead of an absolute address PC address = <current PC address> + <offset> Branch class examples bne $src, $dest, LABEL beq $src, $dest, LABEL
14 Sequential Instruction Fetch 4 + Program Counter Instruction Register addr Memory data x RegFile Datapath Microcode ROM
15 Branch Support Offset (from ROM) ext 4 beq 1 mux bne (if true) 0 + Program Counter Instruction Register addr Memory data x RegFile Datapath Microcode ROM
16 Branch and Jump Support rs Offset (from ROM) Target addr (from ROM) jr/j ext 1 mux 0 jr j ext beq 1 bne (if true) 1 4 mux mux Program Counter Instruction Register addr Memory data x RegFile Datapath Microcode ROM
17 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation Why is some hardware better than others for different programs? What factors of system performance are hardware related? (e.g., Do we need a new machine, or a new operating system?)
18 Which of These Airplanes Has the Best Performance? Airplane Passengers Range (mi) Speed (mph) Boeing Boeing BAC/Sud Concorde Douglas DC How much faster is the Concorde compared to the 747? How much bigger is the 747 than the Douglas DC-8?
19 Computer Performance: TIME, TIME, TIME Response Time (latency) How long does it take for my job to run? How long does it take to execute a job? How long must I wait for the database query? Throughput How many jobs can the machine run at once? What is the average execution rate? How much work is getting done?
20 Clock Cycles Instead of reporting program execution time in seconds, we often use cycles seconds program cycles program seconds cycle Clock ticks indicate when to start activities (one abstraction): cycle time = time between ticks = seconds per cycle clock rate = cycles per second (1 Hz = 1 cycle/sec)
21 How to Improve Performance seconds program cycles program seconds cycle So, to improve performance (everything else being equal) you can either the # of required cycles for a program, or the clock cycle time or, said another way, the clock rate.
22 How Many Cycles Are Required for a Program? 1st instruction 2nd instruction 3rd instruction 4th 5th 6th... Could assume that # of cycles = # of instructions? time This assumption is incorrect, different instructions take different amounts of time on different machines.
23 Different Numbers of Cycles for Different Instructions time Multiplication takes more time than addition Floating point operations take longer than integer ones Accessing memory takes (in general) more time than accessing registers Important point: changing the cycle time often changes the number of cycles required for various instructions (more later)
24 Performance Performance is determined by execution time Do any of the other variables equal performance? # of cycles to execute program? # of instructions in program? # of cycles per second? (frequency) average # of cycles per instruction (CPI)? average # of instructions per second? Common pitfall: thinking one of the variables is indicative of performance when it really isn t.
25 CPI Example (CPI - average # of cycles per instruction) Suppose we have two implementations of the same instruction set architecture (ISA). For some program, Machine A has a clock cycle time of 10 ns. and a CPI of ns * 2.0 * # of instructions Machine B has a clock cycle time of 20 ns. and a CPI of ns * 1.2 * # of instructions What machine is faster for this program?
26 # of Instructions Example A compiler designer is trying to decide between two code sequences for a particular machine. Class A Instruction Class B Instruction Class C Instruction One Cycle Two Cycles Three Cycles Class A Instructions Class B Instructions Class C Instructions 2 of A 1 of B 2 of C 4 of A 1 of B 1 of C Total Instructions 5 6 Total Cycles 2+1*2+2*3 = *2+1*3 = 9 Which sequence will be faster? What is the CPI for each sequence?
27 Amdahl's Law Execution Time After Improvement = Execution Time Unaffected +( Execution Time Affected / Amount of Improvement ) Time before Improvement Time after Improvement
28 Amdahl s Law Speed-up = Perf new / Perf old =Exec_time old / Exec_time new = Performance improvement from using faster mode is limited by the fraction the faster mode can be applied. (1 - f) T old f Gene Amdahl 1 (1 f ) f P (1 - f)t new f / P
29 Amdahl s Law Analogy Spring Break Driving from Houston to South Padre Island 60 miles/hr from Houston to Kingsville 120 miles/hr from Kingsville to South Padre Island How much time you can save compared against driving all the way at 60 miles/hr from Houston to Padre Island? about 5hr 10min vs. 6hr 10min Key is to speed up the biggie portion, i.e. speed up frequently executed blocks
30 Example Suppose a program runs in 100 seconds on a machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? 20S 80S 5S Principle: Make the common case fast
31 Example A Finance Company developed an in-house option pricing SW based on Monte Carlo method (computing intensive) Take 1 second to finish Monte Carlo Consider to use GPU for enhancing performance Part of the calculation (0.6s out of the 1s) can be offloaded to GPU (6x speedup) 0.4S 0.6s what will the total speedup be?
32 Amdahl s Law Accelerate HD encoding using GPU Assume 50% time spent on ME Overall performance speedup ME offloaded to GPU, F = 50% The rest on CPU
33 Remember Performance is specific to a particular program Total execution time is a consistent summary of performance For a given architecture performance increases come from: increases in clock rate (without adverse CPI affects) improvements in processor organization that lower CPI compiler enhancements that lower CPI and/or instruction count Pitfall: expecting improvement in one aspect of a machine s performance to affect the total performance
34 Speed-up Parallelism vs. Speedup 100 P=1 P=2 P=4 P=8 P=16 P= P=64 Amdahl's Law speed-up as a function of parallelism x 1.11x 1.33x Code portion in Faster mode (f) 34
35 Gustafson s Law Amdahl s Law killed massive parallel processing (MPP) Gustafson came to rescue John Gustafson Seq T new Parallel Seq T old P * Parallel Time Assume: Seq + Parallel = 1 (T new ) Speedup = Seq + p * (1 Seq) where p=parallel factor If Seq diminishes with increased problem size, Speedup p 35
36 36
37 50mph 100 miles 50mph 100 miles Gustafson s Law 100mph 2nd 100 miles 100mph 2nd 100 miles 66.67mph 100mph 3rd 100 miles 75mph 50mph 100 miles 100mph 2nd 100 miles 100mph 3rd 100 miles 100mph 4th 100 miles 80mph Suppose a car has already been travelling for some time at speed of less than 50km/h, and when given enough time and distance to travel, the car s average speed can reach 100km/h as long as it drives faster than 100 km/h for some time. And also the average speed can reach 120km/h and even 150km/h as long as it drives fast enough in the following part 37
38 Amdahl versus Gustafson Who is right? 38
39 Amdahl versus Gustafson Amdahl s presumption of fixed data size Both laws are in fact different perspective over the same truth one sees data size as fixed and the other sees the relation as a function of data size 39
40 Additional Example of Performance Evaluation Operation Frequency Clock cycle count ALU Ops (regreg) 43% 1 Loads 21% 2 Stores 12% 2 Branches 24% 2 Assume 25% of the ALU ops directly use a loaded operand that is not used again. We propose adding ALU instructions that have one src operand in memory. These new reg-mem instructions spend 2 clock cycles. Also assume that the extended instruction set increase branch s clock by 1, but no impact to cycle time. Would this change improve performance? 40
41 Additional Example of Performance Evaluation Operation Frequency Clock cycle count ALU Ops (regreg) 43% 1 Loads 21% 2 Stores 12% 2 Branches 24% 2 Assume 25% of the ALU ops directly use a loaded operand that is not used again. We propose adding ALU instructions that have one src operand in memory. These new reg-mem instructions spend 2 clock cycles. Also assume that the extended instruction set increase branch s clock by 1, but no impact to cycle time. Would this change improve performance? Cycles old * Cycles new ( ) 1 ( *0.43)*2 0.12*2 0.24*
42 Additional Example of Performance Evaluation FP instructions = 25% Average CPI of FP instructions = 4.0 Average CPI of other instructions = 1.33 FPSQRT = 2% of all instructions, CPI of FPSQRT = 20 Design Option 1: decrease the CPI of FQSQRT to 2 Design Option 2: decease the average CPI of all FP instructions to
43 Additional Example of Performance Evaluation FP instructions = 25% Average CPI of FP instructions = 4.0 Average CPI of other instructions = 1.33 FPSQRT = 2% of all instructions, CPI of FPSQRT = 20 Design Option 1: decrease the CPI of FPSQRT to 2 Design Option 2: decease the average CPI of all FP instructions to 2.5 Original CPI = 0.25* *(1-0.25) = 2.0 Option 1 CPI = 2.0 2%*(20-2) = 1.64 Option 2 CPI = 0.25* *(1-0.25) = Speedup of Option 1 = 2/1.64 = Speedup of Option 2 = 2/1.625 =
44 Additional Example of Performance Evaluation Clock freq = 1.4 GHz FP insturctionss = 25% Average CPI of FP instructions = 4.0 Average CPI of other instructions = 1.33 FPSQRT = 2%, CPI of FPSQRT = 20 Design Option 1: decrease the CPI of FPSQRT to 2, clock freq = 1.2GHz Design Option 2: decease the average CPI of all FP instructions to 2.5, clock freq = 1.1 GHz 44
45 Additional Example of Performance Evaluation Clock freq = 1.4 GHz FP insturctionss = 25% Average CPI of FP instructions = 4.0 Average CPI of other instructions = 1.33 FPSQRT = 2%, CPI of FPSQRT = 20 Design Option 1: decrease the CPI of FPSQRT to 2, clock freq = 1.2GHz Design Option 2: decease the average CPI of all FP instructions to 2.5, clock freq = 1.1 GHz Original CPI = 2.0, IPC = 1/2, Inst/Sec = ½*1.4G = 0.7G inst/s Option 1 CPI = 1.64, IPC = 1/1.64, Inst/Sec = 1/1.64*1.2G = 0.73G inst/s Option 2 CPI = 1.625, IPC = 1/1.625, Inst/Sec = 1/1.625*1.1G = 0.68G inst/s 45
Defining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747
Defining Which airplane has the best performance? 1 Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300 400 500 Passenger Capacity
More informationDefining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.
Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300
More informationComputer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm
Computer Performance He said, to speed things up we need to squeeze the clock Reread Chapter 1.4-1.9 Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm L15 Computer Performance 1 Why Study Performance?
More informationCS 4200/5200 Computer Architecture I
CS 4200/5200 Computer Architecture I MIPS Instruction Set Architecture Dr. Xiaobo Zhou Department of Computer Science CS420/520 Lec3.1 UC. Colorado Springs Adapted from UCB97 & UCB03 Review: Organizational
More informationCS/COE1541: Introduction to Computer Architecture
CS/COE1541: Introduction to Computer Architecture Dept. of Computer Science University of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/1541p/index.html 1 Computer Architecture? Application pull Operating
More informationThe overall datapath for RT, lw,sw beq instrucution
Designing The Main Control Unit: Remember the three instruction classes {R-type, Memory, Branch}: a) R-type : Op rs rt rd shamt funct 1.src 2.src dest. 31-26 25-21 20-16 15-11 10-6 5-0 a) Memory : Op rs
More informationCENG 3420 Lecture 06: Datapath
CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference
More informationIC220 Slide Set #5B: Performance (Chapter 1: 1.6, )
Performance IC220 Slide Set #5B: Performance (Chapter 1: 1.6, 1.9-1.11) Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational
More informationCENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu
CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationThe Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture
The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations
More informationare Softw Instruction Set Architecture Microarchitecture are rdw
Program, Application Software Programming Language Compiler/Interpreter Operating System Instruction Set Architecture Hardware Microarchitecture Digital Logic Devices (transistors, etc.) Solid-State Physics
More informationThe Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath
The Big Picture: Where are We Now? EEM 486: Computer Architecture Lecture 3 The Five Classic Components of a Computer Processor Input Control Memory Designing a Single Cycle path path Output Today s Topic:
More informationMeasure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding effects of underlying architecture
Chapter 2 Note: The slides being presented represent a mix. Some are created by Mark Franklin, Washington University in St. Louis, Dept. of CSE. Many are taken from the Patterson & Hennessy book, Computer
More informationCOMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath
COMP33 - Computer Architecture Lecture 8 Designing a Single Cycle Datapath The Big Picture The Five Classic Components of a Computer Processor Input Control Memory Datapath Output The Big Picture: The
More informationProcessor (I) - datapath & control. Hwansoo Han
Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two
More informationReview: Abstract Implementation View
Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:
More informationChapter 1. Computer Abstractions and Technology. Lesson 3: Understanding Performance
Chapter 1 Computer Abstractions and Technology Lesson 3: Understanding Performance Manufacturing ICs 1.7 Real Stuff: The AMD Opteron X4 Yield: proportion of working dies per wafer Chapter 1 Computer Abstractions
More informationLearning Outcomes. Spiral 3-3. Sorting: Software Implementation REVIEW
3-3. Learning Outcomes 3-3. Spiral 3-3 Single Cycle CPU I understand how the single-cycle CPU datapath supports each type of instruction I understand why each mux is needed to select appropriate inputs
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationLecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1
Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number
More informationMark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control
EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic
More informationT F The immediate field of branches is sign extended. T F The immediate field of and immediate (andi) and or immediate (ori) is zero extended.
Problem 1 MIPS Instruction Set Architecture (22 pts, 10 mins) Extending the Immediate Field in MIPS (6 pts) Mark the following statements true or false about executing the MIPS Core instructions from column
More informationLecture 4: Instruction Set Architectures. Review: latency vs. throughput
Lecture 4: Instruction Set Architectures Last Time Performance analysis Amdahl s Law Performance equation Computer benchmarks Today Review of Amdahl s Law and Performance Equations Introduction to ISAs
More information361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath
361 datapath.1 Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath Outline of Today s Lecture Introduction Where are we with respect to the BIG picture? Questions and Administrative
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19
CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be
More informationChapter 4. The Processor. Computer Architecture and IC Design Lab
Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS
More informationECE232: Hardware Organization and Design. Computer Organization - Previously covered
ECE232: Hardware Organization and Design Part 6: MIPS Instructions II http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Computer Organization
More informationOutline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath
Outline EEL-473 Computer Architecture Designing a Single Cycle path Introduction The steps of designing a processor path and timing for register-register operations path for logical operations with immediates
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor
More informationReminder: tutorials start next week!
Previous lecture recap! Metrics of computer architecture! Fundamental ways of improving performance: parallelism, locality, focus on the common case! Amdahl s Law: speedup proportional only to the affected
More informationThe MIPS Processor Datapath
The MIPS Processor Datapath Module Outline MIPS datapath implementation Register File, Instruction memory, Data memory Instruction interpretation and execution. Combinational control Assignment: Datapath
More informationLecture 10: Simple Data Path
Lecture 10: Simple Data Path Course so far Performance comparisons Amdahl s law ISA function & principles What do bits mean? Computer math Today Take QUIZ 6 over P&H.1-, before 11:59pm today How do computers
More informationCS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic
CS 61C: Great Ideas in Computer Architecture Datapath Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/fa15 1 Components of a Computer Processor Control Enable? Read/Write
More informationChapter 4. The Processor Designing the datapath
Chapter 4 The Processor Designing the datapath Introduction CPU performance determined by Instruction Count Clock Cycles per Instruction (CPI) and Cycle time Determined by Instruction Set Architecure (ISA)
More informationPipeline design. Mehran Rezaei
Pipeline design Mehran Rezaei How Can We Improve the Performance? Exec Time = IC * CPI * CCT Optimization IC CPI CCT Source Level * Compiler * * ISA * * Organization * * Technology * With Pipelining We
More informationCS 2506 Computer Organization II
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationIntroduction. Datapath Basics
Introduction CPU performance factors - Instruction count; determined by ISA and compiler - CPI and Cycle time; determined by CPU hardware 1 We will examine a simplified MIPS implementation in this course
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationEECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction
EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement
More informationThe Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
The Processor: Datapath and Control Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction CPU performance factors Instruction count Determined
More informationInf2C - Computer Systems Lecture Processor Design Single Cycle
Inf2C - Computer Systems Lecture 10-11 Processor Design Single Cycle Boris Grot School of Informatics University of Edinburgh Previous lectures Combinational circuits Combinations of gates (INV, AND, OR,
More informationChapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations
Chapter 4 The Processor Part I Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations
More informationRISC Processor Design
RISC Processor Design Single Cycle Implementation - MIPS Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 13 SE-273: Processor Design Feb 07, 2011 SE-273@SERC 1 Courtesy:
More informationChapter 4 The Processor 1. Chapter 4A. The Processor
Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware
More informationECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor
ECE 23 Digital Logic & Computer Organization Spring 28 More Single Cycle Microprocessor Lecture 6: HW6 due tomorrow Announcements Prelim 2: Tues April 7, 7:3pm, Phillips Hall Coverage: Lectures 8~6 Inform
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design
More informationCISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization
CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationCS3350B Computer Architecture Quiz 3 March 15, 2018
CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.
More informationProgrammable Machines
Programmable Machines Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. Quiz 1: next week Covers L1-L8 Oct 11, 7:30-9:30PM Walker memorial 50-340 L09-1 6.004 So Far Using Combinational
More informationCS Computer Architecture Spring Week 10: Chapter
CS 35101 Computer Architecture Spring 2008 Week 10: Chapter 5.1-5.3 Materials adapated from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [adapted from D. Patterson slides] CS 35101 Ch 5.1
More informationThe Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
The Processor (1) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationMidterm I October 6, 1999 CS152 Computer Architecture and Engineering
University of California, Berkeley College of Engineering Computer Science Division EECS Fall 1999 John Kubiatowicz Midterm I October 6, 1999 CS152 Computer Architecture and Engineering Your Name: SID
More informationProgrammable Machines
Programmable Machines Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. Quiz 1: next week Covers L1-L8 Oct 11, 7:30-9:30PM Walker memorial 50-340 L09-1 6.004 So Far Using Combinational
More informationCOSC 6385 Computer Architecture - Pipelining
COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationEECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer
EECS150 - Digital Design Lecture 9- CPU Microarchitecture Feb 15, 2011 John Wawrzynek Spring 2011 EECS150 - Lec09-cpu Page 1 Watson: Jeopardy-playing Computer Watson is made up of a cluster of ninety IBM
More informationCISC 662 Graduate Computer Architecture. Lecture 4 - ISA
CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationThe Processor: Datapath & Control
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Processor: Datapath & Control Processor Design Step 3 Assemble Datapath Meeting Requirements Build the
More informationReview. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU
CS6C L9 CPU Design : Designing a Single-Cycle CPU () insteecsberkeleyedu/~cs6c CS6C : Machine Structures Lecture #9 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker Review
More informationELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)
ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2) Victor P. Nelson, Professor & Asst. Chair Vishwani D. Agrawal, James J. Danaher Professor Department
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #19 Designing a Single-Cycle CPU 27-7-26 Scott Beamer Instructor AI Focuses on Poker CS61C L19 CPU Design : Designing a Single-Cycle CPU
More informationComputer Architecture
Computer Architecture Chapter 2 Instructions: Language of the Computer Fall 2005 Department of Computer Science Kent State University Assembly Language Encodes machine instructions using symbols and numbers
More informationECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations
ECE7 Computer Architecture Single Cycle Control Review: 3a: Overview of the Fetch Unit The common operations Fetch the : mem[] Update the program counter: Sequential Code: < + Branch and Jump: < something
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationCpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath
CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CPE 442 single-cycle datapath.1 Outline of Today s Lecture Recap and Introduction Where are we with respect to the BIG picture?
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 14: One Cycle MIPs Datapath Adapted from Computer Organization and Design, Patterson & Hennessy, UCB R-Format Instructions Read two register operands Perform
More informationPerformance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]
Performance CS 3410 Computer System Organization & Programming [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon] Performance Complex question How fast is the processor? How fast your application runs?
More informationLecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)
Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20
More information--------------------------------------------------------------------------------------------------------------------- 1. Objectives: Using the Logisim simulator Designing and testing a Pipelined 16-bit
More informationProcessor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4
Processor Han Wang CS3410, Spring 2012 Computer Science Cornell University See P&H Chapter 2.16 20, 4.1 4 Announcements Project 1 Available Design Document due in one week. Final Design due in three weeks.
More informationInstructions: Language of the Computer
CS359: Computer Architecture Instructions: Language of the Computer Yanyan Shen Department of Computer Science and Engineering 1 The Language a Computer Understands Word a computer understands: instruction
More informationCPU Performance Pipelined CPU
CPU Performance Pipelined CPU Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Chapters 1.4 and 4.5 In a major matter, no details are small French Proverb 2 Big Picture:
More informationCS 2506 Computer Organization II
Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may
More informationLecture 7 Pipelining. Peng Liu.
Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt
More informationinst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I ! Nasty new windows vulnerability!
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I CS61C L18 CPU Design: The Single-Cycle I (1)! 2010-07-21!!!Instructor Paul Pearce! Nasty new windows vulnerability!
More informationCS3350B Computer Architecture Winter 2015
CS3350B Computer Architecture Winter 2015 Lecture 5.5: Single-Cycle CPU Datapath Design Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson
More informationA Processor. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3
A Processor Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 2.16-20, 4.1-3 Let s build a MIPS CPU but using Harvard architecture Basic Computer System Registers ALU
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationCS 61C: Great Ideas in Computer Architecture Pipelining and Hazards
CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time
More informationECE 486/586. Computer Architecture. Lecture # 7
ECE 486/586 Computer Architecture Lecture # 7 Spring 2015 Portland State University Lecture Topics Instruction Set Principles Instruction Encoding Role of Compilers The MIPS Architecture Reference: Appendix
More informationCPU Organization (Design)
ISA Requirements CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic
More informationCSE140: Components and Design Techniques for Digital Systems
CSE4: Components and Design Techniques for Digital Systems Tajana Simunic Rosing Announcements and Outline Check webct grades, make sure everything is there and is correct Pick up graded d homework at
More informationCPE 335 Computer Organization. Basic MIPS Architecture Part I
CPE 335 Computer Organization Basic MIPS Architecture Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s8/index.html CPE232 Basic MIPS Architecture
More informationENE 334 Microprocessors
ENE 334 Microprocessors Lecture 6: Datapath and Control : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 3 th & 4 th Edition, Patterson & Hennessy, 2005/2008, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/
More informationData paths for MIPS instructions
You are familiar with how MIPS programs step from one instruction to the next, and how branches can occur conditionally or unconditionally. We next examine the machine level representation of how MIPS
More informationComputer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture
Computer Science 324 Computer Architecture Mount Holyoke College Fall 2009 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:
More informationMIPS ISA. 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support
Components of an ISA EE 357 Unit 11 MIPS ISA 1. Data and Address Size 8-, 16-, 32-, 64-bit 2. Which instructions does the processor support SUBtract instruc. vs. NEGate + ADD instrucs. 3. Registers accessible
More informationTopic Notes: MIPS Instruction Set Architecture
Computer Science 220 Assembly Language & Comp. Architecture Siena College Fall 2011 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture.
More informationPipelined CPUs. Study Chapter 4 of Text. Where are the registers?
Pipelined CPUs Where are the registers? Study Chapter 4 of Text Second Quiz on Friday. Covers lectures 8-14. Open book, open note, no computers or calculators. L17 Pipelined CPU I 1 Review of CPU Performance
More informationReview: latency vs. throughput
Lecture : Performance measurement and Instruction Set Architectures Last Time Introduction to performance Computer benchmarks Amdahl s law Today Take QUIZ 1 today over Chapter 1 Turn in your homework on
More informationL19 Pipelined CPU I 1. Where are the registers? Study Chapter 6 of Text. Pipelined CPUs. Comp 411 Fall /07/07
Pipelined CPUs Where are the registers? Study Chapter 6 of Text L19 Pipelined CPU I 1 Review of CPU Performance MIPS = Millions of Instructions/Second MIPS = Freq CPI Freq = Clock Frequency, MHz CPI =
More informationEE 457 Midterm Summer 14 Redekopp Name: Closed Book / 105 minutes No CALCULATORS Score: / 100
EE 47 Midterm Summer 4 Redekopp Name: Closed Book / minutes No CALCULATORS Score: /. (7 pts.) Short Answer [Fill in the blanks or select the correct answer] a. If a control signal must be valid during
More informationCOMP303 Computer Architecture Lecture 9. Single Cycle Control
COMP33 Computer Architecture Lecture 9 Single Cycle Control A Single Cycle Datapath We have everything except control signals (underlined) RegDst busw Today s lecture will look at how to generate the control
More informationCh 5: Designing a Single Cycle Datapath
Ch 5: esigning a Single Cycle path Computer Systems Architecture CS 365 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Memory path Input Output Today s Topic:
More informationInstruction Set Architecture (ISA)
Instruction Set Architecture (ISA)... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data
More information