Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So
|
|
- Amanda George
- 5 years ago
- Views:
Transcription
1 Computer Architecture ELEC344 Computer Performance How do you measure performance of a computer? 2 nd Semester, Dr. Hayden Kwok-Hay So How do you make a computer fast? Department of Electrical and Electronic Engineering 2nd sem. '8-9 ENGG344 - HS 2 Ways to measure Performance Execution Time Time to finish a task Throughput Number of tasks finish per unit time n Execution time (response time) Throughput n We will focus on execution time in this course Relative Performance n Define performance of a computer as Performance = ExecutionTime n Computer B is n times faster than Computer A if: n = Performance B Performance A = ExecutionTime A ExecutionTime B 2nd sem. '8-9 ENGG344 - HS 3 2nd sem. '8-9 ENGG344 - HS 4
2 Quick Check n Computer A finishes a task in 5s, Computer B finishes the same task in 4s. Which one is faster, by how much? Performance B Performance A = ExecutionTime A ExecutionTime B = 5 4 =.25 Computer B is.25 times faster than Computer A 2nd sem. '8-9 ENGG344 - HS 5 Ways to Measure Execution Time n Wall Clock Time (Elapse Time) The total time a user experiences that a computer takes to finish a task Includes OS overhead, I/O, idle time, time shared with other users n CPU Time The time spent on a user task in the CPU User CPU + OS CPU time Does not include I/O, time spent by other users, etc n Focus on CPU Time in this course $ time shasum afile 32ecc0e9eec9d5dc775752efeac280cecebdc afile real user sys 0m20.77s 0m2.835s 0m.786s 2nd sem. '8-9 ENGG344 - HS 6 How can we determine CPU time needed to execute a program? CPUTime = The Iron Law # of instruction program # of cycle instruction time cycle CPU Time Step CPUTime = CycleCount CycleTime = CycleCount ClockFrequency n Most modern CPUs are synchronous digital systems n The time needs to finish executing a task is determined by the number of cycles needed for that ask, multiply by the cycle time. Digital system design review 2nd sem. '8-9 ENGG344 - HS 7 2nd sem. '8-9 ENGG344 - HS 8 2
3 input Synchronous Sequential Circuits n A synchronous sequential circuit contains exactly clock signal n All state elements are connected to the same clock signal è the state of the entire circuit is updated at the same time n Common form of synchronous sequential circuits: Comb Logic Comb Logic Comb Logic Comb Logic output Clock Signal n A clock signal is particularly important signal in a synchronous sequential circuit It controls the action of all DFFs n A clock signal toggles between 0 and periodically n The frequency of the toggling determines the maximum speed of the circuit E.g.: in the accumulator example earlier, the output S cannot change faster than the clock frequency X x0 x x2 S 0 x0 x0 + x x0 + x + x2 clock period = clock frequency clock period e.g. Intel CPU runs at 3 GHz, Mobile phone processors at GHz Lab FPGA board at 50 MHz 2nd sem. '8-9 ENGG344 - HS 9 2nd sem. '8-9 ENGG344 - HS 0 Timing in Synchronous Circuits a b c d n In a synchronous sequential circuit, signal changes occur only during clock edge n All signals are therefore synchronized to change values right after a clock edge n In the above example, need to make sure correct value of y available BEFORE next clock edge Avoid glitches y Timing in Synchronous Circuits n In general, the propagation delay through the combinational logic between any two registers must be shorter than the clock period n The longest such path is called the critical path of the circuit n The critical path determines the maximum clock speed a b x y From glitch example Comb Logic Stable before clock edge 2nd sem. '8-9 ENGG344 - HS 2nd sem. '8-9 ENGG344 - HS 2 3
4 CPU Time Step Summary CPUTime = CycleCount CycleTime = n To improve performance:. Increase clock frequency 2. Reduce cycle count CycleCount ClockFrequency n Increase clock freq è shorter critical path è less work accomplished in cycle è more cycles needed Engineers need tradeoff between the two CPU Time Step 2 Cycle Per Instruction (CPI) CycleCount = InstructionCount CyclePerInstruction n Program A has 2000 instructions, each instruction takes 2 cycles to finish. How many cycles does it take to complete Program A? n Program B has 3000 instructions of them takes 2 cycles and 000 of them takes cycle. How many cycles does the program take to finish? How many cycle does it take to finish a program? 2nd sem. '8-9 ENGG344 - HS 3 2nd sem. '8-9 ENGG344 - HS 4 Average CPI n In general, different machine instructions may take different amount of time to complete. n Assuming n classes of instructions, then total clock cycle: ClockCycle = i= n Weighted average CPI: n CycleCount CPI = InstructionCount = CPI i InstructionCount i n i= CPI i InstructionCount i InstructionCount CPI Example () Class C C2 C3 Cycles 4 8 Compiler J n The ISA of computer A includes 3 classes of instructions that take different number of cycles to complete. A program P is compiled using compiler J, resulting in the utilization above. n What is the average CPI of the compiled program? 2nd sem. '8-9 ENGG344 - HS 5 2nd sem. '8-9 ENGG344 - HS 6 4
5 CPI Example (2) Class C C2 C3 Cycles 4 8 Compiler J Compiler K n A newer compiler K was developed to compile same program P, resulting in the utilization above. n What is the average CPI of the compiled program using compiler K? Ans: 2.3 Which compiler was better? 2nd sem. '8-9 ENGG344 - HS 7 CPI Example (3) Class C C2 C3 #instr #cycle CPI Cycles 4 8 Compiler J Compiler K n Observation: Compiler J results in higher CPI Compiler K uses more instructions n But most importantly: Compiler J uses fewer cycles è shorter run time è better 2nd sem. '8-9 ENGG344 - HS 8 Number of Instructions How many instructions are there in the following code? If CPI =, how many cycles does it take to complete? a = 0 b = a + c = a + b b = c + b # of instr: 4 # of cycles: 4 Number of Instructions How many instructions are there in the following code? If CPI =, how many cycles does it take to complete? i = 0 loop: a = a + i = i + if i < 0 goto loop # of STATIC instructions: 4 # of DYNAMIC instructions: + 3 * 0 = 3 # of cycles: 3 2nd sem. '8-9 ENGG344 - HS 9 2nd sem. '8-9 ENGG344 - HS 20 5
6 Number of Instructions How many instructions are there in the following code? To compute: r = a b r = 0 for (i=b; i>0; i=i-) r = r + a # of DYNAMIC instructions: 3b # of cycles: 3b r = a * b # of instructions: # of cycles: (?) Dynamic # of instructions can be data dependent. Instruction Count & CPI n The number of instructions in a program depends on Nature of application Compiler techniques Type of available instruction of an ISA n Average cycles per instruction depends on CPU microarchitecture ISA (CISC vs RISC) The current running state of CPU n Different instructions may have different CPI Average CPI affected by instruction mix 2nd sem. '8-9 ENGG344 - HS 2 2nd sem. '8-9 ENGG344 - HS 22 Combining All The Iron Law CPUTime = # of instruction program # of cycle instruction time cycle CISC vs RISC n CISC: Complex Instruction Set Computer RISC: Reduced Instruction Set Computer n CISC and RISC are two different computer design strategies: CISC RISC Algorithm Language Compiler ISA Language Compiler ISA Microarchitecture ISA Hardware design VAX x86 PA-RISC Alpha SPARC MIPS ARM RISC-V 2nd sem. '8-9 ENGG344 - HS 23 2nd sem. '8-9 ENGG344 - HS 24 6
7 CISC n ISA includes complex instructions E.g. VAX has a POLY instruction that evaluate polynomial in hardware n Includes complex addressing mode Mem-reg; mem-mem; indirect; relative; double-indirect.. n Hardware implement complex instructions using multiple clock cycles microcode n One promise of CISC ISA is that it allows shorter compiled code and make compiler easier. Still relevant today in embedded systems n Drawback: Less attractive as compiler techniques improve Complex hardware è slow RISC n ISA specifies simple instructions Mostly register-register transfer Simple addressing mode n Simpler hardware design Allows hardware optimization Faster hardware overall Allows easy pipelining n Simple ISA allows compiler optimization n Generated code length is generally longer n Most (if not all) ISA after the 80s are RISC 2nd sem. '8-9 ENGG344 - HS 25 2nd sem. '8-9 ENGG344 - HS 26 RISC vs CISC Iron Law CPUTime = Microarchitecture CPI Cycle Time CISC > short RISC single cycle unpipelined # of instruction program long RISC pipelined short # of cycle instruction time cycle Amdahl s Law Review n Describes the overall speedup of a system due to speed improvement that applies to a portion of the system. n Let P be the portion of the system that can be sped up by a factor of S, 0 P n Amdahl s Law stays that the overall speedup is: ( P)+ P S n E.g.: P = 50%, S=00 è speedup =.98x 2nd sem. '8-9 ENGG344 - HS 27 2nd sem. '8-9 ENGG344 - HS 28 7
8 Amdahl s Law Example n Q: a new implementation of C3 reduces its execution length by half to 4 cycles, how much improvement in performance can be achieved? P = Class C C2 C3 Cycles 4 8 # instr # cycles = 0.4 S = 2 speedup = ( 0.4)+ 0.4 / 2 =.25 2nd sem. '8-9 ENGG344 - HS 29 Amdahl s Law Example n Q2: Which instruction class, when its cycle count is reduced by half, will result in most performance improvement? Largest CPI? Most used? Most cycles used? Class C C2 C3 Cycles 4 8 # instr # cycles nd sem. '8-9 ENGG344 - HS 30 Amdahl s Law Implications n In most applications, only portion of the computation can be sped up improved hardware designs parallelization Can we get to a speedup of 0 with P=0.9? n Amdahl s Law è max speedup is limited by P If only small portion of program can be sped up, then it doesn t matter how large S is 2nd sem. '8-9 ENGG344 - HS 3 Benchmark Programs n A benchmark suite is a set of programs used to compare processor performance n Need to be representative of typical workload n Kernel vs whole application Recall Amdahl s Law n Avoid over optimization for specific benchmark n SPEC benchmark Several benchmark suites commonly used in computer architecture research E.g. SPEC CPU2006 2nd sem. '8-9 ENGG344 - HS 32 8
9 SPEC CPU Benchmark n Programs used to measure performance Supposedly typical of actual workload n Standard Performance Evaluation Corp (SPEC) Develops benchmarks for CPU, I/O, Web, n SPEC CPU2006 Elapsed time to execute a selection of programs Negligible I/O, so focuses on CPU performance Normalize relative to reference machine Summarize as geometric mean of performance ratios CINT2006 (integer) and CFP2006 (floating-point) CINT2006 for Intel Core i7 920 n n ÕExecution timeratio i i= ENGG344 2nd sem. '8-9 - HS 33 ENGG344 2nd sem. '8-9 - HS 34 Matrix-Matrix Multiplication a 0,0! a 0,N " # " a N,0! a N,N! N = " a i,k b k, j " k=0! b 0,0! b 0,N " # " b N,0! b N,N 2nd sem. '8-9 ENGG344 - HS 35 r[i][j] = 0 for (k=0; k<n; k++) r[i][j] += a[i][k] * b[k][j] Matrix-Matrix Multiplication! n = " a i,k b k, j " k=0! for(i=0; i<n; i++) for(j=0; j<n; j++) r[i][j] = 0 for (k=0; k<n; k++) r[i][j] += a[i][k] * b[k][j] Total number of instructions: N 3 [, +, assignment] n If all instructions have CPI=, then time to complete is ~N 3 cycles. n What are the factors that will make this run faster/slower? 2nd sem. '8-9 ENGG344 - HS 36 9
10 And in conclusion n The study of computer architecture allows us to construct better computer systems Performance, power n Computer architecture is a study that crosses software and hardware n We will use RISC-V as main ISA for class work, but design principles applicable to other computer designs n The Iron Law determines performance of a CPU n ISA, microarchitecture, compilers, and hardware technology all play a role in determining CPU performance Acknowledgements n These slides contain material developed and copyright by: Arvind (MIT) Krste Asanovic (MIT/UCB) Joel Emer (Intel/MIT) James Hoe (CMU) John Kubiatowicz (UCB) David Patterson (UCB) n MIT material derived from course n UCB material derived from course CS52, CS252 2nd sem. '8-9 ENGG344 - HS 42 2nd sem. '8-9 ENGG344 - HS 43 0
Computer Architecture ELEC3441
Computer Architecture ELEC3441 RISC vs CISC Iron Law CPUTime = # of instruction program # of cycle instruction cycle Lecture 5 Pipelining Dr. Hayden Kwok-Hay So Department of Electrical and Electronic
More informationIntroduction. What is Computer Architecture? Meltdown & Spectre. Meltdown & Spectre. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So
Computer Architecture ELEC3441 What is Computer Architecture? Introduction 2 nd Semester, 2018-19 Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Computer Architecture 2nd sem.
More informationResponse Time and Throughput
Response Time and Throughput Response time How long it takes to do a task Throughput Total work done per unit time e.g., tasks/transactions/ per hour How are response time and throughput affected by Replacing
More informationIntroduction. What is Computer Architecture? Design constraints. What is Computer Architecture? Computer Architecture ELEC3441
Computer Architecture ELEC3441 What is Computer Architecture? Introduction 2 nd Semester, 2016-17 Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Computer Architecture 2 What
More informationIntroduction. What is Computer Architecture? Meltdown & Spectre. Meltdown & Spectre. Computer Architecture ELEC3441. Dr. Hayden Kwok-Hay So
Computer Architecture ELEC3441 What is Computer Architecture? Introduction 2 nd Semester, 2017-18 Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Computer Architecture 2 Meltdown
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationCSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era
More informationWhat is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.
Performance COMP375 Computer Architecture and dorganization What is Good Performance Which is the best performing jet? Airplane Passengers Range (mi) Speed (mph) Boeing 737-100 101 630 598 Boeing 747 470
More informationC 1. Last Time. CSE 490/590 Computer Architecture. ISAs and MIPS. Instruction Set Architecture (ISA) ISA to Microarchitecture Mapping
CSE 49/59 Computer Architecture ISAs and MIPS Last Time Computer Architecture >> ISAs and RTL Comp. Arch. shaped by technology and applications Computer Architecture brings a quantitative approach to the
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationCS3350B Computer Architecture CPU Performance and Profiling
CS3350B Computer Architecture CPU Performance and Profiling Marc Moreno Maza http://www.csd.uwo.ca/~moreno/cs3350_moreno/index.html Department of Computer Science University of Western Ontario, Canada
More informationFrom CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations
1 CISC Creates the Anti CISC Revolution Digital Equipment Company (DEC) introduces VAX (1977) Commercially successful 32-bit CISC minicomputer From CISC to RISC In 1970s and 1980s CISC minicomputers became
More informationComputer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm
Computer Performance He said, to speed things up we need to squeeze the clock Reread Chapter 1.4-1.9 Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm L15 Computer Performance 1 Why Study Performance?
More informationLecture 14: Multithreading
CS 152 Computer Architecture and Engineering Lecture 14: Multithreading John Wawrzynek Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~johnw
More informationLecture 4 - Pipelining
CS 152 Computer Architecture and Engineering Lecture 4 - Pipelining John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationCourse web site: teaching/courses/car. Piazza discussion forum:
Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start
More informationIC220 Slide Set #5B: Performance (Chapter 1: 1.6, )
Performance IC220 Slide Set #5B: Performance (Chapter 1: 1.6, 1.9-1.11) Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational
More informationCS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches
CS 152 Computer Architecture and Engineering Lecture 11 - Virtual Memory and Caches Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationCS 152 Computer Architecture and Engineering. Lecture 8 - Memory Hierarchy-III
CS 152 Computer Architecture and Engineering Lecture 8 - Memory Hierarchy-III Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationPerformance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process
Performance evaluation It s an everyday process CS/COE0447: Computer Organization and Assembly Language Chapter 4 Sangyeun Cho Dept. of Computer Science When you buy food Same quantity, then you look at
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationComputer Architecture ELEC3441
CPU-Memory Bottleneck Computer Architecture ELEC44 CPU Memory Lecture 9 Cache Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Performance of high-speed computers is usually limited
More informationCS 152 Computer Architecture and Engineering. Lecture 8 - Memory Hierarchy-III
CS 152 Computer Architecture and Engineering Lecture 8 - Memory Hierarchy-III Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationECE 486/586. Computer Architecture. Lecture # 3
ECE 486/586 Computer Architecture Lecture # 3 Spring 2014 Portland State University Lecture Topics Measuring, Reporting and Summarizing Performance Execution Time and Throughput Benchmarks Comparing and
More informationDefining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.
Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300
More informationLecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533
Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 19: Verilog and Processor Performance Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Verilog Basics Hardware description language
More informationCS252 Spring 2017 Graduate Computer Architecture. Lecture 18: Virtual Machines
CS252 Spring 2017 Graduate Computer Architecture Lecture 18: Virtual Machines Lisa Wu, Krste Asanovic http://inst.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 Midterm Topics ISA -- e.g. RISC vs. CISC
More informationCS 152, Spring 2011 Section 8
CS 152, Spring 2011 Section 8 Christopher Celio University of California, Berkeley Agenda Grades Upcoming Quiz 3 What it covers OOO processors VLIW Branch Prediction Intel Core 2 Duo (Penryn) Vs. NVidia
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationLecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationMEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance
Necessity of evaluation computer performance MEASURING COMPUTER PERFORMANCE For comparing different computer performances User: Interested in reducing the execution time (response time) of a task. Computer
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationC 1. Last time. CSE 490/590 Computer Architecture. Complex Pipelining I. Complex Pipelining: Motivation. Floating-Point Unit (FPU) Floating-Point ISA
CSE 490/590 Computer Architecture Complex Pipelining I Steve Ko Computer Sciences and Engineering University at Buffalo Last time Virtual address caches Virtually-indexed, physically-tagged cache design
More informationLecture 13 - VLIW Machines and Statically Scheduled ILP
CS 152 Computer Architecture and Engineering Lecture 13 - VLIW Machines and Statically Scheduled ILP John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationCS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation
CS 152 Computer Architecture and Engineering Lecture 9 - Address Translation Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationCS 152 Computer Architecture and Engineering. Lecture 9 - Virtual Memory
CS 152 Computer Architecture and Engineering Lecture 9 - Virtual Memory Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationCS 152 Computer Architecture and Engineering. Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming
CS 152 Computer Architecture and Engineering Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming John Wawrzynek Electrical Engineering and Computer Sciences University of California at
More informationCS 152 Computer Architecture and Engineering. Lecture 18: Multithreading
CS 152 Computer Architecture and Engineering Lecture 18: Multithreading Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationLecture 9 - Virtual Memory
CS 152 Computer Architecture and Engineering Lecture 9 - Virtual Memory Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National Laboratory http://inst.eecs.berkeley.edu/~cs152
More informationChapter 2: Instructions How we talk to the computer
Chapter 2: Instructions How we talk to the computer 1 The Instruction Set Architecture that part of the architecture that is visible to the programmer - instruction formats - opcodes (available instructions)
More informationChapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture
More informationCS 152 Computer Architecture and Engineering. Lecture 14: Multithreading
CS 152 Computer Architecture and Engineering Lecture 14: Multithreading Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationLecture 4: Instruction Set Architecture
Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationPipelining. CS701 High Performance Computing
Pipelining CS701 High Performance Computing Student Presentation 1 Two 20 minute presentations Burks, Goldstine, von Neumann. Preliminary Discussion of the Logical Design of an Electronic Computing Instrument.
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 4 Reduced Instruction Set Computers
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 4 Reduced Instruction Set Computers Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationCS 152 Computer Architecture and Engineering. Lecture 12 - Advanced Out-of-Order Superscalars
CS 152 Computer Architecture and Engineering Lecture 12 - Advanced Out-of-Order Superscalars Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National Laboratory
More informationEECS2021E EECS2021E. The Computer Revolution. Morgan Kaufmann Publishers September 12, Chapter 1 Computer Abstractions and Technology 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface RISC-V Edition EECS2021E Computer Organization Fall 2017 These slides are based on the slides by the authors. The slides doesn t include
More informationCS 152 Computer Architecture and Engineering. Lecture 13 - Out-of-Order Issue and Register Renaming
CS 152 Computer Architecture and Engineering Lecture 13 - Out-of-Order Issue and Register Renaming Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://wwweecsberkeleyedu/~krste
More informationCS 152 Computer Architecture and Engineering. Lecture 19: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationCS 152 Computer Architecture and Engineering. Lecture 22: Virtual Machines
CS 152 Computer Architecture and Engineering Lecture 22: Virtual Machines Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationThe Role of Performance
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Role of Performance What is performance? A set of metrics that allow us to compare two different hardware
More informationECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 8 Instruction-Level Parallelism Part 1
ECE 252 / CPS 220 Advanced Computer Architecture I Lecture 8 Instruction-Level Parallelism Part 1 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationHigh Performance Computing
High Performance Computing CS701 and IS860 Basavaraj Talawar basavaraj@nitk.edu.in Course Syllabus Definition, RISC ISA, RISC Pipeline, Performance Quantification Instruction Level Parallelism Pipeline
More informationCS 152 Computer Architecture and Engineering. Lecture 8 - Memory Hierarchy-III
CS 152 Computer Architecture and Engineering Lecture 8 - Memory Hierarchy-III Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationQuiz for Chapter 1 Computer Abstractions and Technology 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [15 points] Consider two different implementations, M1 and
More informationCS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation
CS 152 Computer Architecture and Engineering Lecture 8 - Translation Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationEE282 Computer Architecture. Lecture 1: What is Computer Architecture?
EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer
More informationReporting Performance Results
Reporting Performance Results The guiding principle of reporting performance measurements should be reproducibility - another experimenter would need to duplicate the results. However: A system s software
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationCS 152 Computer Architecture and Engineering. Lecture 19: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationCS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation
CS 152 Computer Architecture and Engineering Lecture 9 - Address Translation Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationLecture 4: Instruction Set Design/Pipelining
Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions
More informationCS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek.
CS61C - Machine Structures Week 6 - Performance Oct 3, 2003 John Wawrzynek http://www-inst.eecs.berkeley.edu/~cs61c/ 1 Why do we worry about performance? As a consumer: An application might need a certain
More informationGRE Architecture Session
GRE Architecture Session Session 2: Saturday 23, 1995 Young H. Cho e-mail: youngc@cs.berkeley.edu www: http://http.cs.berkeley/~youngc Y. H. Cho Page 1 Review n Homework n Basic Gate Arithmetics n Bubble
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationCS252 Spring 2017 Graduate Computer Architecture. Lecture 17: Virtual Memory and Caches
CS252 Spring 2017 Graduate Computer Architecture Lecture 17: Virtual Memory and Caches Lisa Wu, Krste Asanovic http://inst.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 Last Time in Lecture 16 Memory
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 15
CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125
More informationCS252 Spring 2017 Graduate Computer Architecture. Lecture 14: Multithreading Part 2 Synchronization 1
CS252 Spring 2017 Graduate Computer Architecture Lecture 14: Multithreading Part 2 Synchronization 1 Lisa Wu, Krste Asanovic http://inst.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 Last Time in Lecture
More informationLecture 4: RISC Computers
Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) represents an important
More informationLecture: Benchmarks, Pipelining Intro. Topics: Performance equations wrap-up, Intro to pipelining
Lecture: Benchmarks, Pipelining Intro Topics: Performance equations wrap-up, Intro to pipelining 1 Measuring Performance Two primary metrics: wall clock time (response time for a program) and throughput
More informationVector and Parallel Processors. Amdahl's Law
Vector and Parallel Processors. Vector processors are processors which have special hardware for performing operations on vectors: generally, this takes the form of a deep pipeline specialized for this
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationLecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture
Lecture Topics ECE 486/586 Computer Architecture Lecture # 5 Spring 2015 Portland State University Quantitative Principles of Computer Design Fallacies and Pitfalls Instruction Set Principles Introduction
More informationChapter 1. The Computer Revolution
Chapter 1 Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Computers
More informationDefining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747
Defining Which airplane has the best performance? 1 Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300 400 500 Passenger Capacity
More informationPerformance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]
Performance CS 3410 Computer System Organization & Programming [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon] Performance Complex question How fast is the processor? How fast your application runs?
More informationPage 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1
Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 13 Memory Part 2
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 13 Memory Part 2 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html
More informationComputer Organization & Assembly Language Programming (CSE 2312)
Computer Organization & Assembly Language Programming (CSE 2312) Lecture 3 Taylor Johnson Summary from Last Time Binary to decimal, decimal to binary, ASCII Structured computers Multilevel computers and
More informationQuiz for Chapter 1 Computer Abstractions and Technology
Date: Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More informationCS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation
CS 152 Computer Architecture and Engineering Lecture 8 - Translation Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationInstruction Set Architectures. Part 1
Instruction Set Architectures Part 1 Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Digital Design Circuit Design 1/9/02 Some ancient history Earliest (1940
More informationComputer Performance Evaluation and Benchmarking. EE 382M Dr. Lizy Kurian John
Computer Performance Evaluation and Benchmarking EE 382M Dr. Lizy Kurian John Evolution of Single-Chip Transistor Count 10K- 100K Clock Frequency 0.2-2MHz Microprocessors 1970 s 1980 s 1990 s 2010s 100K-1M
More informationC 1. Last time. CSE 490/590 Computer Architecture. Virtual Machines I. Types of Virtual Machine (VM) Outline. User Virtual Machine = ISA + Environment
CSE 490/590 Computer Architecture Last time Directory-based coherence protocol 4 cache states: C-invalid, C-shared, C-modified, and C-transient 4 memory states: R(dir), W(id), TR(dir), TW(id) Virtual Machines
More informationDr. George Michelogiannakis. EECS, University of California at Berkeley CRD, Lawrence Berkeley National Laboratory
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National Laboratory http://inst.eecs.berkeley.edu/~cs152!
More informationCSC 631: High-Performance Computer Architecture
CSC 631: High-Performance Computer Architecture Spring 2017 Lecture 4: Pipelining Last Time in Lecture 3 icrocoding, an effective technique to manage control unit complexity, invented in era when logic
More informationInstruction Set Architecture. "Speaking with the computer"
Instruction Set Architecture "Speaking with the computer" The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Digital Design
More informationCS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007
CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007 Name: Solutions (please print) 1-3. 11 points 4. 7 points 5. 7 points 6. 20 points 7. 30 points 8. 25 points Total (105 pts):
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 15 Very Long Instruction Word Machines
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 15 Very Long Instruction Word Machines Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationComputer Architecture ELEC2401 & ELEC3441
Computer Architecture ELEC2401 & ELEC3441 Lecture 14 Explicit Parallel Processors Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Superscalar Control Logic Scaling Issue Group
More information55:132/22C:160, HPCA Spring 2011
55:132/22C:160, HPCA Spring 2011 Second Lecture Slide Set Instruction Set Architecture Instruction Set Architecture ISA, the boundary between software and hardware Specifies the logical machine that is
More information