1.6 Computer Performance
|
|
- Gerald Job Harris
- 5 years ago
- Views:
Transcription
1 1.6 Computer Performance Performance How do we measure performance? Define Metrics Benchmarking Choose programs to evaluate performance Performance summary Fallacies and Pitfalls How to avoid getting fooled by performance calculations and demos! 1 Defining Performance Which airplane has the best performance? Airplane Passengers Range (mi) Speed (mph) Boeing Boeing BAC/Sud Concorde Douglas DC How much faster is the Concorde compared to the 747? How much bigger is the 747 than the Douglas DC-8? Concord is the fastest; DC-8 has the longest range, Performance means different things to different people! 2
2 Performance of computers Which computer is the fastest? For which application? Performance cannot be separated from a particular application or application class Scientific FP performance Commercial memory and I/O Program development Integer performance 3 Performance What s important to who? Individual computer user minimize elapsed time for program = end_time start_time Also called response time, execution time Computer center manager maximize completion rate = #jobs completed/second Called throughput 4
3 Definitions Performance is in units of things per sec bigger is better If we are primarily concerned with response time performance(x) = 1 execution_time(x) " X is n times faster than Y " means n = performance(x) performance(y) Speed-up = time_old / time_new 5 Relative Performance If an Pentium III runs a program in 8 seconds and a PowerPC runs the same program in 10 seconds, how many times faster is the Pentium III? n = 10 / 8 = 1.25 times faster (or 25% faster) Why might someone chose to buy the PowerPC in this case? 6
4 Elapsed Time / response time / real time counts everything (disk and memory accesses, I/O, OS, etc.) a useful number, but often not good for comparison purposes Alternative: just time processor (CPU) is working only on your program (since multiple processes running at same time) CPU time Measuring Performance doesn't count I/O or time spent running other programs can be broken up into system time (in OS), and user time (in user program) Our focus: user CPU time time spent executing the lines of code that are "in" our program 7 Different aspects cascade% time myprogram 90.7u 12.9s 2:39 65% User CPU time 90.7 seconds System CPU time 12.9 seconds Elapsed time is 2 minutes and 39 seconds Percentage of elapsed time that is CPU time is 65% 8
5 How to Measure Time? User Time seconds CPU Time: Computers constructed using a clock that runs at a constant rate and determines when events take place in the hardware These discrete time intervals called clock cycles (or informally clocks or cycles) Length of clock period: clock cycle time and clock rate which is the inverse of the clock period; Cycle rate vs. period (cycle time) 500MHz P-III runs 500M cycles/sec 1 clock cycle = 2ns 2GHz P-IV runs 2G cycles/sec 1 clock cycle = 0.5ns 9 CPU performance and its factors The time to execute a given program can be computed as CPU time = CPU clock cycles for a program x clock cycle time = CPU clock cycles for a program / clock rate The number of CPU clock cycles can be determined by CPU clock cycles = (instructions/program) which gives x (clock cycles/instruction) = Instruction count x CPI = IC x CPI CPU time = IC x CPI x CC = IC x CPI / CR The units for this are instructions ciock cycles seconds seconds = x x program instruction clock cycle 10
6 Example CPU time = CPU clock cycles for a program / clock rate Our program runs in 10 seconds on computer A, which has a 400MHz clock. We are trying to help a computer designer build a machine B that will run the same program in 6 seconds. The designer has determined that a substantial increase in the clock rate is possible, but this increase will affect the rest of the CPU design, causing machine B to require 1.2 times as many clock cycles as machine A for this program, what clock rate should we tell the designer to target? CPU time (A) = clock cycles(a) / clock rate (A) CPU time (B) = clock cycles(b) / clock rate (B) Known: CPU time (A) = 10 s, CPU time (B) = 6 s, clock cycles (B) = 1.2 * clock cycles(a), clock rate (A) = 400MHz, Find: clock rate (B) =? 11 How to Calculate the 3 Components? Clock Cycle Time: in specification of computer (Clock Rate in ad) Instruction Count: Count instructions in loop of small program Use simulator to count instructions Hardware counter in spec. register (Pentium II) CPI: Calculate: cpu Clock cycles Instruction Count Hardware counter in special register (PII) 12
7 Examples If a computer has a CR= 50 MHz, how long does it take to execute a program with 1,000 instructions, if the CPI for the program is 3.5? CPU time = IC x CPI / CR = 1000 x 3.5 / (50 x 10-6 ) If a computer s clock rate increases from 200 MHz to 250 MHz and the other factors 6 remain the same, how many times faster will the computer be? CPU time old clock rate new 250 MHz = = = 1.25 CPU time new clock rate old 200 MHZ 13 Example 2 Two implementation of the same ISA: Cycletime CPI Machine A 1 ns 2.0 Machine B 2 ns 1.2 Which is faster? Assuming I instructions Performance(A) Performance(B) = ExecutionTime(B) ExecutionTime(A) = I I = =1.2 14
8 Computing CPI The CPI is the average number of cycles per instruction. If for each instruction type, we know its frequency and number of cycles need to execute it, Example CPI = n CPI j F j j=1 Op Freq i CPI i Prod (% Time) ALU 50% 1.5 (23%) Load 20% (45%) Store 10% 3.3 (14%) Branch 20% 2.4 (18%) Instruction Mix CPI= 2.2 (Where time spent) 15 Do not confuse with CPU clock cycles! CPI = n CPI j F j j=1 A compiler designer is trying to decide between two code sequences for a particular machine. The hardware designer has supplied the following facts: for instruction class A, B, and C, their CPI are 1, 2 and 3, respectively. For a particular high-level language statement, the compiler writer is considering two code sequences that require the following instruction counts: Code sequence instruction counts for instruction class A B C Which code sequence executes the most instructions? Which will be faster? What is the CPI for each sequence? # of instructions: Seq1: Seq2: faster? CPU clock cycles: 2*1+1*2+2*3 4*1+1*2+1*3 CPI: clock cycles / IC n CPU clock cycles = CPI j IC j=1 j 16
9 How to improve performance? Iron Law Instructions Cycles Time = X X Program Instruction Cycle (IC) (CPI) (cycle time) Compiler Designer Processor Designer Chip Designer Instructions/Program Instructions executed, not static code size Determined by algorithm, compiler, ISA Cycles/Instruction Determined by ISA and CPU organization Overlap among instructions reduces this term Time/cycle Determined by technology, organization, clever circuit design 17 Factors affecting CPU Performance Which factors are affected by each of the following? CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle IC CPI clock rate Program x Compiler x x Instr. Set Arch. X x Organization x x Technology x 18
10 1.9 Evaluating Performance: Benchmarking Benchmarks Programs chosen to measure performance Predict performance of actual workload saves effort and money, Representative? Honest? SPEC System Performance Evaluation Cooperative Dozen benchmarks Which programs? Best case - you run the same set of programs everyday Or, typical of expected class of applications e.g., compilers/editors, scientific applications, graphics, etc. Small benchmarks nice for architects and designers easy to standardize, can be abused 19 Benchmarks: SPEC CPU2006 System Performance Evaluation Cooperative Formed in 80s to combat benchmarketing SPEC89, SPEC92, SPEC95, SPEC2000, SPEC2006 For CPU time performance only 12 integer and 17 floating-point programs Sun Ultra-5 300MHz reference machine has score of 100 Report Geometric mean of ratios to reference machine 20
11 Benchmarks: SPEC CINT2000 Benchmark 164.gzip 175.vpr 176.gcc 181.mcf 186.crafty 197.parser 252.eon 253.perlbmk 254.gap 255.vortex 256.bzip2 300.twolf Description Compression FPGA place and route C compiler Combinatorial optimization Chess Word processing, grammatical analysis Visualization (ray tracing) PERL script execution Group theory interpreter Object-oriented database Compression Place and route simulator 21 Benchmarks: SPEC CFP2000 Benchmark 168.wupwise 171.swim 172.mgrid 173.applu 177.mesa 178.galgel 179.art 183.equake 187.facerec 188.ammp 189.lucas 191.fma3d 200.sixtrack 301.apsi Description Physics/Quantum Chromodynamics Shallow water modeling Multi-grid solver: 3D potential field Parabolic/elliptic PDE 3-D graphics library Computational Fluid Dynamics Image Recognition/Neural Networks Seismic Wave Propagation Simulation Image processing: face recognition Computational chemistry Number theory/primality testing Finite-element Crash Simulation High energy nuclear physics accelerator design Meteorology: Pollutant distribution 22
12 SPEC CINT2000 and CFP200 ratings for the Pentium III and P 4 23 CINT2006 for Opteron X Name Description IC 10 9 CPI Tc (ns) Exec time Ref time SPECratio perl Interpreted string processing 2, , bzip2 Block-sorting compression 2, , gcc GNU C Compiler 1, , mcf Combinatorial optimization ,345 9, go Go game (AI) 1, , hmmer Search gene sequence 2, , sjeng Chess game (AI) 2, , libquantum Quantum computer simulation 1, ,047 20, h264avc Video compression 3, , omnetpp Discrete event simulation , astar Games/path finding 1, , xalancbmk XML parsing 1, ,143 6, Geometric mean 11.7 High cache miss rates 24
13 SPEC INT 2006 Example FIGURE 1.18 SPECINTC2006 benchmarks running on a 2.66 GHz Intel Core i As the equation on page 35 explains, execution time is the product of the three factors in this table: instruction count in billions, clocks per instruction (CPI), and clock cycle time in nanoseconds. SPECratio is simply the reference time, which is supplied by SPEC, divided by the measured execution time. The single number quoted as SPECINTC2006 is the geometric mean of the SPECratios. Copyright 2014 Elsevier Inc. All rights reserved SPECpower of Server FIGURE 1.19 SPECpower_ssj2008 running on a dual socket 2.66 GHz Intel Xeon X5650 with 16 GB of DRAM and one 100 GB SSD disk. Power consumption of server at different workload levels Performance: ssj_ops/sec, Power: Watts (Joules/sec) Overall ssj_ops per Watt = ssj_ops i poweri i= 0 i=
14 Benchmark Pitfalls Benchmark not representative Your workload is I/O bound, SPEC is useless Benchmark is too old Benchmarks age poorly; benchmarketing pressure causes vendors to optimize compiler/hardware/software to benchmarks Need to be periodically refreshed 27 Summarizing Performance Machine A Machine B Program Program Total Simplest: total execution time Performance B Performance A Execution time A = Execution time B = = 9.1 B is 9.1x faster than A if program 1 and program 2 run equal number of times 28
15 Summarizing Performance 2. Arithmetic mean (AM) n 1 time ( i) i = 1 n AM(A) = 1001/2 = AM(B) = 110/2 = /55 = 9.1x For n programs, each run an equal number of times 3. Weighed arithmetic mean program is weighted by its frequency in the workload n ( weight( i) time( i) ) i= 1 29 Geometric Mean 4. Geometric mean of ratios = Independent of reference machine n Machine A (s) Machine B (s) Program Program If we take ratios with respect to machine A Machine A (ratio) Machine B (ratio) Program Program n ratio( i) i=1 Average for machine A is 1 average for machine B is 5.05 If we take ratios with respect to machine B Machine A Program Program Machine B Can t both be true!!!, Don t use arithmetic mean on ratios! Average for machine A is 5.05 Average for machine B is 1 30
16 Benchmarking & performance summary Benchmarks Programs chosen to measure performance SPEC System Performance Evaluation Cooperative Consists of many small programs Summarizing performance 1. Total execution time 2. Arithmetic mean (AM) 3. Weighed arithmetic mean 4. Geometric mean of ratios Fallacies and Pitfalls Pitfall#1: expecting the improvement of one aspect of a computer to increase performance by an amount proportional to the size of the improvement Example: a program runs in 100 seconds on a computer, with multiply operations responsible for 80 seconds of the time. How much do I have to improve the speed of multiplication if I want my program to run 5 times faster? 32
17 Amdahl s Law Execution time after improvement = ( execution time affected by improvement amount of improvement + execution time unaffected ) From the data of last example: Execution time after improvement = 80/n + (100-80) Since we want the performance to be 5 times faster, the new execution time should be 20 seconds, giving 20 = 80/n + (100-80) => n =? 33 Amdahl s Law Example Execution time after improvement = ( execution time affected by improvement amount of improvement + execution time unaffected ) Your boss asks you to improve performance Two options Improve the ALU used 95% of time by 10% improve the squareroot unit used 5%, by a factor 10 Execution time affected Amount of improve ment 95% % 10 5% Execution time after improveme nt 34
18 From Amdahl s Law, how do we improve performance Execution time after improvement = ( execution time affected by improvement amount of improvement + execution time unaffected ) Make the common case fast Favor the frequent case over the infrequent case Example: The new CPU is 10 times faster. The CPU is busy with computation 40% time and I/O waiting for 60%. execution time affected = 0.4 execution time unaffected = 0.6 amount of improvement =10, new execution time =? 35 Fallacies and Pitfalls Pitfall#2: Using a subset of the performance equation (IC x CPI x CCT) as a performance metric MIPS and MFLOPS Millions of instructions per second Millions of floating point instructions per second MIPS = instruction count / (execution time x 10 6 ) = clock rate / (CPI x 10 6 ) Higher MIPS means faster machine But MIPS has serious shortcomings 36
19 Problems with MIPS Ignore program (IC): Cannot compare two computers with different instruction sets Different instruction sets different instruction counts Varies for different programs on same machine Different instruction mixes different MIPS rate Can vary inversely with performance When is MIPS ok? Same compiler, same ISA 37 Summary of Performance Evaluation Different performance metrics, (response time, throughput, CPU time, MIPS and MFLOPS Performance =1/time = IC x CPI x CC (Iron Law) Good benchmarks, such as the SPEC benchmarks, can provide an accurate method for evaluating and comparing computer performance. Summarize performance: total execute time, AM, WAM, GM Amdahl s law provides an efficient method for determining speedup due to an enhancement. Make the common case fast! Why MIPS measurement is not used? 38
Computer System. Performance
Computer System Performance Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationWhich is the best? Measuring & Improving Performance (if planes were computers...) An architecture example
1 Which is the best? 2 Lecture 05 Performance Metrics and Benchmarking 3 Measuring & Improving Performance (if planes were computers...) Plane People Range (miles) Speed (mph) Avg. Cost (millions) Passenger*Miles
More informationPerformance. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Performance Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Defining Performance (1) Which airplane has the best performance? Boeing 777 Boeing
More informationPerformance, Cost and Amdahl s s Law. Arquitectura de Computadoras
Performance, Cost and Amdahl s s Law Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras Performance-
More informationChapter 1. Computer Abstractions and Technology. Adapted by Paulo Lopes, IST
Chapter 1 Computer Abstractions and Technology Adapted by Paulo Lopes, IST The Computer Revolution Progress in computer technology Sustained by Moore s Law Makes novel and old applications feasible Computers
More informationCPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPI CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f =
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 36 Performance 2010-04-23 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in
More informationDefining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.
Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300
More informationMeasure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding effects of underlying architecture
Chapter 2 Note: The slides being presented represent a mix. Some are created by Mark Franklin, Washington University in St. Louis, Dept. of CSE. Many are taken from the Patterson & Hennessy book, Computer
More informationPerformance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process
Performance evaluation It s an everyday process CS/COE0447: Computer Organization and Assembly Language Chapter 4 Sangyeun Cho Dept. of Computer Science When you buy food Same quantity, then you look at
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41 Performance II CS61C L41 Performance II (1) Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia UWB Ultra Wide Band! The FCC moved
More informationChapter 1. and Technology
Chapter 1 Computer Abstractions Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Computers in automobiles
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationTDT4255 Computer Design. Lecture 1. Magnus Jahre
1 TDT4255 Computer Design Lecture 1 Magnus Jahre 2 Outline Practical course information Chapter 1: Computer Abstractions and Technology 3 Practical Course Information 4 TDT4255 Computer Design TDT4255
More informationLecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533
Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 38 Performance 2008-04-30 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in
More informationCSE2021 Computer Organization. Computer Abstractions and Technology
CSE2021 Computer Organization Chapter 1 Computer Abstractions and Technology Instructor: Prof. Peter Lian Department of Electrical Engineering & Computer Science Lassonde School of Engineering York University
More informationComputer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm
Computer Performance He said, to speed things up we need to squeeze the clock Reread Chapter 1.4-1.9 Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm L15 Computer Performance 1 Why Study Performance?
More informationQuantifying Performance EEC 170 Fall 2005 Chapter 4
Quantifying Performance EEC 70 Fall 2005 Chapter 4 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation
More informationRechnerstrukturen
182.690 Rechnerstrukturen Herbert.Gruenbacher@tuwien.ac.at Institut für Technische Informatik Treitlstraße 3, 1040 Wien http://ti.tuwien.ac.at/rts/teaching/courses/cod-ws11 1 incl. CD-ROM < 50 e-book version
More informationComputer Performance Evaluation: Cycles Per Instruction (CPI)
Computer Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: where: Clock rate = 1 / clock cycle A computer machine
More informationDefining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747
Defining Which airplane has the best performance? 1 Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300 400 500 Passenger Capacity
More informationResponse Time and Throughput
Response Time and Throughput Response time How long it takes to do a task Throughput Total work done per unit time e.g., tasks/transactions/ per hour How are response time and throughput affected by Replacing
More information5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner. Topic 1: Introduction
5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner Topic 1: Introduction These slides are mostly taken verbatim, or with minor changes, from
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationOverview of Today s Lecture: Cost & Price, Performance { 1+ Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class
Overview of Today s Lecture: Cost & Price, Performance EE176-SJSU Computer Architecture and Organization Lecture 2 Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class EE176
More informationChapter 1. Computer Abstractions and Technology
Chapter 1 Computer Abstractions and Technology Goals Understand the how and why of computer system organization Instruction Set Architecture (ISA) System Organization (processor, memory, I/O) Microarchitecture:
More informationWhat is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.
Performance COMP375 Computer Architecture and dorganization What is Good Performance Which is the best performing jet? Airplane Passengers Range (mi) Speed (mph) Boeing 737-100 101 630 598 Boeing 747 470
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 18, 2005 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software
More informationCSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science
More informationEECS2021. EECS2021 Computer Organization. EECS2021 Computer Organization. Morgan Kaufmann Publishers September 14, 2016
EECS2021 Computer Organization Fall 2015 The slides are based on the publisher slides and contribution from Profs Amir Asif and Peter Lian The slides will be modified, annotated, explained on the board,
More informationEIE/ENE 334 Microprocessors
EIE/ENE 334 Microprocessors Lecture 01: Introduction to Digital Computer System Week #01: Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier
More informationChapter 1. The Computer Revolution
Chapter 1 Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Computers
More informationComputer Science 246. Computer Architecture
Computer Architecture Spring 2010 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture Outline Performance Metrics Averaging Amdahl s Law Benchmarks The CPU Performance Equation Optimal
More informationLecture 3: Evaluating Computer Architectures. How to design something:
Lecture 3: Evaluating Computer Architectures Announcements - (none) Last Time constraints imposed by technology Computer elements Circuits and timing Today Performance analysis Amdahl s Law Performance
More informationIC220 Slide Set #5B: Performance (Chapter 1: 1.6, )
Performance IC220 Slide Set #5B: Performance (Chapter 1: 1.6, 1.9-1.11) Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational
More informationThe bottom line: Performance. Measuring and Discussing Computer System Performance. Our definition of Performance. How to measure Execution Time?
The bottom line: Performance Car to Bay Area Speed Passengers Throughput (pmph) Ferrari 3.1 hours 160 mph 2 320 Measuring and Discussing Computer System Performance Greyhound 7.7 hours 65 mph 60 3900 or
More informationThe Role of Performance
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Role of Performance What is performance? A set of metrics that allow us to compare two different hardware
More informationECE369: Fundamentals of Computer Architecture
: Fundamentals of Computer Architecture ECE 369 MWF 10:00 AM - 10:50 AM in HARV-302 Instructor Teaching Assistant Name: Ali Akoglu Chad Rossmeisl Office: ECE 356-B Phone: (520) 626-5149 Email: akoglu@ece.arizona.edu
More informationCS 110 Computer Architecture
CS 110 Computer Architecture Performance and Floating Point Arithmetic Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University
More informationAPPENDIX Summary of Benchmarks
158 APPENDIX Summary of Benchmarks The experimental results presented throughout this thesis use programs from four benchmark suites: Cyclone benchmarks (available from [Cyc]): programs used to evaluate
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 15
CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125
More informationCS61C Performance. Lecture 23. April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)
cs 61C L23 performance.1 CS61C Performance Lecture 23 April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html Outline Review HP-PA, Intel 80x86 instruction
More informationChapter 1. Computer Abstractions and Technology. Jiang Jiang
Chapter 1 Computer Abstractions and Technology Jiang Jiang jiangjiang@ic.sjtu.edu.cn [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2008, MK] Chapter 1 Computer Abstractions
More informationComputer Organization & Assembly Language Programming (CSE 2312)
Computer Organization & Assembly Language Programming (CSE 2312) Lecture 3 Taylor Johnson Summary from Last Time Binary to decimal, decimal to binary, ASCII Structured computers Multilevel computers and
More informationCS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek.
CS61C - Machine Structures Week 6 - Performance Oct 3, 2003 John Wawrzynek http://www-inst.eecs.berkeley.edu/~cs61c/ 1 Why do we worry about performance? As a consumer: An application might need a certain
More informationThe Von Neumann Computer Model
The Von Neumann Computer Model Partitioning of the computing engine into components: Central Processing Unit (CPU): Control Unit (instruction decode, sequencing of operations), Datapath (registers, arithmetic
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationImpact of Cache Coherence Protocols on the Processing of Network Traffic
Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background
More informationCS3350B Computer Architecture CPU Performance and Profiling
CS3350B Computer Architecture CPU Performance and Profiling Marc Moreno Maza http://www.csd.uwo.ca/~moreno/cs3350_moreno/index.html Department of Computer Science University of Western Ontario, Canada
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationPerformance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]
Performance CS 3410 Computer System Organization & Programming [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon] Performance Complex question How fast is the processor? How fast your application runs?
More informationComputing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design
Computing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design Computing Element Choices: Computing Element Programmability Spatial vs. Temporal Computing Main Processor Types/Applications
More informationTransistors and Wires
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis Part II These slides are based on the slides provided by the publisher. The slides
More informationThis Unit. CIS 501 Computer Architecture. As You Get Settled. Readings. Metrics Latency and throughput. Reporting performance
This Unit CIS 501 Computer Architecture Metrics Latency and throughput Reporting performance Benchmarking and averaging Unit 2: Performance Performance analysis & pitfalls Slides developed by Milo Martin
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationEECS2021E EECS2021E. The Computer Revolution. Morgan Kaufmann Publishers September 12, Chapter 1 Computer Abstractions and Technology 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface RISC-V Edition EECS2021E Computer Organization Fall 2017 These slides are based on the slides by the authors. The slides doesn t include
More informationPerformance Characterization of SPEC CPU Benchmarks on Intel's Core Microarchitecture based processor
Performance Characterization of SPEC CPU Benchmarks on Intel's Core Microarchitecture based processor Sarah Bird ϕ, Aashish Phansalkar ϕ, Lizy K. John ϕ, Alex Mericas α and Rajeev Indukuru α ϕ University
More informationThree major components of Embedded Systems
Hardware 1 A Washing machine A simple Embedded System only work on a set of rules: Step 1: Rinse in fresh water mixed with detergent Step 2: Wash by spinning the motor Step 3: Rinse in fresh water after
More informationCS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic
CS 61C: Great Ideas in Computer Architecture Performance and Floating Point Arithmetic Instructors: Bernhard Boser & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/25/16 Fall 2016 -- Lecture #17
More informationIntroduction To Computer Architecture
Introduction To Computer Architecture. Virendra Singh Computer Design and Test Lab. Supercomputer Education and Research Centre Indian Institute of Science Bangalore http://www.serc.iisc.ernet.in/~viren
More informationCache Optimization by Fully-Replacement Policy
American Journal of Embedded Systems and Applications 2016; 4(1): 7-14 http://www.sciencepublishinggroup.com/j/ajesa doi: 10.11648/j.ajesa.20160401.12 ISSN: 2376-6069 (Print); ISSN: 2376-6085 (Online)
More informationOutline. What is Performance? Restating Performance Equation Time = Seconds. CPU Performance Factors
CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/fa17 Outline Defining Performance
More informationCS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic
CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/fa17 10/24/17 Fall 2017-- Lecture
More informationCS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic
CS 61C: Great Ideas in Computer Architecture Performance and Floating-Point Arithmetic Instructors: Nick Weaver & John Wawrzynek http://inst.eecs.berkeley.edu/~cs61c/sp18 3/16/18 Spring 2018 Lecture #17
More informationChapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures CS61C L41 Performance I (1) Lecture 41 Performance I 2004-12-06 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Sour Roses! Cal s best season
More informationEvaluating Computers: Bigger, better, faster, more?
Evaluating Computers: Bigger, better, faster, more? 1 Key Points What does it mean for a computer to be good? What is latency? bandwidth? What is the performance equation? 2 What do you want in a computer?
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Cost, Performance & Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley
More informationCS61C : Machine Structures
CS 61C L27 RAID and Performance (1) inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #27: RAID & Performance Outline Disks Part 2 RAID Performance 2006-08-15 Andy Carle CS 61C L27
More informationChapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance
Chapter 1 Computer Abstractions and Technology Lesson 2: Understanding Performance Indeed, the cost-performance ratio of the product will depend most heavily on the implementer, just as ease of use depends
More informationEKT 303 WEEK Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ EKT 303 WEEK 2 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Chapter 2 + Performance Issues + Designing for Performance The cost of computer systems continues to drop dramatically,
More informationLecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1
Lecture - 4 Measurement Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1 Acknowledgements David Patterson Dr. Roger Kieckhafer 9/29/2009 2 Computer Architecture is Design and Analysis
More informationMEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance
Necessity of evaluation computer performance MEASURING COMPUTER PERFORMANCE For comparing different computer performances User: Interested in reducing the execution time (response time) of a task. Computer
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationPIPELINING AND PROCESSOR PERFORMANCE
PIPELINING AND PROCESSOR PERFORMANCE Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 1, John L. Hennessy and David A. Patterson, Morgan Kaufmann,
More informationCourse web site: teaching/courses/car. Piazza discussion forum:
Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 7 Performance 2005-2-8 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Ted Hong and David Marquardt www-inst.eecs.berkeley.edu/~cs152/ Last Time: Tips
More informationInstructor Information
CS 203A Advanced Computer Architecture Lecture 1 1 Instructor Information Rajiv Gupta Office: Engg.II Room 408 E-mail: gupta@cs.ucr.edu Tel: (951) 827-2558 Office Times: T, Th 1-2 pm 2 1 Course Syllabus
More informationThe Von Neumann Computer Model
The Von Neumann Computer Model Partitioning of the computing engine into components: Central Processing Unit (CPU): Control Unit (instruction decode, sequencing of operations), Datapath (registers, arithmetic
More informationSEN361 Computer Organization. Prof. Dr. Hasan Hüseyin BALIK (2 nd Week)
+ SEN361 Computer Organization Prof. Dr. Hasan Hüseyin BALIK (2 nd Week) + Outline 1. Overview 1.1 Basic Concepts and Computer Evolution 1.2 Performance Issues + 1.2 Performance Issues + Designing for
More informationIntroduction to Pipelined Datapath
14:332:331 Computer Architecture and Assembly Language Week 12 Introduction to Pipelined Datapath [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 W12.1 Review:
More informationQuiz for Chapter 1 Computer Abstractions and Technology
Date: Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More informationComputer Engineering Fall Semester, 2011
Computer Engineering 9859 Fall Semester, 2011 1 What will we do in this course? We will look at the design of an instruction set for a simple processor. The processor is based on a real processor, the
More informationLast time. Lecture #29 Performance & Parallel Intro
CS61C L29 Performance & Parallel (1) inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #29 Performance & Parallel Intro 2007-8-14 Scott Beamer, Instructor Paper Battery Developed by Researchers
More informationWorkloads, Scalability and QoS Considerations in CMP Platforms
Workloads, Scalability and QoS Considerations in CMP Platforms Presenter Don Newell Sr. Principal Engineer Intel Corporation 2007 Intel Corporation Agenda Trends and research context Evolving Workload
More informationComputing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design
Computing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design Computing Element Choices: Computing Element Programmability Spatial vs. Temporal Computing Main Processor Types/Applications
More informationAries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX
Aries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX Keerthi Bhushan Rajesh K Chaurasia Hewlett-Packard India Software Operations 29, Cunningham Road Bangalore 560 052 India +91-80-2251554
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures CS61C L27 Performance II & Summary (1) Lecture #27 Performance II & Summary 2005-12-07 There is one handout today at the front and back of the room!
More informationADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS
The most important thing we build is trust ADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS UT840 LEON Quad Core First Silicon Results Cobham Semiconductor
More informationMestrado em Informática
Sistemas de Computação e Desempenho Arquitecturas Paralelas Mestrado em Informática 2010/11 A.J.Proença Tema Arquitecturas Paralelas (1) Estrutura do tema AP 1. A evolução das arquitecturas pelo paralelismo
More informationECE 486/586. Computer Architecture. Lecture # 3
ECE 486/586 Computer Architecture Lecture # 3 Spring 2014 Portland State University Lecture Topics Measuring, Reporting and Summarizing Performance Execution Time and Throughput Benchmarks Comparing and
More information15-740/ Computer Architecture Lecture 10: Runahead and MLP. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 10: Runahead and MLP Prof. Onur Mutlu Carnegie Mellon University Last Time Issues in Out-of-order execution Buffer decoupling Register alias tables Physical
More information