Performance, Cost and Amdahl s s Law. Arquitectura de Computadoras
|
|
- Charlene Doyle
- 6 years ago
- Views:
Transcription
1 Performance, Cost and Amdahl s s Law Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras Performance- 1
2 Performance Purchasing perspective given a collection of machines, which has the» best performance?» least cost?» best performance / cost? Design perspective faced with design options, which has the» best performance improvement?» least cost?» best performance / cost? Both require basis for comparison metric for evaluation Our goal is to understand cost & performance implications of architectural choices Arquitectura de Computadoras Performance- 2
3 Two notions of performance Plane DC to Paris Speed Passengers Throughput (pmph) Boeing hours 610 mph ,700 Concorde 3 hours 1350 mph ,200 Which has higher performance? Time to do the task (Execution Time) execution time, response time, latency Tasks per day, hour, week, sec, ns... (Performance) throughput, bandwidth Response time and throughput often are in opposition Arquitectura de Computadoras Performance- 3
4 What is Performance? KEY: A measure of Speed (Rate) Car: miles driven per hour Car wash: cars washed per day Auto plant: cars built per year Two metrics: Latency (response or execution time)» time to start to finish of a task Throughput (bandwidth)» rate of task completion = rate of task initiation = 1 / (time between task completions) Deterministic vs. average Arquitectura de Computadoras Performance- 4
5 Definitions Performance is in units of things-per-second bigger is better If we are primarily concerned with response time performance(x) = 1 execution_time(x) " X is n times faster than Y" means Performance(X) n = Performance(Y) Arquitectura de Computadoras Performance- 5
6 Example Time of Concorde vs. Boeing 747? Concord is 1350 mph / 610 mph = 2.2 times faster = 6.5 hours / 3 hours Throughput of Concorde vs. Boeing 747? Concord is 178,200 pmph / 286,700 pmph Boeing is 286,700 pmph / 178,200 pmph = 0.62 times faster = 1.60 times faster Boeing is 1.6 times ( 60% ) faster in terms of throughput Concord is 2.2 times ( 120% ) faster in terms of flying time We will focus primarily on execution time for a single job Lots of instructions in a program => Instruction throughput important! Arquitectura de Computadoras Performance- 6
7 Relative Performance Definition: X is n % faster than Y if execution rate execution rate Example: X = 1 minute, Y = 2 minutes 2 minute 1 minute = Thus, X is 100 % faster than Y Example: Car wash that starts one car per minute and holds four cars. Latency = four minutes per car Throughput = one car per minute Throughput > 1/Latency due to overlap Key idea: pipelining X Y execution time Y n = = 1+ execution time 100 Arquitectura de Computadoras Performance- 7 X
8 Basis of Evaluation Pros representative portable widely used improvements useful in reality Actual Target Workload Full Application Benchmarks Cons very specific non-portable difficult to run, or measure hard to identify cause less representative easy to run, early in design cycle Small Kernel Benchmarks easy to fool identify peak capability and potential bottlenecks Microbenchmarks peak may be a long way from application performance Arquitectura de Computadoras Performance- 8
9 Metrics of performance Application Answers per month Useful Operations per second Programming Language Compiler ISA (millions) of Instructions per second MIPS (millions) of (F.P.) operations per second MFLOP/s Datapath Control Function Units Transistors Wires Pins Megabytes per second Cycles per second (clock rate) Each metric has a place and a purpose, and each can be misused Arquitectura de Computadoras Performance- 9
10 Aspects of CPU Performance CPU CPU time time = Seconds = Instructions x Cycles Cycles x Seconds Program Program Instruction Cycle Cycle Program instr count CPI clock rate Compiler Instr. Set Organization Technology Arquitectura de Computadoras Performance- 10
11 Aspects of CPU Performance CPU CPU time time = Seconds = Instructions x Cycles Cycles x Seconds Program Program Instruction Cycle Cycle Program instr count CPI clock rate X Compiler X X Instr. Set X X X Organization X X Technology X Arquitectura de Computadoras Performance- 11
12 CPI: Average cycles per per instruction CPI = Instruction Count / (CPU Time * Clock Rate) = Instruction Count / Cycles CPU Time = Cycle Time * n CPI i= 1 i * I i CPU Time = n CPI i= 1 i * F i where F i Ii = Instruction Count Invest resources where time is spent! Arquitectura de Computadoras Performance- 12
13 Controversial Example CPU time Instruction Cycles = Program Instruction Seconds Cycle Some have argued: CISC CPU Time = P x 8 x T = 8PT RISC CPU Time = 2P x 2 x T = 4PT RISC CPU Time = (CISC CPU Time)/2 DISCLAIMER: The truth is much, much more complex Arquitectura de Computadoras Performance- 13
14 Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = = ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S and the remainder of the task is unaffected then, ExTime(with E) = ((1-F) + F/S) X ExTime(without E) Speedup(with E) = 1 (1-F) + F/S Arquitectura de Computadoras Performance- 14
15 Amdahl s Law Let Speedup new rate = = old rate old latency new latency Consider an enhancement x that speedups fraction f x of a task by S x Speedup overall = old latency new latency = [( 1- fx ) + fx ] old latency ( 1 f ) old latency + (f / S ) old latency x x x Amdahl s Law gives: Speedup overall = 1 ( 1 f ) + (f / S ) x x x Arquitectura de Computadoras Performance- 15
16 Amdahl s Law, cont. Example: f x = 95 % and S x = 1.10 Speedup overall = 1 ( ) + ( 095. / 110. ) = Example: f x = 5% and S x = 10 Speedup overall = 1 ( ) + ( 005. / 10) = Example: f x = 5% and S x Speedup overall = ( 005. = (. ) / ) Arquitectura de Computadoras Performance- 16
17 Amdahl s Law Corollary Since S x implies Speedup overall For real speedups: 1 Speedup overall < ( 1 ) 1 ( 1 ) + (f / ) f x f x x Example: f x 1 ( 1 f x ) 1 % % % % % % 2.00 Arquitectura de Computadoras Performance- 17
18 Standard Example: : Load/Store Machine Operation Frequency Cycle Count ALU Ops 43 % 1 Loads 21 % 1 Stores 12 % 2 Branches 24 % 2 Suppose we could make stores execute in 1 cycle, by slowing down the cycle time by 15 % Should we make this optimization? Old CPI = ( )x2 = 1.36 New CPI = x2 = 1.24 New CPU time Old CPU time Conclusion: Don t make the change = P New CPI 115. T P Old CPI T = 105. Arquitectura de Computadoras Performance- 18
19 Example (RISC processor) Base Machine (Reg / Reg) Op Freq Cycles CPI(i) % Time ALU 50% % Load 20% % Store 10% % Branch 20% % Typical Mix 2.2 How much faster would the machine be if a better data cache reduce the average load time to 2 cycles? How does this compare with using branch prediction to shave a cycle off the branch time? What if two ALU instructions could be executed at once? Arquitectura de Computadoras Performance- 19
20 Evaluating Instruction Sets? Design-time metrics: Can it be implemented, in how long, at what cost? Can it be programmed? Ease of compilation? Static Metrics: How many bytes does the program occupy in memory? Dynamic Metrics: How many instructions are executed? How many bytes does the processor fetch to execute the program? CPI How many clocks are required per instruction? How "lean" a clock is practical? Best Metric: Time to execute the program! Inst. Count Cycle Time NOTE: this depends on instructions set, processor organization, and compilation techniques. Arquitectura de Computadoras Performance- 20
21 Corollary: Make The Common Case Fast All instructions require an instruction fetch, only a fraction require a data fetch/store. Optimize instructions access over data access Programs exhibit locality spatial locality temporal locality Arquitectura de Computadoras Performance- 21
22 Corollary: Make The Common Case Fast Access to small memories is faster provide a storage hierarchy such that the most frequent accesses are the smallest (closest) memories Regs. Cache Memory Disk/Tape Arquitectura de Computadoras Performance- 22
23 Marketing Metrics Clock Frequency 3 Ghz better than 2 Ghz? Only relevant for comparing processors from the same family The same architecture The same ISA Machine with different instruction sets? Intel Pentium vs PowerPC Program with different instruction mixes? Dynamic frequency of instructions Uncorrelated to performance Arquitectura de Computadoras Performance- 23
24 Marketing Metrics MIPS= instruction Count /Time * 10 6 = Clock Rate / CPI * 10 6 machine with different instruction sets? program with different instruction mixes? dynamic frequency of instruction uncorrelated to performance MFLOPS = FP Operations / Time * 10 6 machine dependent often not where time is spent Normalized: add, sub, compare, mult 1 divide, sqrt 4 exp, sin,... 8 Arquitectura de Computadoras Performance- 24
25 Normalized MFLOPS Not all machines implement the same FP operations Cray-1 does not implement Divide Motorola does SQRT, SIN, and COS Not all FP operations are the same ADD is much faster than Divide Normalized MFLOPS Assign a canonical number of FP operations to a program Normalized MFLOPS = Canonical FP operations time 10 6 Arquitectura de Computadoras Performance- 25
26 Metrics of performance Application Answers per month Useful Operations per second Programming Language Compiler ISA (millions) of Instructions per second MIPS (millions) of (F.P.) operations per second MFLOP/s Datapath Control Function Units Transistors Wires Pins Megabytes per second Cycles per second (clock rate) Each metric has a place and a purpose, and each can be misused Arquitectura de Computadoras Performance- 26
27 Benchmarks Real Programs Representative of real workload The only accurate way to characterize performance e.g., gcc, spice,... Kernels Representative program fragments Time critical excerpts of real programs. e.g., Livermore loops Toy Benchmarks lines e. g. Sieve, Puzzle, Towers Synthetic Benchmarks attempt to match average frequencies of real workloads e.g. Whetstone, dhrystone Arquitectura de Computadoras Performance- 27
28 Benchmarking Reproducible results must control outside factors Important factors Program input Version of program Version of compiler Optimization level Version of operating system Amount of memory Number and type of disks Version of CPU Cache configuration Arquitectura de Computadoras Performance- 28
29 Benchmarking: SPEC Limitations of de facto Benchmarks Dhrystone Synthetic integer benchmark Heavy string emphasis Optimization compilers cause MAJOR problems Whetsone Synthetic floating-point benchmark Designed to thwart optimization Linpack Floating-point kernel DAXPY() = A(I) = B(I) + C * D(I) Arquitectura de Computadoras Performance- 29
30 SPEC95 Standard Performance Evaluation Corporation Eighteen application benchmarks (with inputs) reflecting a technical computing workload Eight integer go, m88ksim, gcc, compress, li, ijpeg, perl, vortex Ten floating-point intensive tomcatv, swim, su2cor, hydro2d, mgrid, applu, turb3d, apsi, fppp, wave5 Must run with standard compiler flags eliminate special undocumented incantations that may not even generate working code for real programs Arquitectura de Computadoras Performance- 30
31 Benchmarking: SPEC200 Integer Floating Point Gzip Vpr Gcc Mcf Crafty Parser Eon Perlbmk Gap Vortex Bzip2 Compression FPGA circuit placement and routing C programming language compiler Combinatorial optimization Game playing: chess Word processing Computer visualization Perl programming language Group theory Object oriented database Compression Wupwise Swim Mgrid Applu Mesa Galgel Art Equaqke Facerec Ammp Lucas Fma3d Sixtrack Apsi Physics: quantum chromadinamics Shallow water modelling Multigrid solver: 3D potential field Partial differential equations 3D Graphics library Computational fluid dynamics Image recognition neural networks Seismic wave propagation simulation Image processing: face recognition Computational chemistry Number theory/primality testing Finite-element crash simulation Nuclear physics accelerator design Meteorology: pollutant distribution Arquitectura de Computadoras Performance- 31
32 Summarizing Results: : A Counter- Example A car goes 30 MPH for the first then miles and 90 MPH for the second ten miles. What the car s average speed over the twenty miles? Wrong answer: Avg Speed = 30 MPH + 90 MPH 2 = 60 MPH Correct answer: Avg Speed = = total distance total time 10 miles + 10 miles ( 10 miles / 30 MPH) + ( 10 miles / 90 MPH) = 20 miles ( 1/ 3) hour + ( 1/ 9) hour = 45 MPH Arquitectura de Computadoras Performance- 32
33 Summarizing Results: Averages Use the ARITHMETIC mean for times (cycles per instruction): Use the HARMONIC mean for rates (MIPS, MFLOPS): 1 n time i i= 1 Use the GEOMETRIC mean for ratios (normalized numbers): n 1 n 1 n i= 1rate i 1 n 1 n i= 1rate i 1 1/ n Better yet: don t average normalized numbers Arquitectura de Computadoras Performance- 33
34 Summarizing Results: A Measure of Time Property 1: A single-number performance measure for a set of benchmarks expressed in units of time should be directly proportional to the total (weighted) time consumed by the benchmarks. Property 2: A single-number performance measure for a set of benchmarks expressed as a rate should be indirectly proportional to the total (weighted) time consumed by the benchmarks. Arquitectura de Computadoras Performance- 34
35 Summarizing Results: Which Which Mean? T i = Execution time for Benchmark i F i = FP Operations for Benchmark i R i = F i / T i = Rate of Benchmark i Average Time: Average Rate: 1 n A mean = n i= 1 n A mean = n i= T i 1 R i 1 Violates Property 2: Not proportional to inverse of time. Use Harmonic mean: n 1 n 1 Fi H mean = 1 1 n i R = 1 n = 1 i= 1T i i Arquitectura de Computadoras Performance- 35
36 Homework 2 Choose a program to evaluate performance of a PC It can be for Linux or Windows Choose performance metrics for: Speed of CPU Speed of Main memory Speed of graphics applications Speed of hard disk Run performance program in two different computers Your assigned PC at the lab Your home computer Compare results for two computers and stand if one is faster than the other according each metrics Three pages long report Describe the performance program (one page) Describe performance tests and metrics (one page) Describe characteristics of both computers, compare results and make conclusions (one page) Arquitectura de Computadoras Performance- 36
37 Homework 2 Computer A Computer B Comparison CPU speed Memory speed Graphics speed HD speed Due date: September 19th, Arquitectura de Computadoras Performance- 37
Overview of Today s Lecture: Cost & Price, Performance { 1+ Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class
Overview of Today s Lecture: Cost & Price, Performance EE176-SJSU Computer Architecture and Organization Lecture 2 Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class EE176
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationWhich is the best? Measuring & Improving Performance (if planes were computers...) An architecture example
1 Which is the best? 2 Lecture 05 Performance Metrics and Benchmarking 3 Measuring & Improving Performance (if planes were computers...) Plane People Range (miles) Speed (mph) Avg. Cost (millions) Passenger*Miles
More informationComputer Performance Evaluation: Cycles Per Instruction (CPI)
Computer Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: where: Clock rate = 1 / clock cycle A computer machine
More informationIntroduction to Pipelined Datapath
14:332:331 Computer Architecture and Assembly Language Week 12 Introduction to Pipelined Datapath [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 W12.1 Review:
More informationQuantifying Performance EEC 170 Fall 2005 Chapter 4
Quantifying Performance EEC 70 Fall 2005 Chapter 4 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationComputer System. Performance
Computer System Performance Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationLecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533
Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?
More information1.6 Computer Performance
1.6 Computer Performance Performance How do we measure performance? Define Metrics Benchmarking Choose programs to evaluate performance Performance summary Fallacies and Pitfalls How to avoid getting fooled
More informationCPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPI CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f =
More informationInstructor Information
CS 203A Advanced Computer Architecture Lecture 1 1 Instructor Information Rajiv Gupta Office: Engg.II Room 408 E-mail: gupta@cs.ucr.edu Tel: (951) 827-2558 Office Times: T, Th 1-2 pm 2 1 Course Syllabus
More informationThe Von Neumann Computer Model
The Von Neumann Computer Model Partitioning of the computing engine into components: Central Processing Unit (CPU): Control Unit (instruction decode, sequencing of operations), Datapath (registers, arithmetic
More informationMeasure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding effects of underlying architecture
Chapter 2 Note: The slides being presented represent a mix. Some are created by Mark Franklin, Washington University in St. Louis, Dept. of CSE. Many are taken from the Patterson & Hennessy book, Computer
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 18, 2005 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 36 Performance 2010-04-23 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in
More informationLecture 3: Evaluating Computer Architectures. How to design something:
Lecture 3: Evaluating Computer Architectures Announcements - (none) Last Time constraints imposed by technology Computer elements Circuits and timing Today Performance analysis Amdahl s Law Performance
More informationLecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1
Lecture - 4 Measurement Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1 Acknowledgements David Patterson Dr. Roger Kieckhafer 9/29/2009 2 Computer Architecture is Design and Analysis
More informationCourse web site: teaching/courses/car. Piazza discussion forum:
Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start
More informationThe bottom line: Performance. Measuring and Discussing Computer System Performance. Our definition of Performance. How to measure Execution Time?
The bottom line: Performance Car to Bay Area Speed Passengers Throughput (pmph) Ferrari 3.1 hours 160 mph 2 320 Measuring and Discussing Computer System Performance Greyhound 7.7 hours 65 mph 60 3900 or
More informationComputer Science 246. Computer Architecture
Computer Architecture Spring 2010 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture Outline Performance Metrics Averaging Amdahl s Law Benchmarks The CPU Performance Equation Optimal
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41 Performance II CS61C L41 Performance II (1) Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia UWB Ultra Wide Band! The FCC moved
More informationMEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance
Necessity of evaluation computer performance MEASURING COMPUTER PERFORMANCE For comparing different computer performances User: Interested in reducing the execution time (response time) of a task. Computer
More informationCS61C Performance. Lecture 23. April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)
cs 61C L23 performance.1 CS61C Performance Lecture 23 April 21, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html Outline Review HP-PA, Intel 80x86 instruction
More informationChapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture
More informationPerformance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process
Performance evaluation It s an everyday process CS/COE0447: Computer Organization and Assembly Language Chapter 4 Sangyeun Cho Dept. of Computer Science When you buy food Same quantity, then you look at
More informationDefining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.
Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 38 Performance 2008-04-30 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationEvaluating Computers: Bigger, better, faster, more?
Evaluating Computers: Bigger, better, faster, more? 1 Key Points What does it mean for a computer to be good? What is latency? bandwidth? What is the performance equation? 2 What do you want in a computer?
More informationCS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek.
CS61C - Machine Structures Week 6 - Performance Oct 3, 2003 John Wawrzynek http://www-inst.eecs.berkeley.edu/~cs61c/ 1 Why do we worry about performance? As a consumer: An application might need a certain
More informationThe Role of Performance
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Role of Performance What is performance? A set of metrics that allow us to compare two different hardware
More informationECE 486/586. Computer Architecture. Lecture # 3
ECE 486/586 Computer Architecture Lecture # 3 Spring 2014 Portland State University Lecture Topics Measuring, Reporting and Summarizing Performance Execution Time and Throughput Benchmarks Comparing and
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 15
CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125
More informationAPPENDIX Summary of Benchmarks
158 APPENDIX Summary of Benchmarks The experimental results presented throughout this thesis use programs from four benchmark suites: Cyclone benchmarks (available from [Cyc]): programs used to evaluate
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationPerformance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]
Performance CS 3410 Computer System Organization & Programming [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon] Performance Complex question How fast is the processor? How fast your application runs?
More informationComputer Architecture
Computer Architecture Architecture The art and science of designing and constructing buildings A style and method of design and construction Design, the way components fit together Computer Architecture
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures CS61C L41 Performance I (1) Lecture 41 Performance I 2004-12-06 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Sour Roses! Cal s best season
More informationComputer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm
Computer Performance He said, to speed things up we need to squeeze the clock Reread Chapter 1.4-1.9 Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm L15 Computer Performance 1 Why Study Performance?
More informationEECS2021. EECS2021 Computer Organization. EECS2021 Computer Organization. Morgan Kaufmann Publishers September 14, 2016
EECS2021 Computer Organization Fall 2015 The slides are based on the publisher slides and contribution from Profs Amir Asif and Peter Lian The slides will be modified, annotated, explained on the board,
More informationDefining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747
Defining Which airplane has the best performance? 1 Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300 400 500 Passenger Capacity
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Number Representation 09212011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Logic Circuits for Register Transfer
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software
More informationCSE 141 Computer Architecture Summer Session Lecture 2 Performance, ALU. Pramod V. Argade
CSE 141 Computer Architecture Summer Session 1 24 Lecture 2 Performance, ALU Pramod V. Argade CSE141: Introduction to Computer Architecture Instructor: TA: Pramod V. Argade (p2argade@cs.ucsd.edu) Office
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationReview: latency vs. throughput
Lecture : Performance measurement and Instruction Set Architectures Last Time Introduction to performance Computer benchmarks Amdahl s law Today Take QUIZ 1 today over Chapter 1 Turn in your homework on
More informationThe Von Neumann Computer Model
The Von Neumann Computer Model Partitioning of the computing engine into components: Central Processing Unit (CPU): Control Unit (instruction decode, sequencing of operations), Datapath (registers, arithmetic
More informationImpact of Cache Coherence Protocols on the Processing of Network Traffic
Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background
More informationWhat is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.
Performance COMP375 Computer Architecture and dorganization What is Good Performance Which is the best performing jet? Airplane Passengers Range (mi) Speed (mph) Boeing 737-100 101 630 598 Boeing 747 470
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationLecture 4: Instruction Set Architectures. Review: latency vs. throughput
Lecture 4: Instruction Set Architectures Last Time Performance analysis Amdahl s Law Performance equation Computer benchmarks Today Review of Amdahl s Law and Performance Equations Introduction to ISAs
More informationComputing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design
Computing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design Computing Element Choices: Computing Element Programmability Spatial vs. Temporal Computing Main Processor Types/Applications
More informationComputing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design
Computing System Fundamentals/Trends + Review of Performance Evaluation and ISA Design Computing Element Choices: Computing Element Programmability Spatial vs. Temporal Computing Main Processor Types/Applications
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568/668 Part 1 Introduction Israel Koren ECE568/Koren Part.1.1 Coping with ECE 568/668 Students with varied
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science
More informationChapter-5 Memory Hierarchy Design
Chapter-5 Memory Hierarchy Design Unlimited amount of fast memory - Economical solution is memory hierarchy - Locality - Cost performance Principle of locality - most programs do not access all code or
More informationComputer Science 146. Computer Architecture
Computer Architecture Spring 2004 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 9: Limits of ILP, Case Studies Lecture Outline Speculative Execution Implementing Precise Interrupts
More informationADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS
The most important thing we build is trust ADVANCED ELECTRONIC SOLUTIONS AVIATION SERVICES COMMUNICATIONS AND CONNECTIVITY MISSION SYSTEMS UT840 LEON Quad Core First Silicon Results Cobham Semiconductor
More informationEE282 Computer Architecture. Lecture 1: What is Computer Architecture?
EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer
More informationChapter 1. Computer Abstractions and Technology. Adapted by Paulo Lopes, IST
Chapter 1 Computer Abstractions and Technology Adapted by Paulo Lopes, IST The Computer Revolution Progress in computer technology Sustained by Moore s Law Makes novel and old applications feasible Computers
More informationPage 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1
Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the
More informationReview: Salient features of MIPS I. CS152 Computer Architecture and Engineering Lecture 3
Review Salient features of MIPS I CS152 Computer rchitecture and Engineering Lecture 3 Performance, Technology & Delay Modeling Smattering of Prerequisite Topics January 28, 2003 John Kubiatowicz (http//www.cs.berkeley.edu/~kubitron)
More informationCS430 Computer Architecture
CS430 Computer Architecture Spring 2015 Spring 2015 CS430 - Computer Architecture 1 Chapter 14 Processor Structure and Function Instruction Cycle from Chapter 3 Spring 2015 CS430 - Computer Architecture
More informationTDT4255 Computer Design. Lecture 1. Magnus Jahre
1 TDT4255 Computer Design Lecture 1 Magnus Jahre 2 Outline Practical course information Chapter 1: Computer Abstractions and Technology 3 Practical Course Information 4 TDT4255 Computer Design TDT4255
More informationCOSC3330 Computer Architecture Lecture 7. Datapath and Performance
COSC3330 Computer Architecture Lecture 7. Datapath and Performance Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Datapath Performance 2 Datapath Control Signals
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationThis Unit. CIS 501 Computer Architecture. As You Get Settled. Readings. Metrics Latency and throughput. Reporting performance
This Unit CIS 501 Computer Architecture Metrics Latency and throughput Reporting performance Benchmarking and averaging Unit 2: Performance Performance analysis & pitfalls Slides developed by Milo Martin
More informationChapter 14 Performance and Processor Design
Chapter 14 Performance and Processor Design Outline 14.1 Introduction 14.2 Important Trends Affecting Performance Issues 14.3 Why Performance Monitoring and Evaluation are Needed 14.4 Performance Measures
More informationCSE 141 Computer Architecture Summer Session I, Lecture 3 Performance and Single Cycle CPU Part 1. Pramod V. Argade
CSE 141 Computer Architecture Summer Session I, 2005 Lecture 3 Performance and Single Cycle CPU Part 1 Pramod V. Argade CSE141: Introduction to Computer Architecture Instructor: Pramod V. Argade (p2argade@cs.ucsd.edu)
More informationEngineering 9859 CoE Fundamentals Computer Architecture
Engineering 9859 CoE Fundamentals Computer Architecture Introduction Dennis Peters 1 Fall 2007 1 Based on notes from Dr. R. Venkatesan Course Details Classes Monday, Wednesday, Friday 9 10 EN-4033 Course
More informationPerformance. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Performance Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Defining Performance (1) Which airplane has the best performance? Boeing 777 Boeing
More informationAdministrivia. CMSC 411 Computer Systems Architecture Lecture 14 Instruction Level Parallelism (cont.) Control Dependencies
Administrivia CMSC 411 Computer Systems Architecture Lecture 14 Instruction Level Parallelism (cont.) HW #3, on memory hierarchy, due Tuesday Continue reading Chapter 3 of H&P Alan Sussman als@cs.umd.edu
More informationReporting Performance Results
Reporting Performance Results The guiding principle of reporting performance measurements should be reproducibility - another experimenter would need to duplicate the results. However: A system s software
More informationGRE Architecture Session
GRE Architecture Session Session 2: Saturday 23, 1995 Young H. Cho e-mail: youngc@cs.berkeley.edu www: http://http.cs.berkeley/~youngc Y. H. Cho Page 1 Review n Homework n Basic Gate Arithmetics n Bubble
More informationCS/ECE 752: Advanced Computer Architecture 1. Lecture 1: What is Computer Architecture?
CS/ECE 752: Advanced Computer Architecture 1 Lecture 1: What is Computer Architecture? Karu Sankaralingam University of Wisconsin-Madison karu@cs.wisc.edu Slides courtesy of Stephen W. Keckler, UT-Austin
More informationECE 252 / CPS 220 Advanced Computer Architecture I. Administrivia. Instructors and Course Website. Where to Get Answers
ECE 252 / CPS 220 Advanced Computer Architecture I Fall 2003 Duke University Instructor: Prof. Daniel Sorin (sorin@ee.duke.edu) Administrivia addresses, email, website, etc. list of topics expected background
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationResponse Time and Throughput
Response Time and Throughput Response time How long it takes to do a task Throughput Total work done per unit time e.g., tasks/transactions/ per hour How are response time and throughput affected by Replacing
More informationCSE 141 Summer 2016 Homework 2
CSE 141 Summer 2016 Homework 2 PID: Name: 1. A matrix multiplication program can spend 10% of its execution time in reading inputs from a disk, 10% of its execution time in parsing and creating arrays
More informationAn Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 Benchmarks
An Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 s Joshua J. Yi and David J. Lilja Department of Electrical and Computer Engineering Minnesota Supercomputing
More informationComputer Architecture. Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson)
Computer Architecture Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson) What is Architecture? Instruction Set Design. Old definition from way back
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationQuiz for Chapter 1 Computer Abstractions and Technology
Date: Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationIC220 Slide Set #5B: Performance (Chapter 1: 1.6, )
Performance IC220 Slide Set #5B: Performance (Chapter 1: 1.6, 1.9-1.11) Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational
More informationComputer Architecture. What is it?
Computer Architecture Venkatesh Akella EEC 270 Winter 2005 What is it? EEC270 Computer Architecture Basically a story of unprecedented improvement $1K buys you a machine that was 1-5 million dollars a
More information5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner. Topic 1: Introduction
5DV118 Computer Organization and Architecture Umeå University Department of Computing Science Stephen J. Hegner Topic 1: Introduction These slides are mostly taken verbatim, or with minor changes, from
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Sept. 5 th : Homework 1 release (due on Sept.
More informationComputer Systems Performance Analysis and Benchmarking (37-235)
Computer Systems Performance Analysis and Benchmarking (37-235) Analytic Modelling Simulation Measurements / Benchmarking Lecture/Assignments/Projects: Dipl. Inf. Ing. Christian Kurmann Textbook: Raj Jain,
More informationChapter 1. and Technology
Chapter 1 Computer Abstractions Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Computers in automobiles
More information1.3 Data processing; data storage; data movement; and control.
CHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical
More informationPIPELINING AND PROCESSOR PERFORMANCE
PIPELINING AND PROCESSOR PERFORMANCE Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 1, John L. Hennessy and David A. Patterson, Morgan Kaufmann,
More informationDesign of Experiments - Terminology
Design of Experiments - Terminology Response variable Measured output value E.g. total execution time Factors Input variables that can be changed E.g. cache size, clock rate, bytes transmitted Levels Specific
More informationCache Optimization by Fully-Replacement Policy
American Journal of Embedded Systems and Applications 2016; 4(1): 7-14 http://www.sciencepublishinggroup.com/j/ajesa doi: 10.11648/j.ajesa.20160401.12 ISSN: 2376-6069 (Print); ISSN: 2376-6085 (Online)
More informationComputer Architecture. Introduction and Performance Measures
046267 Computer Architecture Introduction and Performance Measures Dr. Yitzhak (Tsahi) Birk Electrical Engr. Dept. Technion 1 מטרות הקורס (ל קח והשג נדרשים) הכרת יחידות המ חשב ושיטות לתכנונן רכישת כלים
More informationAries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX
Aries: Transparent Execution of PA-RISC/HP-UX Applications on IPF/HP-UX Keerthi Bhushan Rajesh K Chaurasia Hewlett-Packard India Software Operations 29, Cunningham Road Bangalore 560 052 India +91-80-2251554
More informationMultithreading Processors and Static Optimization Review. Adapted from Bhuyan, Patterson, Eggers, probably others
Multithreading Processors and Static Optimization Review Adapted from Bhuyan, Patterson, Eggers, probably others Schedule of things to do By Wednesday the 9 th at 9pm Please send a milestone report (as
More information