What is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.

Similar documents
Defining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747

Defining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.

Computer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

ECE 486/586. Computer Architecture. Lecture # 3

Designing for Performance. Patrick Happ Raul Feitosa

Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding effects of underlying architecture

Performance of computer systems

Performance, Power, Die Yield. CS301 Prof Szajda

The Computer Revolution. Classes of Computers. Chapter 1

Performance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process

The bottom line: Performance. Measuring and Discussing Computer System Performance. Our definition of Performance. How to measure Execution Time?

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Course web site: teaching/courses/car. Piazza discussion forum:

Response Time and Throughput

CPE300: Digital System Architecture and Design

Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So

Chapter 1. The Computer Revolution

Lecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533

Chapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance

Chapter 1. Computer Abstractions and Technology. Adapted by Paulo Lopes, IST

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

Performance. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

The Role of Performance

ECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.

EECS2021E EECS2021E. The Computer Revolution. Morgan Kaufmann Publishers September 12, Chapter 1 Computer Abstractions and Technology 1

1.6 Computer Performance

TDT4255 Computer Design. Lecture 1. Magnus Jahre

EECS2021. EECS2021 Computer Organization. EECS2021 Computer Organization. Morgan Kaufmann Publishers September 14, 2016

Computer Organization & Assembly Language Programming (CSE 2312)

UCB CS61C : Machine Structures

Computer System. Performance

CMSC 611: Advanced Computer Architecture

Quiz for Chapter 1 Computer Abstractions and Technology

Computer Architecture Homework Set # 1 COVER SHEET Please turn in with your own solution

Chapter 1. and Technology

CS430 Computer Architecture

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Parallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1

Performance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

ECE369: Fundamentals of Computer Architecture

CS61C - Machine Structures. Week 6 - Performance. Oct 3, 2003 John Wawrzynek.

From CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations

Outline Marquette University

1.3 Data processing; data storage; data movement; and control.

Assessing and Understanding Performance

Evaluating Computers: Bigger, better, faster, more?

CSE2021 Computer Organization. Computer Abstractions and Technology

Instructor Information

COMPUTER ORGANIZATION AND DESIGN

Performance, Cost and Amdahl s s Law. Arquitectura de Computadoras

An Introduction to Parallel Architectures

TDT 4260 lecture 2 spring semester 2015

COSC3330 Computer Architecture Lecture 7. Datapath and Performance

Which is the best? Measuring & Improving Performance (if planes were computers...) An architecture example

GRE Architecture Session

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CO Computer Architecture and Programming Languages CAPL. Lecture 15

This Unit. CIS 501 Computer Architecture. As You Get Settled. Readings. Metrics Latency and throughput. Reporting performance

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

Vector and Parallel Processors. Amdahl's Law

T T T T T T N T T T T T T T T N T T T T T T T T T N T T T T T T T T T T T N.

Typical DSP application

Reduced Instruction Set Computer

Exercise 1 Advanced Computer Architecture. Exercise 1

Intel released new technology call P6P

IC220 Slide Set #5B: Performance (Chapter 1: 1.6, )

Computer Organization and Design THE HARDWARE/SOFTWARE INTERFACE

PERFORMANCE MEASUREMENT

CSE 141 Summer 2016 Homework 2

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University

MEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance

Performance Analysis

Chapter 1. Introduction To Computer Systems

Engineering 9859 CoE Fundamentals Computer Architecture

Overview of Today s Lecture: Cost & Price, Performance { 1+ Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class

Computer Architecture. What is it?

Computer Architecture. Chapter 1 Part 2 Performance Measures

Computer Architecture

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010

CpE 442 Introduction to Computer Architecture. The Role of Performance

Lecture 7: Parallel Processing

Microarchitecture Overview. Performance

Chapter 17 Assessing Computer Performance

Keywords and Review Questions

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

Performance Metrics. 1 cycle. 1 cycle. Computer B performs more instructions per second, thus it is the fastest for this program.

Chapter 1. Computer Abstractions and Technology

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

Rechnerstrukturen

Microarchitecture Overview. Performance

Final Lecture. A few minutes to wrap up and add some perspective

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science

Lecture 2: Performance

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 1. Computer Abstractions and Technology. Jiang Jiang

Transcription:

Performance COMP375 Computer Architecture and dorganization What is Good Performance Which is the best performing jet? Airplane Passengers Range (mi) Speed (mph) Boeing 737-100 101 630 598 Boeing 747 470 4150 610 BAC/Sud Concorde 132 4000 1350 Douglas DC-8-50 146 8720 544 The Concorde is the fastest. The Boeing 747 is the largest. The DC-8 has the longest range. 2004 Morgan Kaufmann Publishers Benchmark at Home and Office Simple program Home 1.00 Office 341 3.41 Benchmark at Home and Office Simple program Program with 2 threads Home 1.00 0.99 Office 341 3.41 586 5.86 Home CPU is an Intel 2.0 GHz Pentium 4 Office CPU is an Intel 3.16 GHz Core 2 Duo Pentium 4 Home CPU is an Intel 2.0 GHz Pentium 4 Office CPU is an Intel 3.16 GHz Core 2 Duo Pentium 4 COMP375 1

Defining Performance Response Time The wall clock time it takes for a computer to complete a task. This is the performance measure of most interest to desktop users. Throughput The amount of work that can be accomplished, such as web pages served per second. Servers are often measured in this manner. Many Applications Some programs do a lot of I/O. They run best on a computer with lots of RAM and fast disks. Some programs are CPU bound. The run best on a computer with a fast CPU and lots of cache. Why does having lots of RAM improve disk I/O access? 1. Better RAM to cache ratio 2. More space for disk caching 3. Faster bus access 4. It doesn t Performance Aspects Integer arithmetic Floating Point arithmetic Network throughput Disk access time Disk throughput Audio and video processing Power usage COMP375 2

Clock Rate Isn t Everything Different architectures require a different number of Cycles per Instruction ti (CPI). CISC processors typically use more CPI, but may accomplish more per instruction. RISC processors usually have lower CPI, but it may take more instructions to accomplish a task. Execution Time N = number of instructions executed to run a program Hz = Clock rate Execution time = N * CPI / Hz This gives the execution time, not counting any I/O, system time or time spent on other applications. What might cause the number of cycles per second to vary? 1. Pipeline stalls 2. Caching 3. Virtual memory 4. Interrupts 5. All of the above 6. None of the above MIPS Millions of Instructions Per Second Meaningless Indicator of Performance MIPS = clock rate / (CPI * 1,000,000) The processor is guaranteed to not run faster than 25 MIPS. honest salesman COMP375 3

FLOPS Fastest Computers Floating Point Operations Per Second A common measure of high performance computer speed. Usually used with a metric such as TeraFLOPS or PetaFLOPS. Peak Cores GFlops Roadrunner 129,600 1,456,704 Jaguar - Cray XT5 150,152 1,381,400 Pleiades - SGI 51,200 608,829829 BlueGene/L 212,992 596,378 Benchmarks A benchmark is a program run on different systems whose execution time is used as a measure of system performance. A benchmark tells you how long it will take to run THAT program with THAT data. Benchmark results may, or may not, correspond to the time it takes to run your application. from http://www.cpubenchmark.net/high_end_cpus.html COMP375 4

SPEC Standardized benchmarks from the System Performance Evaluation Corporation Several different versions SPEC CINT2000 SPEC CFP2000 SPECweb2005 WinSPEC Homogeneous Systems Improvements for a given architecture: Increases in clock rate Improvements in CPU organization that t lowers the CPI More cache and wider bus for faster memory access Compiler enhancements to reduce the number of instructions executed to perform an application. Adding custom instructions. Amdahl s Law Improvements to part of a program can only improve that part of the program. Assume P is the fraction of the program that can run on N processors Speedup Speedup is the ratio of the speed of the parallel l program compared to the speed of the sequential version. COMP375 5

How to Improve Performance Buy more memory Buy faster disks If you are writing the application use compiler optimization read and write files in large blocks use memory instead of files Measure System Performance All operating systems provide tools to help measure system performance. Commonly measured values CPU utilization Number of I/Os to the disk / second Number of page faults / second SPARC Instruction Set The Sun SPARC processor has 32 registers. Register R0 is always zero and cannot be changed. Instructions are of the form Add R1, R2, R3 Where R2 is added to R3 and the result stored in R1. How can you implement mov R dest, R source? 1. Add R dest, R source, R0 2. Add R0, R dest, R source 3. Sub R dest, R0, R source 4. Add R0, R0, R0 5. Cannot be done COMP375 6

3 Level Memory Speed Assume your have L1 and L2 cache and RAM with hit rates H 1, H 2 and H 3 (Note: H1 + H2 + H3 = 1) The average memory access time is: avg = H 1 *L1 + H 2 *L2 + H 3 *RAM COMP375 7