Course web site: teaching/courses/car. Piazza discussion forum:

Similar documents
The bottom line: Performance. Measuring and Discussing Computer System Performance. Our definition of Performance. How to measure Execution Time?

Instructor Information

ECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

ECE 486/586. Computer Architecture. Lecture # 3

CpE 442 Introduction to Computer Architecture. The Role of Performance

Defining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.

Chapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

MEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance

Computer Architecture

Performance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Performance of computer systems

Engineering 9859 CoE Fundamentals Computer Architecture

Final Lecture. A few minutes to wrap up and add some perspective

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

Computer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture

Copyright 2012, Elsevier Inc. All rights reserved.

Quiz for Chapter 1 Computer Abstractions and Technology

Defining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747

EECS4201 Computer Architecture

IC220 Slide Set #5B: Performance (Chapter 1: 1.6, )

GRE Architecture Session

Lecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533

Lecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1

Spring 2014 Midterm Exam Review

Overview of Today s Lecture: Cost & Price, Performance { 1+ Administrative Matters Finish Lecture1 Cost and Price Add/Drop - See me after class

CSE 141 Summer 2016 Homework 2

15-740/ Computer Architecture Lecture 4: Pipelining. Prof. Onur Mutlu Carnegie Mellon University

CO Computer Architecture and Programming Languages CAPL. Lecture 15

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Fundamentals of Quantitative Design and Analysis

ECE 587 Advanced Computer Architecture I

Review: latency vs. throughput

Designing for Performance. Patrick Happ Raul Feitosa

Computer Performance Evaluation: Cycles Per Instruction (CPI)

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

What is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

University of Toronto Faculty of Applied Science and Engineering

Response Time and Throughput

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Microarchitecture Overview. Performance

CPE300: Digital System Architecture and Design

15-740/ Computer Architecture Lecture 7: Pipelining. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/26/2011

Performance, Cost and Amdahl s s Law. Arquitectura de Computadoras

Lecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture

Architectural Issues for the 1990s. David A. Patterson. Computer Science Division EECS Department University of California Berkeley, CA 94720

Pipelining. CS701 High Performance Computing

CMSC 611: Advanced Computer Architecture

Lecture 4: Instruction Set Architectures. Review: latency vs. throughput

High Performance Computing

Which is the best? Measuring & Improving Performance (if planes were computers...) An architecture example

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

The Von Neumann Computer Model

CS654 Advanced Computer Architecture. Lec 2 - Introduction

Performance Analysis

Performance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

Lecture 1: Introduction

Alexandria University

Practice Assignment 1

CS341l Fall 2009 Test #2

LECTURE 1. Introduction

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

Keywords and Review Questions

The Role of Performance

Lecture Topics ECE 341. Lecture # 8. Functional Units. Information Processed by a Computer

Microarchitecture Overview. Performance

Computer Architecture

Transistors and Wires

CMSC 411 Practice Exam 1 w/answers. 1. CPU performance Suppose we have the following instruction mix and clock cycles per instruction.

CS430 Computer Architecture

Performance, Power, Die Yield. CS301 Prof Szajda

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So

Computer Architecture s Changing Definition

Lec 25: Parallel Processors. Announcements

Security-Aware Processor Architecture Design. CS 6501 Fall 2018 Ashish Venkat

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

Lecture 3: Evaluating Computer Architectures. How to design something:

Lecture 21: Parallelism ILP to Multicores. Parallel Processing 101

PERFORMANCE MEASUREMENT

Computer Architecture. Introduction. Lynn Choi Korea University

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

Computer Architecture. Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson)

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

This Unit. CIS 501 Computer Architecture. As You Get Settled. Readings. Metrics Latency and throughput. Reporting performance

Cycles Per Instruction For This Microprocessor

The Processor: Instruction-Level Parallelism

ELE 375 Final Exam Fall, 2000 Prof. Martonosi

Computer Architecture. Chapter 1 Part 2 Performance Measures

Chapter 9. Pipelining Design Techniques

CS/ECE 752: Advanced Computer Architecture 1. Lecture 1: What is Computer Architecture?

CS152 Computer Architecture and Engineering. Lecture 9 Performance Dave Patterson. John Lazzaro. www-inst.eecs.berkeley.

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

Outline Marquette University

EE282H: Computer Architecture and Organization. EE282H: Computer Architecture and Organization -- Course Overview

Lecture 2: Performance

Performance. February 12, Howard Huang 1

Transcription:

Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start in week 3 1

Summary: Computer Architecture What is Computer Architecture? Technology Logic design Operating System (OS) Computer Architecture: ISA Microarchitecture Hardware implementation Programming Languages Compilers Applications Users Metrics of interest for computer architects Performance Cost Reliability Power Area 2

Performance of Computer Architectures Metrics: Execution time, response time: overall time for a given computation (e.g. full program execution, one transaction) Latency: time to complete a given task (e.g. memory latency, I/O latency, instruction latency) Bandwidth, throughput: rate of completion of tasks (e.g. memory bandwidth, transactions per second, MIPS, FLOPS) MIPS, FLOPS: must use with caution Benchmarking: toy benchmarks, synthetic benchmarks, kernels, real programs, Input sets Standard benchmarking suites: SPEC INT/FP, EEMBC Coremark, CloudSuite 3

CPU Performance Terminology A is n times faster than B means Execution time of B Execution time of A = n A is m% faster than B means Execution time of B Execution time of A Execution time of A x100 = m 4

Improving Performance: Principles of CA Design Take advantage of parallelism System level: multiple processors, multiple disks Processor level: operate on multiple instructions at once (e.g., pipelining, superscalar issue) Circuit level: operate on multiple bits at once (e.g., carry-lookahead ALU) Principle of locality Spatial and Temporal Locality 90% of program executing in 10% of code E.g. Caches Focus on the common case Amdahl s law, CPU Performance equation E.g. RISC design principle 5

Amdahl s Law Speedup due to an enhancement E: Time before Time after Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected. What is the Execution time after and Speedup(S)? 6

Amdahl s Law 7

Amdahl s Law - Example Floating point instructions contribute 10% to the program execution time and are improved to run 2x faster. Q: What is the execution time and speedup after improvement? Answer: 8

Factors that affect CPU performance Instruction count () Compiler, ISA Cycles per instruction (CPI) ISA, microarchitecture What determines these factors? Clock time (1/f) Microarchitecture, technology 9

The CPU Performance Equation CPU time = x CPI x Clock time where: where: CPU time = execution time = number of instructions executed (instruction count) CPI = number of average clock cycles per instruction Clock time = duration of processor clock n ( CPU time = Σ ) i x CPI i x Clock time i=1 i = for instruction (instruction group) i CPI i = CPI for instruction (instruction group) i CPI = n ( Σ i x CPI i ) i=1 Σ ( ) = CPI i x i n i=1 10

Examples Example: Branch instructions take 2 cycles, all other instructions take 1 cycle CPU A uses extra compare instruction per branch Clock frequency of CPU A is 1.25 times faster than CPU B On CPU A, 20% of instructions are branches (thus other 20% are compare instructions) Find the relative performance of CPUs A and B CPI A = CPI branch x branch + CPI others x others = 2 x 0.2 + 1 x 0.8 = 1.2 CPI B = CPI branch x branch + CPI others x others = 2 x 0.25 + 1 x 0.75 = 1.25 CPU time B = B x CPI B x Clock time B = 0.8 x A x 1.25 x (1.25 x Clock time A ) = 1.25 x A x Clock time A CPU time A = A x CPI A x Clock time A = A x 1.2 x Clock time A CPU A is faster! 11

Improving CPU Performance : Compiler optimizations (constant folding, constant propagation) ISA (More complex instructions) CPI: Microarchitecture (Pipelining, Out-of-order execution, branch prediction) Compiler (Instruction scheduling) ISA (Simpler instructions) Clock period: Technology (Smaller transistors) ISA (Simple instructions that can be easily decoded) Microarchitecture (Simple architecture) 12