CS654 Advanced Computer Architecture. Lec 2 - Introduction

Similar documents
Lecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1

B649 Graduate Computer Architecture. Lec 1 - Introduction

MOTIVATION. B649 Parallel Architectures and Programming

CS654 Advanced Computer Architecture. Lec 3 - Introduction

Technology. Giorgio Richelli

Lecture 1: Introduction

Advanced Computer Architecture

Administrivia. CMSC 411 Computer Systems Architecture Lecture 8 Basic Pipelining, cont., & Memory Hierarchy. SPEC92 benchmarks

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Computer Architecture. R. Poss

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0.

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

COSC 6385 Computer Architecture - Memory Hierarchies (I)

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Modern Computer Architecture

Lecture 4: Instruction Set Architectures. Review: latency vs. throughput

ECE468 Computer Organization and Architecture. Virtual Memory

Final Lecture. A few minutes to wrap up and add some perspective

ECE4680 Computer Organization and Architecture. Virtual Memory

CSC Memory System. A. A Hierarchy and Driving Forces

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPE 631 Lecture 04: CPU Caches

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

CS152 Computer Architecture and Engineering Lecture 17: Cache System

Introduction to Computer Architecture II

CPU Pipelining Issues

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Topics. Computer Organization CS Improving Performance. Opportunity for (Easy) Points. Three Generic Data Hazards

Advanced Computer Architecture

COSC 6385 Computer Architecture - Pipelining

Memory hierarchy review. ECE 154B Dmitri Strukov

CS654 Advanced Computer Architecture. Lec 1 - Introduction

Handout 4 Memory Hierarchy

CSE 502 Graduate Computer Architecture. Lec 6-7 Memory Hierarchy Review

Course web site: teaching/courses/car. Piazza discussion forum:

Instructor Information

CS152 Computer Architecture and Engineering Lecture 18: Virtual Memory

Computer Architecture and System

Computer Architecture and System

Topics. Digital Systems Architecture EECE EECE Need More Cache?

Memory Hierarchies 2009 DAT105

Computer Systems Architecture Spring 2016

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

CSE 502 Graduate Computer Architecture. Lec 5-6 Memory Hierarchy Review

CSE502 Graduate Computer Architecture. Lec 22 Goodbye to Computer Architecture and Review

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668

Chapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

ENGG 1203 Tutorial. Computer Arithmetic (1) Computer Arithmetic (3) Computer Arithmetic (2) Convert the following decimal values to binary:

Computer Architecture Spring 2016

Lecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture

Review from last lecture. EECS 252 Graduate Computer Architecture. Lec 4 Memory Hierarchy Review. Outline. Example Standard Deviation: Last time

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Lecture 29 Review" CPU time: the best metric" Be sure you understand CC, clock period" Common (and good) performance metrics"

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov

Memory Systems and Performance Engineering

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

ECE ECE4680

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Cycle Time for Non-pipelined & Pipelined processors

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Memory Hierarchy: Motivation

Performance, Power, Die Yield. CS301 Prof Szajda

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

EITF20: Computer Architecture Part4.1.1: Cache - 2

Review: latency vs. throughput

CMSC 611: Advanced Computer Architecture. Cache and Memory

The Memory Hierarchy & Cache

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Instr. execution impl. view

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Introduction to cache memories

ELE 375 Final Exam Fall, 2000 Prof. Martonosi

Lecture 11. Virtual Memory Review: Memory Hierarchy

Transistor: Digital Building Blocks

Memory. Lecture 22 CS301

Speicherarchitektur. Who Cares About the Memory Hierarchy? Technologie-Trends. Speicher-Hierarchie. Referenz-Lokalität. Caches

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

14:332:331. Week 13 Basics of Cache

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

Memory Systems and Performance Engineering. Fall 2009

CS430 Computer Architecture

MEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

EEC 483 Computer Organization

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Memory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Bell s Law new class per decade

Computer Architecture

Why memory hierarchy

Transcription:

CS654 Advanced Computer Architecture Lec 2 - Introduction Peter Kemper Adapted from the slides of EECS 252 by Prof. David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley

Outline Computer Science at a Crossroads Computer Architecture v. Instruction Set Arch. What Computer Architecture brings to table Technology Trends 1/23/09 CS654 W&M 2

What Computer Architecture brings to Table Other fields often borrow ideas from architecture Quantitative Principles of Design 1. Take Advantage of Parallelism 2. Principle of Locality 3. Focus on the Common Case 4. Amdahl s Law 5. The Processor Performance Equation Careful, quantitative comparisons Define, quantify, and summarize relative performance Define and quantify relative cost Define and quantify dependability Define and quantify power Culture of anticipating and exploiting advances in technology Culture of well-defined interfaces that are carefully implemented and thoroughly checked 1/23/09 CS654 W&M 3

1) Taking Advantage of Parallelism Increasing throughput of server computer via multiple processors or multiple disks Detailed HW design Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand Multiple memory banks searched in parallel in set-associative caches Pipelining: overlap instruction execution to reduce the total time to complete an instruction sequence. Not every instruction depends on immediate predecessor executing instructions completely/partially in parallel possible Classic 5-stage pipeline: 1) Instruction Fetch (), 2) ister Read (), 3) Execute (), 4) Data Memory Access (Dmem), 5) ister Write () 1/23/09 CS654 W&M 4

Pipelined Instruction Execution Time (clock cycles) I n s t r. Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 O r d e r 1/23/09 CS654 W&M 5

Limits to pipelining Hazards prevent next instruction from executing during its designated clock cycle Structural hazards: attempt to use the same hardware to do two different things at once Data hazards: Instruction depends on result of prior instruction still in the pipeline Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps). Time (clock cycles) I n s t r. O r d e r 1/23/09 CS654 W&M 6

2) The Principle of Locality The Principle of Locality: Program access a relatively small portion of the address space at any instant of time. Two Different Types of Locality: Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access) Last 30 years, HW relied on locality for memory perf. P $ MEM 1/23/09 CS654 W&M 7

Capacity Access Time Cost Levels of the Memory Hierarchy CPU isters 100s Bytes 300 500 ps (0.3-0.5 ns) L1 and L2 Cache 10s-100s K Bytes ~1 ns - ~10 ns $1000s/ GByte Main Memory G Bytes 80ns- 200ns ~ $100/ GByte Disk 10s T Bytes, 10 ms (10,000,000 ns) ~ $1 / GByte isters L1 Cache Memory Disk Instr. Operands Blocks L2 Cache Blocks Pages Files Staging Xfer Unit prog./compiler 1-8 bytes cache cntl 32-64 bytes cache cntl 64-128 bytes OS 4K-8K bytes user/operator Mbytes Upper Level faster Tape infinite sec-min Tape ~$1 / GByte 1/23/09 CS654 W&M 8 Larger Lower Level

3) Focus on the Common Case Common sense guides computer design Since it's engineering, common sense is valuable In making a design trade-off, favor the frequent case over the infrequent case E.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it 1st E.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it 1st Frequent case is often simpler and can be done faster than the infrequent case E.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow May slow down overflow, but overall performance improved by optimizing for the normal case What is frequent case and how much performance improved by making case faster => Amdahl s Law 1/23/09 CS654 W&M 9

4) Amdahl s Law ExTimenew = ExTimeold ( 1 % & Fraction # $( ' Fraction ) +! " Speedup Speedup overall = ExTime ExTime old new = ( 1! Fraction ) 1 + Fraction Speedup Best you could ever hope to do: Speedup = maximum 1 1 - Fraction ( ) 1/23/09 CS654 W&M 10

Amdahl s Law example New CPU 10X faster I/O bound server, so 60% time waiting for I/O Speedup overall = = ( 1! Fraction ) 1 ( 1! 0.4) + 0.4 10 = 1 Fraction + Speedup 1 0.64 = 1.56 Apparently, its human nature to be attracted by 10X faster, vs. keeping in perspective its just 1.6X faster 1/23/09 CS654 W&M 11

5) Processor performance equation CPI inst count Cycle time CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPI Program X Clock Rate Compiler X (X) Inst. Set. X X Organization X X Technology X 1/23/09 CS654 W&M 12

At this point Computer Architecture >> instruction sets Computer Architecture skill sets are different 5 Quantitative principles of design Quantitative approach to design Solid interfaces that really work Technology tracking and anticipation Computer Science at the crossroads from sequential to parallel computing Salvation requires innovation in many fields, including computer architecture However for CS654, we have to go through the state of the art first: Material: read Chapter 1, then Appendix A in Hennessy/Patterson 1/23/09 CS654 W&M 13