CS430 Computer Architecture

Similar documents
Designing for Performance. Patrick Happ Raul Feitosa

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

CPU Structure and Function

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

1.3 Data processing; data storage; data movement; and control.

Computer Architecture. Chapter 1 Part 2 Performance Measures

CMSC 411 Practice Exam 1 w/answers. 1. CPU performance Suppose we have the following instruction mix and clock cycles per instruction.

Performance. CS 3410 Computer System Organization & Programming. [K. Bala, A. Bracy, E. Sirer, and H. Weatherspoon]

Computer Architecture and Organization. Instruction Sets: Addressing Modes and Formats

Lecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533

Instructor Information

Vector and Parallel Processors. Amdahl's Law

Course web site: teaching/courses/car. Piazza discussion forum:

CPU Structure and Function

Chapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

ECE 486/586. Computer Architecture. Lecture # 7

CS 152 Computer Architecture and Engineering

CS 152 Computer Architecture and Engineering

CS 352H Computer Systems Architecture Exam #1 - Prof. Keckler October 11, 2007

Chapter 4. The Processor

William Stallings Computer Organization and Architecture

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

CO Computer Architecture and Programming Languages CAPL. Lecture 15

CS 351 Final Review Quiz

CPE300: Digital System Architecture and Design

T T T T T T N T T T T T T T T N T T T T T T T T T N T T T T T T T T T T T N.

What is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.

ECE 341. Lecture # 15

Lecture 4: Instruction Set Architectures. Review: latency vs. throughput

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

Chapter 14 - Processor Structure and Function

Do not open this exam until instructed to do so. CS/ECE 354 Final Exam May 19, CS Login: QUESTION MAXIMUM SCORE TOTAL 115

CSCE 5610: Computer Architecture

Defining Performance. Performance. Which airplane has the best performance? Boeing 777. Boeing 777. Boeing 747. Boeing 747

Main Points of the Computer Organization and System Software Module

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture

Computer Architecture CS372 Exam 3

15-740/ Computer Architecture Lecture 4: Pipelining. Prof. Onur Mutlu Carnegie Mellon University

Chapter 12. CPU Structure and Function. Yonsei University

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

GRE Architecture Session

Addressing Modes. Immediate Direct Indirect Register Register Indirect Displacement (Indexed) Stack

CS654 Advanced Computer Architecture. Lec 2 - Introduction

CS Mid-Term Examination - Fall Solutions. Section A.

ECE 486/586. Computer Architecture. Lecture # 12

COSC3330 Computer Architecture Lecture 7. Datapath and Performance

EE 457 Unit 6a. Basic Pipelining Techniques

SUPERSCALAR AND VLIW PROCESSORS

CS 351 Final Exam Solutions

Multiple Issue ILP Processors. Summary of discussions

Instr. execution impl. view

Final Exam Fall 2007

Lecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture

CS 2506 Computer Organization II

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Which is the best? Measuring & Improving Performance (if planes were computers...) An architecture example

Quiz for Chapter 1 Computer Abstractions and Technology

Instruction Pipelining Review

Parts A and B both refer to the C-code and 6-instruction processor equivalent assembly shown below:

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

Chapter 4. The Processor

Blog -

The Role of Performance

Module 4c: Pipelining

Defining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.

Analysis Report. Number of Multiprocessors 3 Multiprocessor Clock Rate Concurrent Kernel Max IPC 6 Threads per Warp 32 Global Memory Bandwidth

ADVANCED COMPUTER ARCHITECTURES: Prof. C. SILVANO Written exam 11 July 2011

Understand the factors involved in instruction set

Lecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1

Review: latency vs. throughput

Chapter 03. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1

University of Toronto Faculty of Applied Science and Engineering

Processor Structure and Function

UNIT V: CENTRAL PROCESSING UNIT

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Performance, Power, Die Yield. CS301 Prof Szajda

Cycles Per Instruction For This Microprocessor

Practice Assignment 1

Advanced processor designs

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

CSE 141 Summer 2016 Homework 2

2. [3 marks] Show your work in the computation for the following questions involving CPI and performance.

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

Computer Organization CS 206 T Lec# 2: Instruction Sets

PIPELINING: HAZARDS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Final Exam Spring 2017

Processing Unit CS206T

Chapter 5. A Closer Look at Instruction Set Architectures

ECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

Processor (I) - datapath & control. Hwansoo Han

Chapter 4. The Processor

CS/COE1541: Introduction to Computer Architecture

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

CS 152, Spring 2011 Section 10

CS 2410 Mid term (fall 2015) Indicate which of the following statements is true and which is false.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Computer Organization and Architecture (CSCI-365) Sample Final Exam

Transcription:

CS430 Computer Architecture Spring 2015 Spring 2015 CS430 - Computer Architecture 1

Chapter 14 Processor Structure and Function Instruction Cycle from Chapter 3 Spring 2015 CS430 - Computer Architecture 2

Updated Instruction Cycle Give an x86 instruction that goes through as many stages as possible Spring 2015 CS430 - Computer Architecture 3

Instruction Pipelining Organization enhancements to the processor over time have greatly improved performance. Examples include: multiple general purpose registers versus a single accumulator multiple levels of cache next is pipelining Spring 2015 CS430 - Computer Architecture 4

Instruction Pipelining Pipelining is a technique that improves instruction throughput by executing as many instructions as possible in parallel. In particular, these instructions are in various "stages" of pipeline execution. Instruction pipelining is very much like the assembly line at a manufacturing plant where products in various stages of production can be worked on simultaneously. Spring 2015 CS430 - Computer Architecture 5

Instruction Pipelining We will break up instruction processing into the following six stages: 1. FI (fetch instruction) - get the next instruction 2. DI (decode instruction) - decode the opcode and operands 3. CO (calculate operands) - calculate EA of the operands 4. FO (fetch operands) - fetch operands from memory (not necessary for register data) 5. EI (execute instruction) - execute instruction storing result if necessary 6. WO (write operand) - write the result in MEM Spring 2015 CS430 - Computer Architecture 6

Instruction Pipelining In a perfect world, this would lead us to an architecture that has a CPU with a six stage pipeline where each stage takes an equal duration of time to execute. 1. Do the above pipeline stages take the same duration of time to complete their specific operation? If not, give two stages that can have different durations of time to complete their task. 2. Assuming no pipelining and each stage in the pipeline takes 1 unit of time to complete, how long would it take 5 instructions to complete? Spring 2015 CS430 - Computer Architecture 7

Instruction Pipelining Assuming a six stage pipeline and each stage in the pipeline takes 1 unit of time to complete, how long will it take 5 instructions to complete (assuming no delays)? Draw the timing diagram Spring 2015 CS430 - Computer Architecture 8

Performance Assessment We begin taking a look at some traditional measures of processor speed Then we look at assessing processor and computer system performance Spring 2015 CS430 - Computer Architecture 9

Clock Speed Clock Rate (or Clock Speed) the rate of pulses Clock Cycle (or Clock Tick) a single clock pulse Clock Cycle Time the time between ticks Spring 2015 CS430 - Computer Architecture 10

Instruction Execution Rate A processor is driven by a clock with a constant frequency, f, or equivalently, a constant cycle time τ, where τ = 1 f I c - instruction count is the number of machine instructions executed for a given program CPI cycles per instruction is the average cycles taken to execute an instruction Spring 2015 CS430 - Computer Architecture 11

Instruction Execution Rate Since each instruction type does not necessarily take the same number of cycles, CPI, is calculated as follows: CPI = n i=1 CPI i I i I c where CPI i is the number of cycles required for instruction type i I i is the number of executed instructions of type i I c is the total number of instructions executed The processor time, T, needed to execute a given program is T = I c CPI τ Spring 2015 CS430 - Computer Architecture 12

MIPS Rate The MIPS rate is the measurement expressing millions of instructions per second MIPS Rate = f CPI 10 6 Problem: A program results in executing 2 million instructions on a 1GHz processor. Given the following, compute the MIPS rate Instruction Type CPI Instruction Mix (%) Arithmetic & Logic 1 60 Load/Store w/cache hit 2 18 Branch 4 12 MEM reference w/cache miss 8 10 Spring 2015 CS430 - Computer Architecture 13

Speedup "Speedup" is a very simple, useful term that allows us a way to measure performance and execution times. Speedup = Time Old /Time(New) Example: A trip to downtown Portland takes 50 minutes by car and 30 minutes by MAX. What is the speedup? Answer: speedup=50/30=1.6666 Spring 2015 CS430 - Computer Architecture 14

Speedup New is faster than Old if Time(New) < Time(Old) New is N% faster than Old given Speedup = Time Old Time New = 1 + N% So, speedup of MAX over car is 1.6666; therefore, MAX is 66.66 faster than the car Notice also that Speedup = Speed(New)/Speed(Old) Spring 2015 CS430 - Computer Architecture 15

Amdahl s Law Given a job, some fraction of that job may be enhanced. In particular, if we can make some enhancement to a particular machine that improves performance, speedup will tell us how much faster the machine is with the enhancement versus the machine without the enhancement. Spring 2015 CS430 - Computer Architecture 16

Amdahl s Law Let's assume it takes us 1 hour to drive to Portland. From Portland, it takes us 3 hours by train to get to Seattle. What is the speedup if we fly instead and it takes 1 hour to fly to Seattle? Answer: speedup = Time(Old)/Time(New) = 4/2 = 2.0 How much faster is the trip if we fly to Seattle versus taking the train? Answer: Flying is 100% faster than taking the train. Spring 2015 CS430 - Computer Architecture 17

Fraction Enhanced Fraction enhanced is the fraction of the "original" time that can be enhanced. What is our fraction enhanced for our Trip to Seattle? Answer: 3 hours can be enhanced, so fraction enhanced is 3/4=75%. Spring 2015 CS430 - Computer Architecture 18

Fraction Enhanced Further examples to clarify the fraction enhanced include the following: 1. Speeding up the floating-point calculations doesn't speed up the integer calculations 2. If we are traveling to Eugene and speedup the highway driving somehow, that doesn't speedup the packing, walking to the car,... An interesting observation is that if we do the fraction enhanced at infinite speed, this gives us the maximum speedup for that enhancement What is the speedup of the six stage pipeline machine versus the machine with no pipelining? Spring 2015 CS430 - Computer Architecture 19