Cycle Time for Non-pipelined & Pipelined processors

Similar documents
Computer Architecture Review. Jo, Heeseung

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Pipelining. CSC Friday, November 6, 2015

Chapter 5. Memory Technology

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

CPU Pipelining Issues

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Memory. Lecture 22 CS301

Lecture-14 (Memory Hierarchy) CS422-Spring

Computer Science 432/563 Operating Systems The College of Saint Rose Spring Topic Notes: Memory Hierarchy

The Memory Hierarchy 10/25/16

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

ECE232: Hardware Organization and Design

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory hierarchy Outline

CSE 2021: Computer Organization

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

LECTURE 11. Memory Hierarchy

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

CSE 2021: Computer Organization

ECE468 Computer Organization and Architecture. Memory Hierarchy


Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Memory Hierarchy Y. K. Malaiya

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hiding Memory Access Times

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies.

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

SAE5C Computer Organization and Architecture. Unit : I - V

Final Lecture. A few minutes to wrap up and add some perspective

5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

Memory Hierarchy: Caches, Virtual Memory

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Large and Fast: Exploiting Memory Hierarchy

CENG4480 Lecture 09: Memory 1

The Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Memory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

LECTURE 10: Improving Memory Access: Direct and Spatial caches

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Chapter 9: A Closer Look at System Hardware

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Lecture 12: Memory hierarchy & caches

Chapter 9: A Closer Look at System Hardware 4

Recap: Machine Organization

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

The University of Adelaide, School of Computer Science 13 September 2018

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

ECE 152 Introduction to Computer Architecture

Computer Architecture Memory hierarchies and caches

Chapter 4. The Processor

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Computer Systems Laboratory Sungkyunkwan University

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

The Memory Hierarchy & Cache

CS 31: Intro to Systems Storage and Memory. Kevin Webb Swarthmore College March 17, 2015

Chapter 6 Objectives

Internal Memory Cache Stallings: Ch 4, Ch 5 Key Characteristics Locality Cache Main Memory

14:332:331. Week 13 Basics of Cache

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CENG3420 Lecture 08: Memory Organization

Question?! Processor comparison!

Course Administration

Transistor: Digital Building Blocks

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Welcome to Part 3: Memory Systems and I/O

LECTURE 5: MEMORY HIERARCHY DESIGN

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

Memory Hierarchies &

CSE Lecture 13/14 In Class Handout For all of these problems: HAS NOT CANNOT Add Add Add must wait until $5 written by previous add;

EEC 483 Computer Organization

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

CpE 442. Memory System

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy

Chapter 6 Objectives

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

Transcription:

Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all the pipeline elements: Clock cycle = 250 + 350 + 150 + 300 + 200 = 1250 ps For a pipelined processor, the clock cycle is the time of the pipeline element with the largest latency: Clock cycle = 350 ps. 3-1

Control Hazard conditional branch determines control flow fetching next instruction depends on outcome in MIPS pipeline compare registers and compute result early add hardware to ID/Reg stage Is this enough? CS 230 - Spring 2018 3-2

Stall on Branch shorter wait for branch outcome CS 230 - Spring 2018 3-3

Branch Prediction longer pipelines can't determine outcome early stall penalty becomes more significant predict outcome of branch only stall if prediction is wrong in MIPS simple static prediction branch not taken, fetch next instruction no delay stall only when branch is taken CS 230 - Spring 2018 3-4

Predict Not Taken Prediction correct Prediction incorrect CS 230 - Spring 2018 3-5

More Realistic Branch Prediction static branch prediction based on typical branch behaviour loop: predict backward branches taken if: predict forward branches not taken encode in high-level SW dynamic branch prediction hardware measures actual branch behaviour e.g., record recent history of each branch assume future behaviour will continue trend CS 230 - Spring 2018 3-6

Writing Code to Avoid Hazards CS 230 - Spring 2018 3-7

CS 230 - Winter 2017 3-8

What if we can't forward? Improve the speed from A by rearranging lines of code WITHOUT changing the final result of the program This add instruction not affected by the slt We save 1 stall by moving the add instruction. 3-9

Pipeline Summary pipelining improves performance by increasing instruction throughput multiple instructions executed in parallel unchanged latency subject to hazards structure, data, control instruction set affects pipeline complexity CS 230 - Spring 2018 3-10

Memory Characteristics cost per unit storage performance access latency - How long does it take to make one access? Throughput - How many accesses can be done over a unit of time? Persistency volatile/non-volatile Being non-volatile means that memory holds its data throughout a power cycle. e.g. if you turn off your computer, then turn it back on the memory holds the same value it did before. 3-11

Registers very expensive, thus limited access at instruction speed volatile CS 230 - Spring 2018 3-12

Main Memory cheap & large noticeable access latency approx. factor 100 slower volatile CS 230 - Spring 2018 3-13

Random Access Memory (RAM) array index-based direct access via memory bus fetching memory CPU sends address to memory controller controller responds with data CS 230 - Spring 2018 3-14

Static vs. Dynamic RAM Static RAM (SRAM) similar to previously shown memory circuit stable storage, as long as power is applied multiple transistors per bit expensive, but faster used for caching Dynamic RAM (DRAM) bit stored in capacitor, needs refreshing single transistor per bit cheaper, but slower CS 230 - Spring 2018 3-15

Non-volatile memory Magnetic Disks Very cheap, slow, mechanical Used in desktops, servers, laptops (to a certain extent) Flash Memory More expensive, fast, non-mechanical Limited Writes Used in usb drives, smartphones CS 230 - Winter 2017 3-16

Memory Technology typical performance and cost figures as of 2008 Technology Access Time $/GB SRAM 0.5-2.5ns $2000-$5000 DRAM 50-70ns $20-$75 Disk 5,000,000-7,000,000ns $0.20-$2 ideally: have large amounts of fast memory CS 230 - Spring 2018 3-17

Memory Hierarchy CS 230 - Spring 2018 3-18

Memory Hierarchy going down the hierarchy, memory gets cheaper bigger slower further away might consider lowest level as backing storage everything else as a cache... different perspectives exist CS 230 - Spring 2018 3-19

Memory Stalls delay until response from memory controller what does the CPU do in the meantime? compute something else? details later... memory compromise: fast vs. cheap satisfy as many requests as fast as possible CS 230 - Spring 2018 3-20

Locality Principle temporal locality same data item likely to be used again soon e.g., loop spatial locality close data item likely to be used soon e.g., iteration through array example: books in library and on desk CS 230 - Spring 2018 3-21