ECE 2300 Digital Logic & Computer Organization. Caches

Size: px
Start display at page:

Download "ECE 2300 Digital Logic & Computer Organization. Caches"

Transcription

1 ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1

2 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2

3 Course Content Binary numbers and logic gates Boolean algebra and combinational logic Sequential logic and state machines Binary arithmetic Memories Instruction set architecture organization s and virtual memory Input/output Advanced topics Lecture 2: 3

4 Review: Pipelined Microprocessor PCJ CU sign bit =? MB, F MW, MD LD M U X PCJ P C PCL +2 Inst RAM Decoder Adder LD SA SB DR RF D_in M U X M U X M U X M U X M U X MB F m F ALU Data RAM D_IN MW M U X MD SE IF/ID ID/EX EX/MEM MEM/WB Lecture 2: 4

5 Example: Data Hazards w/ Forwarding Assume HW forwarding and NO delay slot for load Identify all data hazards in the following instruction sequences by circling each source register that is read before the updated value is written back LW, () ADDI,, 1 BEQ,, X SW, 4() X: Lecture 2: 5

6 We Need Fast and Large IF ID EX MEM WB Instruction RAM Reg A L U Data RAM Reg IF/ID ID/EX EX/MEM MEM/WB cycle time: ~3ps-2ns (~3GHz-5MHz) DRAM Slow (1-5 ns for a read or write) Cheap (1 transistor + capacitor per bit cell) SRAM Fast ( s of ps to few ns for a read/write) Expensive (6 transistors per bit cell) Lecture 2: 6

7 Using s in the Pipeline IF ID EX MEM WB Instruction (SRAM) Reg A L U Data (SRAM) Reg IF/ID ID/EX EX/MEM MEM/WB Main (DRAM) Lecture 2: 7

8 Small SRAM memory that permits rapid access to a subset of instructions or data If the data is in the cache (cache hit), we retrieve it without slowing down the pipeline If the data is not in the cache (cache miss), we retrieve it from the main memory (penalty incurred in accessing DRAM) The hit rate is the fraction of memory accesses found in the cache The miss rate is (1 hit rate) Lecture 2: 8

9 Access with Average memory access time with cache: Hit time + Miss rate * Miss penalty Example Main memory access time = 5ns hit time = 2ns Miss rate = 1% Average mem access time w/o cache = 5ns Average mem access time w/ cache = 2 +.1*5 = 7ns Lecture 2: 9

10 Why s Work: Principle of Locality Temporal locality If memory location X is accessed, then it is likely to be accessed again in the near future s exploit temporal locality by keeping a referenced instruction or data in the cache Spatial locality If memory location X is accessed, then locations near X are likely to be accessed in the near future s exploit spatial locality by bringing in a block of instructions or data into the cache on a miss Lecture 2: 1

11 Some Important Terms is partitioned into blocks Each cache block (or cache line) typically contains multiple bytes of data A whole block is read or written during data transfer between cache and main memory Each cache block is associated with a tag and a valid bit Tag: A unique ID to differentiate between different memory blocks may be mapped into the same block Valid bit: indicates whether the data in a cache block is valid (1) or not () Lecture 2: 11

12 Direct Mapped Concepts A given memory block is mapped to one and only one cache block Block Block, 8, 16, , 9, 17, , 1, 18, , 11, 19, , 12, 2, , 13, 21, , 14, 22, 3 7 7, 15, 23, 31 Example: A cache with 8 blocks Block addresses in decimal Here we assume the main memory is 4 times larger than cache Lecture 2: 12

13 Direct Mapped (DM) Concepts Same example: has 8 blocks and main memory has 32 blocks Block addresses are in binary 4 different memory blocks may be mapped to the same cache location 11 Lecture 2: 13

14 DM Organization Lecture 2: 14

15 Address Translation for DM Breakdown of memory address for cache use n-i-b tag bits i index bits b byte offset bits DM cache parameters Size of each cache block is 2 b bytes cache block and cache line are synonymous Number of blocks is 2 i Total cache size is 2 b 2 i = 2 b+i bytes Lecture 2: 15

16 Use the index bits to retrieve the tag, data, and valid bit Reading DM Compare the tag from the address with the retrieved tag If valid & a match in tag (hit), select the desired data using the byte offset Otherwise (miss) Bring the memory block into the cache (also set valid) Store the tag from the address with the block Select the desired data using the byte offset Lecture 2: 16

17 Use the index bits to retrieve the tag and valid bit Writing DM Compare the tag from the address with the retrieved tag Data If valid & a match in tag (hit), write the data into the cache location Otherwise (miss), one option Bring the memory block into the cache (also set valid) Store the tag from the address with the block Write the data into the cache location Lecture 2: 17

18 Direct Mapped Example Size of each block is 4 bytes holds 4 blocks holds 16 blocks address has 6 bits 2 tag bits 2 index bits 2 byte offset bits Lecture 2: 18

19 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss block address (binary) Data (decimal) Lecture 2: 19

20 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 2

21 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 21

22 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 22

23 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 23

24 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R 14 miss Lecture 2: 24

25 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R 14 miss Lecture 2: 25

26 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 26

27 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 27

28 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 28

29 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 29

30 Direct Mapped Example <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 3

31 Doubling the Block Size Size of each block is 8 bytes holds 2 blocks holds 8 blocks address has 6 bits 2 tag bits 3 byte offset bits 1 V tag data 1 index bit Lecture 2: 31

32 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss 1 V tag data Lecture 2: 32

33 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 33

34 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 34

35 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 35

36 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 36

37 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R 14 miss Lecture 2: 37

38 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R 14 miss Lecture 2: 38

39 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 39

40 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 4

41 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R miss Lecture 2: 41

42 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 42

43 Doubling the Block Size <= M[] <= M[] <= M[11] <= M[] R hit Lecture 2: 43

44 Block Size Considerations Larger blocks may reduce miss rate due to spatial locality But in a fixed-sized cache Larger blocks => fewer of them => increased miss rate due to conflicts Larger blocks => data fetched along with the requested data may not be used Larger blocks increase the miss penalty Takes longer to transfer a larger block from memory Lecture 2: 44

45 Next Time More s Lecture 2: 45

ECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance

ECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance ECE 23 Digital Logic & Computer Organization Spring 28 More s Measuring Performance Announcements HW7 due tomorrow :59pm Prelab 5(c) due Saturday 3pm Lab 6 (last one) released HW8 (last one) to be released

More information

Computer Architecture CS372 Exam 3

Computer Architecture CS372 Exam 3 Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the

More information

CPU Pipelining Issues

CPU Pipelining Issues CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! L17 Pipeline Issues & Memory 1 Pipelining Improve performance by increasing instruction throughput

More information

ECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor

ECE 2300 Digital Logic & Computer Organization. More Single Cycle Microprocessor ECE 23 Digital Logic & Computer Organization Spring 28 More Single Cycle Microprocessor Lecture 6: HW6 due tomorrow Announcements Prelim 2: Tues April 7, 7:3pm, Phillips Hall Coverage: Lectures 8~6 Inform

More information

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice

Perfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice Instructions Page 1 of 7 Use pencil, if you have one. For multiple choice questions, circle the letter of the one best choice unless the question specifically says to select all correct choices. There

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several

More information

CMSC411 Fall 2013 Midterm 1

CMSC411 Fall 2013 Midterm 1 CMSC411 Fall 2013 Midterm 1 Name: Instructions You have 75 minutes to take this exam. There are 100 points in this exam, so spend about 45 seconds per point. You do not need to provide a number if you

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

Memory hierarchy Outline

Memory hierarchy Outline Memory hierarchy Outline Performance impact Principles of memory hierarchy Memory technology and basics 2 Page 1 Performance impact Memory references of a program typically determine the ultimate performance

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code

More information

Question 1: (20 points) For this question, refer to the following pipeline architecture.

Question 1: (20 points) For this question, refer to the following pipeline architecture. This is the Mid Term exam given in Fall 2018. Note that Question 2(a) was a homework problem this term (was not a homework problem in Fall 2018). Also, Questions 6, 7 and half of 5 are from Chapter 5,

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

ECE 2300 Digital Logic & Computer Organization

ECE 2300 Digital Logic & Computer Organization ECE 2300 Digital Logic & Computer Organization Spring 201 Memories Lecture 14: 1 Announcements HW6 will be posted tonight Lab 4b next week: Debug your design before the in-lab exercise Lecture 14: 2 Review:

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

ECE154A Introduction to Computer Architecture. Homework 4 solution

ECE154A Introduction to Computer Architecture. Homework 4 solution ECE154A Introduction to Computer Architecture Homework 4 solution 4.16.1 According to Figure 4.65 on the textbook, each register located between two pipeline stages keeps data shown below. Register IF/ID

More information

Comprehensive Exams COMPUTER ARCHITECTURE. Spring April 3, 2006

Comprehensive Exams COMPUTER ARCHITECTURE. Spring April 3, 2006 Comprehensive Exams COMPUTER ARCHITECTURE Spring 2006 April 3, 2006 ID Number 1 /15 2 /20 3 /20 4 /20 Total /75 Problem 1. ( 15 points) Logic Design: A three-input switching function is expressed as f(a,

More information

Chapter 5. Memory Technology

Chapter 5. Memory Technology Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018

ECE 331 Hardware Organization and Design. UMass ECE Discussion 10 4/5/2018 ECE 331 Hardware Organization and Design UMass ECE Discussion 10 4/5/2018 Today s Discussion Topics Direct and Set Associative Cache Midterm Review Hazards Code reordering and forwarding Direct Mapped

More information

ECE 2300 Digital Logic & Computer Organization. Single Cycle Microprocessor

ECE 2300 Digital Logic & Computer Organization. Single Cycle Microprocessor ECE 23 Digital Logic & Computer Organization Spring 28 Single Cycle Microprocessor Lecture 5: Announcements HW 6 released (with a practice question) Lab 5 prelab (A) due tomorrow Lecture 5: 2 SRAM vs.

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

Lecture 29 Review" CPU time: the best metric" Be sure you understand CC, clock period" Common (and good) performance metrics"

Lecture 29 Review CPU time: the best metric Be sure you understand CC, clock period Common (and good) performance metrics Be sure you understand CC, clock period Lecture 29 Review Suggested reading: Everything Q1: D[8] = D[8] + RF[1] + RF[4] I[15]: Add R2, R1, R4 RF[1] = 4 I[16]: MOV R3, 8 RF[4] = 5 I[17]: Add R2, R2, R3

More information

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:

More information

Lecture 9 Pipeline and Cache

Lecture 9 Pipeline and Cache Lecture 9 Pipeline and Cache Peng Liu liupeng@zju.edu.cn 1 What makes it easy Pipelining Review all instructions are the same length just a few instruction formats memory operands appear only in loads

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

DLX Unpipelined Implementation

DLX Unpipelined Implementation LECTURE - 06 DLX Unpipelined Implementation Five cycles: IF, ID, EX, MEM, WB Branch and store instructions: 4 cycles only What is the CPI? F branch 0.12, F store 0.05 CPI0.1740.83550.174.83 Further reduction

More information

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction

More information

1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11

1. Truthiness /8. 2. Branch prediction /5. 3. Choices, choices /6. 5. Pipeline diagrams / Multi-cycle datapath performance /11 The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 ANSWER KEY November 23 rd, 2010 Name: University of Michigan uniqname: (NOT your student ID

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering ECE260: Fundamentals of Computer Engineering Pipelined Datapath and Control James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania ECE260: Fundamentals of Computer Engineering

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

CS/CoE 1541 Mid Term Exam (Fall 2018).

CS/CoE 1541 Mid Term Exam (Fall 2018). CS/CoE 1541 Mid Term Exam (Fall 2018). Name: Question 1: (6+3+3+4+4=20 points) For this question, refer to the following pipeline architecture. a) Consider the execution of the following code (5 instructions)

More information

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Pipeline Hazards. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Pipeline Hazards Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Hazards What are hazards? Situations that prevent starting the next instruction

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions. Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation

More information

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011

The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011 1. Performance Principles [5 pts] The University of Michigan - Department of EECS EECS 370 Introduction to Computer Architecture Midterm Exam 2 solutions April 5, 2011 For each of the following comparisons,

More information

CS/CoE 1541 Exam 1 (Spring 2019).

CS/CoE 1541 Exam 1 (Spring 2019). CS/CoE 1541 Exam 1 (Spring 2019). Name: Question 1 (8+2+2+3=15 points): In this problem, consider the execution of the following code segment on a 5-stage pipeline with forwarding/stalling hardware and

More information

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6. Announcement Computer Architecture (CSC-350) Lecture 0 (08 April 008) Seung-Jong Park (Jay) http://www.csc.lsu.edu/~sjpark Chapter 6 Objectives 6. Introduction Master the concepts of hierarchical memory

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter, adapted by Janet Davis, are licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

CS 2506 Computer Organization II Test 2

CS 2506 Computer Organization II Test 2 Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted one-page formula sheet. No calculators or other computing devices may

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses

More information

ECE 2300 Digital Logic & Computer Organization. More Caches

ECE 2300 Digital Logic & Computer Organization. More Caches ECE 23 Digital Logic & Computer Organization Spring 218 More Caches 1 Announcements Prelim 2 stats High: 79.5 (out of 8), Mean: 65.9, Median: 68 Prelab 5(C) deadline extended to Saturday 3pm No further

More information

Computer Systems Architecture Spring 2016

Computer Systems Architecture Spring 2016 Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor (3) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University The Processor (3) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

lecture 18 cache 2 TLB miss TLB - TLB (hit and miss) - instruction or data cache - cache (hit and miss)

lecture 18 cache 2 TLB miss TLB - TLB (hit and miss) - instruction or data cache - cache (hit and miss) lecture 18 2 virtual physical virtual physical - TLB ( and ) - instruction or data - ( and ) Wed. March 16, 2016 Last lecture I discussed the TLB and how virtual es are translated to physical es. I only

More information

Processor (II) - pipelining. Hwansoo Han

Processor (II) - pipelining. Hwansoo Han Processor (II) - pipelining Hwansoo Han Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 =2.3 Non-stop: 2n/0.5n + 1.5 4 = number

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27)

Lecture Topics. Announcements. Today: Data and Control Hazards (P&H ) Next: continued. Exam #1 returned. Milestone #5 (due 2/27) Lecture Topics Today: Data and Control Hazards (P&H 4.7-4.8) Next: continued 1 Announcements Exam #1 returned Milestone #5 (due 2/27) Milestone #6 (due 3/13) 2 1 Review: Pipelined Implementations Pipelining

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs.

CENG 3531 Computer Architecture Spring a. T / F A processor can have different CPIs for different programs. Exam 2 April 12, 2012 You have 80 minutes to complete the exam. Please write your answers clearly and legibly on this exam paper. GRADE: Name. Class ID. 1. (22 pts) Circle the selected answer for T/F and

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!) Basic Memory Hierarchy Principles Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!) Cache memory idea Use a small faster memory, a cache memory, to store recently

More information

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory Memory systems Memory technology Memory hierarchy Virtual memory Memory technology DRAM Dynamic Random Access Memory bits are represented by an electric charge in a small capacitor charge leaks away, need

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

COMP 3221: Microprocessors and Embedded Systems

COMP 3221: Microprocessors and Embedded Systems COMP 3: Microprocessors and Embedded Systems Lectures 7: Cache Memory - III http://www.cse.unsw.edu.au/~cs3 Lecturer: Hui Wu Session, 5 Outline Fully Associative Cache N-Way Associative Cache Block Replacement

More information

Virtual memory. Hung-Wei Tseng

Virtual memory. Hung-Wei Tseng Virtual memory Hung-Wei Tseng Why virtual memory How VM works VM and cache Outline 4 Virtual memory 5 Scenario I An application is design on machine A with memory size X. Can we safely execute the same

More information

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts) Part I: Assembly and Machine Languages (22 pts) 1. Assume that assembly code for the following variable definitions has already been generated (and initialization of A and length). int powerof2; /* powerof2

More information

14:332:331. Week 13 Basics of Cache

14:332:331. Week 13 Basics of Cache 14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006

More information

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer // CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ // Fall - Lecture # Parallel Requests Assigned to computer

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Graduate Computer Architecture Lecture 6 - Hazards Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 10: Improving Memory Access: Direct and Spatial caches EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses

More information

ECE 2300 Digital Logic & Computer Organization. More Caches

ECE 2300 Digital Logic & Computer Organization. More Caches ECE 23 Digital Logic & Computer Organization Spring 217 More Caches 1 Prelim 2 stats High: 9 (out of 9) Mean: 7.2, Median: 73 Announcements Prelab 5(C) due tomorrow 2 Example: Direct Mapped (DM) Cache

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

Pipeline Control Hazards and Instruction Variations

Pipeline Control Hazards and Instruction Variations Pipeline Control Hazards and Instruction Variations Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Appendix 4.8 Goals for Today Recap: Data Hazards Control Hazards

More information

Lecture 16: Cache in Context (Uniprocessor) James C. Hoe Department of ECE Carnegie Mellon University

Lecture 16: Cache in Context (Uniprocessor) James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 16: Cache in Context (Uniprocessor) James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L16 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping understand

More information

Final Exam Fall 2007

Final Exam Fall 2007 ICS 233 - Computer Architecture & Assembly Language Final Exam Fall 2007 Wednesday, January 23, 2007 7:30 am 10:00 am Computer Engineering Department College of Computer Sciences & Engineering King Fahd

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast

More information

(Basic) Processor Pipeline

(Basic) Processor Pipeline (Basic) Processor Pipeline Nima Honarmand Generic Instruction Life Cycle Logical steps in processing an instruction: Instruction Fetch (IF_STEP) Instruction Decode (ID_STEP) Operand Fetch (OF_STEP) Might

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6

RISC Pipeline. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter 4.6 RISC Pipeline Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6 A Processor memory inst register file alu PC +4 +4 new pc offset target imm control extend =? cmp

More information

Q1: Finite State Machine (8 points)

Q1: Finite State Machine (8 points) Q1: Finite State Machine (8 points) Answer the questions below for the finite state machine in this diagram: 1. Complete the truth table shown below. (2 points) Input Output State In State Out S 0 = 00

More information

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/11/17 Fall 2017 - Lecture #14 1 Parallel

More information

CS 61C: Great Ideas in Computer Architecture. Lecture 13: Pipelining. Krste Asanović & Randy Katz

CS 61C: Great Ideas in Computer Architecture. Lecture 13: Pipelining. Krste Asanović & Randy Katz CS 61C: Great Ideas in Computer Architecture Lecture 13: Pipelining Krste Asanović & Randy Katz http://inst.eecs.berkeley.edu/~cs61c/fa17 RISC-V Pipeline Pipeline Control Hazards Structural Data R-type

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

Handout 4 Memory Hierarchy

Handout 4 Memory Hierarchy Handout 4 Memory Hierarchy Outline Memory hierarchy Locality Cache design Virtual address spaces Page table layout TLB design options (MMU Sub-system) Conclusion 2012/11/7 2 Since 1980, CPU has outpaced

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information