Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

Similar documents
CS3350B Computer Architecture

+ Random-Access Memory (RAM)

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )

Problem: Processor- Memory Bo<leneck

Read-only memory (ROM): programmed during production Programmable ROM (PROM): can be programmed once SRAM (Static RAM)

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, SPRING 2013

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 26, FALL 2012

Memory Hierarchy. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Giving credit where credit is due

F28HS Hardware-Software Interface: Systems Programming

CISC 360. The Memory Hierarchy Nov 13, 2008

Random-Access Memory (RAM) Lecture 13 The Memory Hierarchy. Conventional DRAM Organization. SRAM vs DRAM Summary. Topics. d x w DRAM: Key features

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Key features. ! RAM is packaged as a chip.! Basic storage unit is a cell (one bit per cell).! Multiple RAM chips form a memory.

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

CMPSC 311- Introduction to Systems Programming Module: Caching

CS 240 Stage 3 Abstractions for Practical Systems

CMPSC 311- Introduction to Systems Programming Module: Caching

Memory hierarchies: caches and their impact on the running time

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

The Memory Hierarchy Sept 29, 2006

Problem: Processor Memory BoJleneck

CS 201 The Memory Hierarchy. Gerson Robboy Portland State University

NEXT SET OF SLIDES FROM DENNIS FREY S FALL 2011 CMSC313.

Random Access Memory (RAM)

211: Computer Architecture Summer 2016

Memory Management! Goals of this Lecture!

CS241 Computer Organization Spring Principle of Locality

Computer Organization: A Programmer's Perspective

Introduction to OpenMP. Lecture 10: Caches

Memory Management. Goals of this Lecture. Motivation for Memory Hierarchy

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }

Computer Systems. Memory Hierarchy. Han, Hwansoo

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

The Memory Hierarchy /18-213/15-513: Introduction to Computer Systems 11 th Lecture, October 3, Today s Instructor: Phil Gibbons

Denison University. Cache Memories. CS-281: Introduction to Computer Systems. Instructor: Thomas C. Bressoud

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Memory Management! How the hardware and OS give application pgms:" The illusion of a large contiguous address space" Protection against each other"

The Memory Hierarchy 10/25/16

Computer Systems CSE 410 Autumn Memory Organiza:on and Caches

Random-Access Memory (RAM) CISC 360. The Memory Hierarchy Nov 24, Conventional DRAM Organization. SRAM vs DRAM Summary.

Cray XE6 Performance Workshop

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Memory Hierarchy. Announcement. Computer system model. Reference

CS 153 Design of Operating Systems

General Cache Mechanics. Memory Hierarchy: Cache. Cache Hit. Cache Miss CPU. Cache. Memory CPU CPU. Cache. Cache. Memory. Memory

ECE232: Hardware Organization and Design

Recap: Machine Organization

Foundations of Computer Systems

CS429: Computer Organization and Architecture

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

The Memory Hierarchy / : Introduction to Computer Systems 10 th Lecture, Feb 12, 2015

Cache Memories. Topics. Next time. Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Memory hierarchy Outline

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Today. The Memory Hierarchy. Random-Access Memory (RAM) Nonvolatile Memories. Traditional Bus Structure Connecting CPU and Memory

211: Computer Architecture Summer 2016

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

How to Write Fast Numerical Code

Advanced Memory Organizations

CS 31: Intro to Systems Storage and Memory. Kevin Webb Swarthmore College March 17, 2015

How to Write Fast Numerical Code

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

Welcome to Part 3: Memory Systems and I/O

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Caches II. CSE 351 Autumn Instructor: Justin Hsia

Registers. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Memory Hierarchy: Caches, Virtual Memory

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Caches Spring 2016 May 4: Caches 1

Assignment 1 due Mon (Feb 4pm

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Page 1. Multilevel Memories (Improving performance using a little cash )

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Memory Hierarchy. Slides contents from:

Caches. Hiding Memory Access Times

Memory Hierarchies &

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

EE 4683/5683: COMPUTER ARCHITECTURE

Today. The Memory Hierarchy. Byte Oriented Memory Organization. Simple Memory Addressing Modes

CPUs. Caching: The Basic Idea. Cache : MainMemory :: Window : Caches. Memory management. CPU performance. 1. Door 2. Bigger Door 3. The Great Outdoors

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Handout 4 Memory Hierarchy

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

Transcription:

Locality CS429: Computer Organization and Architecture Dr Bill Young Department of Computer Sciences University of Texas at Austin Principle of Locality: Programs tend to reuse data and instructions near those used recently, or that were recently referenced Temporal locality: Recently referenced items are likely to be referenced in the near future Spatial locality: Items with nearby addresses tend to be referenced close together in time sum = 0; for ( i = 0; i < n; i++ ) sum += a[ i ]; Locality Example Last updated: January 13, 2017 at 08:55 CS429 Slideset 19: 1 Data: Reference array elements in succession (stride-1): Spatial Reference sum each iteration: Temporal Locality Example 2 CS429 Slideset 19: 2 Instructions: Reference instructions in sequence: Spatial Cycle through loop repeatedly: Temporal Claim: Being able to look at code and get a qualitative sense of its locality is a key skill for a professional programmer int sumarrayrows1( int a[m][n] ) int i, j, sum = 0; for ( i = 0; i < M; i++ ) sum += a[ i ][ j ]; Question: Does this function have good locality? int sumarrayrows2( int a[m][n] ) int i, j, sum = 0; for ( i = 0; i < M; i++ ) sum += a[ i ][ j ]; Does this compute the same function as sumarrayrows1? Question: Does this function have good locality? How does it compare to the previous version? CS429 Slideset 19: 3 CS429 Slideset 19: 4

Locality Example 3 A stride-1 reference pattern means that successive references are 1 unit apart (Here unit means the size of the data type) Can you permute the loops so that this function scans the 3-d array a with a stride-1 reference pattern (and thus has good spatial locality)? Remember the CPU-Memory Gap CPU speed increases faster than memory speed, meaning that: memory is more and more a limiting factor on performance; increased importance for caching and similar techniques int sumarray3d( int a[n][n][n] ) int i, j, k, sum = 0; for ( i = 0; i < N; i++ ) for ( k = 0; k < N; k++ ) sum += a[k ][ i ][ j ]; CS429 Slideset 19: 5 CS429 Slideset 19: 6 Memory Hierarchies Example Memory Hierarchy Some fundamental and enduring properties of hardware and software: Fast storage technologies typically cost more per byte and have less capacity than slower ones The gap between CPU and main memory speed is widening Well-written programs tend to exhibit good locality Memory systems access blocks of data, not individual bytes These fundamental properties complement each other beautifully They suggest an approach for organizing memory and storage systems known as a memory hierarchy CS429 Slideset 19: 7 CS429 Slideset 19: 8

Caches Examples of Caching in the Hierarchy Cache: A smaller, faster storage device that acts as a staging area for a subset of the data in a larger, slower device The fundamental idea of a memory hierarchy: For each k, the faster, smaller device at level k serves as a for the larger, slower device at level k+1 Cache type What Where Latency Managed by (cycles) Registers 8-byte word CPU registers 0 compiler TLB address On-chip TLB 0 hardware translations 32-byte block On-chip 1 hardware L2 32-byte block On-chip L2 10 hardware Virtual 4KB page main memory 100 hw + OS Memory Buffer parts of files main memory 100 OS Network buffer parts of files local disk 10M AFS/NFS client Browser web pages local disk 10M web browser Web web pages remote server 1000M web proxy disks server CS429 Slideset 19: 9 CS429 Slideset 19: 10 Why Memory Hierarchies? Caching in a Memory Hierarchy Why do memory hierarchies work? Programs tend to access the data at level k more often than they access the data at level k+1 Thus, the storage at level k+1 can be slower, and thus larger and cheaper per bit Net effect: A large pool of memory that costs as much as the cheap storage near the bottom, but that serves data to programs at the rate of the fast storage near the top We use a combination of small fast memory and big slow memory to give the illusion of big fast memory Level k: 4 9 10 3 10 Level k+1: 4 5 6 7 8 9 10 11 12 13 14 15 Smaller, faster, more expensive device at level k s a subset of the blocks from level k+1 Data is copied between levels in block-sized transfer units Larger, slower, cheaper storage device at level k+1 is partitioned into blocks CS429 Slideset 19: 11 CS429 Slideset 19: 12

General Caching Concepts General Caching Concepts Level k: Level k+1: 12 12 12 Request 12 8 Request 12 4* 9 14 3 5 6 7 8 9 10 11 13 14 15 Program needs object d, stored in some block b Cache hit: program finds b in the level k, eg, block 14 Cache miss: b is not at level k, so must fetch it from level k+1, eg, block 12 If level k is full, then some current block (the victim) must be replaced (evicted) Placement policy: where can the new block go? Eg, b mod 4 Replacement policy: Which block should be evicted? Eg, LRU Types of misses: Cold (compulsary) miss: the is empty Conflict miss: all available positions at level k are occupied Most s limit blocks at level k+1 to a small subset (sometimes only one) of the block positions at level k Eg, Block i at level k+1 must be placed in block (i mod 4) at level k Conflict misses occur when multiple data objects all map to the same level k block Note: there still may be empty slots in the Eg, Referencing blocks 0,8,0,8,0,8, would miss every time Capacity miss: the set of active blocks (working set) is larger than the CS429 Slideset 19: 13 CS429 Slideset 19: 14 Cache Memories L2 and L2 memories are small, fast SRAM-based memories managed automatically in hardware They hold frequently accessed blocks of main memory CPU looks first for data in, then in L2, then in main memory The typical bus structure is shown below bus CPU chip register file bus interface ALU system bus I/O Bridge memory bus main memory Inserting an Cache line 0 line k block 10 block 21 block 30 Register file Cache Main memory abcd pqrs wxyz The tiny, very fast CPU register file has room for a small number of 8-byte words The transfer unit between register file and is an 8-byte block The small, fast has room for k lines (each containing several words) The transfer unit between and main memory is a block of bytes The big, slow main memory has room for many blocks CS429 Slideset 19: 15 CS429 Slideset 19: 16

Multi-Level Caches Summary Options: separate data and instruction s, or a unified Processor Regs d i Unified L2 Memory registers L2 memory disk size 200B 8-64KB 1-4MB SRAM 128MB DRAM 30GB speed 3ns 3ns 6ns 60ns 8ms $/MB $100 $150 $005 line size 8B 32B 32B 8KB Disk This slideset: Locality: Spatial and Temporal Cache principles Multi-level hierarchies Next time: Cache organization Replacement and writes Programming considerations CS429 Slideset 19: 17 CS429 Slideset 19: 18