EE414 Embedded Systems Ch 5. Memory Part 2/2

Similar documents
Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

ECE 485/585 Microprocessor System Design

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

CPE300: Digital System Architecture and Design

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by:

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by:

Large and Fast: Exploiting Memory Hierarchy

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Computer Systems Laboratory Sungkyunkwan University

a) Memory management unit b) CPU c) PCI d) None of the mentioned

CSE140: Components and Design Techniques for Digital Systems

Memory System Overview. DMA & Endian-ness. Technology. Architectural. Problem: The Memory Wall

ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Circle the memory type based on electrically re-chargeable elements

COSC 6385 Computer Architecture - Memory Hierarchies (III)

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

EEM 486: Computer Architecture. Lecture 9. Memory

Memories: Memory Technology

Chapter 8 Memory Basics

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

The Memory Hierarchy & Cache

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

Topic 21: Memory Technology

Topic 21: Memory Technology

Memory. Lecture 22 CS301

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

MEMORY. Objectives. L10 Memory

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Memory. Objectives. Introduction. 6.2 Types of Memory

Computer Organization. 8th Edition. Chapter 5 Internal Memory

EN1640: Design of Computing Systems Topic 06: Memory System

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Chapter 5 Internal Memory

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand

MEMORY SYSTEM MEMORY TECHNOLOGY SUMMARY DESIGNING MEMORY SYSTEM. The goal in designing any memory system is to provide

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

Computer System Components

The Memory Component

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS311 Lecture 21: SRAM/DRAM/FLASH

Chapter 4 Main Memory

CENG4480 Lecture 09: Memory 1

Introduction read-only memory random access memory

The University of Adelaide, School of Computer Science 13 September 2018

CSE502: Computer Architecture CSE 502: Computer Architecture

Basic Organization Memory Cell Operation. CSCI 4717 Computer Architecture. ROM Uses. Random Access Memory. Semiconductor Memory Types

COSC 6385 Computer Architecture - Memory Hierarchies (II)

Design with Microprocessors

Introduction to cache memories

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Mainstream Computer System Components

Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

CS 261 Fall Mike Lam, Professor. Memory

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

The Memory Hierarchy Part I

UNIT-V MEMORY ORGANIZATION

UMBC. Select. Read. Write. Output/Input-output connection. 1 (Feb. 25, 2002) Four commonly used memories: Address connection ... Dynamic RAM (DRAM)

Memory Hierarchy and Caches

CENG3420 Lecture 08: Memory Organization

Copyright 2012, Elsevier Inc. All rights reserved.

Design with Microprocessors

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

TECHNOLOGY BRIEF. Double Data Rate SDRAM: Fast Performance at an Economical Price EXECUTIVE SUMMARY C ONTENTS

k -bit address bus n-bit data bus Control lines ( R W, MFC, etc.)

UNIT II PROCESSOR AND MEMORY ORGANIZATION

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

CPSC 352. Chapter 7: Memory. Computer Organization. Principles of Computer Architecture by M. Murdocca and V. Heuring

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

Lecture 18: DRAM Technologies

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

Microcontroller Systems. ELET 3232 Topic 11: General Memory Interfacing

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

chapter 8 The Memory System Chapter Objectives

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Memory Technologies. Technology Trends

CSE502: Computer Architecture CSE 502: Computer Architecture

Chapter 7- Memory System Design

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:

COMPUTER ARCHITECTURES

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

211: Computer Architecture Summer 2016

Lecture 17. Fall 2007 Prof. Thomas Wenisch. row enable. _bitline. Lecture 18 Slide 1 EECS 470

Advanced Memory Organizations

Adapted from David Patterson s slides on graduate computer architecture

CS 320 February 2, 2018 Ch 5 Memory

Transcription:

EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology

Overview 6.1 introduction 6.2 Memory Write Ability and Storage Performance 6.3 Common Memory Types 6.4 Composing Memory 6.5 Cache 6.6 Advanced RAM Embedded Systems, KAIST 2

5.4 Composing Memory Memory size needed often differs from size of readily available memories When available memory is larger, simply ignore unneeded high-order address bits and higher data lines When available memory is smaller, compose several smaller memories into one larger memory: A. To increase width of words B. To increase number of words C. To increase number and width of words Embedded Systems, KAIST 3

Composing Memory (II) A. To increase width of words Connect side-by-side to increase width of words 2 m 3n ROM enable 2 m n ROM 2 m n ROM 2 m n ROM Increase width of words A 0 A m Q 3n-1 Q 2n-1 Q 0 Embedded Systems, KAIST 4

Composing Memory (III) B. To increase number of words Connect top to bottom to increase number of words Added high-order address line selects smaller memory containing desired word using an address decoder May use odd and even addresses: Ao to address decoder. A 0 A m-1 enable 2 m+1 n ROM 2 m n ROM Embedded Systems, KAIST 5 A m 1 2 decoder Increase number of words 2 m n ROM Q n-1 Q 0

Composing Memory (IV) C. To increase number and width of words Combine techniques to increase number and width of words Address decoder A Increase number and width of words enable outputs Embedded Systems, KAIST 6

5.5 Cache Memory Hierarchy Want inexpensive, fast memory Main memory Large, inexpensive, slow memory stores entire program and data Secondary memory Processor Registers Cache Main memory Disk Cache Small, expensive, fast memory stores copy of likely accessed parts of larger memory Can be multiple levels of cache Embedded Systems, KAIST 7 Tape

Cache Usually designed with SRAM faster but more expensive than DRAM Usually on same chip as processor space limited, so much smaller than off-chip main memory faster access (1 cycle vs. several cycles for main memory) Cache operation: Request for main memory access (read or write) First, check cache for copy Cache hit: copy is in cache, quick access Cache miss: copy not in cache. read memory at the address and possibly its neighbors into cache Several cache design choices Cache mapping, replacement policies, and write techniques Embedded Systems, KAIST 8

Cache Mapping Cache mapping Methods of assigning main memory addresses (M bits) to the far fewer number of available cache addresses (C bits) Determine hit or miss: Whether a particular main memory address s contents are in the cache. Three basic techniques: A. Direct mapping B. Fully associative mapping C. Set-associative mapping Cache line (or cache block) Caches partitioned into indivisible blocks or lines of adjacent memory addresses usually 4 or 8 memory addresses per cache line. C bits = B blocks + O offset. Embedded Systems, KAIST 9

A. Direct Mapping Main memory address divided into 3 fields Tag M - C Compared with tag stored in cache block indicated by index B Index B Cache (block) address Number of bits determined by cache size: B = log 2 (num. cache blocks) Offset O Used to find particular word in cache block Cache contains Tag (to be compared) Valid bit Indicates whether data in slot Tag Index V T D Offset has been loaded from memory. Valid output = 1 if = Valid memory tag== cache tag at index & valid bit == 1. Data for each cache block: Get with Offset. Embedded Systems, KAIST 10 Data

Direct mapping example Memory: 64 bytes, Cache: 32 bytes, 8 bytes/cache block. M=6, C=5, O=3 8 memory blocks. 4 cache blocks. Memory address 0x0 to 0x3F = Tag 1 Index 2 Offset 3: Ta g Ind ex Mem adr Mem blocks Ind ex Va lid Ta g Cache blocks 0 00 0x00 8 adr x 8 data 0 01 0x08 0 10 0x10 Pgm 1/5 0 11 0x18 Pgm 2/5 1 00 0x20 Pgm 3/5 1 01 0x28 Pgm 4/5 1 10 0x30 Pgm 5/5 * 1 11 0x38 00 01 10 11 1 1 Pgm 3/5 1 1 Pgm 4/5 0 0 Pgm 1/5 0 0 Pgm 2/5 Embedded Systems, KAIST 11

B. Fully Associative Mapping Complete main memory address (tag, excluding offset, no index: M- O bits) stored as cache tag in each cache block. All cache tags are simultaneously compared with the desired memory tag: Many comparators. Memory block mapped to any cache block. Valid outputs are OR-ed. Tag Offset Data V T D V T D V T D = = = Valid Embedded Systems, KAIST 12

C. Set-Associative Mapping Compromise between direct mapping and fully associative mapping Index maps memory address to a cache address, but Cache has two or more set of entries: N-way. Each entry in cache contains tag, valid bit, and cache block data. Tags at the same index of each set simultaneously (associatively) compared. Cache with N entries is called N-way set-associative (N=2, 4, 8) 2-way set-associative Tag Index Offset V T D V T D Data Valid = = Embedded Systems, KAIST 13

Comparison of Cache Mapping Cache mapping techniques Pros Cons Direct-mapped Easy to implement Numerous misses if several words with the same index are accessed frequently Fully associative Fast The comparison logic is expensive to implement Set-associative Can reduce misses. Small number of comparators. -! Embedded Systems, KAIST 14

Cache-Replacement Policy Cache-replacement policy Technique for choosing which block to replace when fully associative cache is full when set-associative cache s line is full Note: Direct mapped cache has no choice! Cache Replacement policies Random replace block chosen at random FIFO: first-in-first-out push block onto queue when accessed choose block to replace by popping queue LRU: least-recently used replace block not accessed for longest time Embedded Systems, KAIST 15

Cache Write Techniques When written, data cache must update the main memory Note: Instruction cache is read-only. Write-through Write to main memory whenever cache is written to Pros: Easiest to implement Cons: Processor must wait for slower main memory write Potential for unnecessary writes Write-back Main memory only written when dirty block replaced Extra dirty bit for each block set when cache block written to Pros: Reduces number of slow main memory writes Embedded Systems, KAIST 16

Cache Impact on System Performance Most important parameters in terms of performance: Total size of cache Total number of data bytes cache can hold Tag, valid and other house keeping bits not included in total Degree of associativity Data block size Embedded Systems, KAIST 17

Cache Performance Trade-offs Improving cache hit rate without increasing size Increase line size Change set-associativity Effect of cache size and associativity % cache miss 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 1 way 2 way 4 way 8 way cache size 1 Kb 2 Kb 4 Kb 8 Kb 16 Kb 32 Kb 64 Kb 128 Kb Embedded Systems, KAIST 18

5.6 Advanced RAM DRAM DRAMs commonly used as main memory in processor based embedded systems Single transistor/capacitor per bit High capacity, low cost Refresh circuit required! Many variations on DRAM Embedded Systems, KAIST 19

A. Basic DRAM Address bus multiplexed between row and column components DRAM memory address = RA CA (Row Address Column Address) Row and column addresses are latched in, sequentially, by strobing ras and cas signals (RA Sel, CA Sel), respectively. Reduce I/O pin requirements. Refresh circuitry can be external or internal to DRAM device Strobes consecutive memory address periodically causing memory content to be refreshed Note. Refresh circuitry disabled during read or write operation. Embedded Systems, KAIST 20

Basic DRAM (II) Basic DRAM architecture data Data Out Buffer Data In Buffer Row Addr. Buffer ColAddr. Buffer rd/wr address cas ras Sense Amplifiers Col Decoder Row Decoder Bit storage array Refresh Circuit cas, ras, clock Embedded Systems, KAIST 21

DRAM Logic Circuit 16K x 8 DRAM Embedded Systems, KAIST 22

DRAM Timing Diagram Read Write Embedded Systems, KAIST 23

B. Fast Page Mode DRAM (FPM DRAM) An improvement on DRAM architecture Each row of memory bit array is viewed as a page A page contains multiple words Individual words addressed by column address only! Timing diagram Row (page) address sent 3 words read consecutively by sending column address for each Extra cycle eliminated on each read/write of words from the same page. ras cas address row col col col data data data data Embedded Systems, KAIST 24

C. Extended Data Out DRAM (EDO DRAM) Improvement of FPM DRAM Extra latch before output buffer Allows strobing of cas before data read operation completed Reduces read/write latency by additional cycle ras cas address data row col col col data data data Speedup through overlap Embedded Systems, KAIST 25

D. (S)ynchronous and Enhanced Synchronous (ES) DRAM SDRAM latches data on active edge of clock Eliminates time to detect ras/cas and rd/wr signals A counter is initialized to column address then incremented on active edge of clock to access consecutive memory locations ESDRAM improves SDRAM Added buffers enable overlapping of column addressing Faster clocking and lower read/write latency possible clock ras cas address row col data data data data Embedded Systems, KAIST 26

E. Rambus DRAM (RDRAM) More of a bus interface architecture than DRAM architecture Data is latched on both rising and falling edge of clock Broken into 4 banks each with own row decoder can have 4 pages open at a time Capable of very high throughput 600 M cycles possible. Embedded Systems, KAIST 27

DIMM [Wikipedia] A DIMM or dual in-line memory module comprises a series of dynamic random-access memory integrated circuits. These modules are mounted on a printed circuit board and designed for use in personal computers, workstations and servers. DIMMs began to replace SIMMs (single in-line memory modules) as the predominant type of memory module as Intel P5-based Pentium processors began to gain market share. DDR, DDR2, DDR3 and DDR4 all have different pin counts, and different notch positions. As of August, 2014, DDR4 SDRAM is a modern emerging type of dynamic random access memory (DRAM) with a high-bandwidth ("double data rate") interface, and has been in use since 2013. Embedded Systems, KAIST 28

DRAM Integration Problem SRAM easily integrated on same chip as processor DRAM more difficult Different chip making process between DRAM and conventional logic Goal of conventional logic (IC) designers: Minimize parasitic capacitance to reduce signal propagation delays and power consumption Goal of DRAM designers: Create capacitor cells to retain stored information Integration processes beginning to appear. Embedded Systems, KAIST 29

Memory Management Unit (MMU) Duties of MMU Handles DRAM refresh, bus interface and arbitration Takes care of memory sharing among multiple processors Translates virtual memory addresses from processor to physical memory addresses of DRAM Modern CPUs often come with MMU built-in. Single-purpose processors can be used. Embedded Systems, KAIST 30

Memory Management Unit [Wikipedia] Embedded Systems, KAIST 31

References [1] Frank Vahid, Embedded system design: A unified hardware/software introduction, John Wiley & Sons, 2002. Embedded Systems, KAIST 32