Chapter 2: Memory Hierarchy Design, part 1 - Introducation. Advanced Computer Architecture Mehran Rezaei

Size: px

Start display at page:

Download "Chapter 2: Memory Hierarchy Design, part 1 - Introducation. Advanced Computer Architecture Mehran Rezaei"

Leonard Heath
5 years ago
Views:

1 Chapter 2: Memory Hierarchy Design, part 1 - Introducation Advanced Computer Architecture Mehran Rezaei

2 Temporal Locality The principle of temporal locality in program references says that if you access a memory location (e.g., 1000) you will be more likely to re-access that location than you will be to reference some other random location.

3 Spatial Locality Spatial locality in a program says that if we reference a memory location (e.g., 1000), we are more likely to reference a location near it (e.g., 1001) than some random location.

4 Courtesy of the figure: figure 2.2 from Quantitative Approach, Hennessy and Patterson, 5 th ed.

5 5 components of computer Computer Processor Memory Devices Keyboard, Mouse Control ( brain ) Datapath ( brawn ) (where programs, data live when running) Input Output Display, Printer Memory Hierarchy

6 What types of memory do we have? Registers Cache (Static RAM) Main Memory (Dynamic RAM) Disk (Magnetic Disk) Flash Memory

7 Courtesy of the figure: figure 2.1 from Quantitative Approach, Hennessy and Patterson, 5 th ed.

8 Memory Hierarchy input control datapath cache Memory Disk output

9 Datapath Memory Memory Hierarchy Principle of Locality Suggests Make the common case fast Processor Control Unit Speed (ns) 0.25s 1s 100s 10 6 s Size (Bytes) 1000s 64Ks 10Gs 100Ts

10 The Main Issue, fully associative and 1 byte line size address V = TAG (CAM) Data = = = hit

11 What seems to be the problem? 1. How large is the CAM if we need 8 Kbyte cache? 2. How does each of those comparators look like, in terms of actual hardware? 3. How many comparators are needed for 8 Kbyte cache? 4. What should we do? 1. Increase cache line size 2. Decrease the level of associativity 3. Increase cache line size and decrease set associtivity

12 Increase cache line size *32 Decoder = Byte 31 Byte 1 Byte 0 = hit How many comparators do we need now? Is the size of each comparators reduced?

13 Other alternatives Ultimately only one comparator How many cache lines do we have? Using a potion of address line, can we address the cache lines?

14 8*256 Decoder Direct Mapped Cache *32 Decoder Byte 31 Byte 1 Byte 0 = hit

15 7*128 Decoder 2-way set-associative cache *32 Decoder Byte 31 Byte 1 Byte 0 = hit Byte 31 Byte 1 Byte 0 =

16 The three fields All fields are read as unsigned integers. Offset: specifies which byte within the block we want Index: specifies the cache index (which row of the cache we should look in) Tag: the remaining bits after offset and index are determined; these are used to distinguish between all the memory addresses that map to the same cache line (cache block)

17 Example Suppose we have a 16KB cache with 4 word blocks Determine the size of the tag, index and offset fields if we re using a 32-bit architecture Offset need to specify correct byte within a block block contains 4 words (each word is 32 bit) = 16 bytes = 2 4 bytes need 4 bits to specify correct byte Case Study Cache simulator (Cont d)

18 Example (Cont d) Index: (index into an array of blocks ) need to specify correct row in cache cache contains 16 KB = 2 14 bytes block contains 2 4 bytes (4 words) # blocks/cache = bytes/cache bytes/block = 2 14 bytes/cache 2 4 bytes/block = 2 10 blocks/cache need 10 bits to specify this many rows

19 Example (Cont d) Tag: use remaining bits as tag tag length = addr length offset - index = bits = 18 bits so tag is leftmost 18 bits of memory address 32 bits address tag index offset 18 bits 10 bits 4 bits

20 Types of Miss Compulsory Misses (Cold Misses) Capacity Misses Conflict Misses How to recognize these misses?

21 6 basic cache optimization methods Larger block size to reduce Bigger cache size to reduce Higher associativity to reduce Multilevel cache to reduce Giving priority to read misses rather than write misses to reduce Avoiding address translation to reduce

22 Courtesy of the figure: figure 2.3 from Quantitative Approach, Hennessy and Patterson, 5 th ed.

23 4 main questions of MH Line placement where can a line or block of data be place in the cache How can a line be found in a cache block identification How can a line be replaced block replacement What happens on write

24 Write Write hit vs. write miss Write allocate vs. write no-allocate Write through vs. write back

25 Reading assignment Appendix B, B.1 (pages B 1 to B 16) 2.1, pages 71 to 78

Memory Hierarchy. Mehran Rezaei

Memory Hierarchy. Mehran Rezaei Memory Hierarchy Mehran Rezaei What types of memory do we have? Registers Cache (Static RAM) Main Memory (Dynamic RAM) Disk (Magnetic Disk) Option : Build It Out of Fast SRAM About 5- ns access Decoders