Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

Similar documents
Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

COMP 273 Winter physical vs. virtual mem Mar. 15, 2012

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides

Slide Set 5. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 10. for ENEL 353 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

Slide Set 1 (corrected)

Cycle Time for Non-pipelined & Pipelined processors

Slide Set 7. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slides for Lecture 15

Memory Overview. Overview - Memory Types 2/17/16. Curtis Nelson Walla Walla University

Large and Fast: Exploiting Memory Hierarchy

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Slide Set 2. for ENCM 335 in Fall Steve Norman, PhD, PEng

Integer Multiplication and Division

LECTURE 10: Improving Memory Access: Direct and Spatial caches

Slide Set 4. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Chapter 9: A Closer Look at System Hardware

Chapter 9: A Closer Look at System Hardware 4

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Welcome to Part 3: Memory Systems and I/O

CSEE W4824 Computer Architecture Fall 2012


Computers Are Your Future

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Slide Set 11. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

ENCM 501 Winter 2015 Tutorial for Week 5

Slide Set 3. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Memory hierarchy Outline

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Slide Set 5. for ENCM 369 Winter 2014 Lecture Section 01. Steve Norman, PhD, PEng

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

EN1640: Design of Computing Systems Topic 06: Memory System

Memory classification:- Topics covered:- types,organization and working

Main Memory (RAM) Organisation

ECE 341. Lecture # 16

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Computer Organization & Architecture M. A, El-dosuky

Cache introduction. April 16, Howard Huang 1

Lecture 12: Memory hierarchy & caches

Memory Hierarchies &

CENG4480 Lecture 09: Memory 1

Chapter 8 :: Topics. Chapter 8 :: Memory Systems. Introduction Memory System Performance Analysis Caches Virtual Memory Memory-Mapped I/O Summary

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

machine cycle, the CPU: (a) Fetches an instruction, (b) Decodes the instruction, (c) Executes the instruction, and (d) Stores the result.

Computer Science 432/563 Operating Systems The College of Saint Rose Spring Topic Notes: Memory Hierarchy

LECTURE 11. Memory Hierarchy

Slide Set 5. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Picture of memory. Word FFFFFFFD FFFFFFFE FFFFFFFF

Characteristics of Memory Location wrt Motherboard. CSCI 4717 Computer Architecture. Characteristics of Memory Capacity Addressable Units

CS 31: Intro to Systems Storage and Memory. Kevin Webb Swarthmore College March 17, 2015

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

The Memory Component

CREATED BY M BILAL & Arslan Ahmad Shaad Visit:

Internal Memory Cache Stallings: Ch 4, Ch 5 Key Characteristics Locality Cache Main Memory

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Memory Hierarchy Y. K. Malaiya

Slides for Lecture 15

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Lecture-14 (Memory Hierarchy) CS422-Spring

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Chapter 4 The Components of the System Unit

INTRODUCTION TO INFORMATION & COMMUNICATION TECHNOLOGY (ICT) LECTURE 2 : WEEK 2 CSC-111-T

The Memory Hierarchy 1

CPU Pipelining Issues

Advanced Topics In Hardware

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

LECTURE 1. Introduction

Computer Memory Basic Concepts. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Slide Set 8. for ENCM 501 in Winter Steve Norman, PhD, PEng

CS429: Computer Organization and Architecture

EN1640: Design of Computing Systems Topic 06: Memory System

ECE 486/586. Computer Architecture. Lecture # 2

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

CS429: Computer Organization and Architecture

Memory Hierarchy and Cache Ch 4-5. Memory Hierarchy Main Memory Cache Implementation

Contents. Slide Set 2. Outline of Slide Set 2. More about Pseudoinstructions. Avoid using pseudoinstructions in ENCM 369 labs

Introduction To Computer Hardware

Memory Hierarchy and Cache Ch 4-5

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13

211: Computer Architecture Summer 2016

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>

MEMORY BHARAT SCHOOL OF BANKING- VELLORE

Technology in Action. Chapter Topics. Participation Question. Participation Question. Participation Question 8/8/11

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy

Introduction To Computers: Hardware

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Transistor: Digital Building Blocks

Computer Architecture and System Software Lecture 08: Assembly Language Programming + Memory Hierarchy

lecture 18 cache 2 TLB miss TLB - TLB (hit and miss) - instruction or data cache - cache (hit and miss)

Chapter 7: Processor and Memory

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

CENG3420 Lecture 08: Memory Organization

Slides for Lecture 6

Course Administration

Transcription:

slide 2/41 Contents Slide Set 9 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 slide 3/41 slide 4/41 In ENCM 369, we skipped Section 7.4 Multicycle Processor to make sure to have time for other important topics, mostly memory systems and floating-point numbers. We ll also skip... Section 7.6: HDL (hardware description language) representation of processor designs. HDLs are a big topic in ENEL 453. Section 7.7: Exception handling, as an extension of the multicycle processor of Section 7.4. It s somewhat sad to skip this, because the FSM of textbook Figure 7.64 is a really cool FSM. slide 5/41 More about skipped sections in Chapter 7 slide 6/41 Section 7.8: Advanced Microarchitecture. Simple pipelining, as described in Section 7.5, is a really important idea in processor design. But the state of the art in processor design has gone far beyond simple pipelining! See Section 7.8 (and ENCM 501) for an overview of how current processor chips work. Section 7.9: x86 Microarchitecture. The history and present status of Intel Corp. is fascinating from both business and technical viewpoints!

slide 7/41 Introduction to Memory Technology slide 8/41 Know the difference between MEMORY and DISK STORAGE! Chapters 6 and 7 of the textbook, and related lectures and labs suggest that memory is simple. MEMORY is an ELECTRONIC system with NO MOVING PARTS. In real-world computers, the memory systems are usually not at all simple! It is constructed on silicon chips using millions or billions of tiny transistors, capacitors and wires per chip. Expect to read Sections 8.1 to 8.4 of the textbook multiple times don t give up if it seems too confusing the first time through! It is mainly used for INSTRUCTIONS and DATA of RUNNING PROGRAMS. The next slide shows the main memory circuits of a laptop from 2007... slide 9/41 slide 10/41 Know the difference between MEMORY and DISK STORAGE! (continued) DISK STORAGE is partly electronic and partly MECHANICAL with LOTS of MOVING PARTS. Access time to disk is MUCH, MUCH, MUCH LONGER than access time to memory. Cost per bit of disk is MUCH less than cost per bit of memory. Disk storage is mainly used for FILE SYSTEMS. The next slide shows a magnetic disk drive with the top of its case removed... slide 11/41 slide 12/41 Describing Capacity: kilobytes, megabytes, gigabytes, kilobits, megabits, gigabits B means byte (8 bits). b means bit. But often B and b get mixed up, so watch out for mistakes! When describing memory size, powers of two are always used (e.g., 1 KB = 1024 bytes, 1 MB = 1, 048, 576 bytes). When describing disk capacity or data transfer rate, powers of ten are more frequently used (e.g, 160 GB = 160,000,000,000 bytes). Image is Figure 8.19 from Harris D. M. and Harris S. L., Digital Design and Computer Architecture, 2nd ed., c 2013, Elsevier, Inc.

slide 13/41 Memory capacity definitions slide 14/41 Does it bother you that 1 Mb might or might not be exactly 1,000,000 bits? 1 KB = 1 kilobyte = 2 10 bytes = 1, 024 bytes. 1 MB = 1 megabyte = 2 20 bytes = 1, 048, 576 bytes. 1 GB = 1 gigabyte = 2 30 bytes = 1, 073, 741, 824 bytes. So, how many one-bit memory cells does it take to make a 256 MB memory circuit? It should! Engineers should try to avoid ambiguity and imprecision in technical communication! IEEE standard 1541 unfortunately not in wide use proposes new prefixes and symbols: kibi- (Ki), mebi- (Mi), gibi- (Gi), etc., for powers of two. Example: 1 Mib = 1 mebibit = 2 20 bits. These prefixes are nice, but won t be used in ENCM 369 we will stick with the more commonly used prefixes and symbols. slide 15/41 ROM and RAM: ROM slide 16/41 ROM and RAM: RAM ROM: read-only memory. ROM is useful in many simple embedded systems, where programs do not change. (The programs are sometimes called firmware instead of software.) An important use of ROM in smartphones, laptops, desktops, and servers is holding instructions that start the process of loading the operating system kernel into memory. RAM: random-access memory. Random-access means accesses do not have to be in sequence by address. (But ROM allows random access for reads.) A more accurate (but unpronounceable) name would be RWM: readable-writable memory being writable is what makes RAM different from ROM. slide 17/41 Silicon memory technologies slide 18/41 DRAM: dynamic RAM Silicon is a semiconductor, so often the term semiconductor memory is used. Important types of silicon memory are DRAM, SRAM, and flash. See Sections 5.5.2 5.5.4 of the textbook for brief notes on DRAM and SRAM. See Section 5.5.6 of textbook for brief notes on flash memory. DRAM is cheap, compared to SRAM. DRAM has one transistor plus one very, very small capacitor per bit very high density DRAM access time is typically tens of nanoseconds, much longer than a processor clock cycle, which is currently 0.5 ns (500 ps) or less.

slide 19/41 DRAM (continued) slide 20/41 Two SO-DIMMs from a 2007 MacBook DRAM is used for main memory in smartphones, tablets, laptops, desktops, and servers, in various flavours such as DDR3 SDRAM (3rd generation double-data-rate synchronous DRAM). DRAM chips are installed on DIMMs (dual inline memory modules) that plug into computer motherboards. SO means small outline. 4 DRAM chips per DIMM are hiding under the white paper labels. These two 256 MB DIMMs were replaced with two 1 GB DIMMs to upgrade RAM from 512 MB to 2 GB. slide 21/41 slide 22/41 SRAM: static RAM SRAM typically uses 6 transistors per bit lower density than DRAM. Core 1 Core 2 SRAM is more expensive than DRAM. SRAM is much faster than DRAM access time can be as low as 1 or 2 processor clock cycles. These days, SRAM is usually on the same chip as the processor cores. See, for example, the L2 cache in textbook Figure 7.79, page 464, on the next slide. Chip dimensions are about 15 mm by 15 mm. Each computer in ICT 0 has two of these chips. The technology is 6 7 years old, but is still awesome! 2 MB Shared Level 2 Cache Image is Figure 7.79 from Harris D. M. and Harris S. L., Digital Design and Computer Architecture, 2nd ed., c 2013, Elsevier, Inc. slide 23/41 Flash memory Compared to DRAM, flash is cheaper per bit, but much slower. Unlike DRAM or SRAM, flash maintains data when power is turned off. Flash is useful for file systems for computers where hard disks would be too slow or too bulky. ( Solid-State Drives store files in flash memory.) The name flash comes from the electrical process used to erase a block of memory, and has nothing to do with relative speed. Three things containing flash memory slide 24/41

slide 25/41 slide 26/41 instruction addresses processor core instructions data addresses data for loads and stores memory system Suppose we are trying to build a memory system for a MIPS processor with a simple 5-stage pipeline like the one we ve just studied. What requirements does the processor design place on the memory system? slide 27/41 Keeping things relatively simple slide 28/41 A simple model of memory: A BIG ARRAY OF WORDS When we start looking at caches, we ll just look at supporting instruction fetch, LW and SW. We will not worry about these extra complexities... To implemement the entire MIPS instruction set we would need also to support loads and stores of bytes, halfwords (16-bit chunks) and doublewords (64-bit chunks). Multiple issue (starting 2 or more instructions in 1 clock cycle within a single core) and supporting multiple cores both require more complex memory interfaces. A definition for the word model: A description of a system that, while not perfectly accurate, helps explain or predict the behaviour of the system. Let s make a sketch of the memory model introduced at the beginning of ENCM 369... How was this model modified in Chapter 7 of the textbook? slide 29/41 Can the simple big-array-of-words model be used for a hardware design? Could memory be organized as a big array (or two big arrays) of -bit groups of one-bit DRAM cells? Could memory be organized as a big array (or two big arrays) of -bit groups of one-bit SRAM cells? slide 30/41 What can be done? Can we combine a very fast small-capacity memory (SRAM) and a slower large-capacity memory (DRAM) to make a system that performs almost as well as an impossible-to-build very fast large-capacity memory? Answer: YES, such a system can often but not always perform like very big, very fast memory. The small, very fast part of the system is called a cache.

slide 31/41 Memory organization in real computers slide /41 See page 1 of the Introduction to Memory Organization lecture document. This is way more complicated than the simple big-array-of-words model! We will not study all parts in detail, but will focus on key components: caches (textbook Section 8.3); address translation and virtual memory (Section 8.4). slide 33/41 What is a cache? slide 34/41 System with one level of cache, no address translation physical address I-cache physical address A cache is a small, very fast memory system that holds copies of some, not all of the instructions or data in main memory. Processor cores look in caches, NOT in main memory, to find instructions and data. processor core (control, registers, ALUs, f-p hardware) instructions physical address D-cache main memory data instructions or data This is from page 2 of Introduction to Memory Organization. Keep this picture in mind as we discuss cache organization. slide 35/41 Simple examples of cache operation slide 36/41 Definitions: Hits and Misses We ll consider the system with one level of cache and no address translation. What happens in instruction fetch? What happens in the data memory access step of a load instruction? What about data memory access in a store instruction? Cache hit: Event in which the processor finds the instruction or data it needs in the cache. Cache miss: Event in which the instruction or data the processor needs is NOT in the cache. A miss causes a long stall while instructions or data are copied from the main memory to the cache. (These definitions apply to instruction fetches and data memory reads. We will have to modify them a little for data memory writes.)

slide 37/41 Capacity: Cache vs. Main Memory slide 38/41 Locality of Reference A typical recent medium-quality laptop (Apple MacBook Air, March 2014) has 4GB of main memory DRAM. It can cache up to 3MB of instructions and data in SRAM within the processor chip. What fraction of main memory contents can be cached? Does that mean that caches are not very useful? This is a property computer programs have, to varying degrees. (It doesn t make sense to say that a processor or memory system has good or bad locality of reference.) Locality of reference is what makes caches valuable. slide 39/41 Temporal locality of reference slide 40/41 Spatial locality of reference Temporal means related to time. If a program accesses a memory word, the program is likely to access the same word in the near future. What are some examples of temporal locality? In this context, spatial means related to the address space of a computer, not related to physical space. If a program accesses a memory word, the program is likely to access other words with nearby addresses in the near future. What are some examples of spatial locality? slide 41/41 Locality of Reference and Cache Hit Rates Programs with high degrees of locality of reference tend to have high cache hit rates 95 to 99% of memory accesses are cache hits. So caches can enable many programs to have performance close to performance of a hypothetical computer in which all memory accesses take 1 2 clock cycles. What about programs with bad locality of reference?