ECE 485/585 Midterm Exam

Similar documents
ECE 485/585 Midterm Exam

COSC 6385 Computer Architecture - Memory Hierarchies (II)

Computer Systems Laboratory Sungkyunkwan University

COSC 6385 Computer Architecture - Memory Hierarchies (III)

Introduction to memory system :from device to system

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

Mainstream Computer System Components

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )

Lecture 18: DRAM Technologies

Adapted from David Patterson s slides on graduate computer architecture

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

A+3 A+2 A+1 A. The data bus 16-bit mode is shown in the figure below: msb. Figure bit wide data on 16-bit mode data bus

Copyright 2012, Elsevier Inc. All rights reserved.

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

COMPUTER ARCHITECTURES

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

LECTURE 5: MEMORY HIERARCHY DESIGN

Memory. Lecture 22 CS301

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

ECE 485/585 Microprocessor System Design

Computer Memory Basic Concepts. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Advanced Memory Organizations

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE 485/585 Microprocessor System Design

Computer Architecture Memory hierarchies and caches

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

ECE 3055: Final Exam

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

The University of Adelaide, School of Computer Science 13 September 2018

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Page 1. Multilevel Memories (Improving performance using a little cash )

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

k -bit address bus n-bit data bus Control lines ( R W, MFC, etc.)

EEM 486: Computer Architecture. Lecture 9. Memory

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand

CSE502: Computer Architecture CSE 502: Computer Architecture

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

ELE 758 * DIGITAL SYSTEMS ENGINEERING * MIDTERM TEST * Circle the memory type based on electrically re-chargeable elements

Advanced Computer Architecture

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Memory hierarchy and cache

MEMORY SYSTEM MEMORY TECHNOLOGY SUMMARY DESIGNING MEMORY SYSTEM. The goal in designing any memory system is to provide

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

Mainstream Computer System Components

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

DDR2 SDRAM UDIMM MT8HTF12864AZ 1GB

Memory. Objectives. Introduction. 6.2 Types of Memory

Topic 21: Memory Technology

Topic 21: Memory Technology

UNIT-V MEMORY ORGANIZATION

Memory Hierarchy and Caches

ECE 485/585 Microprocessor System Design

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

a) Memory management unit b) CPU c) PCI d) None of the mentioned

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

DRAM Main Memory. Dual Inline Memory Module (DIMM)

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

CS 61C: Great Ideas in Computer Architecture (Machine Structures)

14:332:331. Week 13 Basics of Cache

Introduction to cache memories

Delhi Noida Bhopal Hyderabad Jaipur Lucknow Indore Pune Bhubaneswar Kolkata Patna Web: Ph:

Pollard s Attempt to Explain Cache Memory

CSE502: Computer Architecture CSE 502: Computer Architecture

Lecture 11 Cache. Peng Liu.

4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD

Chapter 8 Memory Basics

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

EE414 Embedded Systems Ch 5. Memory Part 2/2

Computer System Components

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Copyright 2012, Elsevier Inc. All rights reserved.

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Recap: Machine Organization

2GB DDR3 SDRAM SODIMM with SPD

Transcription:

ECE 485/585 Midterm Exam Time allowed: 100 minutes Total Points: 65 Points Scored: Name: Problem No. 1 (12 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a) The following is an example of an instruction used in memory-mapped I/O: IN AX, 4 FALSE (b) In an architecture that restricts memory operand alignment, a double-word write starting at the following hexadecimal address will result in an un-aligned access: 0x4273fb6a TRUE (c) A breakpoint inserted in the code by a debugger will result in a synchronous interrupt TRUE (d) If the memory access pattern exhibits high spatial locality, it is better to use high order address interleaving FALSE (e) In a burst EDO DRAM, two different column addresses must be specified for accesses to two consecutive columns FALSE (f) Increasing the associativity of a cache has no impact on compulsory misses TRUE Problem No. 2 (9 points) For each of the following questions, encircle ALL the correct answers: (a) When assigning interrupt priorities among multiple interrupt requests, the following factors need to be considered: i. Relative importance of the I/O device that generated the request ii. Length of the Interrupt Service Routine iii. Ability of I/O device to buffer data iv. All of the above (b) A DDR3-1333 DRAM has the following timing parameters: t RCD = 6 cycles, t RP = 6 cycles, t RAS = 14 cycles, t RRD = 3 cycles. What is the minimum time between activating two rows in two different banks? i. 9 ns ii. 30 ns iii. 4.5 ns iv. None of the above (c) When comparing SRAM with DRAM, which of the following statements are correct? i. SRAM has lower density compared to DRAM ii. SRAM is easier to integrate with logic circuits as compared to DRAM iii. SRAM requires multiplexed address lines whereas DRAM does not iv. All of the above

Problem No. 3 (10 points) (a) (6 points) In this problem, your objective is to design a finite state machine that recognizes the particular pattern 10110. The input to the finite state machine is a sequence of binary bits in series. When the FSM sees the pattern 10110 in its most recent input bits, it should output 1, otherwise it should output 0. Draw the state transition diagram for this FSM? (b) (2 points) State ONE advantage of using DMA as compared to using programmed I/O. DMA frees up the CPU from having to co-ordinate every single transfer of bytes between an I/O device and memory. This allows the CPU to carry out other tasks while a data transfer is going on. (c) (2 points) What is the usage of Minimum/Maximum mode in an 8086 CPU? To support a co-processor

Problem No. 4 (14 points) (a) (8 points) The following table shows the cache configuration for three different caches (C1, C2 and C3) in terms of cache size, line size and associativity. For each of the caches, fill in the missing entries in the table: (i) Number of sets, (ii) Number of address bits needed for the Index field, (iii) Number of address bits needed for the Tag field. Assume that the processor is using 32-bit addresses: CACHE CACHE LINE SIZE ASSOCIATIVITY NUMBER INDEX BITS TAG BITS SIZE OF SETS C1 64 KB 64 B Direct mapped 1024 10 16 (1-way) C2 256 KB 64 B 8-way Set 512 9 17 Associative C3 16 KB 32 B Fully Associative 1 0 27 (b) (3 points) What is the minimum burst length supported in DDR2? Why do DDRx memories not support a burst length of 1? DDR2 supports a minimum burst length of 4. DDRx memories carry out two data transfers per clock cycle. They accomplish that by doing a 2n or greater prefetch (where n is the width of the data bus). Since at least 2n data have already been prefetched, using a burst length of 1 would simply waste data bus bandwidth. (c) (3 points) Describe a scenario in which a cache write request sent by the processor results in a memory write followed by a memory read. Consider a cache which use write-allocate and write-back policies. Consider a write to address A which results in a cache miss. The cache decides decides to evict block B to make room for A. Assume that B had its dirty bit = 1. Therefore, evicting B will result in a memory write. After B has been evicted A will be fetched from memory. This will result in a memory read.

Prob vlem No. 5 (20 points) A processor uses a dual-rank DDR4-2400 memory system. The following table shows the relevant memory system parameters. Assume that the memory controller is using an open page policy, such that once a row in a bank has been activated, it is kept open as long as there is no conflicting request to a different row in the same bank. DRAM Parameter Value Number of ranks 2 DRAM channel width 64 bits DRAM chip output width 4 bits DRAM chip capacity 16 Gbits Number of banks 16 Row size 4KBytes Burst length 16 Memory controller policy Open page t RCD t CL t RP Answer the following questions: (a) (3 points) Calculate the total DRAM capacity available in the system. # of ranks = 2 DRAM chip capacity = 16 Gbits # of DRAM chips per rank = DRAM channel width / DRAM chip output width = 64 / 4 = 16 Capacity of each DRAM rank = 16 Gbits * 16 = 256 Gbit = 32 Gbytes Total DRAM capacity = Capacity of each rank * # of ranks = 32 Gbytes * 2 = 64 Gbytes (b) (6 points) Calculate the number of bits needed to specify each of the following fields in the physical address: (i) Rank, (ii) Bank, (iii) Column, and (iv) Row. Number of bits needed to specify the desired rank = log 2(# of ranks) = log 2(2) = 1 Number of bits needed to specify the desired bank = log 2(# of banks) = log 2(16) = 4 # of DRAM rows per bank = Capacity of each bank / Capacity of each row = (Rank capacity / # of banks per rank) / Row capacity = (32Gbytes/16)/4Kbytes = 2Gbytes/4Kbytes = 2 31 / 2 12 = 2 19 Therefore, Number of bits needed to specify the desired row = log 2(2 19 ) = 19 Width of each column = Channel width = 64 bits = 8 bytes = 2 3 bytes Number of columns per row = Row Capacity / Column width = 4KB / 8B = 2 12 / 2 3 = 2 9 = 512 Therefore, number of bits needed to specify the column field = log 2(512) = 9

(c) (6 points) Consider a memory access sequence which requires the processor to read the ENTIRE contents of a single DRAM row R1 in the bank B1. Before this access sequence could proceed, the currently open row in bank B1 (a row different from R1 ) needs to be closed. In the absence of any other memory requests, how long (in nanoseconds) will it take for the memory controller to complete the access sequence to row R1? Clock speed for DDR4-2400 memory = 2400 / 2 = 1200 MHz Therefore 1 DRAM clock cycle = 1/1200MHz = 0.833 nsec The access sequence specified in the problem statement requires the following steps: (i) Previous row is closed (takes t RP) (ii) Next row is activated (takes t RCD) (iii) A CAS is sent to slect the first column in the row (takes t CL) (iv) The ENTIRE row is transferred to the processor. This requires 4KB / 8B = 512 transfers, or 512/2 = 256 clock cycles. Therefore, total time taken to read the ENTIRE contents of the row = 0.833 * (10+10+10+256) = 238.2 nanoseconds (d) (5 points) Assume that each DRAM row must be refreshed once in every 64 milliseconds. For that purpose, refresh commands are being periodically sent to each DRAM rank. Each refresh command triggers a parallel refresh operation in every bank within a rank, resulting in 16 rows to be refreshed in each bank. Assume that each refresh command takes 400ns (t RFC). Calculate the fraction of time for which the memory system is unable to service memory requests due to refresh activity. Number of rows refreshed in a rank during a single refresh command = 16 rows per bank * 16 banks per rank = 256 rows = 2 8 rows Total number of rows in a rank = Rank capacity/row capacity = 32 GBytes/4Kbytes = 2 35 /2 12 = 2 23 Number of refresh commands needed in a 64ms period = 2 23 / 2 8 = 2 15 Therefore t REFI = 64ms / 2 15 = 1.95microseconds Fraction of times for which memory system is unavailable due to refresh = trfc / trefi = 400ns / 1.95microseconds = 20.5%