A high-level model of embedded flash energy consumption

Size: px
Start display at page:

Download "A high-level model of embedded flash energy consumption"

Transcription

1 A high-level model of embedded flash energy consumption To appear in CASES, ESWEEK 2014 James Pallister, PhD Student Kerstin Eder, Primary PhD Supervisor Simon Hollis, PhD Supervisor Jeremy Bennett, Embecosm

2 Outline Embedded flash memory Effect on energy Modelling Modelling instruction streams Optimisation

3 Outline Hardware Embedded flash memory Effect on energy Modelling Modelling instruction streams Software Optimisation

4 What is embedded flash memory? Not what we typically think of Similar to flash drives/ssds Flash memory that is on the same die as the processor

5 Deeply embedded SoCs

6 Deeply embedded SoCs Small, cache-less, SoCs

7 Deeply embedded SoCs Small, cache-less, SoCs Simple processors

8 Deeply embedded SoCs Small, cache-less, SoCs Simple processors Typically flash and RAM

9 Deeply embedded SoCs Small, cache-less, SoCs Simple processors Typically flash and RAM Single cycle access to both

10 Deeply embedded SoCs Small, cache-less, SoCs Simple processors Typically flash and RAM Single cycle access to both Code executed directly out of flash

11 Deeply embedded SoCs Small, cache-less, SoCs Simple processors Typically flash and RAM Single cycle access to both Code executed directly out of flash Low energy requirements

12 Deeply embedded SoCs Small, cache-less, SoCs Simple processors Typically flash and RAM Single cycle access to both Code executed directly out of flash Low energy requirements Position of code in flash affects the energy?

13 Flash memory

14 Flash memory Precharge bit-lines

15 Flash memory Address is fed into the decoder

16 Flash memory Decode the address, select the block

17 Flash memory Assert the control gate on the desired word-line

18 Flash memory Connect the flash cell to the bit-line with the select gates

19 Flash memory The precharged bit-lines are pulled up or down, depending on the charge in the flash cell

20 Flash memory Sense amplifiers Detect the change and buffer the output onto the data bus

21 Expected energy effect Pages, blocks, word-lines, bit-lines Changing each of these has an energy cost

22 Expected energy effect Bit-lines Decoder Decoder Word-lines

23 Expected energy effect Bit-lines Word-lines Decoder Decoder n-bit length instructions loop: nop nop subs r0, #1 bne loop

24 Expected energy effect Bit-lines Word-lines Decoder n-bit length instructions loop: nop nop subs r0, #1 Decoder bne loop

25 Expected energy effect Bit-lines Word-lines Decoder n-bit length instructions loop: nop nop subs r0, #1 Decoder bne loop Straddling two blocks.

26 Expected energy effect Bit-lines Word-lines Decoder n-bit length instructions loop: nop nop subs r0, #1 Decoder bne loop Straddling two blocks. Each iteration must repeatedly power up one, then the other

27 Expected energy effect Bit-lines Word-lines Decoder n-bit length instructions loop: nop nop subs r0, #1 Decoder bne loop Straddling two blocks. Each iteration must repeatedly power up one, then the other Higher energy cost when an instruction jumps from one 'region' to another.

28 Expected energy effect Bit-lines Word-lines Decoder Decoder n-bit length instructions Straddling two blocks. loop: nop nop subs r0, #1 bne loop Each iteration must repeatedly power up one, then the other Higher energy cost when an instruction jumps from one 'region' to another.

29 Actual results Bottom line (blue) 8-byte loop Top line (green) 10-byte loop

30 Actual results Bottom line (blue) 8-byte loop Top line (green) 10-byte loop

31 Actual results Bottom line (blue) 8-byte loop Top line (green) 10-byte loop

32 Actual results Bottom line (blue) 8-byte loop Top line (green) 10-byte loop

33 Modelling Each consecutive memory access has an address dependent energy consumption. E.g

34 Modelling Each consecutive memory access has an address dependent energy consumption. E.g

35 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

36 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

37 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

38 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

39 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

40 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

41 Modelling Each consecutive memory access has an address dependent energy consumption. E.g

42 Modelling Each consecutive memory access has an address dependent energy consumption. E.g

43 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

44 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

45 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

46 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

47 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

48 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

49 Modelling Each consecutive memory access has an address dependent energy consumption. E.g byte

50 Modelling - formula

51 Modelling - formula

52 Modelling - formula

53 Modelling - formula

54 Modelling - formula

55 Modelling multiple instructions Addr Code loop: add ldr cmp bne r1, #1 r0, [r3, #4] r1, r2 loop Each instruction is 2 byte Each memory access is two bytes

56 Modelling multiple instructions Addr Code loop: add ldr cmp bne r1, #1 r0, [r3, #4] r1, r2 loop 0 2

57 Modelling multiple instructions Addr ? Code loop: add ldr cmp bne r1, #1 r0, [r3, #4] r1, r2 loop.word 0x ?

58 Modelling multiple instructions Addr ? Code loop: add ldr cmp bne r1, #1 r0, [r3, #4] r1, r2 loop.word 0x ?? 4

59 Modelling multiple instructions Addr Code loop: add ldr cmp bne r1, #1 r0, [r3, #4] r1, r2 loop 0 2 2?? 4 4 6

60 Modelling multiple instructions Addr Code loop: add r1, #1 ldr r0, [r3, #4] cmp r1, r2 bne loop prefetched 0 2 2??

61 Modelling multiple instructions Addr Code loop: add r1, #1 ldr r0, [r3, #4] cmp r1, r2 bne loop prefetched 0 2 2??

62 Modelling multiple instructions Addr Code loop: add r1, #1 ldr r0, [r3, #4] cmp r1, r2 bne loop prefetched 0 2 2??

63 Approximating Ignore the data accesses Mostly to RAM Infrequently to flash (if so, usually one time loads) Almost allows us to statically analyse code for flash energy consumption Conditional branches still unknown

64 Parameters

65 Cross validation STM32F0

66 A potential optimisation For code which is executed frequently, ensure it crosses as few costly boundaries as possible. Use the model to make these decisions.

67 A potential optimisation For code which is executed frequently, ensure it crosses as few costly boundaries as possible. Use the model to make these decisions.

68 A potential optimisation For code which is executed frequently, ensure it crosses as few costly boundaries as possible. Use the model to make these decisions.

69 Conclusion 5-15% energy reduction Flexible model Validated on 5 SoCs Average error: 11% Compiler optimisation to be implemented Analysis indicates 30-40% of all loops can be better aligned 4% increase in executable size

70 Thanks! Preprint: Final version to appear in CASES, ESWEEK 2014.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved. Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory

More information

Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption. Jeremy Bennett

Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption. Jeremy Bennett Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption Jeremy Bennett Why? Ericsson T65 released 2001 Li-Ion 720 mah standby 300 h talk time 11 h includes

More information

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5

More information

arxiv: v2 [cs.oh] 3 Jun 2014

arxiv: v2 [cs.oh] 3 Jun 2014 Optimizing the flash-ram energy trade-off in deeply embedded systems James Pallister 1, Kerstin Eder 1 and Simon J. Hollis 1 1 University of Bristol arxiv:1406.0403v2 [cs.oh] 3 Jun 2014 Abstract Deeply

More information

arxiv: v2 [cs.oh] 3 Jun 2014

arxiv: v2 [cs.oh] 3 Jun 2014 Optimizing the flash-ram energy trade-off in deeply embedded systems James Pallister 1, Kerstin Eder 1 and Simon J. Hollis 1 1 University of Bristol arxiv:1406.0403v2 [cs.oh] 3 Jun 2014 Abstract Deeply

More information

Scalability in Compiler Development How to Get Testing and Optimization Done in a Reasonable Time. Jeremy Bennett

Scalability in Compiler Development How to Get Testing and Optimization Done in a Reasonable Time. Jeremy Bennett Scalability in Compiler Development How to Get Testing and Optimization Done in a Reasonable Time Jeremy Bennett Machine Learning Compilers Do Compilers Affect Energy? Initial research in 2012 by Embecosm

More information

ECEN 449 Microprocessor System Design. Memories. Texas A&M University

ECEN 449 Microprocessor System Design. Memories. Texas A&M University ECEN 449 Microprocessor System Design Memories 1 Objectives of this Lecture Unit Learn about different types of memories SRAM/DRAM/CAM Flash 2 SRAM Static Random Access Memory 3 SRAM Static Random Access

More information

Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption

Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption Who Ate My Battery? Why Free and Open Source Systems Are Solving the Problem of Excessive Energy Consumption Jeremy Bennett, Embecosm Kerstin Eder, Computer Science, University of Bristol Why? Ericsson

More information

Chapter 5 Internal Memory

Chapter 5 Internal Memory Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

Pipelining, Branch Prediction, Trends

Pipelining, Branch Prediction, Trends Pipelining, Branch Prediction, Trends 10.1-10.4 Topics 10.1 Quantitative Analyses of Program Execution 10.2 From CISC to RISC 10.3 Pipelining the Datapath Branch Prediction, Delay Slots 10.4 Overlapping

More information

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 Professor: Sherief Reda School of Engineering, Brown University 1. [from Debois et al. 30 points] Consider the non-pipelined implementation of

More information

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda biswap@cse.iitk.ac.in https://www.cse.iitk.ac.in/users/biswap/cs698y.html Row decoder Accessing a Row Access Address

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Computer Organization. 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

CPU Structure and Function

CPU Structure and Function Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com http://www.yildiz.edu.tr/~naydin CPU Structure and Function 1 2 CPU Structure Registers

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Module 6 : Semiconductor Memories Lecture 30 : SRAM and DRAM Peripherals

Module 6 : Semiconductor Memories Lecture 30 : SRAM and DRAM Peripherals Module 6 : Semiconductor Memories Lecture 30 : SRAM and DRAM Peripherals Objectives In this lecture you will learn the following Introduction SRAM and its Peripherals DRAM and its Peripherals 30.1 Introduction

More information

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 9 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

The Memory Hierarchy Part I

The Memory Hierarchy Part I Chapter 6 The Memory Hierarchy Part I The slides of Part I are taken in large part from V. Heuring & H. Jordan, Computer Systems esign and Architecture 1997. 1 Outline: Memory components: RAM memory cells

More information

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory Memory systems Memory technology Memory hierarchy Virtual memory Memory technology DRAM Dynamic Random Access Memory bits are represented by an electric charge in a small capacitor charge leaks away, need

More information

Memory Controller. Speaker: Tzu-Wei Tseng. Adopted from National Taiwan University SoC Design Laboratory. SOC Consortium Course Material

Memory Controller. Speaker: Tzu-Wei Tseng. Adopted from National Taiwan University SoC Design Laboratory. SOC Consortium Course Material Memory Controller Speaker: Tzu-Wei Tseng Adopted from National Taiwan University SoC Design Laboratory SOC Consortium Course Material Goal of This Lab Familiarize with ARM memory interface Know ARM Integrator

More information

! Memory Overview. ! ROM Memories. ! RAM Memory " SRAM " DRAM. ! This is done because we can build. " large, slow memories OR

! Memory Overview. ! ROM Memories. ! RAM Memory  SRAM  DRAM. ! This is done because we can build.  large, slow memories OR ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec 2: April 5, 26 Memory Overview, Memory Core Cells Lecture Outline! Memory Overview! ROM Memories! RAM Memory " SRAM " DRAM 2 Memory Overview

More information

COSC 6385 Computer Architecture - Memory Hierarchy Design (III)

COSC 6385 Computer Architecture - Memory Hierarchy Design (III) COSC 6385 Computer Architecture - Memory Hierarchy Design (III) Fall 2006 Reducing cache miss penalty Five techniques Multilevel caches Critical word first and early restart Giving priority to read misses

More information

Introduction to memory system :from device to system

Introduction to memory system :from device to system Introduction to memory system :from device to system Jianhui Yue Electrical and Computer Engineering University of Maine The Position of DRAM in the Computer 2 The Complexity of Memory 3 Question Assume

More information

Faculty of Science FINAL EXAMINATION

Faculty of Science FINAL EXAMINATION Faculty of Science FINAL EXAMINATION COMPUTER SCIENCE COMP 273 INTRODUCTION TO COMPUTER SYSTEMS Examiner: Prof. Michael Langer April 18, 2012 Associate Examiner: Mr. Joseph Vybihal 2 P.M. 5 P.M. STUDENT

More information

Design and Implementation of an AHB SRAM Memory Controller

Design and Implementation of an AHB SRAM Memory Controller Design and Implementation of an AHB SRAM Memory Controller 1 Module Overview Learn the basics of Computer Memory; Design and implement an AHB SRAM memory controller, which replaces the previous on-chip

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information

MIPS) ( MUX

MIPS) ( MUX Memory What do we use for accessing small amounts of data quickly? Registers (32 in MIPS) Why not store all data and instructions in registers? Too much overhead for addressing; lose speed advantage Register

More information

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Caches and Memory Hierarchies.

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Caches and Memory Hierarchies. Introduction to Computer Architecture Caches and emory Hierarchies Copyright 2012 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) and Alvin Lebeck (Duke) Spring 2012 Where

More information

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations? Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined

More information

Memories: Memory Technology

Memories: Memory Technology Memories: Memory Technology Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Memory Hierarchy

More information

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory The basic element of a semiconductor memory is the memory cell. Although a variety of

More information

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng. CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 4: Memory Organization Our goal: understand the basic types of memory in computer understand memory hierarchy and the general process to access memory

More information

ARM Cortex-M4 Architecture and Instruction Set 3: Branching; Data definition and memory access instructions

ARM Cortex-M4 Architecture and Instruction Set 3: Branching; Data definition and memory access instructions ARM Cortex-M4 Architecture and Instruction Set 3: Branching; Data definition and memory access instructions M J Brockway February 17, 2016 Branching To do anything other than run a fixed sequence of instructions,

More information

Chapter 6 (Lect 3) Counters Continued. Unused States Ring counter. Implementing with Registers Implementing with Counter and Decoder

Chapter 6 (Lect 3) Counters Continued. Unused States Ring counter. Implementing with Registers Implementing with Counter and Decoder Chapter 6 (Lect 3) Counters Continued Unused States Ring counter Implementing with Registers Implementing with Counter and Decoder Sequential Logic and Unused States Not all states need to be used Can

More information

INSTRUCTION LEVEL PARALLELISM

INSTRUCTION LEVEL PARALLELISM INSTRUCTION LEVEL PARALLELISM Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 2 and Appendix H, John L. Hennessy and David A. Patterson,

More information

Chapter 4 Main Memory

Chapter 4 Main Memory Chapter 4 Main Memory Course Outcome (CO) - CO2 Describe the architecture and organization of computer systems Program Outcome (PO) PO1 Apply knowledge of mathematics, science and engineering fundamentals

More information

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling

More information

CpE 442. Memory System

CpE 442. Memory System CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)

More information

VLIW DSP Processor Design for Mobile Communication Applications. Contents crafted by Dr. Christian Panis Catena Radio Design

VLIW DSP Processor Design for Mobile Communication Applications. Contents crafted by Dr. Christian Panis Catena Radio Design VLIW DSP Processor Design for Mobile Communication Applications Contents crafted by Dr. Christian Panis Catena Radio Design Agenda Trends in mobile communication Architectural core features with significant

More information

CISC RISC. Compiler. Compiler. Processor. Processor

CISC RISC. Compiler. Compiler. Processor. Processor Q1. Explain briefly the RISC design philosophy. Answer: RISC is a design philosophy aimed at delivering simple but powerful instructions that execute within a single cycle at a high clock speed. The RISC

More information

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides Slide Set 1 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section

More information

A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle.

A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle. CS 320 Ch. 16 SuperScalar Machines A superscalar machine is one in which multiple instruction streams allow completion of more than one instruction per cycle. A superpipelined machine is one in which a

More information

Marching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J.

Marching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J. UCAS-6 6 > Stanford > Imperial > Verify 2011 Marching Memory マーチングメモリ Tadao Nakamura 中村維男 Based on Patent Application by Tadao Nakamura and Michael J. Flynn 1 Copyright 2010 Tadao Nakamura C-M-C Computer

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 251, Winter 2019, Assignment 5.1.1 3% of course mark Due Wednesday, March 27th, 5:30PM Lates accepted until 1:00pm March 28th with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (5 th Week)

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (5 th Week) + (Advanced) Computer Organization & Architechture Prof. Dr. Hasan Hüseyin BALIK (5 th Week) + Outline 2. The computer system 2.1 A Top-Level View of Computer Function and Interconnection 2.2 Cache Memory

More information

APPLICATION NOTE. SH3(-DSP) Interface to SDRAM

APPLICATION NOTE. SH3(-DSP) Interface to SDRAM APPLICATION NOTE SH3(-DSP) Interface to SDRAM Introduction This application note has been written to aid designers connecting Synchronous Dynamic Random Access Memory (SDRAM) to the Bus State Controller

More information

Reducing Instruction Cache Energy Using Gated Wordlines. Mukaya Panich. Submitted to the Department of Electrical Engineering and Computer Science

Reducing Instruction Cache Energy Using Gated Wordlines. Mukaya Panich. Submitted to the Department of Electrical Engineering and Computer Science Reducing Instruction Cache Energy Using Gated Wordlines by Mukaya Panich Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree

More information

Experiment No. 5 MEMORY DESIGN USING STATIC RANDOM ACCESS MEMORY (RAM) ECE 441

Experiment No. 5 MEMORY DESIGN USING STATIC RANDOM ACCESS MEMORY (RAM) ECE 441 Experiment No. 5 MEMORY DESIGN USING STATIC RANDOM ACCESS MEMORY (RAM) ECE 441 Peter Chinetti October 30, 2013 1 Introduction 1.1 Purpose Date Performed: October 10,17, & 24, 2013 Partners: Zelin Wu Instructor:

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

F28HS Hardware-Software Interface: Systems Programming

F28HS Hardware-Software Interface: Systems Programming F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has

More information

Writing ARM Assembly. Steven R. Bagley

Writing ARM Assembly. Steven R. Bagley Writing ARM Assembly Steven R. Bagley Introduction Previously, looked at how the system is built out of simple logic gates Last week, started to look at the CPU Writing code in ARM assembly language Assembly

More information

ECE 2300 Digital Logic & Computer Organization

ECE 2300 Digital Logic & Computer Organization ECE 2300 Digital Logic & Computer Organization Spring 201 Memories Lecture 14: 1 Announcements HW6 will be posted tonight Lab 4b next week: Debug your design before the in-lab exercise Lecture 14: 2 Review:

More information

A Scheme of Predictor Based Stream Buffers. Bill Hodges, Guoqiang Pan, Lixin Su

A Scheme of Predictor Based Stream Buffers. Bill Hodges, Guoqiang Pan, Lixin Su A Scheme of Predictor Based Stream Buffers Bill Hodges, Guoqiang Pan, Lixin Su Outline Background and motivation Project hypothesis Our scheme of predictor-based stream buffer Predictors Predictor table

More information

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information

Memory System Design. Outline

Memory System Design. Outline Memory System Design Chapter 16 S. Dandamudi Outline Introduction A simple memory block Memory design with D flip flops Problems with the design Techniques to connect to a bus Using multiplexers Using

More information

In-order vs. Out-of-order Execution. In-order vs. Out-of-order Execution

In-order vs. Out-of-order Execution. In-order vs. Out-of-order Execution In-order vs. Out-of-order Execution In-order instruction execution instructions are fetched, executed & committed in compilergenerated order if one instruction stalls, all instructions behind it stall

More information

CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Spring 2018 SOLUTIONS Caches and the Memory Hierarchy Assigned February 8 Problem Set #2 Due Wed, February 21 http://inst.eecs.berkeley.edu/~cs152/sp18

More information

ECE 341. Lecture # 16

ECE 341. Lecture # 16 ECE 341 Lecture # 16 Instructor: Zeshan Chishti zeshan@ece.pdx.edu November 24, 2014 Portland State University Lecture Topics The Memory System Basic Concepts Semiconductor RAM Memories Organization of

More information

Computer Architecture Sample Test 1

Computer Architecture Sample Test 1 Computer Architecture Sample Test 1 Question 1. Suppose we have 32-bit memory addresses, a byte-addressable memory, and a 512 KB (2 19 bytes) cache with 32 (2 5 ) bytes per block. a) How many total lines

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

An introduction to SDRAM and memory controllers. 5kk73

An introduction to SDRAM and memory controllers. 5kk73 An introduction to SDRAM and memory controllers 5kk73 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part

More information

East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2006

East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2006 Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2006 Read

More information

Cache Designs and Tricks. Kyle Eli, Chun-Lung Lim

Cache Designs and Tricks. Kyle Eli, Chun-Lung Lim Cache Designs and Tricks Kyle Eli, Chun-Lung Lim Why is cache important? CPUs already perform computations on data faster than the data can be retrieved from main memory and microprocessor execution speeds

More information

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers

More information

ECSE-2610 Computer Components & Operations (COCO)

ECSE-2610 Computer Components & Operations (COCO) ECSE-2610 Computer Components & Operations (COCO) Part 18: Random Access Memory 1 Read-Only Memories 2 Why ROM? Program storage Boot ROM for personal computers Complete application storage for embedded

More information

A+3 A+2 A+1 A. The data bus 16-bit mode is shown in the figure below: msb. Figure bit wide data on 16-bit mode data bus

A+3 A+2 A+1 A. The data bus 16-bit mode is shown in the figure below: msb. Figure bit wide data on 16-bit mode data bus 3 BUS INTERFACE The ETRAX 100 bus interface has a 32/16-bit data bus, a 25-bit address bus, and six internally decoded chip select outputs. Six additional chip select outputs are multiplexed with other

More information

A Bit of History. Program Mem Data Memory. CPU (Central Processing Unit) I/O (Input/Output) Von Neumann Architecture. CPU (Central Processing Unit)

A Bit of History. Program Mem Data Memory. CPU (Central Processing Unit) I/O (Input/Output) Von Neumann Architecture. CPU (Central Processing Unit) Memory COncepts Address Contents Memory is divided into addressable units, each with an address (like an array with indices) Addressable units are usually larger than a bit, typically 8, 16, 32, or 64

More information

EECS 366: Computer Architecure. Memory Technology. Lecture Notes # 15. University of Illinois at Chicago. Instructor: Shantanu Dutt Department of EECS

EECS 366: Computer Architecure. Memory Technology. Lecture Notes # 15. University of Illinois at Chicago. Instructor: Shantanu Dutt Department of EECS EECS 366: Computer Architecure Instructor: Shantanu Dutt Department of EECS University of Illinois at Chicago Lecture Notes # 15 Memory Technology c Shantanu Dutt MEMORY ORGANIZATION Physical Characteristics

More information

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed Instructions: Print your name in the space provided below. This examination is closed book and closed notes, aside from the permitted fact sheet, with a restriction: 1) one 8.5x11 sheet, both sides, handwritten

More information

Advanced Assembly, Branching, and Monitor Utilities

Advanced Assembly, Branching, and Monitor Utilities 2 Advanced Assembly, Branching, and Monitor Utilities 2.1 Objectives: There are several different ways for an instruction to form effective addresses to acquire data, called addressing modes. One of these

More information

Cycle Approximate Simulation of RISC-V Processors

Cycle Approximate Simulation of RISC-V Processors Cycle Approximate Simulation of RISC-V Processors Lee Moore, Duncan Graham, Simon Davidmann Imperas Software Ltd. Felipe Rosa Universidad Federal Rio Grande Sul Embedded World conference 27 February 2018

More information

Low-Power SRAM and ROM Memories

Low-Power SRAM and ROM Memories Low-Power SRAM and ROM Memories Jean-Marc Masgonty 1, Stefan Cserveny 1, Christian Piguet 1,2 1 CSEM, Neuchâtel, Switzerland 2 LAP-EPFL Lausanne, Switzerland Abstract. Memories are a main concern in low-power

More information

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory

More information

AN4777 Application note

AN4777 Application note Application note Implications of memory interface configurations on low-power STM32 microcontrollers Introduction The low-power STM32 microcontrollers have a rich variety of configuration options regarding

More information

EE414 Embedded Systems Ch 5. Memory Part 2/2

EE414 Embedded Systems Ch 5. Memory Part 2/2 EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology Overview 6.1 introduction 6.2 Memory Write Ability and Storage

More information

A Comparative Study of Power Efficient SRAM Designs

A Comparative Study of Power Efficient SRAM Designs A Comparative tudy of Power Efficient RAM Designs Jeyran Hezavei, N. Vijaykrishnan, M. J. Irwin Pond Laboratory, Department of Computer cience & Engineering, Pennsylvania tate University {hezavei, vijay,

More information

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access

More information

ECEN 449 Microprocessor System Design. Memories

ECEN 449 Microprocessor System Design. Memories ECEN 449 Microprocessor System Design Memories 1 Objectives of this Lecture Unit Learn about different types of memories SRAM/DRAM/CAM /C Flash 2 1 SRAM Static Random Access Memory 3 SRAM Static Random

More information

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology

More information

For all questions, please show your step-by-step solution to get full-credits.

For all questions, please show your step-by-step solution to get full-credits. For all questions, please show your step-by-step solution to get full-credits. Question 1: Assume a 5 stages microprocessor, calculate the number of cycles of below instructions for 1) no pipeline system

More information

Memory System Overview. DMA & Endian-ness. Technology. Architectural. Problem: The Memory Wall

Memory System Overview. DMA & Endian-ness. Technology. Architectural. Problem: The Memory Wall The Memory Wall EE 357 Unit 13 Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology targets density rather than speed) Large memories

More information

Exam 1 Fun Times. EE319K Fall 2012 Exam 1A Modified Page 1. Date: October 5, Printed Name:

Exam 1 Fun Times. EE319K Fall 2012 Exam 1A Modified Page 1. Date: October 5, Printed Name: EE319K Fall 2012 Exam 1A Modified Page 1 Exam 1 Fun Times Date: October 5, 2012 Printed Name: Last, First Your signature is your promise that you have not cheated and will not cheat on this exam, nor will

More information

Computer Architecture Memory hierarchies and caches

Computer Architecture Memory hierarchies and caches Computer Architecture Memory hierarchies and caches S Coudert and R Pacalet January 23, 2019 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches

More information

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,

More information

A Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference

A Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference A Comparison of File System Workloads D. Roselli, J. R. Lorch, T. E. Anderson Proc. 2000 USENIX Annual Technical Conference File System Performance Integral component of overall system performance Optimised

More information

ELCT 912: Advanced Embedded Systems

ELCT 912: Advanced Embedded Systems Advanced Embedded Systems Lecture 2: Memory and Programmable Logic Dr. Mohamed Abd El Ghany, Memory Random Access Memory (RAM) Can be read and written Static Random Access Memory (SRAM) Data stored so

More information

Computer Organization and Assembly Language (CS-506)

Computer Organization and Assembly Language (CS-506) Computer Organization and Assembly Language (CS-506) Muhammad Zeeshan Haider Ali Lecturer ISP. Multan ali.zeeshan04@gmail.com https://zeeshanaliatisp.wordpress.com/ Lecture 2 Memory Organization and Structure

More information

Using MSI Logic To Build An Output Port

Using MSI Logic To Build An Output Port Using MSI Logic To Build An Output Port Many designs use standard MSI logic for microprocessor expansion This provides an inexpensive way to expand microprocessors One MSI device often used in such expansions

More information

AMULET2e. AMULET group. What is it?

AMULET2e. AMULET group. What is it? 2e Jim Garside Department of Computer Science, The University of Manchester, Oxford Road, Manchester, M13 9PL, U.K. http://www.cs.man.ac.uk/amulet/ jgarside@cs.man.ac.uk What is it? 2e is an implementation

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COEN-4710 Computer Hardware Lecture 7 Large and Fast: Exploiting Memory Hierarchy (Chapter 5) Cristinel Ababei Marquette University Department

More information

Mobile Multimedia Processor Technical Information

Mobile Multimedia Processor Technical Information Mobile Multimedia Processor Technical Information Document No. IMB YB2 000585 1/13 EMMA Mobile TM 1 Date issued Oct 23 rd.,2009 (MC-10118A μpd77630a) Usage Restrictions Issued by Mobile Platform Group

More information

Random Access Memory (RAM)

Random Access Memory (RAM) Random Access Memory (RAM) EED2003 Digital Design Dr. Ahmet ÖZKURT Dr. Hakkı YALAZAN 1 Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read and write

More information

Lecture Notes on Cache Iteration & Data Dependencies

Lecture Notes on Cache Iteration & Data Dependencies Lecture Notes on Cache Iteration & Data Dependencies 15-411: Compiler Design André Platzer Lecture 23 1 Introduction Cache optimization can have a huge impact on program execution speed. It can accelerate

More information

Logic and Computer Design Fundamentals. Chapter 8 Memory Basics

Logic and Computer Design Fundamentals. Chapter 8 Memory Basics Logic and Computer Design Fundamentals Memory Basics Overview Memory definitions Random Access Memory (RAM) Static RAM (SRAM) integrated circuits Arrays of SRAM integrated circuits Dynamic RAM (DRAM) Read

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan ARM Programmers Model Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu Current program status register (CPSR) Prog Model 2 Data processing

More information