Mode-Controlled Dataflow Modeling of Real-Time Memory Controllers
|
|
- Warren McKenzie
- 6 years ago
- Views:
Transcription
1 Mode-Controlled Dataflow Modeling of eal-time Memory Controllers Yonghui Li, Hrishikesh alunkhe, Joao Bastos, Orlando Moreira 2, Benny Akesson 3 and Kees Goossens Eindhoven University of Technology, the Netherlands 2 Intel corporation, 3 CITE/INEC TEC, IEP, Portugal yonghui.li@tue.nl
2 Introduction: DTV & TB NXP Viper2 (PNX8550) 0.3 m orst-case Bandwidth (CB)? MIP P4450 M Memory Controller TM32 M M TM32 ~50 M transistors ~00 clock domains more than 70 IP blocks MPEG MIP MDC VIP MP MB MCU TDC QVCP5L TriMedia # TriMedia #2 QVCP2L DC-EC DC-CT M-GIC M-IPC CLOCK GLOBAL EET TM-DBG TM2-DBG UAT UAT2 UAT3 EJTAG BOOT M M M M-DC PMA-MON PMA-EC PMA-AB M PCI/XIO DE IIC IIC2 IIC3 UB MC MC2 M-Gate C-Bridge PMA DC-EC DC-CT VMPG DVDD EDMA VLD QVCP2 MB MB2 QTN QVCP VIP VIP2 VPK TDMA T-DC TM-IPC TM-GIC M M M M M M TM2-IPC TM2-GIC DENC PDIO AIO AIO2 AIO3 GPIO TUNNEL MP MP2
3 Outline Background DAM Dynamically scheduled memory controller Dataflow modeling of command scheduling hy dataflow modeling? Mode-controlled dataflow (MCDF) MCDF modeling of a memory controller Experimental results Conclusions 2
4 DAM Memories DAM is accessed by scheduling commands ACT, PE, D,, EF, NOP ubject to timing constraints cmd addr. data DAM Activate (ACT) ead (D) Bank Bank 7 Bank 0 ow buffer ow buffer ow buffer rite () Precharge (PE) ACT NoP D NoP ACT NoP NoP PE NoP PE NoP 3
5 Dynamically cheduled Memory Controller A transaction is translated into a sequence of commands Trans Command Generator Log. Addr. Data Memory Map Cmd queue Phy. Addr. cmd Bank 7 Bank Bank 0 ow buffer ow buffer ow buffer Timing Counters cheduler DAM cheduling algorithm First-Come First-erve (FCF) for transactions D or commands have higher priority than ACT 4
6 cheduling Dependencies of a Transaction A transaction T i is executed by scheduling commands to successive banks twitch td tp ta tcd tccd ttp ACT PE tp Bank 0 T i tfa tp tfa tp tfa td tccd ta tcd tccd ttp tp ACT PE Bank td td tccd tccd Bank BI i ta tcd tccd ttp tp ACT PE twitch orst-case analyses have been carried out based on analyzing individual dependency [4][0][][2] 5
7 Dataflow Modeling of Command cheduling From command scheduling to actor firing Commands are represented by actors Timing constraints are captured by delay actors cheduling dependencies are depicted by the edges between actors D D CCD tccd ACT tcd D ACT CD D t D j = max{t ACT j + tcd, t D j + tccd} Max-plus algebra 6
8 Dataflow Modeling of Command cheduling Dataflow model of transactions (e.g., 32-byte read) Command scheduling dependencies ACT D PE tccd ttp td tccd ACT tcd D A ta ttp ta PE tp tp ACT CD D TP PE Bank 0 Bank P Dynamisms Unknown order of accessing different banks Variable transaction sizes need different number of banks D CCD CCD ACT CD D TP PE P A ingle rate dataflow graph (DF) 7
9 Outline Background DAM Dynamically scheduled memory controller Dataflow modeling of command scheduling hy dataflow modeling? Mode-controlled dataflow (MCDF) MCDF modeling of a memory controller Experimental results Conclusions 8
10 Mode-Controlled Dataflow Model (MCDF) MCDF is a restricted variant of Boolen data-flow capturing dynamisms while being analyzable The structure includes actors behaving as switch and select, which are controlled by an actor, named mode-controller (MC) 0 M0 A, 0 0 rc, 2 switch Tunnel select M = M 0 M M 0 M 0 M M B, 2 MC Dynamism is captured by defining mode sequences (M) 9
11 Mode-Controlled Dataflow Model (MCDF) 0 M0 A, 0 0 MCDF & M rc, 2 switch Tunnel select MC M B, 2 M 0 = M 0 M = M ingle rate dataflow graph (DF) rc A L Tsw a Tsl rc B L Tsw b Tsl MC MC 0
12 Mode-Controlled Dataflow Model (MCDF) 0 M0 A, 0 0 MCDF & M rc, 2 switch Tunnel select MC M B, 2 M = M 0 M ingle rate dataflow graph (DF) rc A_ L_ Tsw_ a_ Tsl_ rc_2 _2 B_ L_2 Tsw_2 b_ Tsl_2 MC_ MC_2
13 Mode-Controlled Dataflow Model (MCDF) M = M 0 M MCDF & M Tsl_ L_ L_2 Tsl_2 a_ A_ B_ b_ Tsw _2 Tsw_ 2 ingle rate dataflow graph (DF) MC_ rc_ rc_2 MC_2 2
14 Mode-Controlled Dataflow Model (MCDF) M = M 0 M MCDF & M Tsl_ L_ L_2 Tsl_2 a_ A_ B_ b_ M 0_0 M _0 Tsw _2 Tsw_ 2 ingle rate dataflow graph (DF) MC_ rc_ rc_2 MC_2 3
15 Mode-Controlled Dataflow Model (MCDF) M = M 0 M MCDF & M M 0_0 M _0 M 0_ M _ ingle rate dataflow graph (DF) M 0_2 M _2 4
16 Outline Background DAM Dynamically scheduled memory controller Dataflow modeling of command scheduling hy dataflow modeling? Mode-controlled dataflow (MCDF) MCDF modeling of a memory controller Experimental results Conclusions 5
17 MCDF Modeling of Dynamic Command cheduling Mode_0 ACT, 2 Bank 0 Mode_ Mode_7 ACT, 2 ACT, 2 Bank Bank 7 TC: ACT TC: ACT PE Mode_8 PE, Bank 0 Mode_9 Mode_5 PE, PE, Bank Bank 7 TC: PE Mode_6 D, Mode_7, 6
18 Mode tunnel: D Mode tunnel: FA D D D FA FA FA ACT, 2 Mode_0 ource, Mode switch Mode_8 Mode_9 CD ACT, 2 CD Mode tunnel: CD ACT, 2 CD A Mode tunnel: A PE, A PE, A Mode_ Mode_7 P P Mode tunnel: P Mode tunnel: P Mode select Mode_5 PE, P Mode tunnel: P CCD Mode tunnel: CCD Mode tunnel: TP D, TP Mode_6 T Mode tunnel: T Mode tunnel: T T Mode controller Mode_7, TP Mode tunnel: CCD CCD 7
19 MCDF Modeling of Dynamic Command cheduling From transactions to mode sequences Transaction Commands Mode sequence 32-byte read: Bank0, Bank Bank2, Bank3 Bank4, Bank5 Bank6, Bank7 ACT D PE tccd ttp td tccd ACT tcd orst-case bandwidth (CB) D ttp PE Bank 0 Bank From Maximum cycle mean (MCM) to CB C MCM = max C G ω(c M 0 = M 0 M 6 M 8 M M 6 M 9 M = M 2 M 6 M 0 M 3 M 6 M M 2 = M 4 M 6 M 2 M 5 M 6 M 3 M 3 = M 6 M 6 M 4 M 7 M 6 M 5 M = M 0 M M 2 M 3 ω(c (C CB = min C G C f mem e ref 8
20 MCDF Modeling of Dynamic Command cheduling MCDF model Commands are captured by cmd actors Timing constraints are described by delay actors A mode is constructed with these actors Transactions are translated in to static mode sequences (M) Cmds ACT... Timing constraints tcd... Mode_0 M 0 Actors ACT, Actors tcd Mode_ Mode_8 M M N T 0 T T N Trans Trace MCDF Graph Cmd cheduling Memory traffic 9
21 Experiments Goal Validate the MCDF model Obtain the worst-case bandwidth results etup Heracles: a temporal analysis tool developed at Ericsson TMemController: an open-source analysis tool for a real-time memory controller with dynamic command scheduling. 6-bit DD3-800D/600G/233K DAMs with 2Gb Transaction sizes include 6-byte, 32-byte, 64-byte, 28-bytes, 256-byte 20
22 Experiment Validation of the MCDF model imulating the MCDF model with Heracles gives identical command schedules as the cycle-accurate TMemController simulator 2
23 CB (MB/s) Experiment 2: Fixed transaction sizes MCDF >= Analytical MCDF < cheduled only for 6-byte due to cmd collision MCDF dynamic(scheduled [4]) dynamic(analytical [4]) Transaction sizes (bytes) [4] Y. Li et.al. Architecture and analysis of a dynamically-scheduled real-time memory controller. eal-time ystems Journal, pp -55, pringer,
24 CB (MB/s) Experiment 3: Variable transaction sizes 64-byte transactions followed by 28-byte transactions (known order) andom mix of 64-byte and 28-byte transactions (unknown order) transaction order(unknown) MCDF scheduled [4] analytical [4] 23 transaction order(known) [4] Y. Li et.al. Architecture and analysis of a dynamically-scheduled real-time memory controller. eal-time ystems Journal, pp -55, pringer, 205.
25 Conclusions A mode-controlled dataflow (MCDF) model of dynamic command scheduling for T memory controllers upports simulation Provides worst-case bandwidth results Is easy to adapt to other memory controllers The MCDF model outperforms several existing analysis approaches 24
26 Thank You. 25
Trends in Embedded System Design
Trends in Embedded System Design MPSoC design gets increasingly complex Moore s law enables increased component integration Digital convergence creates a market for highly integrated devices The resulting
More informationNetworks on Chip. on-chip interconnect: physical. Kees Goossens. Kees Goossens Eindhoven University of Technology 1
1 Networks on Chip Kees Goossens Kees Goossens Group Electrical Engineering Faculty on-chip interconnect: physical Kees Goossens
More informationKees Goossens Electronic Systems TM 3218 PPMA TPBC T-PIC MBS AICP1 AICP2 VMPG VIP1 VIP2 MSP1 MSP2 MSP3 S S R W M S M S M S M S M S M S
EJTAG FPBC -PIC DBG PBC GLOBAL IIC x CLOCK EET T_DBG BOOT P90 -Bridge -PI Bus F-PI Bus F-Gate DE PCI -Gate DA CAD UAT x UB 9 emory Controller C-Bridge PPA B AICP AICP VPG V V P P P T-Gate T 8 T-PI Bus
More informationWorst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni
orst Case Analysis of DAM Latency in Multi-equestor Systems Zheng Pei u Yogen Krish odolfo Pellizzoni Multi-equestor Systems CPU CPU CPU Inter-connect DAM DMA I/O 1/26 Multi-equestor Systems CPU CPU CPU
More informationExploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization
Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization Karthik Chandrasekar, Sven Goossens 2, Christian Weis 3, Martijn Koedam 2, Benny Akesson 4, Norbert Wehn 3, and Kees
More informationIntroduction to memory system :from device to system
Introduction to memory system :from device to system Jianhui Yue Electrical and Computer Engineering University of Maine The Position of DRAM in the Computer 2 The Complexity of Memory 3 Question Assume
More informationVariability Windows for Predictable DDR Controllers, A Technical Report
Variability Windows for Predictable DDR Controllers, A Technical Report MOHAMED HASSAN 1 INTRODUCTION In this technical report, we detail the derivation of the variability window for the eight predictable
More information1 Introduction. embedded systems design networks and connected systems verification and validation networks on chip
NETOK ON CHIP: A COMMUNICATION-CENTIC APPOACH TO PLATFOM-BAED DEIGN Jef van Meerbergen Fellow Philips esearch Professor Technical University Eindhoven Abstract Embedded system implementations must be flexible,
More informationDesign and Implementation of Refresh and Timing Controller Unit for LPDDR2 Memory Controller
Design and Implementation of Refresh and Timing Controller Unit for LPDDR2 Memory Controller Sandya M.J Dept. of Electronics and communication BNM Institute Of Technology Chaitra.N Dept. of Electronics
More informationAn introduction to SDRAM and memory controllers. 5kk73
An introduction to SDRAM and memory controllers 5kk73 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part
More informationReliable Dynamic Embedded Data Processing Systems
2 Embedded Data Processing Systems Reliable Dynamic Embedded Data Processing Systems sony Twan Basten thales Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen,
More informationLow-Cost Inter-Linked Subarrays (LISA) Enabling Fast Inter-Subarray Data Movement in DRAM
Low-Cost Inter-Linked ubarrays (LIA) Enabling Fast Inter-ubarray Data Movement in DRAM Kevin Chang rashant Nair, Donghyuk Lee, augata Ghose, Moinuddin Qureshi, and Onur Mutlu roblem: Inefficient Bulk Data
More informationCIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1
Version 1 1. (20 Points) Given the class A network address 117.0.0.0 will be divided into multiple subnets. a. (5 Points) How many bits will be necessary to address 4,000 subnets? b. (5 Points) What is
More informationReliable Embedded Multimedia Systems?
2 Overview Reliable Embedded Multimedia Systems? Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Embedded Multi-media Analysis of
More informationModelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system
Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system A.J.M. Moonen Information and Communication Systems Department of Electrical Engineering Eindhoven University
More informationEECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun
EECS750: Advanced Operating Systems 2/24/2014 Heechul Yun 1 Administrative Project Feedback of your proposal will be sent by Wednesday Midterm report due on Apr. 2 3 pages: include intro, related work,
More informationMemory management. Knut Omang Ifi/Oracle 10 Oct, 2012
Memory management Knut Omang Ifi/Oracle 1 Oct, 212 (with slides from V. Goebel, C. Griwodz (Ifi/UiO), P. Halvorsen (Ifi/UiO), K. Li (Princeton), A. Tanenbaum (VU Amsterdam), and M. van Steen (VU Amsterdam))
More informationtrend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability
Eindhoven University of Technology This research is supported by EU grants T-CTEST, Cobra and NL grant NEST. Parts of the platform were developed in COMCAS, Scalopes, TSA, NEVA,
More informationCIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1
Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. Router A Router B Router C Router D Network Next Hop Next Hop Next Hop Next
More informationBalancing DRAM Locality and Parallelism in Shared Memory CMP Systems
Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems Min Kyu Jeong, Doe Hyun Yoon^, Dam Sunwoo*, Michael Sullivan, Ikhwan Lee, and Mattan Erez The University of Texas at Austin Hewlett-Packard
More informationPerformance Measurements Improving Latency and Bandwidth of Your DDR4 System MEMCON 2014
Performance Measurements Improving Latency and Bandwidth of Your DDR4 System Barbara Aichinger Vice President New Business Development FuturePlus Systems Corporation Outline Performance measurements or
More informationEfficient real-time SDRAM performance
1 Efficient real-time SDRAM performance Kees Goossens with Benny Akesson, Sven Goossens, Karthik Chandrasekar, Manil Dev Gomony, Tim Kouters, and others Kees Goossens
More informationThe CompSOC Design Flow for Virtual Execution Platforms
NEST COBRA CA104 The CompSOC Design Flow for Virtual Execution Platforms FPGAWorld 10-09-2013 Sven Goossens*, Benny Akesson*, Martijn Koedam*, Ashkan Beyranvand Nejad, Andrew Nelson, Kees Goossens* * Introduction
More informationFault tolerance in consumer products. Ben Pronk
Fault tolerance in consumer products Ben Pronk Content Consumer electronics, some background Reliability and software in consumer products Current solutions Future outllok 2 Consumer electronics, some
More informationMARACAS: A Real-Time Multicore VCPU Scheduling Framework
: A Real-Time Framework Computer Science Department Boston University Overview 1 2 3 4 5 6 7 Motivation platforms are gaining popularity in embedded and real-time systems concurrent workload support less
More informationIngo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA Copyright 2003, SAS Institute Inc. All rights reserved.
Intelligent Storage Results from real life testing Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA SAS Intelligent Storage components! OLAP Server! Scalable Performance Data Server!
More informationCS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #1
CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #1 This exam is closed book, closed notes. All cell phones must be turned off. No calculators may be used. You have two hours to complete
More informationA Comparative Study of Predictable DRAM Controllers
0:1 0 A Comparative Study of Predictable DRAM Controllers Real-time embedded systems require hard guarantees on task Worst-Case Execution Time (WCET). For this reason, architectural components employed
More informationProduct Specifications
Product Specificatio L75S655BGAS. General Information 5MB 6Mx7 SDRAM PC ECC REGISTERED PIN SODIMM Description: The L75S655B is a 6Mx 7 Synchronous Dynamic RAM high deity memory module. This memory module
More information6.1 Combinational Circuits. George Boole ( ) Claude Shannon ( )
6. Combinational Circuits George Boole (85 864) Claude Shannon (96 2) Signals and Wires Digital signals Binary (or logical ) values: or, on or off, high or low voltage Wires. Propagate digital signals
More informationINSTALLATION and OPERATION MANUAL
INSTALLATION and OPERATION MANUAL Table of Contents Overview and Features.. 1 Specifications 2 Functional Block Diagram 3 Unit Detail. 4 DIP Switch Setting Tables 7 RS-232 I/F Pin Assignment.. 10 V.35
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationMemory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague
Memory Controllers for Real-Time Embedded Systems Benny Akesson Czech Technical University in Prague Trends in Embedded Systems Embedded systems get increasingly complex Increasingly complex applications
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationTopic 21: Memory Technology
Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,
More informationTopic 21: Memory Technology
Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,
More informationThe Memory Hierarchy 1
The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow
More informationHybrid Storage Performance Characteristics
Hybrid Storage Performance Characteristics Kirill Malkin CTO, Starboard Storage Systems Flash Memory Summit 2013 Santa Clara, CA 1 ho is Starboard Storage? Designer and innovator of Hybrid Storage Innovative
More informationCIS-331 Fall 2013 Exam 1 Name: Total of 120 Points Version 1
Version 1 1. (24 Points) Show the routing tables for routers A, B, C, and D. Make sure you account for traffic to the Internet. NOTE: Router E should only be used for Internet traffic. Router A Router
More informationRobustness for Control-Data-Traffic in Time Sensitive Networks
Robustness for Control-Data-Traffic in Time Sensitive Networks 2013-07-15 -v01- IEEE 802.1 TSN TG Meeting Geneva - Switzerland Presenter: Franz-Josef Goetz, Siemens AG franz-josef.goetz@siemens.com Structure
More informationThe cache is 4-way set associative, with 4-byte blocks, and 16 total lines
Sample Problem 1 Assume the following memory setup: Virtual addresses are 20 bits wide Physical addresses are 15 bits wide The page size if 1KB (2 10 bytes) The TLB is 2-way set associative, with 8 total
More information128Mx72 bits PC133 SDRAM Registered DIMM with PLL, based on 64Mx4 SDRAM with LVTTL, 4 banks & 8K Refresh
128Mx72 bits PC133 SDAM egistered DIMM with PLL, based on 64Mx4 SDAM with LVTTL, 4 banks & 8K efresh DSCIPTION The HYM72V12C736K4 Series are 128Mx72bits CC Synchronous DAM Modules. The modules are composed
More informationFunctional modeling style for efficient SW code generation of video codec applications
Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory,
More information6. Specifications & Additional Information
6. Specifications & Additional Information SIIGX52004-3.1 Transceier Blocks Table 6 1 shows the transceier blocks for Stratix II GX and Stratix GX deices and compares their features. Table 6 1. Stratix
More informationN551C321. Table of Contents-
Table of Contents- 1. GENERAL DESCRIPTION... 2 2. FEATURES... 2 3. BLOCK DIAGRAM... 3 4. PIN DESCRIPTION... 3 5. ABSOLUTE MAXIMUM RATINGS... 4 6. ELECTRICAL CHARACTERISTICS... 4 7. APPLICATION CIRCUIT...
More informationWhy an additional Shaper
Distributed Embedded Systems University of Paderborn Why an additional Shaper Marcel Kiessling Distributed Embedded Systems Marcel Kießling 1 Outline Recap: Industrial Requirements for Latency Different
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationBlueVisor: A Scalable Real-time Hardware Hypervisor for Many-core Embedded System
BlueVisor: A Scalable eal-time Hardware Hypervisor for Many-core Embedded System Zhe Jiang, Neil C Audsley, Pan Dong eal-time Systems Group Department of Computer Science University of York, United Kingdom
More informationEE 457 Unit 7b. Main Memory Organization
1 EE 457 Unit 7b Main Memory Organization 2 Motivation Organize main memory to Facilitate byte-addressability while maintaining Efficient fetching of the words in a cache block Low order interleaving (L.O.I)
More information2 nd Half. Memory management Disk management Network and Security Virtual machine
Final Review 1 2 nd Half Memory management Disk management Network and Security Virtual machine 2 Abstraction Virtual Memory (VM) 4GB (32bit) linear address space for each process Reality 1GB of actual
More informationCIS-331 Spring 2016 Exam 1 Name: Total of 109 Points Version 1
Version 1 Instructions Write your name on the exam paper. Write your name and version number on the top of the yellow paper. Answer Question 1 on the exam paper. Answer Questions 2-4 on the yellow paper.
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 19 Advanced Processors III 2006-11-2 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Udam Saini and Jue Sun www-inst.eecs.berkeley.edu/~cs152/ 1 Last
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationAvoiding Utilization Inefficiency in.1qbv
Avoiding Utilization Inefficiency in.1qbv IEEE 802 Interim Meeting, Norfolk, VA, May/2014 (preliminary version) Wilfried Steiner, Corporate Scientist wilfried.steiner@tttech.com Page 1 From 802.1Qbv-D1.2
More information4. Specifications and Additional Information
4. Specifications and Additional Information AGX52004-1.0 8B/10B Code This section provides information about the data and control codes for Arria GX devices. Code Notation The 8B/10B data and control
More informationGiorgio Buttazzo. Scuola Superiore Sant Anna, Pisa. The transition
Giorgio Buttazzo Scuola Superiore Sant Anna, Pisa The transition On May 7 th, 2004, Intel, the world s largest chip maker, canceled the development of the Tejas processor, the successor of the Pentium4-style
More informationPeristaltic Shaper: updates, multiple speeds
Peristaltic Shaper: ups, multiple speeds Michael Johas Teener Broadcom, mikejt@broadcom.com 00 IEEE 80. TimeSensitive Networking TG Agenda Objectives review History Baseline design and assumptions Higher
More informationPerformance Tuning on the Blackfin Processor
1 Performance Tuning on the Blackfin Processor Outline Introduction Building a Framework Memory Considerations Benchmarks Managing Shared Resources Interrupt Management An Example Summary 2 Introduction
More informationTopNet implementation: SpW IP Tunnel and Protocol Analyser
TopNet implementation: SpW IP Tunnel and Protocol Analyser Vitulli R. TEC-EDP Email: Raffaele.Vitulli@esa.int Slide : 1 SpaceWire System Node 71 Node 72 Node 61 Router 2 Node 73 Node 62 Router 1 Router
More informationAn Analysis of Blocking vs Non-Blocking Flow Control in On-Chip Networks
An Analysis of Blocking vs Non-Blocking Flow Control in On-Chip Networks ABSTRACT High end System-on-Chip (SoC) architectures consist of tens of processing engines. These processing engines have varied
More informationAccelerating Development and Troubleshooting of Data Center Bridging (DCB) Protocols Using Xgig
Accelerating Development and Troubleshooting of Data Center Bridging (DCB) Protocols Using Xgig The new Data Center Bridging (DCB) protocols provide important mechanisms for enabling priority and managing
More informationMapping and Configuration Methods for Multi-Use-Case Networks on Chips
Mapping and Configuration Methods for Multi-Use-Case Networks on Chips Srinivasan Murali, Stanford University Martijn Coenen, Andrei Radulescu, Kees Goossens, Giovanni De Micheli, Ecole Polytechnique Federal
More informationMemory Access Scheduling
Memory Access Scheduling ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 Instructor: Dr. Chigan 1 ECE 5900 spring 05 1 Outline Introduction Modern DRAM architecture Memory access scheduling Structure
More informationEmbedded Systems. Series Editors
Embedded Systems Series Editors Nikil D. Dutt, Department of Computer Science, Zot Code 3435, Donald Bren School of Information and Computer Sciences, University of California, Irvine, CA 92697-3435, USA
More informationA Comprehensive Analytical Performance Model of DRAM Caches
A Comprehensive Analytical Performance Model of DRAM Caches Authors: Nagendra Gulur *, Mahesh Mehendale *, and R Govindarajan + Presented by: Sreepathi Pai * Texas Instruments, + Indian Institute of Science
More informationCIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1
Version 1 1. (20 Points) Given the class A network address 119.0.0.0 will be divided into a maximum of 15,900 subnets. a. (5 Points) How many bits will be necessary to address the 15,900 subnets? b. (5
More informationComputer Structure. Unit 4. Processor
Computer Structure Unit 4. Processor Departamento de Informática Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas UNIVERSIDAD CARLOS III DE MADRID Contents Computer elements Processor organization
More informationRefresh-Aware DDR3 Barrel Memory Controller with Deterministic Functionality
Refresh-Aware DDR3 Barrel Memory Controller with Deterministic Functionality Abir M zah Department of Computer and System Engineering ENSTA-Paristech, 828 Blvd Marechaux, Palaiseau, France abir.mzah@ensta-paristech.fr
More informationComputer Architecture ELEC3441
CPU-Memory Bottleneck Computer Architecture ELEC44 CPU Memory Lecture 9 Cache Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering Performance of high-speed computers is usually limited
More informationELCT 912: Advanced Embedded Systems
Advanced Embedded Systems Lecture 2: Memory and Programmable Logic Dr. Mohamed Abd El Ghany, Memory Random Access Memory (RAM) Can be read and written Static Random Access Memory (SRAM) Data stored so
More informationDMA Latency
AB-36 APPLICATION BRIEF 80186 80188 DMA Latency STEVE FARRER APPLICATIONS ENGINEER April 1989 Order Number 270525-001 Information in this document is provided in connection with Intel products Intel assumes
More informationEfficient Signature Matching with Multiple Alphabet Compression Tables
Efficient Signature Matching with Multiple Alphabet Compression Tables Shijin Kong Randy Smith Cristian Estan Presented at SecureComm, Istanbul, Turkey Signature Matching Signature Matching a core component
More informationDatabase Workload. from additional misses in this already memory-intensive databases? interference could be a problem) Key question:
Database Workload + Low throughput (0.8 IPC on an 8-wide superscalar. 1/4 of SPEC) + Naturally threaded (and widely used) application - Already high cache miss rates on a single-threaded machine (destructive
More informationA high-level model of embedded flash energy consumption
A high-level model of embedded flash energy consumption To appear in CASES, ESWEEK 2014 James Pallister, PhD Student Kerstin Eder, Primary PhD Supervisor Simon Hollis, PhD Supervisor Jeremy Bennett, Embecosm
More informationMemory latency: Affects cache miss penalty. Measured by:
Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory
More informationComposable Resource Sharing Based on Latency-Rate Servers
Composable Resource Sharing Based on Latency-Rate Servers Benny Akesson 1, Andreas Hansson 1, Kees Goossens 2,3 1 Eindhoven University of Technology 2 NXP Semiconductors Research 3 Delft University of
More informationMemory latency: Affects cache miss penalty. Measured by:
Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationExperiences with the Sparse Matrix-Vector Multiplication on a Many-core Processor
Experiences with the Sparse Matrix-Vector Multiplication on a Many-core Processor Juan C. Pichel Centro de Investigación en Tecnoloxías da Información (CITIUS) Universidade de Santiago de Compostela, Spain
More informationA Comparative Study of Predictable DRAM Controllers
A A Comparative Study of Predictable DRAM Controllers Danlu Guo,Mohamed Hassan,Rodolfo Pellizzoni and Hiren Patel, {dlguo,mohamed.hassan,rpellizz,hiren.patel}@uwaterloo.ca, University of Waterloo Recently,
More informationKNX TinySerial 810. Communication Protocol. WEINZIERL ENGINEERING GmbH
WEINZIERL ENGINEERING GmbH KNX TinySerial 810 Communication Protocol WEINZIERL ENGINEERING GmbH Bahnhofstr. 6 DE-84558 Tyrlaching GERMAY Tel. +49 8623 / 987 98-03 Fax +49 8623 / 987 98-09 E-Mail: info@weinzierl.de
More informationComparative Analysis of Contemporary Cache Power Reduction Techniques
Comparative Analysis of Contemporary Cache Power Reduction Techniques Ph.D. Dissertation Proposal Samuel V. Rodriguez Motivation Power dissipation is important across the board, not just portable devices!!
More informationPart IV: 3D WiNoC Architectures
Wireless NoC as Interconnection Backbone for Multicore Chips: Promises, Challenges, and Recent Developments Part IV: 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan 1 Outline: 3D WiNoC Architectures
More informationSnoop-Based Multiprocessor Design III: Case Studies
Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions
More informationSection 8.3 Vector, Parametric, and Symmetric Equations of a Line in
Section 8.3 Vector, Parametric, and Symmetric Equations of a Line in R 3 In Section 8.1, we discussed vector and parametric equations of a line in. In this section, we will continue our discussion, but,
More informationDesign and Implementation of High Performance Application Specific Memory
Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics
More information11.1. Unit 11. Adders & Arithmetic Circuits
. Unit s & Arithmetic Circuits .2 Learning Outcomes I understand what gates are used to design half and full adders I can build larger arithmetic circuits from smaller building blocks ADDER.3 (+) Register.4
More informationEECS 322 Computer Architecture Superpipline and the Cache
EECS 322 Computer Architecture Superpipline and the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow Summary:
More informationComputer Structure. The Uncore. Computer Structure 2013 Uncore
Computer Structure The Uncore 1 2 nd Generation Intel Next Generation Intel Turbo Boost Technology High Bandwidth Last Level Cache Integrates CPU, Graphics, MC, PCI Express* on single chip PCH DMI PCI
More informationMIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer
MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware
More informationF21 Microprocessor Preliminary specifications 9/98
F21 contains a CPU, a memory interface processor, two analog I/O coprocessors, an active message serial network coprocessor, and a parallel I/O port on a small custom VLSI CMOS chip. CPU 0 operand stack
More informationChapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review
Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic
More informationReal-Time (Paradigms) (47)
Real-Time (Paradigms) (47) Memory: Memory Access Protocols Tasks competing for exclusive memory access (critical sections, semaphores) become interdependent, a common phenomenon especially in distributed
More informationMemory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S.
Memory Hierarchy Lecture notes from MKP, H. H. Lee and S. Yalamanchili Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 Reading (2) 1 SRAM: Value is stored on a pair of inerting gates Very fast but
More informationA Reconfigurable Real-Time SDRAM Controller for Mixed Time-Criticality Systems
A Reconfigurable Real-Time SDRAM Controller for Mixed Time-Criticality Systems Sven Goossens, Jasper Kuijsten, Benny Akesson, Kees Goossens Eindhoven University of Technology {s.l.m.goossens,k.b.akesson,k.g.w.goossens}@tue.nl
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third
More informationOn Generalized Processor Sharing with Regulated Traffic for MPLS Traffic Engineering
On Generalized Processor Sharing with Regulated Traffic for MPLS Traffic Engineering Shivendra S. Panwar New York State Center for Advanced Technology in Telecommunications (CATT) Department of Electrical
More informationEmbedded Systems: Hardware Components (part II) Todor Stefanov
Embedded Systems: Hardware Components (part II) Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3
More information