Best Practice for Caching of Single-Path Code
|
|
- Aileen Hancock
- 5 years ago
- Views:
Transcription
1 Best Practice for Caching of Single-Path Code Martin Schoeberl, Bekim Cilku, Daniel Prokesch, and Peter Puschner Technical University of Denmark Vienna University of Technology 1
2 Context n Real-time systems t Worst-case execution time (WCET) counts n Different from average-case performance t Standard processors are optimized for average-case performance n Design a processor and a compiler for real-time systems t The T-CREST approach 2
3 T-CREST n Time-predictable multicore Patmos R NI SPM Patmos R NI SPM Patmos R NI SPM t Processor t Network-on-chip t Memory hierarchy t Compiler M$ Dec + T-CREST Chip t WCET analysis (AbsInt ait and platin) S/D$ M$ Dec + Memory Tree Memory Controller SDRAM Memory S/D$ M$ Dec + S/D$ n Most parts open-source n 3
4 Patmos Processor n Time-predictable processor n Called Patmos n Flexibility to define the instruction set t LLVM compiler adapted for Patmos n Co-design for low WCET of t Patmos t Compiler t WCET analysis 4
5 Patmos Processor Fetch Decode Execute Memory Writeback M$ RF RF + S$ + PC IR Dec D$ n RISC pipeline n Dual issue n Special caches n No time dependency between instructions SP 5
6 Hardware Description n Chisel t Scala embedded Language t Higher level than VHDL/Verilog n Generates two versions t C++ based emulator t Verilog based hardware description n Cycle accurate emulation in C++ faster than VHDL/Verilog simulation t Based on the hardware description Martin Schoeberl 6 Caching of Single-Path Code
7 Single-path Programming n Remove input data dependent control flow decisions t Gives constant execution time t Uses (heavily) predicates n If-conversion t Execute both branches t Use if condition for result write back n Constant loop iterations t Use loop bounds t Exit condition for result write back 7
8 Single-path Programming n Loops need to be bounded t In WCET analyzable programs anyway n T-CREST compiler can generate single path code from C programs t For non-recursive programs n Simply measure execution time 8
9 Single-Path Support in Patmos n Constant execution time of all instructions n Predicated instructions t 8 predicates t One is constant true t Write result when predicate is true t Otherwise do nothing (NOP instruction) n All instructions are predicated t Execution time independent from predicate 9
10 Caches in Patmos n Configurable: type and size n For data: normal data cache, stack cache, and scratchpad memory n For instructions: t Standard instruction cache t Prefetching instruction cache (SP) t Method cache t Scratchpad memory Currently only single core (Loader issue) 10
11 Method Cache n Originally developed for the Java processor JOP t Therefore called method cache n Now also used in t SHAP t Merasa processor (CarCore) t Metzlaff PhD thesis n Also in Patmos Martin Schoeberl Caching of Single-Path 11 Code
12 Method Cache n Caches whole method/functions t May load unused instructions n Misses only on call or return t Other instructions guaranteed hits n Cache is divided in blocks n Method can span several blocks n Continuous blocks for a method n Replacement FIFO n Tag memory: One entry per block b foo a a b b Martin Schoeberl 12 Caching of Single-Path Code
13 Evaluation n TACLeBench benchmarks V 1.9 t Self-contained benchmarks n Patmos configured for DE2-115 FPGA board n 8 KB instruction cache t 16 methods when method cache n Cycle accurate emulator to collect the data 13
14 Method vs. Standard Cache Relative performance adpcm dec adpcm enc binarysearch bsort cjpeg wrbmp complex updates countnegative cover du fac g723 enc gsm dec h264 dec hu dec iir insertsort jfdctint lift lms matrix1 md5 ndes petrinet powerwindow prime sha st statemate 14
15 2-way vs Direct Mapped Relative performance adpcm dec adpcm enc binarysearch bsort cjpeg wrbmp complex updates countnegative cover du fac g723 enc gsm dec h264 dec hu dec iir insertsort jfdctint lift lms matrix1 md5 ndes petrinet powerwindow prime sha st statemate 15
16 Dynamic Benchmark Sizes Size in bytes adpcm dec adpcm enc binarysearch bsort cjpeg wrbmp complex updates countnegative cover du fac g723 enc gsm dec h264 dec hu dec iir insertsort jfdctint lift lms ludcmp matrix1 md5 minver ndes petrinet powerwindow prime sha st statemate 16
17 Method vs Standard Cache 2 KB Relative performance adpcm dec adpcm enc binarysearch bsort cjpeg wrbmp complex updates countnegative cover du fac g723 enc gsm dec h264 dec hu dec iir insertsort jfdctint lift lms matrix1 md5 ndes petrinet powerwindow prime sha st statemate 17
18 Reproducing the Results 18
19 Conclusion n Single-path code gives constant execution time n Compared different caching organizations n No single winner n In FPGA we can use application specific caching 19
Scope-based Method Cache Analysis
Scope-based Method Cache Analysis Benedikt Huber 1, Stefan Hepp 1, Martin Schoeberl 2 1 Vienna University of Technology 2 Technical University of Denmark 14th International Workshop on Worst-Case Execution
More informationA Stack Cache for Real-Time Systems
A Stack Cache for Real-Time Systems Martin Schoeberl and Carsten Nielsen Department of Applied Mathematics and Computer Science Technical University of Denmark Email: masca@imm.dtu.dk, carstenlau.nielsen@uzh.ch
More informationD 5.7 Report on Compiler Evaluation
Project Number 288008 D 5.7 Report on Compiler Evaluation Version 1.0 Final Public Distribution Vienna University of Technology, Technical University of Denmark Project Partners: AbsInt Angewandte Informatik,
More informationImproving Performance of Single-path Code Through a Time-predictable Memory Hierarchy
Improving Performance of Single-path Code Through a Time-predictable Memory Hierarchy Bekim Cilku, Wolfgang Puffitsch, Daniel Prokesch, Martin Schoeberl and Peter Puschner Vienna University of Technology,
More informationStatic Analysis of Worst-Case Stack Cache Behavior
Static Analysis of Worst-Case Stack Cache Behavior Alexander Jordan 1 Florian Brandner 2 Martin Schoeberl 2 Institute of Computer Languages 1 Embedded Systems Engineering Section 2 Compiler and Languages
More informationSingle-Path Code Generation and Input-Data Dependence Analysis
Single-Path Code Generation and Input-Data Dependence Analysis Daniel Prokesch daniel@vmars.tuwien.ac.at July 10 th, 2014 Project Workshop Madrid D. Prokesch TUV T-CREST Workshop, Madrid July 10 th, 2014
More informationAligning Single Path Loops to Reduce the Number of Capacity Cache Misses
Aligning Single Path Loops to Reduce the Number of Capacity Cache Misses Bekim Cilku, Roland Kammerer, and Peter Puschner Institute of Computer Engineering Vienna University of Technology A0 Wien, Austria
More informationAligning Single Path Loops to Reduce the Number of Capacity Cache Misses
Aligning Single Path Loops to Reduce the Number of Capacity Cache Misses Bekim Cilku Institute of Computer Engineering Vienna University of Technology A40 Wien, Austria bekim@vmars tuwienacat Roland Kammerer
More informationStatic Analysis of Worst-Case Stack Cache Behavior
Static Analysis of Worst-Case Stack Cache Behavior Florian Brandner Unité d Informatique et d Ing. des Systèmes ENSTA-ParisTech Alexander Jordan Embedded Systems Engineering Sect. Technical University
More informationJOP: A Java Optimized Processor for Embedded Real-Time Systems. Martin Schoeberl
JOP: A Java Optimized Processor for Embedded Real-Time Systems Martin Schoeberl JOP Research Targets Java processor Time-predictable architecture Small design Working solution (FPGA) JOP Overview 2 Overview
More informationD 8.4 Workshop Report
Project Number 288008 D 8.4 Workshop Report Version 2.0 30 July 2014 Final Public Distribution Denmark Technical University, Eindhoven University of Technology, Technical University of Vienna, The Open
More informationSingle-Path Programming on a Chip-Multiprocessor System
Single-Path Programming on a Chip-Multiprocessor System Martin Schoeberl, Peter Puschner, and Raimund Kirner Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at, {peter,raimund}@vmars.tuwien.ac.at
More informationContinuous Non-Intrusive Hybrid WCET Estimation Using Waypoint Graphs
Continuous Non-Intrusive Hybrid WCET Estimation Using Waypoint Graphs Boris Dreyer 1, Christian Hochberger 2, Alexander Lange 3, Simon Wegener 4, and Alexander Weiss 5 1 Fachgebiet Rechnersysteme, Technische
More informationCache-Aware Instruction SPM Allocation for Hard Real-Time Systems
Cache-Aware Instruction SPM Allocation for Hard Real-Time Systems Arno Luppold Institute of Embedded Systems Hamburg University of Technology Germany Arno.Luppold@tuhh.de Christina Kittsteiner Institute
More informationA Time-predictable Object Cache
A Time-predictable Object Cache Martin Schoeberl Department of Informatics and Mathematical Modeling Technical University of Denmark Email: masca@imm.dtu.dk Abstract Static cache analysis for data allocated
More informationManaging Hybrid On-chip Scratchpad and Cache Memories for Multi-tasking Embedded Systems
Managing Hybrid On-chip Scratchpad and Cache Memories for Multi-tasking Embedded Systems Zimeng Zhou, Lei Ju, Zhiping Jia, Xin Li School of Computer Science and Technology Shandong University, China Outline
More informationA Time-Predictable Instruction-Cache Architecture that Uses Prefetching and Cache Locking
A Time-Predictable Instruction-Cache Architecture that Uses Prefetching and Cache Locking Bekim Cilku, Daniel Prokesch, Peter Puschner Institute of Computer Engineering Vienna University of Technology
More informationD 5.6 Full Compiler Version
Project Number 288008 D 5.6 Full Compiler Version Version 1.0 Final Public Distribution Vienna University of Technology Project Partners: AbsInt Angewandte Informatik, Eindhoven University of Technology,
More informationCOSC 6385 Computer Architecture - Memory Hierarchy Design (III)
COSC 6385 Computer Architecture - Memory Hierarchy Design (III) Fall 2006 Reducing cache miss penalty Five techniques Multilevel caches Critical word first and early restart Giving priority to read misses
More informationAscertaining Uncertainty for Efficient Exact Cache Analysis
Ascertaining Uncertainty for Efficient Exact Cache Analysis Valentin Touzeau 1, Claire Maïza 1, David Monniaux 1, and Jan Reineke 2 1 Univ. Grenoble Alpes, VERIMAG, F-38000 Grenoble, France CNRS, VERIMAG,
More informationTiming Anomalies Reloaded
Gernot Gebhard AbsInt Angewandte Informatik GmbH 1 of 20 WCET 2010 Brussels Belgium Timing Anomalies Reloaded Gernot Gebhard AbsInt Angewandte Informatik GmbH Brussels, 6 th July, 2010 Gernot Gebhard AbsInt
More informationA Dynamic Instruction Scratchpad Memory for Embedded Processors Managed by Hardware
A Dynamic Instruction Scratchpad Memory for Embedded Processors Managed by Hardware Stefan Metzlaff 1, Irakli Guliashvili 1,SaschaUhrig 2,andTheoUngerer 1 1 Department of Computer Science, University of
More informationA Single-Path Chip-Multiprocessor System
A Single-Path Chip-Multiprocessor System Martin Schoeberl, Peter Puschner, and Raimund Kirner Institute of Computer Engineering Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at, {peter,raimund}@vmars.tuwien.ac.at
More informationTowards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach
Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach Martin Schoeberl 1, Pascal Schleuniger 1, Wolfgang Puffitsch 2, Florian Brandner 3, Christian W. Probst 1, Sven Karlsson 1, and
More informationVHDL vs. BSV: A case study on a Java-optimized processor
VHDL vs. BSV: A case study on a Java-optimized processor April 18, 2007 Outline Introduction Goal Design parameters Goal Design parameters What are we trying to do? Compare BlueSpec SystemVerilog (BSV)
More informationHybrid SPM-Cache Architectures to Achieve High Time Predictability and Performance
Hybrid SPM-Cache Architectures to Achieve High Time Predictability and Performance Wei Zhang and Yiqiang Ding Department of Electrical and Computer Engineering Virginia Commonwealth University {wzhang4,ding4}@vcu.edu
More information6.004 Tutorial Problems L22 Branch Prediction
6.004 Tutorial Problems L22 Branch Prediction Branch target buffer (BTB): Direct-mapped cache (can also be set-associative) that stores the target address of jumps and taken branches. The BTB is searched
More informationFlexPRET: A Processor Platform for Mixed-Criticality Systems
This is the author prepared accepted version. 2014 IEEE. The published version is: Michael Zimmer, David Broman, Chris Shaver, and Edward A. Lee. FlexPRET: A Processor Platform for Mixed-Criticality Systems.
More informationIntroduction Architecture overview. Multi-cluster architecture Addressing modes. Single-cluster Pipeline. architecture Instruction folding
ST20 icore and architectures D Albis Tiziano 707766 Architectures for multimedia systems Politecnico di Milano A.A. 2006/2007 Outline ST20-iCore Introduction Introduction Architecture overview Multi-cluster
More informationEfficient Computing in Cyber-Physical Systems
12 Efficient Computing in Cyber-Physical Systems Peter Marwedel TU Dortmund (Germany) Informatik 12 Springer, 2010 2013/06/20 Cyber-physical systems and embedded systems Embedded systems (ES): information
More informationCOSC 6385 Computer Architecture. - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Fall 2008 Cache Performance Avg. memory access time = Hit time + Miss rate x Miss penalty with Hit time: time to access a data item which is available
More informationA Shared Scratchpad Memory with Synchronization Support
Downloaded from orbit.dtu.dk on: Dec, 17 A Shared Scratchpad Memory with Synchronization Support Hansen, Henrik Enggaard; Maroun, Emad Jacob ; Kristensen, Andreas Toftegaard; Schoeberl, Martin; Marquart,
More informationReal-Time Audio Processing on the T-CREST Multicore Platform
Real-Time Audio Processing on the T-CREST Multicore Platform Daniel Sanz Ausin, Luca Pezzarossa, and Martin Schoeberl Department of Applied Mathematics and Computer Science Technical University of Denmark,
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Edgar Gabriel Spring 2018 Types of cache misses Compulsory Misses: first access to a block cannot be in the cache (cold start misses) Capacity
More informationResearch Article Time-Predictable Computer Architecture
Hindawi Publishing Corporation EURASIP Journal on Embedded Systems Volume 2009, Article ID 758480, 17 pages doi:10.1155/2009/758480 Research Article Time-Predictable Computer Architecture Martin Schoeberl
More informationFlexPRET: A Processor Platform for Mixed-Criticality Systems
FlexPRET: A Processor Platform for Mixed-Criticality Systems Michael Zimmer David Broman Christopher Shaver Edward A. Lee Electrical Engineering and Computer Sciences University of California at Berkeley
More informationDesign and Implementation of a FPGA-based Pipelined Microcontroller
Design and Implementation of a FPGA-based Pipelined Microcontroller Rainer Bermbach, Martin Kupfer University of Applied Sciences Braunschweig / Wolfenbüttel Germany Embedded World 2009, Nürnberg, 03.03.09
More informationIS CHIP-MULTIPROCESSING THE END OF REAL-TIME SCHEDULING? Martin Schoeberl and Peter Puschner 1
IS CHIP-MULTIPROCESSING THE END OF REAL-TIME SCHEDULING? Martin Schoeberl and Peter Puschner 1 Abstract Chip-multiprocessing is considered the future path for performance enhancements in computer architecture.
More informationKeywords and Review Questions
Keywords and Review Questions lec1: Keywords: ISA, Moore s Law Q1. Who are the people credited for inventing transistor? Q2. In which year IC was invented and who was the inventor? Q3. What is ISA? Explain
More informationWrite only as much as necessary. Be brief!
1 CIS371 Computer Organization and Design Midterm Exam Prof. Martin Thursday, March 15th, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached
More informationMulti-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview
Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?
More informationMemory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt
Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory
More informationState-based Communication on Time-predictable Multicore Processors
State-based Communication on Time-predictable Multicore Processors Rasmus Bo Sørensen, Martin Schoeberl, Jens Sparsø Department of Applied Mathematics and Computer Science Technical University of Denmark
More informationD 3.6 FPGA implementation of self-timed NOC
Project Number 288008 D 3.6 FPGA implementation of self-timed NOC Version 1.0 Final Public Distribution Technical University of Denmark Project Partners: AbsInt Angewandte Informatik, Eindhoven University
More informationData cache organization for accurate timing analysis
Downloaded from orbit.dtu.dk on: Nov 19, 2018 Data cache organization for accurate timing analysis Schoeberl, Martin; Huber, Benedikt; Puffitsch, Wolfgang Published in: Real-Time Systems Link to article,
More informationScratchpad memory vs Caches - Performance and Predictability comparison
Scratchpad memory vs Caches - Performance and Predictability comparison David Langguth langguth@rhrk.uni-kl.de Abstract While caches are simple to use due to their transparency to programmer and compiler,
More informationA Time-Composable Operating System for the Patmos Processor
A Time-Composable Operating System for the Patmos Processor Marco Ziccardi Department of Mathematics University of Padua mziccard@math.unipd.it Martin Schoeberl Department of Applied Mathematics and Computer
More informationLocality. Cache. Direct Mapped Cache. Direct Mapped Cache
Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationait: WORST-CASE EXECUTION TIME PREDICTION BY STATIC PROGRAM ANALYSIS
ait: WORST-CASE EXECUTION TIME PREDICTION BY STATIC PROGRAM ANALYSIS Christian Ferdinand and Reinhold Heckmann AbsInt Angewandte Informatik GmbH, Stuhlsatzenhausweg 69, D-66123 Saarbrucken, Germany info@absint.com
More informationKey Point. What are Cache lines
Caching 1 Key Point What are Cache lines Tags Index offset How do we find data in the cache? How do we tell if it s the right data? What decisions do we need to make in designing a cache? What are possible
More informationLast Time. Response time analysis Blocking terms Priority inversion. Other extensions. And solutions
Last Time Response time analysis Blocking terms Priority inversion And solutions Release jitter Other extensions Today Timing analysis Answers a question we commonly ask: At most long can this code take
More informationWorst-case execution time analysis driven object cache design
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2011; 00:1 19 Published online in Wiley InterScience (www.interscience.wiley.com). Worst-case execution time analysis
More informationAbstract. 1. Introduction
THE MÄLARDALEN WCET BENCHMARKS: PAST, PRESENT AND FUTURE Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper School of Innovation, Design and Engineering, Mälardalen University Box 883, S-721
More informationQuick Reference Card. Timing and Stack Verifiers Supported Platforms. SCADE Suite 6.3
Timing and Stack Verifiers Supported Platforms SCADE Suite 6.3 About Timing and Stack Verifiers supported platforms Timing and Stack Verifiers Supported Platforms SCADE Suite Timing Verifier and SCADE
More informationCaches in Real-Time Systems. Instruction Cache vs. Data Cache
Caches in Real-Time Systems [Xavier Vera, Bjorn Lisper, Jingling Xue, Data Caches in Multitasking Hard Real- Time Systems, RTSS 2003.] Schedulability Analysis WCET Simple Platforms WCMP (memory performance)
More informationECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance
ECE 23 Digital Logic & Computer Organization Spring 28 More s Measuring Performance Announcements HW7 due tomorrow :59pm Prelab 5(c) due Saturday 3pm Lab 6 (last one) released HW8 (last one) to be released
More informationD 5.3 Report on Compilation for Time-Predictability
Project Number 288008 D 5.3 Report on Compilation for Time-Predictability Version 1.0 Final Public Distribution Vienna University of Technology, Technical University of Denmark, AbsInt Angewandte Informatik,
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More informationWCET-Aware C Compiler: WCC
12 WCET-Aware C Compiler: WCC Jian-Jia Chen (slides are based on Prof. Heiko Falk) TU Dortmund, Informatik 12 2015 年 05 月 05 日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply.
More informationComputer Performance Evaluation and Benchmarking. EE 382M Dr. Lizy Kurian John
Computer Performance Evaluation and Benchmarking EE 382M Dr. Lizy Kurian John Desirable features for modeling/evaluation techniques Accurate Not expensive Non-invasive User-friendly Fast Easy to change
More informationUNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.
UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation. July 14) (June 2013) (June 2015)(Jan 2016)(June 2016) H/W Support : Conditional Execution Also known
More informationPrecise Continuous Non-Intrusive Measurement-Based Execution Time Estimation. Boris Dreyer, Christian Hochberger, Simon Wegener, Alexander Weiss
Precise Continuous Non-Intrusive Measurement-Based Execution Time Estimation Boris Dreyer, Christian Hochberger, Simon Wegener, Alexander Weiss This work was funded within the project CONIRAS by the German
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationClassification of Code Annotations and Discussion of Compiler-Support for Worst-Case Execution Time Analysis
Proceedings of the 5th Intl Workshop on Worst-Case Execution Time (WCET) Analysis Page 41 of 49 Classification of Code Annotations and Discussion of Compiler-Support for Worst-Case Execution Time Analysis
More informationUsing Hardware Methods to Improve Time-predictable Performance in Real-time Java Systems
Using Hardware Methods to Improve Time-predictable Performance in Real-time Java Systems Jack Whitham, Neil Audsley, Martin Schoeberl University of York, Technical University of Vienna Hardware Methods
More informationA Statically Scheduled Time- Division-Multiplexed Networkon-Chip for Real-Time Systems
A Statically Scheduled Time- Division-Multiplexed Networkon-Chip for Real-Time Systems Martin Schoeberl, Florian Brandner, Jens Sparsø, Evangelia Kasapaki Technical University of Denamrk 1 Real-Time Systems
More informationCS 3330 Exam 2 Fall 2017 Computing ID:
S 3330 Fall 2017 Exam 2 Variant page 1 of 8 Email I: S 3330 Exam 2 Fall 2017 Name: omputing I: Letters go in the boxes unless otherwise specified (e.g., for 8 write not 8 ). Write Letters clearly: if we
More informationENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013
ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 Professor: Sherief Reda School of Engineering, Brown University 1. [from Debois et al. 30 points] Consider the non-pipelined implementation of
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More informationCS433 Final Exam. Prof Josep Torrellas. December 12, Time: 2 hours
CS433 Final Exam Prof Josep Torrellas December 12, 2006 Time: 2 hours Name: Instructions: 1. This is a closed-book, closed-notes examination. 2. The Exam has 6 Questions. Please budget your time. 3. Calculators
More informationChapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)
Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)
More informationModule 5: "MIPS R10000: A Case Study" Lecture 9: "MIPS R10000: A Case Study" MIPS R A case study in modern microarchitecture.
Module 5: "MIPS R10000: A Case Study" Lecture 9: "MIPS R10000: A Case Study" MIPS R10000 A case study in modern microarchitecture Overview Stage 1: Fetch Stage 2: Decode/Rename Branch prediction Branch
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 13 Memory Part 2
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 13 Memory Part 2 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html
More informationMemory Hierarchy Basics
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases
More informationECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 13 Memory Part 2
ECE 252 / CPS 220 Advanced Computer Architecture I Lecture 13 Memory Part 2 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationA Java Processor Architecture for Embedded Real-Time Systems
A Java Processor Architecture for Embedded Real-Time Systems Martin Schoeberl Institute of Computer Engineering, Vienna University of Technology, Austria Abstract Architectural advancements in modern processor
More information5008: Computer Architecture
5008: Computer Architecture Chapter 2 Instruction-Level Parallelism and Its Exploitation CA Lecture05 - ILP (cwliu@twins.ee.nctu.edu.tw) 05-1 Review from Last Lecture Instruction Level Parallelism Leverage
More informationThe lowrisc project Alex Bradbury
The lowrisc project Alex Bradbury lowrisc C.I.C. 3 rd April 2017 lowrisc We are producing an open source Linux capable System-on-a- Chip (SoC) 64-bit multicore Aim to be the Linux of the Hardware world
More informationWriting Temporally Predictable Code
Writing Temporally Predictable Code Peter Puschner Benedikt Huber slides credits: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2015 Task Execution Time a 1 a 2 a 3 a 4 a 5 a 6 a 7 a 9 a 8 1. Sequence
More informationECE 571 Advanced Microprocessor-Based Design Lecture 13
ECE 571 Advanced Microprocessor-Based Design Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements More on HW#6 When ask for reasons why cache
More informationSireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern
Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Introduction WCET of program ILP Formulation Requirement SPM allocation for code SPM allocation for data Conclusion
More informationMULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming
MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance
More informationMULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming
MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance
More informationGetting CPI under 1: Outline
CMSC 411 Computer Systems Architecture Lecture 12 Instruction Level Parallelism 5 (Improving CPI) Getting CPI under 1: Outline More ILP VLIW branch target buffer return address predictor superscalar more
More informationCPUs. Caching: The Basic Idea. Cache : MainMemory :: Window : Caches. Memory management. CPU performance. 1. Door 2. Bigger Door 3. The Great Outdoors
CPUs Caches. Memory management. CPU performance. Cache : MainMemory :: Window : 1. Door 2. Bigger Door 3. The Great Outdoors 4. Horizontal Blinds 18% 9% 64% 9% Door Bigger Door The Great Outdoors Horizontal
More informationArchitectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability
Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability Yiqiang Ding, Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Outline
More informationCPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition
CPU Structure and Function Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU must: CPU Function Fetch instructions Interpret/decode instructions Fetch data Process data
More informationAvionics Applications on a Time-predictable Chip-Multiprocessor
Avionics Applications on a Time-predictable Chip-Multiprocessor André Rocha and Cláudio Silva GMV Lisbon, Portugal Email: [andre.rocha, claudio.silva]@gmv.com Rasmus Bo Sørensen, Jens Sparsø, and Martin
More informationEnabling Compositionality for Multicore Timing Analysis
Enabling Compositionality for Multicore Timing Analysis Sebastian Hahn Saarland University Saarland Informatics Campus Saarbrücken, Germany sebastian.hahn@cs.unisaarland.de Michael Jacobs Saarland University
More informationShared Cache Aware Task Mapping for WCRT Minimization
Shared Cache Aware Task Mapping for WCRT Minimization Huping Ding & Tulika Mitra School of Computing, National University of Singapore Yun Liang Center for Energy-efficient Computing and Applications,
More informationTiming Analysis. Dr. Florian Martin AbsInt
Timing Analysis Dr. Florian Martin AbsInt AIT OVERVIEW 2 3 The Timing Problem Probability Exact Worst Case Execution Time ait WCET Analyzer results: safe precise (close to exact WCET) computed automatically
More informationECE 411 Exam 1 Practice Problems
ECE 411 Exam 1 Practice Problems Topics Single-Cycle vs Multi-Cycle ISA Tradeoffs Performance Memory Hierarchy Caches (including interactions with VM) 1.) Suppose a single cycle design uses a clock period
More informationOperating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015
Operating Systems 09. Memory Management Part 1 Paul Krzyzanowski Rutgers University Spring 2015 March 9, 2015 2014-2015 Paul Krzyzanowski 1 CPU Access to Memory The CPU reads instructions and reads/write
More informationPipelining, Branch Prediction, Trends
Pipelining, Branch Prediction, Trends 10.1-10.4 Topics 10.1 Quantitative Analyses of Program Execution 10.2 From CISC to RISC 10.3 Pipelining the Datapath Branch Prediction, Delay Slots 10.4 Overlapping
More informationSeveral Common Compiler Strategies. Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining
Several Common Compiler Strategies Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining Basic Instruction Scheduling Reschedule the order of the instructions to reduce the
More informationLecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Admin
Lecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Professor Alvin R. Lebeck Computer Science 220 Fall 1999 Exam Average 76 90-100 4 80-89 3 70-79 3 60-69 5 < 60 1 Admin
More informationBluEJAMM: A Bluespec Embedded Java Architecture with Memory Management
BluEJAMM: A Bluespec Embedded Java Architecture with Memory Management Flavius Gruian 1 Mark Westmijze 2 1 Lund University, Sweden flavius.gruian@cs.lth.se 2 University of Twente, The Netherlands m.westmijze@student.utwente.nl
More informationECE 571 Advanced Microprocessor-Based Design Lecture 8
ECE 571 Advanced Microprocessor-Based Design Lecture 8 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 16 February 2017 Announcements HW4 Due HW5 will be posted 1 HW#3 Review Energy
More information