Predictable Programming on a Precision Timed Architecture
|
|
- Melissa Morrison
- 5 years ago
- Views:
Transcription
1 Predictable Programming on a Precision Timed Architecture Ben Lickly, Isaac Liu, Hiren Patel, Edward Lee, University of California, Berkeley Sungjun Kim, Stephen Edwards, Columbia University, New York Presented By Ashutosh Dhekne PhD Student, University of Illinios at Urbana Champaign
2 Goal of the Paper Rethink processor architecture to provide predictable timing Why such a stance? CPU Caching RAM Current computers optimized for average performance Too many time saving tricks that complicate WCET analysis Pipelined Execution How to achieve it? Exposed memory hierarchies Thread interleaved pipelining Deadline instructions CPU RAM Virtual Memory CPU fx perf Frequency Scaling
3 Words that Stick [link]
4 MMU External Material Drawn from memory The Familiar Architecture (x86) Processor (CISC) Main Memory Instruction Pipeline ALUs Cache Try Low Latency Cache Cache Miss High Latency Paging IO - DMA DMA Internal Registers Task Switch Regs Transparent to Program HDD
5 MMU Paper Innovations The PRET Architecture Processor (RISC) Scratchpad Memory (Part of Memory Address Space) Main Memory Thread Interleaved Pipeline Code ALUs M/M IO Data 5 4 DMA Thread Controller Register File Register File Register File Register File Memory Wheel
6 Main Memory 0x x00000FFF 0x3F x x405FFFFF 0x xFFFFFFFF Boot code used by each thread on startup. Initializes all the registers Shared Data 8MB between multiple threads Thread local instructions and data (1MB per thread) 512KB for instruction, 512KB for data Memory Mapped IO The Memory Wheel 1 2 I am feeling lucky! Access the Main Memory only through the Memory Wheel 13 cycle slotted time to access the Main Memory TDMA access creates false busy resource impression In the worst case, 90 cycles are required to access memory bounded worst case
7 External Material Drawn from memory Instruction Pipelines Can we keep the pipeline always running? What about Data Hazards, Control Hazards, Structural Hazards? Instruction 0 Instruction 1 Instruction 2 Instruction 3 Instruction 4 Instruction 5 Instruction 6 Instruction 7
8 Derived from: Precision Timed Machines, Isaac Liu Thread-Interleaved Pipelines What if we thread interleave pipelines, instead? Can we avoid all pipeline hazards? Thread 0 Thread 1 Thread 2 Thread 3 Thread 4 Thread 0 Thread 1 Thread 2
9 Derived from: Precision Timed Machines, Isaac Liu Hazardless Pipeline Not Quite Can we ensure no hazards in thread interleaved pipelines? Always fill the pipelines with instructions from distinct threads No explicit control dependencies between threads No Control Hazard Long latency instructions; prevent two from same thread No Data Hazard Very few concurrent threads; push in NOPs No Data Hazard Access to multi-cycle shared resources (eg. Memory) Structural Hazard TDMA access to the shared resources removes timing dependencies Nonetheless, removing interdependence between pipeline units eases timing analysis
10 Deadline hit Deadline miss Derived from: Precision Timed Machines, Isaac Liu Deadline Handling Deadline of Task 1A) Finish the task and detect at the end, if the deadline was missed 1B) Immediately handle a missed deadline 2A) Continue with next task 2B) Stall before next task Task Next Task Deadline Miss Handler Preemption Stall
11 Deadline hit Deadline miss Derived from: Precision Timed Machines, Isaac Liu Deadline Handling Deadline of Task 1A) Finish the task and detect at the end, if the deadline was missed 1B) Immediately handle a missed deadline Future Work 2A) Continue with next task 2B) Stall before next task Task Next Task Deadline Miss Handler Preemption Stall
12 The Deadline Instruction A per-thread Deadline Register t i DEAD(x) blocks until t i reaches zero It then loads the value x in the register and executes next instruction The paper does not handle missing deadlines Producer int main() { DEAD(28); volatile unsigned int *buf = (unsigned int*) (0x3F800200); unsigned int i = 0; for (i=0; ; i++) { DEAD(26); *buf = i; } return 0; } Register t i is loaded with value 28 Program waits here until t i becomes zero, then loads 26. If program returns here due to the loop, it might wait again. The deadline register is checked in the register access stage and replayed until it becomes zero
13 Example Game Commands Command Queues Commands Pixel Data Even Buffer Odd Buffer Pixel Data Game Logic Thread Swap 2 Graphics Controller Thread Swap 2 Video Driver Thread 1 New graphics available (Sync Request) 1 Refresh Screen (VSync Request) Sync Complete (Queue Swapped) 3 VSync (Frame Buffer Swapped) 3
14 VGA Real-time Constraints VGA Vsync Time VGA Hsync Time Sixteen Pixels at a time
15 Experiences from the Two Samples It is possible to provide timing guarantees using the PRET architecture But, timing calculations by hand are error-prone Automated tools will be provided in the future The underlying architecture lacks synchronization primitives Simple synchronization can be achieved using the deadline instructions
16 Comparison with the LEON3 Average case time degradation is studied PRET shows significant degradation due to lack of parallel threads None of the special PRET features are used Degradation factor < 6; no pipeline hazard advantage?
17 Conclusions The paper builds a remarkable architecture using SystemC model It introduces new instruction for one type of deadlines PRET keeps memory hierarchy and time differences exposed to user The model runs actual C programs and a small game Somewhat unfair comparison between LEON3 and PRET at the end It is possible to modify a RISC processor to have predictable timing
18 Some Observations With a project of this scale, it is difficult to fit all details in a paper I had to refer to one of the author s thesis work to gain insights The memory wheel assumes all threads will use memory equally I would suggest reduce the LEON3 comparison; include more fundamental insights instead Overall the work is commendable Provides some thoughts not discussed in any previous paper A true systems level work Can off the shelf architectures provide a strict WCET mode?
19 Thanks!
C Code Generation from the Giotto Model of Computation to the PRET Architecture
C Code Generation from the Giotto Model of Computation to the PRET Architecture Shanna-Shaye Forbes Ben Lickly Man-Kit Leung Electrical Engineering and Computer Sciences University of California at Berkeley
More informationPrecision Timed (PRET) Machines
Precision Timed (PRET) Machines Edward A. Lee Robert S. Pepper Distinguished Professor UC Berkeley BWRC Open House, Berkeley, CA February, 2012 Key Collaborators on work shown here: Steven Edwards Jeff
More informationPREcision Timed (PRET) Architecture
PREcision Timed (PRET) Architecture Isaac Liu Advisor Edward A. Lee Dissertation Talk April 24, 2012 Berkeley, CA Acknowledgements Many people were involved in this project: Edward A. Lee UC Berkeley David
More informationDesign and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256
Design and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256 Jan Reineke @ saarland university computer science ACACES Summer School 2017
More informationSingle-Path Programming on a Chip-Multiprocessor System
Single-Path Programming on a Chip-Multiprocessor System Martin Schoeberl, Peter Puschner, and Raimund Kirner Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at, {peter,raimund}@vmars.tuwien.ac.at
More informationReconciling Repeatable Timing with Pipelining and Memory Hierarchy
Reconciling Repeatable Timing with Pipelining and Memory Hierarchy Stephen A. Edwards 1, Sungjun Kim 1, Edward A. Lee 2, Hiren D. Patel 2, and Martin Schoeberl 3 1 Columbia University, New York, NY, USA,
More informationTiming Analysis of Embedded Software for Families of Microarchitectures
Analysis of Embedded Software for Families of Microarchitectures Jan Reineke, UC Berkeley Edward A. Lee, UC Berkeley Representing Distributed Sense and Control Systems (DSCS) theme of MuSyC With thanks
More informationIntroduction to Operating. Chapter Chapter
Introduction to Operating Systems Chapter 1 1.3 Chapter 1.5 1.9 Learning Outcomes High-level understand what is an operating system and the role it plays A high-level understanding of the structure of
More informationDisruptor Using High Performance, Low Latency Technology in the CERN Control System
Disruptor Using High Performance, Low Latency Technology in the CERN Control System ICALEPCS 2015 21/10/2015 2 The problem at hand 21/10/2015 WEB3O03 3 The problem at hand CESAR is used to control the
More informationIntroduction to Operating Systems. Chapter Chapter
Introduction to Operating Systems Chapter 1 1.3 Chapter 1.5 1.9 Learning Outcomes High-level understand what is an operating system and the role it plays A high-level understanding of the structure of
More informationCPU Pipelining Issues
CPU Pipelining Issues What have you been beating your head against? This pipe stuff makes my head hurt! L17 Pipeline Issues & Memory 1 Pipelining Improve performance by increasing instruction throughput
More informationFun with a Deadline Instruction
Fun with a Deadline Instruction Martin Schoeberl Institute of Computer Engineering Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at Hiren D. Patel University of Waterloo Waterloo, Ontario,
More informationIS CHIP-MULTIPROCESSING THE END OF REAL-TIME SCHEDULING? Martin Schoeberl and Peter Puschner 1
IS CHIP-MULTIPROCESSING THE END OF REAL-TIME SCHEDULING? Martin Schoeberl and Peter Puschner 1 Abstract Chip-multiprocessing is considered the future path for performance enhancements in computer architecture.
More informationUsing a Model Checker to Determine Worst-case Execution Time
Using a Model Checker to Determine Worst-case Execution Time Sungjun Kim Department of Computer Science Columbia University New York, NY 10027, USA skim@cs.columbia.edu Hiren D. Patel Electrical Engineering
More informationIntroduction to Operating Systems. Chapter Chapter
Introduction to Operating Systems Chapter 1 1.3 Chapter 1.5 1.9 Learning Outcomes High-level understand what is an operating system and the role it plays A high-level understanding of the structure of
More informationA Single-Path Chip-Multiprocessor System
A Single-Path Chip-Multiprocessor System Martin Schoeberl, Peter Puschner, and Raimund Kirner Institute of Computer Engineering Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at, {peter,raimund}@vmars.tuwien.ac.at
More informationEE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin
EE382 (20): Computer Architecture - ism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez The University of Texas at Austin 1 Recap 2 Streaming model 1. Use many slimmed down cores to run in parallel
More informationComputer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can
More information4. Hardware Platform: Real-Time Requirements
4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture
More informationEXPLICIT SYNCHRONIZATION
EXPLICIT SYNCHRONIZATION Lauri Peltonen XDC, 8 October, 204 WHAT IS EXPLICIT SYNCHRONIZATION? Fence is an abstract primitive that marks completion of an operation Implicit synchronization Fences are attached
More informationPrecision Timed Machines
Precision Timed Machines Isaac Liu Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2012-113 http://www.eecs.berkeley.edu/pubs/techrpts/2012/eecs-2012-113.html
More informationPRET-C: A New Language for Programming Precision Timed Architectures (extended abstract)
PRET-C: A New Language for Programming Precision Timed Architectures (extended abstract) Sidharta Andalam 1, Partha S Roop 1, Alain Girault 2, and Claus Traulsen 3 1 University of Auckland, New Zealand
More informationSISTEMI EMBEDDED. Computer Organization Pipelining. Federico Baronti Last version:
SISTEMI EMBEDDED Computer Organization Pipelining Federico Baronti Last version: 20160518 Basic Concept of Pipelining Circuit technology and hardware arrangement influence the speed of execution for programs
More information101. The memory blocks are mapped on to the cache with the help of a) Hash functions b) Vectors c) Mapping functions d) None of the mentioned
101. The memory blocks are mapped on to the cache with the help of a) Hash functions b) Vectors c) Mapping functions d) None of the mentioned 102. During a write operation if the required block is not
More informationc. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?
Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationAC: COMPOSABLE ASYNCHRONOUS IO FOR NATIVE LANGUAGES. Tim Harris, Martín Abadi, Rebecca Isaacs & Ross McIlroy
AC: COMPOSABLE ASYNCHRONOUS IO FOR NATIVE LANGUAGES Tim Harris, Martín Abadi, Rebecca Isaacs & Ross McIlroy Synchronous IO in the Windows API Read the contents of h, and compute a result BOOL ProcessFile(HANDLE
More informationComputer Systems Architecture I. CSE 560M Lecture 5 Prof. Patrick Crowley
Computer Systems Architecture I CSE 560M Lecture 5 Prof. Patrick Crowley Plan for Today Note HW1 was assigned Monday Commentary was due today Questions Pipelining discussion II 2 Course Tip Question 1:
More informationSIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core
SIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core Sebastian Hahn and Jan Reineke RTSS, Nashville December, 2018 saarland university computer science SIC: Provably Timing-Predictable
More informationWorst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni
orst Case Analysis of DAM Latency in Multi-equestor Systems Zheng Pei u Yogen Krish odolfo Pellizzoni Multi-equestor Systems CPU CPU CPU Inter-connect DAM DMA I/O 1/26 Multi-equestor Systems CPU CPU CPU
More informationVirtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1
Virtual Memory Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L16-1 Reminder: Operating Systems Goals of OS: Protection and privacy: Processes cannot access each other s data Abstraction:
More informationSystems I: Programming Abstractions
Systems I: Programming Abstractions Course Philosophy: The goal of this course is to help students become facile with foundational concepts in programming, including experience with algorithmic problem
More informationA Deterministic Concurrent Language for Embedded Systems
A Deterministic Concurrent Language for Embedded Systems Stephen A. Edwards Columbia University Joint work with Olivier Tardieu SHIM:A Deterministic Concurrent Language for Embedded Systems p. 1/30 Definition
More informationMultiprocessors and Locking
Types of Multiprocessors (MPs) Uniform memory-access (UMA) MP Access to all memory occurs at the same speed for all processors. Multiprocessors and Locking COMP9242 2008/S2 Week 12 Part 1 Non-uniform memory-access
More informationDepartment of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.
Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline
More informationCS 856 Latency in Communication Systems
CS 856 Latency in Communication Systems Winter 2010 Latency Challenges CS 856, Winter 2010, Latency Challenges 1 Overview Sources of Latency low-level mechanisms services Application Requirements Latency
More informationThe Design Complexity of Program Undo Support in a General Purpose Processor. Radu Teodorescu and Josep Torrellas
The Design Complexity of Program Undo Support in a General Purpose Processor Radu Teodorescu and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu Processor with program
More informationQUESTION BANK UNIT-I. 4. With a neat diagram explain Von Neumann computer architecture
UNIT-I 1. Write the basic functional units of computer? (Nov/Dec 2014) 2. What is a bus? What are the different buses in a CPU? 3. Define multiprogramming? 4.List the basic functional units of a computer?
More informationChapter Seven Morgan Kaufmann Publishers
Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationEITF20: Computer Architecture Part4.1.1: Cache - 2
EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss
More informationA Deterministic Concurrent Language for Embedded Systems
SHIM:A A Deterministic Concurrent Language for Embedded Systems p. 1/28 A Deterministic Concurrent Language for Embedded Systems Stephen A. Edwards Columbia University Joint work with Olivier Tardieu SHIM:A
More informationSEDA: An Architecture for Well-Conditioned, Scalable Internet Services
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles
More informationOperating System Design Issues. I/O Management
I/O Management Chapter 5 Operating System Design Issues Efficiency Most I/O devices slow compared to main memory (and the CPU) Use of multiprogramming allows for some processes to be waiting on I/O while
More informationTopic 18: Virtual Memory
Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think using a level of indirection
More informationA Deterministic Concurrent Language for Embedded Systems
A Deterministic Concurrent Language for Embedded Systems Stephen A. Edwards Columbia University Joint work with Olivier Tardieu SHIM:A Deterministic Concurrent Language for Embedded Systems p. 1/38 Definition
More informationParallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1
Pipelining COMP375 Computer Architecture and dorganization Parallelism The most common method of making computers faster is to increase parallelism. There are many levels of parallelism Macro Multiple
More informationWhat Operating Systems Do An operating system is a program hardware that manages the computer provides a basis for application programs acts as an int
Operating Systems Lecture 1 Introduction Agenda: What Operating Systems Do Computer System Components How to view the Operating System Computer-System Operation Interrupt Operation I/O Structure DMA Structure
More informationPrecise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection
Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection Daniel Grund 1 Jan Reineke 2 1 Saarland University, Saarbrücken, Germany 2 University of California, Berkeley, USA Euromicro
More informationCS2253 COMPUTER ORGANIZATION AND ARCHITECTURE 1 KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY
CS2253 COMPUTER ORGANIZATION AND ARCHITECTURE 1 KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK Sub. Code & Name: CS2253 Computer organization and architecture Year/Sem
More informationMartin Kruliš, v
Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal
More informationCISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP
CISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationMemory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague
Memory Controllers for Real-Time Embedded Systems Benny Akesson Czech Technical University in Prague Trends in Embedded Systems Embedded systems get increasingly complex Increasingly complex applications
More information1 /10 2 /16 3 /18 4 /15 5 /20 6 /9 7 /12
M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE 6.004 Computation Structures Fall 2018 Practice Quiz #3B Name Athena login
More informationARTIST-Relevant Research from Linköping
ARTIST-Relevant Research from Linköping Department of Computer and Information Science (IDA) Linköping University http://www.ida.liu.se/~eslab/ 1 Outline Communication-Intensive Real-Time Systems Timing
More informationVirtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1
Virtual Memory Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L20-1 Reminder: Operating Systems Goals of OS: Protection and privacy: Processes cannot access each other s data Abstraction:
More informationVirtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1
Virtual Memory Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs
More informationCourse Description: This course includes concepts of instruction set architecture,
Computer Architecture Course Title: Computer Architecture Full Marks: 60+ 20+20 Course No: CSC208 Pass Marks: 24+8+8 Nature of the Course: Theory + Lab Credit Hrs: 3 Course Description: This course includes
More informationCSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)
CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that
More informationWorst Case Analysis of DRAM Latency in Hard Real Time Systems
Worst Case Analysis of DRAM Latency in Hard Real Time Systems by Zheng Pei Wu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied
More informationChapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel
Chapter-6 SUBJECT:- Operating System TOPICS:- I/O Management Created by : - Sanjay Patel Disk Scheduling Algorithm 1) First-In-First-Out (FIFO) 2) Shortest Service Time First (SSTF) 3) SCAN 4) Circular-SCAN
More informationMemory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.
Memory Hierarchy Goal: Fast, unlimited storage at a reasonable cost per bit. Recall the von Neumann bottleneck - single, relatively slow path between the CPU and main memory. Fast: When you need something
More informationENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013
ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 Professor: Sherief Reda School of Engineering, Brown University 1. [from Debois et al. 30 points] Consider the non-pipelined implementation of
More informationOutline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??
Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross
More informationCISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP
CISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationPerformance of Various Levels of Storage. Movement between levels of storage hierarchy can be explicit or implicit
Memory Management All data in memory before and after processing All instructions in memory in order to execute Memory management determines what is to be in memory Memory management activities Keeping
More informationProcessors, Performance, and Profiling
Processors, Performance, and Profiling Architecture 101: 5-Stage Pipeline Fetch Decode Execute Memory Write-Back Registers PC FP ALU Memory Architecture 101 1. Fetch instruction from memory. 2. Decode
More informationESE532: System-on-a-Chip Architecture. Today. Message. Real Time. Real-Time Tasks. Real-Time Guarantees. Real Time Demands Challenges
ESE532: System-on-a-Chip Architecture Day 9: October 1, 2018 Real Time Real Time Demands Challenges Today Algorithms Architecture Disciplines to achieve Penn ESE532 Fall 2018 -- DeHon 1 Penn ESE532 Fall
More informationHY225 Lecture 12: DRAM and Virtual Memory
HY225 Lecture 12: DRAM and irtual Memory Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS May 16, 2011 Dimitrios S. Nikolopoulos Lecture 12: DRAM and irtual Memory 1 / 36 DRAM Fundamentals Random-access
More informationLatches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter
IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more
More informationARSITEKTUR SISTEM KOMPUTER. Wayan Suparta, PhD 17 April 2018
ARSITEKTUR SISTEM KOMPUTER Wayan Suparta, PhD https://wayansuparta.wordpress.com/ 17 April 2018 Reduced Instruction Set Computers (RISC) CISC Complex Instruction Set Computer RISC Reduced Instruction Set
More informationMain Points of the Computer Organization and System Software Module
Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a
More informationTiming Anomalies Reloaded
Gernot Gebhard AbsInt Angewandte Informatik GmbH 1 of 20 WCET 2010 Brussels Belgium Timing Anomalies Reloaded Gernot Gebhard AbsInt Angewandte Informatik GmbH Brussels, 6 th July, 2010 Gernot Gebhard AbsInt
More informationAdministrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.
Administrivia HW0 scores, HW1 peer-review assignments out. HW2 out, due Nov. 2. If you re having Cython trouble with HW2, let us know. Review on Wednesday: Post questions on Piazza Introduction to GPUs
More informationMemory: Overview. CS439: Principles of Computer Systems February 26, 2018
Memory: Overview CS439: Principles of Computer Systems February 26, 2018 Where We Are In the Course Just finished: Processes & Threads CPU Scheduling Synchronization Next: Memory Management Virtual Memory
More informationTopics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability
Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What
More informationThis course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers
Course Introduction Purpose: This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers Objectives: Learn about error detection and address errors
More informationDEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationTeaching Computer Architecture with FPGA Soft Processors
Teaching Computer Architecture with FPGA Soft Processors Dr. Andrew Strelzoff 1 Abstract Computer Architecture has traditionally been taught to Computer Science students using simulation. Students develop
More informationWilliam Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers
More informationStreamIt on Fleet. Amir Kamil Computer Science Division, University of California, Berkeley UCB-AK06.
StreamIt on Fleet Amir Kamil Computer Science Division, University of California, Berkeley kamil@cs.berkeley.edu UCB-AK06 July 16, 2008 1 Introduction StreamIt [1] is a high-level programming language
More informationMigrating from the UT699 to the UT699E
Standard Products Application Note Migrating from the UT699 to the UT699E January 2015 www.aeroflex.com/leon Table 1.1 Cross Reference of Applicable Products Product Name: Manufacturer Part Number SMD
More informationIntroduction to Real-Time Systems and Multitasking. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff
Introduction to Real-Time Systems and Multitasking Real-time systems Real-time system: A system that must respond to signals within explicit and bounded time requirements Categories Soft real-time system:
More informationOperating Systems 2010/2011
Operating Systems 2010/2011 Introduction Johan Lukkien 1 Agenda OS: place in the system Some common notions Motivation & OS tasks Extra-functional requirements Course overview Read chapters 1 + 2 2 A computer
More informationThird Midterm Exam April 24, 2017 CS162 Operating Systems
University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2017 Ion Stoica Third Midterm Exam April 24, 2017 CS162 Operating Systems Your Name: SID AND 162 Login: TA
More informationEITF20: Computer Architecture Part4.1.1: Cache - 2
EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss
More informationCEC 450 Real-Time Systems
CEC 450 Real-Time Systems Lecture 6 Accounting for I/O Latency September 28, 2015 Sam Siewert A Service Release and Response C i WCET Input/Output Latency Interference Time Response Time = Time Actuation
More informationCS 2410 Mid term (fall 2015) Indicate which of the following statements is true and which is false.
CS 2410 Mid term (fall 2015) Name: Question 1 (10 points) Indicate which of the following statements is true and which is false. (1) SMT architectures reduces the thread context switch time by saving in
More informationReal Processors. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University
Real Processors Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel
More informationCMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)
CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer
More informationMARTHANDAM COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOGY TWO MARK QUESTIONS AND ANSWERS
MARTHANDAM COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOGY TWO MARK QUESTIONS AND ANSWERS SUB NAME: COMPUTER ORGANIZATION AND ARCHITECTTURE SUB CODE: CS 2253 YEAR/SEM:II/IV Marthandam
More informationArchitecture and OS. To do. q Architecture impact on OS q OS impact on architecture q Next time: OS components and structure
Architecture and OS To do q Architecture impact on OS q OS impact on architecture q Next time: OS components and structure Computer architecture and OS OS is intimately tied to the hardware it runs on
More informationVirtual Memory. CS61, Lecture 15. Prof. Stephen Chong October 20, 2011
Virtual Memory CS6, Lecture 5 Prof. Stephen Chong October 2, 2 Announcements Midterm review session: Monday Oct 24 5:3pm to 7pm, 6 Oxford St. room 33 Large and small group interaction 2 Wall of Flame Rob
More informationIn embedded systems there is a trade off between performance and power consumption. Using ILP saves power and leads to DECREASING clock frequency.
Lesson 1 Course Notes Review of Computer Architecture Embedded Systems ideal: low power, low cost, high performance Overview of VLIW and ILP What is ILP? It can be seen in: Superscalar In Order Processors
More informationVirtual Memory: From Address Translation to Demand Paging
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November 9, 2015
More informationCaches. Hiding Memory Access Times
Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY
More information07 - Program Flow Control
September 23, 2014 Schedule change this week The lecture on thursday needs to move Lab computers The current computer lab (Bussen) is pretty nice since it has dual monitors However, the computers does
More informationPROJECT 4 Architecture Review of the Nintendo GameCube TM CPU. Ross P. Davis CDA5155 Sec /06/2003
PROJECT 4 Architecture Review of the Nintendo GameCube TM CPU Ross P. Davis CDA5155 Sec. 7233 08/06/2003 Introduction In this report, statements or sets of consecutive statements will be backed up by stating
More informationEECS 3221 Operating System Fundamentals
EECS 3221 Operating System Fundamentals Instructor: Prof. Hui Jiang Email: hj@cse.yorku.ca Web: http://www.eecs.yorku.ca/course/3221 General Info 3 lecture hours each week 2 assignments (2*5%=10%) 1 project
More information