To observe relevant aspects of reality Fewer assumptions High computational workload
|
|
- Roberta Francis
- 6 years ago
- Views:
Transcription
1 On Simulation Jakob Engblom, PhD Virtutech & Uppsala University Simulation: Modeling + Execution Build a model of the system Try various scenarios on this model Experimental, not analytical approach Understand the real system by working with the model More available More inspectable Less dangerous 5 Simulation or Analysis Sufficient Level of Detail Simulation gets closer to real world Maintain sufficient details More details To observe relevant aspects of reality Fewer assumptions High computational workload Analytical models To avoid artifacts of experiment Abstract away unimportant aspects Newtonian vs. quantum physics Efficient predictors Low computational workload... but more removed from world=less accurate Timing vs. function Danger: bad abstractions = bad simulation 6 7
2 Scope versus Abstraction Example: Scope/Detail tradeoff Atom Simulating a single atom, we can use the incredible detail of quantom mechanics and string theory Galaxies Level of abstraction String theory Scope of model To simulate the universe, the units of simulation have to be galaxies The Universe Reasonable to simulate: scope proportional to abstraction GPL Life-like action: Momentum Friction Steering Engine torque Not nuts & bolts of cars Grand-Prix Legends 8 9 Simulation is never perfect Simulating Computers It is never quite the real thing......but it can be very close indeed 10
3 Simulating Computer Systems Simulating Computer Systems We need to decide the level of abstraction More detail = smaller scope Less detail = larger scope Size of systems that can be investigated Number of different systems Measure of scope: speed As number of software instructions per second Processor What do we need to simulate? Peripherals Stimuli Program Detailed Hardware Models Instruction-Set Simulation Transistor-level model Very close to actual implementation Small scope Small piece of HW Small programs Stimuli at bit level Speed: 100s of instructions per second With 25MUSD hardware: KIPS Necessary for hardware development Model computer at instruction set level Stable & defined interface The level where hardware & software meet Stimuli at transaction level Abstractions to increase scope: Keep functionality correct Vary fidelity in timing Simplify some behavior Speed: 10 KIPS to 100 MIPS 700 MIPS Key issue: there can be no software visible difference (including to the OS) 14 15
4 Sufficient Detail of Model Complete from a software perspective All readable values represented All registers of CPU implemented Software=OS, drivers, applications, middleware,... Hardware considered as a set of devices I/O-space or memory mapped Behavior at level seen by device drivers No abstract networks, all concrete Next slide: example of detail required Instruction-Set Simulation Full-System Simulation To run real workloads, you need Hardware: CPU & devices OS and other services Stimuli to feed them Common methods to achieve this Virtutech Simics Full-system simulation Virtualization User-level simulation One physical computer Virtual computer systems of many different types 18 19
5 Not Full-System Simulation Virtualization Full-System Simulation User program Virtualization system Middleware DB Servers Operating system Real OS & Software One physical computer Several virtual computers of the same type CPU RAM Disk Hardware Network GPU Device controller Simulated hardware Not Full-System Simulation Speed User-level simulation CPU User program Middleware DB Servers Operating system Real user program Simulated OS, services, some HW Depends of level of timing detail in model Slowest: cycle-accurate simulation Hardware timing modeled in great detail Fastest: emulation (user-level only) Sweet spot: somewhere inbetween Simics tries to hit this spot RAM Configurable level of detail Hardware 22 23
6 Speed Going up in Scope Detailed hardware sim accuracy cycle-accurate simulator (>10,000x) fast full-system simulation (20-400x) emulator (5x) Virtualization speed 10 KIPS 1000 MIPS Interesting systems are larger than single CPU Multiprocessors Homogeneous like servers Heterogeneous like mobile phones Distributed systems Local-Area Networks Embedded CAN buses Networks-on-chips = Simulated shared memory, networks Distributed Network Simulation Level of simulation Entire packets, not physical layer Simulate the network cards in nodes Spread simulation across multiple machines Necessary increase of speed Still, maintain determinism Synchronize simulated machines One machine stops, all machines stop Global checkpointing & restore Network Simulation Simulated network of simulated machines Interface to real network if needed Real network of physical machines 26 27
7 Simulation Advantages Simulation Advantages Configurability Simulate anything, Independent of available hardware Target architecture System configuration Availability Easy to copy setup, no manufacturing involved Determinism Removes real-world indeterminism Synchronization across machines and networks 29 Simulation Advantages Simulation Advantages Checkpoint & restart Save state of machine to reload later Parallelize & repeat runs Distribute fixed starting points Non-intrusive inspection & tracing Any events or state in the machine Does not affect running system IO events, hardware events Deep inspection of system state Caches, TLBs, registers, device registers, buffers... Sandboxing Completely walled-in No hidden communications Undo state changes Dangerous experiments possible Viruses, worms, buffer overflows, Magic instructions: Allows programs to communicate with outside 30 31
8 Peripherals HW/SW Cosimulation Program Integrated Systems Highly-integrated devices on the rise Develop HW & SW in parallel Simulate hardware and software together in development of entire system Bluetooth GSM Radio LCD driver Data mem CPU DSP Code memory 33 Big Systems & Small Details Transactions vs Pins To achieve speed: reduce level of detail To capture important effects: increase level Solution: model only parts at great detail Finished hardware can be modeled simply Model only what needs to be observed Mostly, no need for RTL-level understanding Transaction-level modeling: Model transactions as a unit Level of model: Memory read / Network packet send /... Only when something is activated Pin-level modeling: Model detailed electronics of a transaction Level of model: Individual pins Clocked pulsing of transmission pins Every clock cycle
9 Clocking vs Blocking Transactions/Events vs Pins/Clock Traditional hardware modeling: One (or two) step per clock cycle Clock to generate evolution of internal state =All devices called each cycle Large overhead for context switching Optimized hardware modeling (blocking): Only call when events (read, writes) occur Evolve internal state several cycles at a time Count the time since last activation Lower context switch overhead Example: device read operations CPU Transaction: Call device model: (op=read, address=0x17) Immediate reply: data=0x42 Pins: Set address pins to Drive clock pin to 1 and then 0... until data ready pin is 1 Then read from data pins CPU Device Device HW/SW Cosimulation: fast HW/SW Cosimulation: detailed Simics Simics VHDL/Verilog Simulator Memory Application (RT)OS Drivers CPU Core Devices Interface: transactions, events, maybe clock cycles Behavioral model Memory Application (RT)OS Drivers CPU Core Devices Interface: pins, clock cycles RTL-level simulation 38 39
10 Device Modeling Large part of work for a platform Processors: few and standardized Devices: (very) many and varied. But simpler. Still pretty fast, at transaction level Modeling devices: C/C++/Python with simulator APIs SystemC VHDL/Verilog Graphical languages (Magic-C) Stimulating a Simulation Stimuli 40 Stimuli Regular Computers Without proper stimuli, model is useless Feed mechanism How to get information into the simulation Data generation What to supply to the simulation Can get tricky Fixed inputs Spec benchmarks: loaded from disk Network Load generation on simulated machines Interface to a real network Interactive use Load generators on real machines Keyboard & mouse Map directly to real device Easy for PC-on-PC-style Interactive user 42 43
11 Non-traditional Computers Physical World Interaction Phones, navigation computers, PDAs, etc. Application development Use GUI to provide interactive sessions with user Keyboard, joystick, touch screen Not radio data etc. Special simulated devices Sensors & actuators Data sources Statistical models of real system behavior Simulation models of physical reality Hardware-in-the-loop simulation Configuration as Stimuli Workload Scaling Stimuli = hardware configuration Booting an operating system Test of OS software vs hardware Reconfigure hardware, alter devices Self-configuring systems Networks & other distributed systems Master election, device discovery, etc. Adding/removing simulated nodes Problem: simulation is slow Especially for detailed architectural simulation Slowdown 10000: 1 minute real time = 7 days simulation time Scale (down) workloads to fit Smaller data sets How to make representative of full runs? Tricky problem in its own right 46 47
12 Using Simulation Software Development Low-level software development Supervisor-level (OS) & interrupt code debug Inspection of system state Device access tracing & breakpoints Debugging unfinished operating systems Developing drivers High-level software development Powerful debugger, with checkpointing 49 Hardware Replacement Hardware Development Embedded HW Model hardware in development Cheaper, more convenient, available, stable Test components before physical prototype Often USD development platforms Virtual platform for early software dev Boards under development AMD64 (Hammer, Opteron, Athlon 64) Saved months for the Linux/AMD64 ports Next-gen UltraSparcs,... Stimulate HW with real workloads Requires ability to run operating systems HW/SW cosimulation At various levels of detail Shortens time to market dramatically 50 51
13 Parallelization of Development Network Software Board design Board prototype Handoff to the software team, when working hardware exists Develop network stacks & protocols Software development Easy to instrument the network, trace traffic Easy to inject packets Board prototype No interference from other traffic Board design & build simulator Simulator = reference Synchronous breaks at important events Try network configurations Handoff to the software team, using a simulation of the hardware platform Software development Large networks Pathological topologies Performance Tuning Faults and Boundary Cases Performance tuning of software Fault injection Trace & statistics on performance events Repeatable, no physical damage necessary Cache misses, TLB misses, disk accesses Fault tolerant systems, safety critical systems Memory access patterns Get first-order estimates from event counts Absolute performance measurements Examples: next slide Boundary case testing Extremely small or large configurations Requires very detailed models Intense bursts of interrupts Not a design goal of large-scale simulators Communications latencies and intensity 54 55
14 Fault Injection Examples Fault Injection & Checkpointing Corrupt register values CPU Corrupt measurements Sensor Boot system SW Position workload Run 1 Check results Restore checkpoint 56 Permanent bit errors Unplug a device RAM Device Transient errors in xmit Kill entire subsystem Bridge Device Network Corrupt network packets; unplug 57 Take checkpoint Check results CKP injected fault Restore checkpoint Run 2 Did the fault affect the result? Check results Teaching Enable hands-on experience Computer architecture Embedded systems programming Operating systems Debug half-finished systems Same setup for all students, easy handins System management Easy to restore system state No risk to real machines and networks Simulate with Care 58
15 Obtaining Significant Results Wisconsin Experiments Computer architecture research 90% or more done in simulation Measure of success: effect of modification to a reference machine What is a significant result? -5%?+10%? How real is the machine modified? SimpleScalar is not a real processor = need for quite extensive modeling Mark Hill et al, IEEE Computer Feb 2003 Investigating potential pitfalls of simulation Detailed microarchitectural modeling Pipeline, caches, reordering, the works Randomized L2 miss time (80-89 cycles) Several runs with same workload Variable results! Wisconsin Experiments Wisconsin Experiments Cycles Per Trans. (millions) ROB Size WCR (16,32) = 18% ( Wrong Conclusion Ratio ) WCR (16,64) = 7.5% WCR (32,64) = 26% max avg min Cycles Per Trans. (millions) Sample Size (number of runs)
16 Wisconsin Experiments Conclusions: Simulation no different from runs on real HW Use standard statistics Non-overlapping confidence intervals Danger of determinism in simulation Testing a single path of a program Induce variability by randomization Implementations 64 Full-System Simulators Integrated full-system simulators Virtutech Simics (better at abstraction) Virtio Virtio (more on the pin-level) Frameworks for combining discrete simulators Combine ISS with VHDL and Verilog simulators Mentor Graphics Seamless Cadence Incisive Virtualization environments VmWare VmWare (fast PC-on-PC) Connectix/Microsoft VirtualPC (fast PC-on-PC) Almost done... 66
17 Cost of Simulation Photo courtesy of The Computer Museum History Center Sim in 1977: on DEC VAX-11/ ,000 USD (1977 dollars) 1 VAX MIPS simulation technology ~200 cost for simulated server hour: 4,000 USD Linux on Itanium VxWorks on PowerPC Windows NT on x86 Photo courtesy of Intel Sim in 2002: on Dell PC (P4 2.2 GHz) 1,500 USD approx 3100 VAX MIPS simulation technology ~40 cost for simulated server hour: 2 USD a factor of 2000 in 25 years! Solaris on Sun SunFire Linux on x86 All running on a Linux host Windows XP/64 on AMD Hammer 68 Demo Time! Booting Linux Lifting checkpoints of Windows and Solaris Kernel debugging IO access history Configuration file syntax... and more.. Thank You info@virtutech.com 70
Why Use Simulation? Simics & dark2 Assignment 1. What is a Simulator? Full-System Simulation
Simics & dark2 Assignment 1 Håkan Zeffer Uppsala University, Sweden zeffer@it.uu.se Why Use Simulation? Understanding real systems More inspectable Less dangerous Fault injection Debugging Prototype HW
More informationProtoFlex: FPGA-Accelerated Hybrid Simulator
ProtoFlex: FPGA-Accelerated Hybrid Simulator Eric S. Chung, Eriko Nurvitadhi James C. Hoe, Babak Falsafi, Ken Mai Computer Architecture Lab at Multiprocessor Simulation Simulating one processor in software
More informationWind River. All Rights Reserved.
1 Using Simulation to Develop and Maintain a System of Connected Devices Didier Poirot Simics Technical Account Manager THE CHALLENGES OF DEVELOPING CONNECTED ELECTRONIC SYSTEMS 3 Mobile Networks Update
More informationHardware Software Bring-Up Solutions for ARM v7/v8-based Designs. August 2015
Hardware Software Bring-Up Solutions for ARM v7/v8-based Designs August 2015 SPMI USB 2.0 SLIMbus RFFE LPDDR 2 LPDDR 3 emmc 4.5 UFS SD 3.0 SD 4.0 UFS Bare Metal Software DSP Software Bare Metal Software
More informationIntroduction to gem5. Nizamudheen Ahmed Texas Instruments
Introduction to gem5 Nizamudheen Ahmed Texas Instruments 1 Introduction A full-system computer architecture simulator Open source tool focused on architectural modeling BSD license Encompasses system-level
More information4. Hardware Platform: Real-Time Requirements
4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture
More informationFull-System Timing-First Simulation
Full-System Timing-First Simulation Carl J. Mauer Mark D. Hill and David A. Wood Computer Sciences Department University of Wisconsin Madison The Problem Design of future computer systems uses simulation
More informationSoftware Development Using Full System Simulation with Freescale QorIQ Communications Processors
Patrick Keliher, Simics Field Application Engineer Software Development Using Full System Simulation with Freescale QorIQ Communications Processors 1 2013 Wind River. All Rights Reserved. Agenda Introduction
More informationPerformance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews
Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models Jason Andrews Agenda System Performance Analysis IP Configuration System Creation Methodology: Create,
More informationOperating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst
Operating Systems CMPSCI 377 Spring 2017 Mark Corner University of Massachusetts Amherst Last Class: Intro to OS An operating system is the interface between the user and the architecture. User-level Applications
More informationProtoFlex: FPGA Accelerated Full System MP Simulation
ProtoFlex: FPGA Accelerated Full System MP Simulation Eric S. Chung, Eriko Nurvitadhi, James C. Hoe, Babak Falsafi, Ken Mai Computer Architecture Lab at Our work in this area has been supported in part
More informationESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)
ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages
More informationCOMPLEX EMBEDDED SYSTEMS
COMPLEX EMBEDDED SYSTEMS Embedded System Design and Architectures Summer Semester 2012 System and Software Engineering Prof. Dr.-Ing. Armin Zimmermann Contents System Design Phases Architecture of Embedded
More informationA FULLY VIRTUAL MULTI-NODE 1553 BUS COMPUTER SYSTEM
A FULLY VIRTUAL MULTI-NODE BUS COMPUTER SYSTEM Jakob Engblom, C.W. Mattias Holm Virtutech AB, Norrtullsgatan 15, SE-11327 Stockholm, Sweden, Email: jakob@virtutech.com, holm@virtutech.com ABSTRACT This
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More information24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.
24-vm.txt Mon Nov 21 22:13:36 2011 1 Notes on Virtual Machines 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Tannenbaum, 3.2 Barham, et al., "Xen and the art of virtualization,"
More information10 Steps to Virtualization
AN INTEL COMPANY 10 Steps to Virtualization WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Virtualization the creation of multiple virtual machines (VMs) on a single piece of hardware, where
More informationEEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools
EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools Wen-Yen Lin, Ph.D. Department of Electrical Engineering Chang Gung University Email: wylin@mail.cgu.edu.tw March 2013 Agenda Introduction
More informationAssembling and Debugging VPs of Complex Cycle Accurate Multicore Systems. July 2009
Assembling and Debugging VPs of Complex Cycle Accurate Multicore Systems July 2009 Model Requirements in a Virtual Platform Control initialization, breakpoints, etc Visibility PV registers, memories, profiling
More informationGetting the Most out of Advanced ARM IP. ARM Technology Symposia November 2013
Getting the Most out of Advanced ARM IP ARM Technology Symposia November 2013 Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block are now Sub-Systems Cortex
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationMultiprocessor Systems. Chapter 8, 8.1
Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor
More informationSYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS
SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous
More informationDepartment of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.
Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline
More informationLast 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture
Last 2 Classes: Introduction to Operating Systems & C++ tutorial User apps OS Virtual machine interface hardware physical machine interface An operating system is the interface between the user and the
More informationOptimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd
Optimizing ARM SoC s with Carbon Performance Analysis Kits ARM Technical Symposia, Fall 2014 Andy Ladd Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block
More informationPerformance of Multithreaded Chip Multiprocessors and Implications for Operating System Design
Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design Based on papers by: A.Fedorova, M.Seltzer, C.Small, and D.Nussbaum Pisa November 6, 2006 Multithreaded Chip
More informationMicrokernels and Portability. What is Portability wrt Operating Systems? Reuse of code for different platforms and processor architectures.
Microkernels and Portability What is Portability wrt Operating Systems? Reuse of code for different platforms and processor architectures. Contents Overview History Towards Portability L4 Microkernels
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationMachine Architecture. or what s in the box? Lectures 2 & 3. Prof Leslie Smith. ITNP23 - Autumn 2014 Lectures 2&3, Slide 1
Machine Architecture Prof Leslie Smith or what s in the box? Lectures 2 & 3 ITNP23 - Autumn 2014 Lectures 2&3, Slide 1 Basic Machine Architecture In these lectures we aim to: understand the basic architecture
More informationSystem Simulator for x86
MARSS Micro Architecture & System Simulator for x86 CAPS Group @ SUNY Binghamton Presenter Avadh Patel http://marss86.org Present State of Academic Simulators Majority of Academic Simulators: Are for non
More informationVariability in Architectural Simulations of Multi-threaded
Variability in Architectural Simulations of Multi-threaded threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison {alaa,david}@cs.wisc.edu http://www.cs.wisc.edu/multifacet
More informationImproving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches
Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative
More informationCONSOLE ARCHITECTURE
CONSOLE ARCHITECTURE Introduction Part 1 What is a console? Console components Differences between consoles and PCs Benefits of console development The development environment Console game design What
More informationEvolution of Computers & Microprocessors. Dr. Cahit Karakuş
Evolution of Computers & Microprocessors Dr. Cahit Karakuş Evolution of Computers First generation (1939-1954) - vacuum tube IBM 650, 1954 Evolution of Computers Second generation (1954-1959) - transistor
More informationCPU Architecture. HPCE / dt10 / 2013 / 10.1
Architecture HPCE / dt10 / 2013 / 10.1 What is computation? Input i o State s F(s,i) (s,o) s Output HPCE / dt10 / 2013 / 10.2 Input and Output = Communication There are many different types of IO (Input/Output)
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationVirtual PLATFORMS for complex IP within system context
Virtual PLATFORMS for complex IP within system context VP Modeling Engineer/Pre-Silicon Platform Acceleration Group (PPA) November, 12th, 2015 Rocco Jonack Legal Notice This presentation is for informational
More informationFYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1.
FYS3240-4240 Data acquisition & control Introduction Spring 2018 Lecture #1 Reading: RWI (Real World Instrumentation) Chapter 1. Bekkeng 14.01.2018 Topics Instrumentation: Data acquisition and control
More informationEven coarse architectural trends impact tremendously the design of systems
CSE 451: Operating Systems Spring 2006 Module 2 Architectural Support for Operating Systems John Zahorjan zahorjan@cs.washington.edu 534 Allen Center Even coarse architectural trends impact tremendously
More informationQEMU for Xilinx ZynqMP. V Aug-20
QEMU for Xilinx ZynqMP Edgar E. Iglesias V2 2015-Aug-20 ZynqMP SoC New Chip (Zynq NG) Aggressive target for QEMU as early SW platform emulating WiP chip BootROMs, Boot-loaders,
More informationInput and Output = Communication. What is computation? Hardware Thread (CPU core) Transforming state
What is computation? Input and Output = Communication Input State Output i s F(s,i) (s,o) o s There are many different types of IO (Input/Output) What constitutes IO is context dependent Obvious forms
More informationDeveloping deterministic networking technology for railway applications using TTEthernet software-based end systems
Developing deterministic networking technology for railway applications using TTEthernet software-based end systems Project n 100021 Astrit Ademaj, TTTech Computertechnik AG Outline GENESYS requirements
More informationComputer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic
More informationAsynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus
Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus Andrew M. Scott, Mark E. Schuelein, Marly Roncken, Jin-Jer Hwan John Bainbridge, John R. Mawer, David L. Jackson, Andrew
More informationEven coarse architectural trends impact tremendously the design of systems
CSE 451: Operating Systems Winter 2015 Module 2 Architectural Support for Operating Systems Mark Zbikowski mzbik@cs.washington.edu 476 Allen Center 2013 Gribble, Lazowska, Levy, Zahorjan 1 Even coarse
More informationEven coarse architectural trends impact tremendously the design of systems. Even coarse architectural trends impact tremendously the design of systems
CSE 451: Operating Systems Spring 2013 Module 2 Architectural Support for Operating Systems Ed Lazowska lazowska@cs.washington.edu 570 Allen Center Even coarse architectural trends impact tremendously
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationChapter Seven Morgan Kaufmann Publishers
Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be
More informationChapter 2: Memory Hierarchy Design Part 2
Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental
More informationTextbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:
Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331
More informationPhoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware
Phoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware Smruti R. Sarangi Abhishek Tiwari Josep Torrellas University of Illinois at Urbana-Champaign Can a Processor
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationHardware/Software Co-design
Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction
More informationOperating System Support for Shared-ISA Asymmetric Multi-core Architectures
Operating System Support for Shared-ISA Asymmetric Multi-core Architectures Tong Li, Paul Brett, Barbara Hohlt, Rob Knauerhase, Sean McElderry, Scott Hahn Intel Corporation Contact: tong.n.li@intel.com
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Test Loads CS 147: Computer Systems Performance Analysis Test Loads 1 / 33 Overview Overview Overview 2 / 33 Test Load Design Test Load Design Test Load Design
More informationCS 134: Operating Systems
CS 134: Operating Systems More Memory Management CS 134: Operating Systems More Memory Management 1 / 27 2 / 27 Overview Overview Overview Segmentation Recap Segmentation Recap Segmentation Recap Segmentation
More informationDesigning, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems
Designing, developing, debugging ARM and heterogeneous multi-processor systems Kinjal Dave Senior Product Manager, ARM ARM Tech Symposia India December 7 th 2016 Topics Introduction System design Software
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationInput/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security
Input/Output Today Principles of I/O hardware & software I/O software layers Disks Next Protection & Security Operating Systems and I/O Two key operating system goals Control I/O devices Provide a simple,
More informationCOMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy
COMPUTER ARCHITECTURE Virtualization and Memory Hierarchy 2 Contents Virtual memory. Policies and strategies. Page tables. Virtual machines. Requirements of virtual machines and ISA support. Virtual machines:
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationCOMP 273 Winter physical vs. virtual mem Mar. 15, 2012
Virtual Memory The model of MIPS Memory that we have been working with is as follows. There is your MIPS program, including various functions and data used by this program, and there are some kernel programs
More informationRTOS Real T i Time me Operating System System Concepts Part 2
RTOS Real Time Operating System Concepts Part 2 Real time System Pitfalls - 4: The Ariane 5 satelite launch rocket Rocket self destructed in 4 June -1996. Exactly after 40 second of lift off at an attitude
More informationCSE 451: Operating Systems Winter Module 2 Architectural Support for Operating Systems
CSE 451: Operating Systems Winter 2017 Module 2 Architectural Support for Operating Systems Mark Zbikowski mzbik@cs.washington.edu 476 Allen Center 2013 Gribble, Lazowska, Levy, Zahorjan 1 Even coarse
More informationEE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University
EE108B Lecture 17 I/O Buses and Interfacing to CPU Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements Remaining deliverables PA2.2. today HW4 on 3/13 Lab4 on 3/19
More informationCache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance
6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,
More informationSoftware Quality is Directly Proportional to Simulation Speed
Software Quality is Directly Proportional to Simulation Speed CDNLive! 11 March 2014 Larry Lapides Page 1 Software Quality is Directly Proportional to Test Speed Intuitively obvious (so my presentation
More informationChapter 2. OS Overview
Operating System Chapter 2. OS Overview Lynn Choi School of Electrical Engineering Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, Kong-Hak-Kwan 411, lchoi@korea.ac.kr,
More informationMemory Subsystem Profiling with the Sun Studio Performance Analyzer
Memory Subsystem Profiling with the Sun Studio Performance Analyzer CScADS, July 20, 2009 Marty Itzkowitz, Analyzer Project Lead Sun Microsystems Inc. marty.itzkowitz@sun.com Outline Memory performance
More informationNetconf RCU and Breakage. Paul E. McKenney IBM Distinguished Engineer & CTO Linux Linux Technology Center IBM Corporation
RCU and Breakage Paul E. McKenney IBM Distinguished Engineer & CTO Linux Linux Technology Center Copyright 2009 IBM 2002 IBM Corporation Overview What the #$I#@(&!!! is RCU-bh for??? RCU status in mainline
More informationSimulation-Based FlexRay TM Conformance Testing an OVM success story
Simulation-Based FlexRay TM Conformance Testing an OVM success story Mark Litterick, Co-founder & Verification Consultant, Verilab Abstract This article presents a case study on how the Open Verification
More informationLecture 3: Evaluating Computer Architectures. How to design something:
Lecture 3: Evaluating Computer Architectures Announcements - (none) Last Time constraints imposed by technology Computer elements Circuits and timing Today Performance analysis Amdahl s Law Performance
More informationEECS 452 Lecture 9 TLP Thread-Level Parallelism
EECS 452 Lecture 9 TLP Thread-Level Parallelism Instructor: Gokhan Memik EECS Dept., Northwestern University The lecture is adapted from slides by Iris Bahar (Brown), James Hoe (CMU), and John Shen (CMU
More informationProtoFlex Tutorial: Full-System MP Simulations Using FPGAs
rotoflex Tutorial: Full-System M Simulations Using FGAs Eric S. Chung, Michael apamichael, Eriko Nurvitadhi, James C. Hoe, Babak Falsafi, Ken Mai ROTOFLEX Computer Architecture Lab at Our work in this
More informationCombining Arm & RISC-V in Heterogeneous Designs
Combining Arm & RISC-V in Heterogeneous Designs Gajinder Panesar, CTO, UltraSoC gajinder.panesar@ultrasoc.com RISC-V Summit 3 5 December 2018 Santa Clara, USA Problem statement Deterministic multi-core
More informationReal-Time Component Software. slide credits: H. Kopetz, P. Puschner
Real-Time Component Software slide credits: H. Kopetz, P. Puschner Overview OS services Task Structure Task Interaction Input/Output Error Detection 2 Operating System and Middleware Application Software
More informationMultiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.
Multiprocessor System Multiprocessor Systems Chapter 8, 8.1 We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than
More informationIntroduction to Embedded Systems
Introduction to Embedded Systems Minsoo Ryu Hanyang University Outline 1. Definition of embedded systems 2. History and applications 3. Characteristics of embedded systems Purposes and constraints User
More informationProfiling: Understand Your Application
Profiling: Understand Your Application Michal Merta michal.merta@vsb.cz 1st of March 2018 Agenda Hardware events based sampling Some fundamental bottlenecks Overview of profiling tools perf tools Intel
More informationSoftware Driven Verification at SoC Level. Perspec System Verifier Overview
Software Driven Verification at SoC Level Perspec System Verifier Overview June 2015 IP to SoC hardware/software integration and verification flows Cadence methodology and focus Applications (Basic to
More informationECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Evaluation Metrics, Simulation, and Workloads
Advanced Computer Architecture II (Parallel Computer Architecture) Evaluation Metrics, Simulation, and Workloads Copyright 2010 Daniel J. Sorin Duke University Outline Metrics Methodologies Modeling Simulation
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationThe Computer Revolution. Classes of Computers. Chapter 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More information7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.
Technology in Action Technology in Action Chapter 9 Behind the Scenes: A Closer Look a System Hardware Chapter Topics Computer switches Binary number system Inside the CPU Cache memory Types of RAM Computer
More informationOperating Systems. Operating System Structure. Lecture 2 Michael O Boyle
Operating Systems Operating System Structure Lecture 2 Michael O Boyle 1 Overview Architecture impact User operating interaction User vs kernel Syscall Operating System structure Layers Examples 2 Lower-level
More informationCodesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web.
Codesign Framework Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web. Embedded Processor Types General Purpose Expensive, requires
More informationChecker Processors. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India
Advanced Department of Computer Science Indian Institute of Technology New Delhi, India Outline Introduction Advanced 1 Introduction 2 Checker Pipeline Checking Mechanism 3 Advanced Core Checker L1 Failure
More informationBasic Concepts COE 205. Computer Organization and Assembly Language Dr. Aiman El-Maleh
Basic Concepts COE 205 Computer Organization and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals [Adapted from slides of
More informationApril 4, 2001: Debugging Your C24x DSP Design Using Code Composer Studio Real-Time Monitor
1 This presentation was part of TI s Monthly TMS320 DSP Technology Webcast Series April 4, 2001: Debugging Your C24x DSP Design Using Code Composer Studio Real-Time Monitor To view this 1-hour 1 webcast
More informationChapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Seventh Edition By William Stallings Objectives of Chapter To provide a grand tour of the major computer system components:
More informationMultiprocessor Systems. COMP s1
Multiprocessor Systems 1 Multiprocessor System We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than one CPU to improve
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationSimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems
University of Toronto FPGA Seminar SimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems Ruediger Willenberg and Paul Chow High-Performance Reconfigurable Computing Group University of Toronto
More informationEmbedded Computation
Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,
More informationEI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)
EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building
More informationEITF20: Computer Architecture Part 5.1.1: Virtual Memory
EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache optimization Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache
More informationSpectre and Meltdown. Clifford Wolf q/talk
Spectre and Meltdown Clifford Wolf q/talk 2018-01-30 Spectre and Meltdown Spectre (CVE-2017-5753 and CVE-2017-5715) Is an architectural security bug that effects most modern processors with speculative
More information