System Simulator for x86
|
|
- Austen Craig
- 5 years ago
- Views:
Transcription
1 MARSS Micro Architecture & System Simulator for x86 CAPS SUNY Binghamton Presenter Avadh Patel
2 Present State of Academic Simulators Majority of Academic Simulators: Are for non x86 platforms Do not support Full System Simulate single core CPUs Current state of x86 Simulators: Very few open source, cycle accurate, full system simulator Some arebased on PTLsim (which was developed by Matt Yourst of CAPS group) Most target single core designs QUF 11 DATE 2011 MARSSx86 ( 2
3 Future Computing Systems Server/Desktop Space: Integrated Systems Many modules in one chip IO bound software applications Hardware Software co design Mobile Space: SoC will integrate more modules Operating System will become more complex Researchers will require more powerful tools for innovative i designs QUF 11 DATE 2011 MARSSx86 ( 3
4 Simulator for Future Enabling research of future micro processors and SoCs Hardware software co design Heterogeneous Architecture Real IO activities Allow existing models to integrate with ease MUST run unmodified binaries 4
5 PTLsim Cycle accurate x86 simulator with: Detailed out of order pipeline pp Simple Cache/Memory simulation Uses Xen to support full system simulation Limitations: Setup required custom kernel with Xen support Difficult to tap into IO activities No support to use external modules/libraries 5
6 QEMU Port of Choice Fully user space level full system emulation PTLsim X was suffering because of Xen setup Runs unmodified binaries an OS Tons of emulated devices 6
7 QEMU Port of Choice Support for Checkpoints/Snapshots Simulators are very slow Not feasible to boot an OS for each benchmark run Leverage QCOW2 format s support to create snapshots at user specified addresses in benchmarks 7
8 QEMU Port of Choice 100% open source Many simulators based on closed source platform Like GEMS, FeS2 which are based on Simics Ati Active community with great support 8
9 MARSSx86 An Overview Simulat ted Software Stack User Space Applications Shared Libraries Services Operating System CPU Model Emulation Simulation Mem mory Manage ement DRAM IO Mem Disk Marss Framework IO Devices User Interface 9
10 MARSSx86 Simulator Model Fron nt End Dis spatch CPU Core Out Of Order Order Issue Re Order Buffer Function Unit Clusters Register File Various Modules in CPU Core that simulates detailed pipeline logic Private Caches Shared Caches Cache/Memory Coherent Caches Interconnect t DRAM Controller Cache/Memory Modules Provide highly configurable Memory Hierarchy designs 10
11 MARSSx86 Execution Control Flow CPU Context is shared between Emulated and Simulated models 11
12 MARSSx86 Key Features True Full System Not only Kernel space simulation but also real time IO simulation Supports running unmodified OSes Simulate real multi threaded workloads Real Time IO simulation i allows quick ikmodeling dli of new IO devices and their simulation 12
13 MARSSx86 Key Features Hardware Software Co Design Simulates full stack of software Communicate between software and Simulated Hardware No special requirement to build software for MARSS Co Simulation Emulation and Simulation Model in one framework Seamless switch hbetween two models dl Fast Fwd to interesting regions of benchmarks 13
14 Co Design & Co Simulation Softwa are Stack User Space Applications/Benchmarks Shared Libraries Services Operating System MMIO Simulated Hardware CPU Model Emulator Simulator DRAM IO Devices Disk, USB,PCI etc. Shared Device Models between CPU Models C0 C1 C2 C3 MLC MLC MLC MLC QUF 11 DATE 2011 Interconnect Shared Cache / DRAM MARSSx86 ( 14
15 MARSSx86 Key Features Heterogeneous Core Modeling Performance models for Aggressive out of order design with RISC substrate In order cores like Atom Multi Threaded core design for both out of order and in order models Simulate mix n match match of different types of core models 15
16 Heterogeneous Core Modeling Highly configurable Cores and Memory Models Oo Oo Oo Oo MT MT MT MT MT MT Oo Oo PC PC PC PC PC PC PC PC PC Interconnect Shared Cache / DRAM Interconnect Shared Cache / DRAM Interconnect Shared Cache / DRAM Multicore Configuration MT Configuration MT Multicore Configuration Oo InC Oo InC PC PC Interconnect Shared Cache / DRAM Hybrid Core Configuration 1 MT InC MT InC PC PC Interconnect Shared Cache / DRAM Hybrid Core Configuration 2 Oo : Out Of Order Core MT : Mutli Threaded core InC : In Order core PC : Private Cache 16
17 More Features Easily integrate external modules DRAMSim2 SystemC Phase Change RAM (PCRAM) New statistics framework enables separate collection of statistics from different regions Collects separate statistics for Kernel and User space Separate statistics collection of user specific ROI QUF 11 DATE 2011 MARSSx86 ( 17
18 Simulator Performance One of the most important requirement Marss runs cycle accurate simulation in range of 400 to 200 KIPS Fast simulation i allows users to test wide ranges of benchmark behavior in one simulation run 18
19 Simulator Performance 800 SPECInt2006 Benchmarks running with Test input (Full Application run) Instructions co ommits per secon nd in Thousands sjeng omnetpp mcf xalanc gcc astar gobmk libquantum hmm perl bzip Average Native System Configuration: Quad Core Intel Xeon 2.67GHz (Nehalem) with 8GB RAM Error bars show Maximum and Minimum Speed in KIPS QUF 11 DATE 2011 MARSSx86 ( 19
20 A Case Study Benchmark Regions 1,4 1,2 1 0,8 0,6 04 0,4 0,2 0 astar 0,10 0,80 1,50 2,20 2,90 3,60 4,30 5,00 5,70 6,40 7,10 7,80 8,50 9,20 9,90 10,60 11,30 12,00 12,70 13,40 14,10 14,81 15,51 16,21 16,91 17,61 18,31 19,01 19,71 20,41 21,11 21,81 22,51 23,21 23,91 24,61 25,31 IPC per 100 Million Cycles Cycles in billions 2 1,5 1 0,5 0 Bzip2 IPC per 100 Million Cycles Cycles in billions 20
21 Future Work ARM Port? Don t have enough expertise May be in next 6 12 months More Cache Coherence C h Models More CPU Models 21
22 Q & A Grab a copy to hack from: Open Source under GPL v2 License Send your comments/questions to apatel@cs.binghamton.edu 22
23 Backup 23
24 Technical Details Functional Model : QEMU JIT code generator for emulation More than 100 IO device models Supports multiple ISA 24
25 Technical ldtil Details Performance Model Based on PTLsim (older x86 simulator from CAPS Group) Components used from PTLsim: Decoder, Core components of Out Of Order Datapath, SuperSTL and Logic libraries lb Fast models for coherent cache, memory system Several added optimizations for performance, correctness and flexibility 25
26 Technical Details Benchmark Regions 1,6 1,4 12 1,2 1 0,8 0,6 0,4 0,2 mcf IPC per 100 Million Cycles 0 0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10, 11, 11, 12, 12,,10,50,90,30,70,10,50,90,30,70,10,50,90,30,70,10,50,90,30,70,10,50,91,31,71,11,51,91,31,71,11,51 Cycles in billions 1,6 1,4 1,2 1 0,8 0,6 0,4 0,2 gcc IPC per 10 Million Cycles 0 10,01 120,36 230,77 341,21 451,62 561,85 672,15 782,41 892, , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,37 QUF 11 DATE 2011 MARSSx86 ( Cycles in Millions 26
27 Simulator Performance IPC 1,8 1,6 1,4 1, ,8 0,6 0,4 0,2 0 SPECInt2006 Benchmarks running with Test input (Full Application run) Marss Nehalem sjeng omnetpp mcf xalanc gcc astar gobmk quantum hmm perl bzip Average Default OoO Core Model is configured with Nehalem Parameters Not exact modeling QUF 11 DATE 2011 MARSSx86 ( 27
28 Sim mulated So oftware Stack User Space Applications Shared Libraries Services Operating System Model CPU Emulation Simulation Me emory Man nageme nt DRAM IO Mem Disk IO Devices User Interface ABC Framework 28
29 Oo Oo Oo Oo Oo Oo Oo Oo PC PC PC PC Interconnect Shared Cache / DRAM PC PC PC PC Interconnect Shared Cache / DRAM C0 C1 C2 C3 MLC MLC MLC MLC Interconnect Shared Cache & DRAM 29
30 Sw Softwa are Stack User Space Applications/Benchmarks Shared Libraries Services Operating System MMIO Hw CPU Model Emulated Simulated DRAM IO Devices Disk, USB,PCI etc. 30
31 Key Features True Full System Co Design & Co Simulation Heterogeneous Models Simulator Performance 31
32 Modules in the Framework CPUs Emulation CPU Simulation Memory Management Unit IO Emulation & BIOS Support Guest Disk Image Management User Interface JIT Soft MMU Exceptions Interrupts t Out Of Order Pipeline In Order Pipeline Multicore Heterogeneous Coherent Caches On Chip Interconnect Guest to Host mapping DRAM DMA IO Memory Page fault Handling VGA NIC USB etc. IO APIC Local APIC Raw Copy on write (QCOW) Snapshots (Checkpoints) Monitor Full Graphic support VNC Serial lport 32
33 True Full System Simulation Not only Kernel space simulation but also real time IO simulation Supports running unmodified OSes Simulate real multi threaded li d workloads Real Time IO simulation allows quick modeling of new IO devices and their simulation 33
34 Key Features True Full System Co Design & Co Simulation Heterogeneous Models Simulator Performance 34
35 Key Features True Full System Co Design & Co Simulation Heterogeneous Models Simulator Performance 35
36 Key Features True Full System Co Design & Co Simulation Heterogeneous Models Simulator Performance 36
MARSSx86: A Full System Simulator for x86 CPUs
MARSSx86: A Full System Simulator for x86 CPUs Avadh Patel, Furat Afram, Shunfei Chen, and Kanad Ghose Dept. of Computer Science, State University of New York at Binghamton {apatel, fafram1, schen, ghose}@cs.binghamton.edu
More informationA Fast Instruction Set Simulator for RISC-V
A Fast Instruction Set Simulator for RISC-V Maxim.Maslov@esperantotech.com Vadim.Gimpelson@esperantotech.com Nikita.Voronov@esperantotech.com Dave.Ditzel@esperantotech.com Esperanto Technologies, Inc.
More informationHow to use the BigDataBench simulator versions
How to use the BigDataBench simulator versions Zhen Jia Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY Objec8ves
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 15, 2007 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationIntroduction to gem5. Nizamudheen Ahmed Texas Instruments
Introduction to gem5 Nizamudheen Ahmed Texas Instruments 1 Introduction A full-system computer architecture simulator Open source tool focused on architectural modeling BSD license Encompasses system-level
More informationEvaluation of RISC-V RTL with FPGA-Accelerated Simulation
Evaluation of RISC-V RTL with FPGA-Accelerated Simulation Donggyu Kim, Christopher Celio, David Biancolin, Jonathan Bachrach, Krste Asanovic CARRV 2017 10/14/2017 Evaluation Methodologies For Computer
More informationProtoFlex: FPGA-Accelerated Hybrid Simulator
ProtoFlex: FPGA-Accelerated Hybrid Simulator Eric S. Chung, Eriko Nurvitadhi James C. Hoe, Babak Falsafi, Ken Mai Computer Architecture Lab at Multiprocessor Simulation Simulating one processor in software
More informationOpenPrefetch. (in-progress)
OpenPrefetch Let There Be Industry-Competitive Prefetching in RISC-V Processors (in-progress) Bowen Huang, Zihao Yu, Zhigang Liu, Chuanqi Zhang, Sa Wang, Yungang Bao Institute of Computing Technology(ICT),
More informationCS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines
CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines Sreepathi Pai UTCS September 14, 2015 Outline 1 Introduction 2 Out-of-order Scheduling 3 The Intel Haswell
More informationEmerging NVM Memory Technologies
Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie yuanxie@cse.psu.edu Position Statement
More informationProtoFlex Tutorial: Full-System MP Simulations Using FPGAs
rotoflex Tutorial: Full-System M Simulations Using FGAs Eric S. Chung, Michael apamichael, Eriko Nurvitadhi, James C. Hoe, Babak Falsafi, Ken Mai ROTOFLEX Computer Architecture Lab at Our work in this
More informationEnergy-centric DVFS Controlling Method for Multi-core Platforms
Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To
More informationHybrid Cache Architecture (HCA) with Disparate Memory Technologies
Hybrid Cache Architecture (HCA) with Disparate Memory Technologies Xiaoxia Wu, Jian Li, Lixin Zhang, Evan Speight, Ram Rajamony, Yuan Xie Pennsylvania State University IBM Austin Research Laboratory Acknowledgement:
More informationResource-Conscious Scheduling for Energy Efficiency on Multicore Processors
Resource-Conscious Scheduling for Energy Efficiency on Andreas Merkel, Jan Stoess, Frank Bellosa System Architecture Group KIT The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe
More informationHow Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC
How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC Three Consortia Formed in Oct 2016 Gen-Z Open CAPI CCIX complex to rack scale memory fabric Cache coherent accelerator
More informationChapter 5 B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 B Large and Fast: Exploiting Memory Hierarchy Dependability 5.5 Dependable Memory Hierarchy Chapter 6 Storage and Other I/O Topics 2 Dependability Service accomplishment Service delivered as
More informationComprehensive Kernel Instrumentation via Dynamic Binary Translation
Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner Angela Demke Brown Ashvin Goel University of Toronto 011 Complexity of Operating Systems 012 Complexity of Operating Systems
More informationProtoFlex: FPGA Accelerated Full System MP Simulation
ProtoFlex: FPGA Accelerated Full System MP Simulation Eric S. Chung, Eriko Nurvitadhi, James C. Hoe, Babak Falsafi, Ken Mai Computer Architecture Lab at Our work in this area has been supported in part
More informationNested Virtualization and Server Consolidation
Nested Virtualization and Server Consolidation Vara Varavithya Department of Electrical Engineering, KMUTNB varavithya@gmail.com 1 Outline Virtualization & Background Nested Virtualization Hybrid-Nested
More informationCycle-Accurate HSA Simulator
Cycle-Accurate HSA Simulator Chien-Chih Chen Advisor: Tien-Fu Chen SoC & ESW Lab 國立交通大學 National Chiao Tung University 2015/5/18 Outline Why do we need a fast cycle-accurate HSA simulator Basics of SW
More informationVirtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University
Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Precise Definition of Virtual Memory Virtual memory is a mechanism for translating logical
More informationChip-Multithreading Systems Need A New Operating Systems Scheduler
Chip-Multithreading Systems Need A New Operating Systems Scheduler Alexandra Fedorova Christopher Small Daniel Nussbaum Margo Seltzer Harvard University, Sun Microsystems Sun Microsystems Sun Microsystems
More informationVirtually Impossible
Virtually Impossible The Reality of Virtualization Security Gal Diskin / Chief Research Officer / Cyvera LTD. /WhoAmI? Chief Research Officer @ Cvyera LTD Formerly Security Evaluation Architect of the
More informationSecurity-Aware Processor Architecture Design. CS 6501 Fall 2018 Ashish Venkat
Security-Aware Processor Architecture Design CS 6501 Fall 2018 Ashish Venkat Agenda Common Processor Performance Metrics Identifying and Analyzing Bottlenecks Benchmarking and Workload Selection Performance
More informationMemory Mapped ECC Low-Cost Error Protection for Last Level Caches. Doe Hyun Yoon Mattan Erez
Memory Mapped ECC Low-Cost Error Protection for Last Level Caches Doe Hyun Yoon Mattan Erez 1-Slide Summary Reliability issues in caches Increasing soft error rate (SER) Cost increases with error protection
More informationLightweight Memory Tracing
Lightweight Memory Tracing Mathias Payer*, Enrico Kravina, Thomas Gross Department of Computer Science ETH Zürich, Switzerland * now at UC Berkeley Memory Tracing via Memlets Execute code (memlets) for
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationFull-System Timing-First Simulation
Full-System Timing-First Simulation Carl J. Mauer Mark D. Hill and David A. Wood Computer Sciences Department University of Wisconsin Madison The Problem Design of future computer systems uses simulation
More informationFrom Application to Technology OpenCL Application Processors Chung-Ho Chen
From Application to Technology OpenCL Application Processors Chung-Ho Chen Computer Architecture and System Laboratory (CASLab) Department of Electrical Engineering and Institute of Computer and Communication
More informationCSC501 Operating Systems Principles. OS Structure
CSC501 Operating Systems Principles OS Structure 1 Announcements q TA s office hour has changed Q Thursday 1:30pm 3:00pm, MRC-409C Q Or email: awang@ncsu.edu q From department: No audit allowed 2 Last
More informationCOMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy
COMPUTER ARCHITECTURE Virtualization and Memory Hierarchy 2 Contents Virtual memory. Policies and strategies. Page tables. Virtual machines. Requirements of virtual machines and ISA support. Virtual machines:
More informationVirtualization Overview NSRC
Virtualization Overview NSRC Terminology Virtualization: dividing available resources into smaller independent units Emulation: using software to simulate hardware which you do not have The two often come
More informationDesigning Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen
Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit
More informationA Heterogeneous Multiple Network-On-Chip Design: An Application-Aware Approach
A Heterogeneous Multiple Network-On-Chip Design: An Application-Aware Approach Asit K. Mishra Onur Mutlu Chita R. Das Executive summary Problem: Current day NoC designs are agnostic to application requirements
More informationQEMU for Xilinx ZynqMP. V Aug-20
QEMU for Xilinx ZynqMP Edgar E. Iglesias V2 2015-Aug-20 ZynqMP SoC New Chip (Zynq NG) Aggressive target for QEMU as early SW platform emulating WiP chip BootROMs, Boot-loaders,
More informationRAMP-White / FAST-MP
RAMP-White / FAST-MP Hari Angepat and Derek Chiou Electrical and Computer Engineering University of Texas at Austin Supported in part by DOE, NSF, SRC,Bluespec, Intel, Xilinx, IBM, and Freescale RAMP-White
More informationCS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives
CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives Virtual Machines Resource Virtualization Separating the abstract view of computing resources from the implementation of these resources
More informationPower Control in Virtualized Data Centers
Power Control in Virtualized Data Centers Jie Liu Microsoft Research liuj@microsoft.com Joint work with Aman Kansal and Suman Nath (MSR) Interns: Arka Bhattacharya, Harold Lim, Sriram Govindan, Alan Raytman
More informationChapter 1. and Technology
Chapter 1 Computer Abstractions Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications feasible Computers in automobiles
More informationEvaluating STT-RAM as an Energy-Efficient Main Memory Alternative
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University
More informationIntroduction to Microprocessor
Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device
More informationPerformance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews
Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models Jason Andrews Agenda System Performance Analysis IP Configuration System Creation Methodology: Create,
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationThe DragonBeam Framework: Hardware-Protected Security Modules for In-Place Intrusion Detection
: Hardware-Protected Security Modules for In-Place Intrusion Detection Man-Ki Yoon, Mihai Christodorescu, Lui Sha, Sibin Mohan University of Illinois at Urbana-Champaign Qualcomm Research Silicon Valley
More informationDifference Engine: Harnessing Memory Redundancy in Virtual Machines (D. Gupta et all) Presented by: Konrad Go uchowski
Difference Engine: Harnessing Memory Redundancy in Virtual Machines (D. Gupta et all) Presented by: Konrad Go uchowski What is Virtual machine monitor (VMM)? Guest OS Guest OS Guest OS Virtual machine
More informationScheduling the Intel Core i7
Third Year Project Report University of Manchester SCHOOL OF COMPUTER SCIENCE Scheduling the Intel Core i7 Ibrahim Alsuheabani Degree Programme: BSc Software Engineering Supervisor: Prof. Alasdair Rawsthorne
More informationEmbedded Systems: Architecture
Embedded Systems: Architecture Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu)
More informationCSE 120 Principles of Operating Systems
CSE 120 Principles of Operating Systems Spring 2018 Lecture 16: Virtual Machine Monitors Geoffrey M. Voelker Virtual Machine Monitors 2 Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationMission-Critical Enterprise Linux. April 17, 2006
Mission-Critical Enterprise Linux April 17, 2006 Agenda Welcome Who we are & what we do Steve Meyers, Director Unisys Linux Systems Group (steven.meyers@unisys.com) Technical Presentations Xen Virtualization
More informationKeyStone II. CorePac Overview
KeyStone II ARM Cortex A15 CorePac Overview ARM A15 CorePac in KeyStone II Standard ARM Cortex A15 MPCore processor Cortex A15 MPCore version r2p2 Quad core, dual core, and single core variants 4096kB
More informationComputer Architecture and OS. EECS678 Lecture 2
Computer Architecture and OS EECS678 Lecture 2 1 Recap What is an OS? An intermediary between users and hardware A program that is always running A resource manager Manage resources efficiently and fairly
More informationBalancing DRAM Locality and Parallelism in Shared Memory CMP Systems
Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems Min Kyu Jeong, Doe Hyun Yoon^, Dam Sunwoo*, Michael Sullivan, Ikhwan Lee, and Mattan Erez The University of Texas at Austin Hewlett-Packard
More informationMicro VMMs and Nested Virtualization
Micro VMMs and Nested Virtualization For the TCE 4th summer school on computer security, big data and innovation Baruch Chaikin, Intel 9 September 2015 Agenda Virtualization Basics The Micro VMM Nested
More informationI/O and virtualization
I/O and virtualization CSE-C3200 Operating systems Autumn 2015 (I), Lecture 8 Vesa Hirvisalo Today I/O management Control of I/O Data transfers, DMA (Direct Memory Access) Buffering Single buffering Double
More informationDataflow: The Road Less Complex
Dataflow: The Road Less Complex Steven Swanson Ken Michelson Andrew Schwerin Mark Oskin University of Washington Sponsored by NSF and Intel Things to keep you up at night (~2016) Opportunities 8 billion
More informationParallel Simulation Accelerates Embedded Software Development, Debug and Test
Parallel Simulation Accelerates Embedded Software Development, Debug and Test Larry Lapides Imperas Software Ltd. larryl@imperas.com Page 1 Modern SoCs Have Many Concurrent Processing Elements SMP cores
More informationDatacenter application interference
1 Datacenter application interference CMPs (popular in datacenters) offer increased throughput and reduced power consumption They also increase resource sharing between applications, which can result in
More informationComputer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic
More informationISA-L Performance Report Release Test Date: Sept 29 th 2017
Test Date: Sept 29 th 2017 Revision History Date Revision Comment Sept 29 th, 2017 1.0 Initial document for release 2 Contents Audience and Purpose... 4 Test setup:... 4 Intel Xeon Platinum 8180 Processor
More informationOverview of System Virtualization: The most powerful platform for program analysis and system security. Zhiqiang Lin
CS 6V81-05: System Security and Malicious Code Analysis Overview of System Virtualization: The most powerful platform for program analysis and system security Zhiqiang Lin Department of Computer Science
More informationLecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996
Lecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Computer Architecture Is the attributes of a [computing] system as seen by the programmer, i.e.,
More informationBERKELEY PAR LAB. RAMP Gold Wrap. Krste Asanovic. RAMP Wrap Stanford, CA August 25, 2010
RAMP Gold Wrap Krste Asanovic RAMP Wrap Stanford, CA August 25, 2010 RAMP Gold Team Graduate Students Zhangxi Tan Andrew Waterman Rimas Avizienis Yunsup Lee Henry Cook Sarah Bird Faculty Krste Asanovic
More informationProfiling and Debugging OpenCL Applications with ARM Development Tools. October 2014
Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationMulticore and MIPS: Creating the next generation of SoCs. Jim Whittaker EVP MIPS Business Unit
Multicore and MIPS: Creating the next generation of SoCs Jim Whittaker EVP MIPS Business Unit www.imgtec.com Many new opportunities Wearables Home wireless for everything Automation & Robotics ADAS and
More informationFirst QEMU Users Forum
Cooperative Computing & Communication Laboratory First QEMU Users Forum Alpexpo Grenoble, March 18 th 2011 Frédéric Pétrot & Wolfgang Mueller What is QEMU? Open source library for hardware emulation and
More informationFlexible Cache Error Protection using an ECC FIFO
Flexible Cache Error Protection using an ECC FIFO Doe Hyun Yoon and Mattan Erez Dept Electrical and Computer Engineering The University of Texas at Austin 1 ECC FIFO Goal: to reduce on-chip ECC overhead
More informationHY225 Lecture 12: DRAM and Virtual Memory
HY225 Lecture 12: DRAM and irtual Memory Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS May 16, 2011 Dimitrios S. Nikolopoulos Lecture 12: DRAM and irtual Memory 1 / 36 DRAM Fundamentals Random-access
More informationChapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs
Chapter 5 (Part II) Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Virtual Machines Host computer emulates guest operating system and machine resources Improved isolation of multiple
More informationImpact of Cache Coherence Protocols on the Processing of Network Traffic
Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits
More informationTDT4255 Computer Design. Lecture 1. Magnus Jahre
1 TDT4255 Computer Design Lecture 1 Magnus Jahre 2 Outline Practical course information Chapter 1: Computer Abstractions and Technology 3 Practical Course Information 4 TDT4255 Computer Design TDT4255
More informationMicroarchitecture Overview. Performance
Microarchitecture Overview Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu January 18, 2005 Performance 4 Make operations faster Process improvements Circuit improvements Use more transistors to make
More informationSimulating Multi-Core RISC-V Systems in gem5
Simulating Multi-Core RISC-V Systems in gem5 Tuan Ta, Lin Cheng, and Christopher Batten School of Electrical and Computer Engineering Cornell University 2nd Workshop on Computer Architecture Research with
More informationArchitecture Exploration of High-Performance PCs with a Solid-State Disk
Architecture Exploration of High-Performance PCs with a Solid-State Disk D. Kim, K. Bang, E.-Y. Chung School of EE, Yonsei University S. Yoon School of EE, Korea University April 21, 2010 1/53 Outline
More informationOperating System Support for Shared-ISA Asymmetric Multi-core Architectures
Operating System Support for Shared-ISA Asymmetric Multi-core Architectures Tong Li, Paul Brett, Barbara Hohlt, Rob Knauerhase, Sean McElderry, Scott Hahn Intel Corporation Contact: tong.n.li@intel.com
More informationOverview ESESC Tutorial
ESESC Tutorial Speaker: Department of Computer Engineering, University of California, Santa Cruz http://masc.soe.ucsc.edu Tutorial Logistics 08:00-08:30: Breakfast 08:30-09:00: 09:00-09:30: Building and
More informationIndustry Collaboration and Innovation
Industry Collaboration and Innovation OpenCAPI Topics Industry Background Technology Overview Design Enablement OpenCAPI Consortium Industry Landscape Key changes occurring in our industry Historical microprocessor
More informationXen. past, present and future. Stefano Stabellini
Xen past, present and future Stefano Stabellini Xen architecture: PV domains Xen arch: driver domains Xen: advantages - small surface of attack - isolation - resilience - specialized algorithms (scheduler)
More informationECE 331 Hardware Organization and Design. UMass ECE Discussion 11 4/12/2018
ECE 331 Hardware Organization and Design UMass ECE Discussion 11 4/12/2018 Today s Discussion Topics Hamming Codes For error detection and correction Virtual Machines Virtual Memory The Hamming SEC Code
More informationVirtual Machine Monitors (VMMs) are a hot topic in
CSE 120 Principles of Operating Systems Winter 2007 Lecture 16: Virtual Machine Monitors Keith Marzullo and Geoffrey M. Voelker Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot topic
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationA Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan
LegoOS A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang Y 4 1 2 Monolithic Server OS / Hypervisor 3 Problems? 4 cpu mem Resource
More informationARM big.little Technology Unleashed An Improved User Experience Delivered
ARM big.little Technology Unleashed An Improved User Experience Delivered Govind Wathan Product Specialist Cortex -A Mobile & Consumer CPU Products 1 Agenda Introduction to big.little Technology Benefits
More informationChapter 1. Computer Abstractions and Technology. Adapted by Paulo Lopes, IST
Chapter 1 Computer Abstractions and Technology Adapted by Paulo Lopes, IST The Computer Revolution Progress in computer technology Sustained by Moore s Law Makes novel and old applications feasible Computers
More informationHow much energy can you save with a multicore computer for web applications?
How much energy can you save with a multicore computer for web applications? Peter Strazdins Computer Systems Group, Department of Computer Science, The Australian National University seminar at Green
More informationNative Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization
Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Mian-Muhammad Hamayun, Frédéric Pétrot and Nicolas Fournel System Level Synthesis
More informationLECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY
LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal
More informationVirtualization. Virtualization
Virtualization Virtualization Memory virtualization Process feels like it has its own address space Created by MMU, configured by OS Storage virtualization Logical view of disks connected to a machine
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 36 Performance 2010-04-23 Lecturer SOE Dan Garcia How fast is your computer? Every 6 months (Nov/June), the fastest supercomputers in
More informationLast class: Today: Course administration OS definition, some history. Background on Computer Architecture
1 Last class: Course administration OS definition, some history Today: Background on Computer Architecture 2 Canonical System Hardware CPU: Processor to perform computations Memory: Programs and data I/O
More informationUnit 11: Putting it All Together: Anatomy of the XBox 360 Game Console
Computer Architecture Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Milo Martin & Amir Roth at University of Pennsylvania! Computer Architecture
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationVirtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1
Virtual Memory Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L16-1 Reminder: Operating Systems Goals of OS: Protection and privacy: Processes cannot access each other s data Abstraction:
More informationLeveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD
Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina TEC- OpenSPARC T1 The T1 is a new-from-the-ground-up SPARC microprocessor implementation
More informationA Global Operating System for HPC Clusters
A Global Operating System Emiliano Betti 1 Marco Cesati 1 Roberto Gioiosa 2 Francesco Piermaria 1 1 System Programming Research Group, University of Rome Tor Vergata 2 BlueGene Software Division, IBM TJ
More information