AND THEN THERE WERE NONE
|
|
- Meredith French
- 6 years ago
- Views:
Transcription
1 ND THEN THERE WERE NONE Stall-Free Real-Time Garbage Collector for Reconfigurable Hardware David F. acon Perry Cheng Sunil Shukla IM Research
2 IMPLEMENTING PROGRMMING LNGUGE Program Source Code Interpreter Instruction Set Processor Circuit
3 IMPLEMENTING PROGRMMING LNGUGE Program Program Compiler Compiler Source Code Interpreter Machine Code Instruction Set Processor Instruction Set Interpreter Circuit Circuit
4 IMPLEMENTING PROGRMMING LNGUGE Program Program Program Compiler Compiler Hardware Compiler Source Code Interpreter Machine Code Instruction Set Processor Instruction Set Interpreter Circuit Circuit Circuit Layout Circuit
5 PROGRMMING RECONFIGURLE HRDWRE (FPGS) Programmed at very low level of abstraction same as designing custom circuits (SICs) Verilog, VHDL prevail: bits and bit arrays are main abstraction
6 PROGRMMING RECONFIGURLE HRDWRE (FPGS) Programmed at very low level of abstraction same as designing custom circuits (SICs) Verilog, VHDL prevail: bits and bit arrays are main abstraction HIGH LEVEL LNGUGE
7 PROGRMMING RECONFIGURLE HRDWRE (FPGS) Programmed at very low level of abstraction same as designing custom circuits (SICs) Verilog, VHDL prevail: bits and bit arrays are main abstraction HIGH LEVEL LNGUGE GRGE COLLECTION
8 SYSTEM = PPLICTION + COLLECTOR HND-WRITTEN HDL COLLECTOR & MEMORY
9 RECONFIGURLE HRDWRE CKGROUND
10 CONFIGURLE LOGIC UP TO 300K SLICES = 2.4M FLIP-FLOPS
11 PROGRMMLE ROUTING NETWORK SOURCE: WIKIMEDI (CC) 2007
12 LOCK-RM MEMORIES (RMS) R/W ddress Data In Data Out
13 LOCK-RM MEMORIES (RMS) R/W ddress Data In Data Out R/W ddress Data In Data Out
14 LOCK-RM MEMORIES (RMS) R/W R/W ddress Data Out ddress Data In Data Out Data In R/W ddress Data In Data Out R/W ddress Data In Data Out 36 KIT 36K X 1 18K X 2 1K X RM OR FIFO
15 LOCK-RM MEMORIES (RMS) R/W R/W ddress Data Out ddress Data In Data Out Data In R/W ddress Data In Data Out R/W ddress Data Out Data In
16 R/W ddress Data In Data Out R/W ddress Data In Data Out R/W ddress Data In R/W Data Out R/W ddress Data In Data Out ddress Data In R/W ddress Data In Data Out Data Out
17 R/W ddress Data In Data Out R/W ddress Data Out Data In R/W ddress Data In Data Out R/W ddress Data In Data Out
18 WHT WE UILT
19 COLLECTOR IN HRDWRE FOR HRDWRE
20 COLLECTOR IN HRDWRE FOR HRDWRE Complete garbage collector NOT hardware-assist instructions (eg zul, Lisp Machine) For on-chip FPG memory NOT for large, general-purpose CPU DRM With fixed object geometry (2 pointers + data) NOT for arbitrarily sized/shaped objects Snapshot-at-the-eginning lgorithm [Yuasa 1990]
21 lloc ddr lloc d ddr to Read/Write Pointer to Write Pointer Value Memory Subsystem
22 lloc ddr lloc d ddr to Read/Write Pointer to Write Pointer Value llocator Mark Engine Memory Subsystem Sweep Engine Memory
23 lloc ddr lloc d ddr to Read/Write Pointer to Write Pointer Value llocator Mark Engine Memory Subsystem Sweep Engine Memory GC Snapshot Engine ROOT
24 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
25 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
26 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
27 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
28 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
29 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear 5 Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
30 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear 5 Stack Top Free Stack Pointer Memory ddress llocated MLLOCTOR (INCL. 1 MEMORY COLUMN )
31 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated WRITING (POINTER) VLUE
32 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated WRITING (POINTER) VLUE
33 lloc ddr to Free ddr to Read/Write Pointer to Write Pointer Value ddr lloc d ddress to Clear Stack Top Free Stack Pointer Memory ddress llocated WRITING (POINTER) VLUE
34 THE TRCE ENGINE 3 OPERTIONS (a) Get a root pointer and mark it (b) Deque a pointer from mark queue and mark it (c) Perform write barrier and mark overwritten pointer
35 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue Pointer Memory MUX Pointer to Trace
36 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg 1 Mark Queue MUX Pointer Memory MUX 3 Pointer to Trace
37 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue Pointer Memory MUX 3 3 Pointer to Trace
38 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue Pointer Memory MUX 3 3 Pointer to Trace
39 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 3 Pointer Memory MUX 3 Pointer to Trace
40 (a) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 3 Pointer Memory MUX 3 Pointer to Trace
41 (b) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 3 Pointer Memory MUX Pointer to Trace
42 (b) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 3 1 Mark Queue Pointer Memory MUX Pointer to Trace
43 (b) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue Pointer Memory MUX 5 5 Pointer to Trace
44 (b) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 5 Pointer Memory MUX 5 Pointer to Trace
45 (c) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 5 Pointer Memory MUX Pointer to Trace
46 (c) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd 3 Mark Map arrier Reg MUX 1 Mark Queue 5 Pointer Memory MUX Pointer to Trace
47 (c) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd 3 Mark Map arrier Reg MUX 1 Mark Queue 5 Pointer Memory MUX Pointer to Trace
48 (c) ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map arrier Reg MUX 1 Mark Queue 5 Pointer Memory MUX Pointer to Trace
49 RESULTS
50 EVLUTE 3 SYSTEMS (a) Malloc llocator Memory (b) Stop-the-World GC llocator Sweep Engine Mark Engine Memory (c) Real-Time GC llocator Sweep Engine Mark Engine Memory Snapshot Engine
51 EVLUTE SYSTEMS IN 3 CONTEXTS COLLECTOR & MEMORY (a) Collector in isolation (no application)
52 EVLUTE SYSTEMS IN 3 CONTEXTS INRY TREE (HND-WRITTEN HDL) COLLECTOR & MEMORY (a) Collector in isolation (no application) (b) Collector with inary Tree benchmark
53 EVLUTE SYSTEMS IN 3 CONTEXTS DEQUEUE INRY TREE (HND-WRITTEN HDL) HDL) COLLECTOR & MEMORY (a) Collector in isolation (no application) (b) Collector with inary Tree benchmark (c) Collector with Double-ended Queue benchmark
54 LOGIC (SLICE) USGE - NO PPLICTION Xilinx Virtex-5 LX330T 51,840 Slices Tiny fraction of chip STW almost as complex as RTGC
55 SYNTHESIZED CLOCK FREQUENCY - NO PPLICTION Frequency goes down with design complexity Malloc is faster, but advantage narrows
56 EXECUTION TIME - DEQUEUE
57 EXECUTION TIME - DEQUEUE RTGC uniformly faster than STW Malloc is faster, but not by that much (almost even for inary Tree)
58 CONCLUSIONS First complete garbage collector in hardware First garbage collector that NEVER pauses mutator Greatly expands expressiveness of hardware programs RTGC is faster, smaller, and cooler than STW RTGC in hardware is MUCH SIMPLER than in software Is something wrong with our processor designs?
59 Questions?
60 Questions? Suggestions: You only have 2 microbenchmarks. Isn t that bogus? Isn t a fixed object layout totally bogus? Can determinism be preserved with a more complex heap? Could this technique be applied to general-purpose systems? I don t believe you never stall. Do you have a proof? Don t you lose performance by reserving one of the ports? What unique hardware features made stall-freedom possible?
61 CKUP
62 ROLE OF THE GRGE COLLECTOR COLLECTOR & MEMORY
63 ROLE OF THE GRGE COLLECTOR PPLICTION COLLECTOR & MEMORY
64 ROLE OF THE GRGE COLLECTOR PPLICTION HND-WRITTEN HDL COLLECTOR & MEMORY
65 ROLE OF THE GRGE COLLECTOR PPLICTION LIME TSK HND-WRITTEN HDL COLLECTOR & MEMORY
66 PIPELINES IN THE LIME LNGUGE var pipeline = task worker1 => task worker2 => task worker3; port-to-stream connection port-to-stream connection char char int int[[5]] worker1( ) { } worker2( ) { } worker3( ) { } compound filter
67 PIPELINES IN THE LIME LNGUGE var pipeline = task worker1 => task worker2 => task worker3; port-to-stream connection port-to-stream connection char char int int[[5]] worker1( ) { } worker2( ) { } worker3( ) { } compound filter
68 GRGE COLLECTING LIME TSKS var pipeline = task worker1 => task worker2 => task worker3; char char int int[[5]] worker1( ) { } worker2( ) { } worker3( ) { }
69 GRGE COLLECTING LIME TSKS var pipeline = task worker1 => task worker2 => task worker3; char char int int[[5]] worker1( ) { } worker2( ) { } worker3( ) { }
70 W_EN DT_IN DT_OUT DT_IN W_EN 101 Mutator Register DT_OUT REGISTER MODULE
71 W_EN DT_IN DT_OUT GC ROOT_OUT W_EN DT_IN DT_IN W_EN Mutator Register DT_OUT Shadow Register DT_OUT REGISTER MODULE + SNPSHOT COMPONENT
72 W_EN DT_IN DT_OUT GC ROOT_OUT 101 W_EN DT_IN DT_IN 101 W_EN Mutator Register DT_OUT Shadow Register DT_OUT REGISTER MODULE + SNPSHOT COMPONENT
73 W_EN DT_IN DT_OUT GC ROOT_OUT 101 W_EN DT_IN DT_IN W_EN Mutator Register DT_OUT Shadow Register DT_OUT REGISTER MODULE + SNPSHOT COMPONENT
74 GC Push/Pop Push Value Pop Value Root to dd Write Reg Read Reg Stack Top MUX Shadow Register Scan Pointer Mutator Stack Mutator Register
75 ddr to Clear ddr to Read/Write Pointer to Write Pointer Value Root to dd Mark Map 000 arrier Reg MUX 1 Mark Queue Pointer Memory MUX Pointer to Trace
76 lloc GC ddr lloc d ddr to Clear ddress to Free =10? Stack Top Free Stack Used Map MUX ddress llocated Sweep Pointer Mark Map
77 GC W_EN PUSH DT_IN (PUSH) DT_OUT (POP) ROOT_OUT 1 DT_IN_ +/- W_EN_ DDR_IN_ DT_OUT_ DT_IN W_EN 101 Top of Stack 1 - MUX 0 W_EN_ DDR_IN_ DT_OUT_ MUX Mutator Stack State Machine 101 Scan Index
78 ENLERS FOR STLL-FREEDOM Dual-ported Memory Read-before-Write Memory and Registers Simple, uniquitous synchronization (clock edge) Forward reasoning about remote states (clock cycles) Determinism
79 EXECUTION TIME IN CYCLES - DEQUEUE
80 EXECUTION TIME IN CYCLES - DEQUEUE STW burns cycles while stopping the world Malloc pays (a little) for explicit free operation Malloc can run in a smaller heap (but not as bad as software)
81 FIELD PROGRMMLE GTE RRYS
82 FIELD PROGRMMLE GTE RRYS
83 FIELD PROGRMMLE GTE RRYS
84 FIELD PROGRMMLE GTE RRYS
85 FIELD PROGRMMLE GTE RRYS IO
86 CPU ackend GPU ackend Node ackend Verilog ackend bytecode binary binary bitfile CPU GPU PowerEN FPG
87 THE LIQUID METL PROGRMMING LNGUGE Lime CPU ackend GPU ackend Node ackend Verilog ackend Lime Compiler bytecode binary binary bitfile CPU GPU PowerEN FPG
88 EXECUTION, COMMUNICTION, ND REPLCEMENT LVM
89 EXECUTION, COMMUNICTION, ND REPLCEMENT LVM
90 EXECUTION, COMMUNICTION, ND REPLCEMENT LVM
91 EXECUTION, COMMUNICTION, ND REPLCEMENT LVM
92 STTEFUL TSKS var averager = task verager().avg; instance variables (local state) primitive filter double double total; long count; double double avg(double x) { total += x; return total/++count; }
93 VIRTULIZTION OF DT MOVEMENT =>
94 INTERPRETTION VERSUS COMPILTION PROGRM getfield invokevirtual INTERPRETER MOV LR INSTRUCTION SET PROCESSOR
95 INTERPRETTION VERSUS COMPILTION PROGRM getfield invokevirtual INTERPRETER MOV LR INSTRUCTION SET PROCESSOR
96 GRGE COLLECTION Frees programmer from managing memory Simpler interfaces, easier debugging, memory safety Invented 1960 for IM 704 with 18K Current large FPGs have memory commensurate with a VX 11/780 Recent results: We built a garbage collector for data in on-chip RMs ble to handle a memory op each cycle without ever stalling Cost in slices and energy is ~0; cost in frequency and RM is small lgorithmically simpler than SW GC, yet achieves vastly better results Potential game-changer in scope of synthesizable code
Technical Perspective The Cleanest Garbage Collection By Eliot Moss
research highlights Technical Perspective The Cleanest Garbage Collection y Eliot Moss DOI:10.1145/2534706.2534725 GRGE COLLECTION, THE quirky name used for automatic storage management, might well be
More informationLIQUID METAL Taming Heterogeneity
LIQUID METAL Taming Heterogeneity Stephen Fink IBM Research! IBM Research Liquid Metal Team (IBM T. J. Watson Research Center) Josh Auerbach Perry Cheng 2 David Bacon Stephen Fink Ioana Baldini Rodric
More informationVIRTUALIZATION IN THE AGE OF HETEROGENEOUS MACHINES
VIRTUALIZATION IN THE AGE OF HETEROGENEOUS MACHINES David F. Bacon The ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 11) Keynote - March 9, 2011 WHAT HETEROGENEOUS
More informationImplementation Garbage Collection
CITS 3242 Programming Paradigms Part IV: Advanced Topics Topic 19: Implementation Garbage Collection Most languages in the functional, logic, and object-oriented paradigms include some form of automatic
More informationAgenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1
Agenda CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Summer 2004 Java virtual machine architecture.class files Class loading Execution engines Interpreters & JITs various strategies
More informationLecture 15 Garbage Collection
Lecture 15 Garbage Collection I. Introduction to GC -- Reference Counting -- Basic Trace-Based GC II. Copying Collectors III. Break Up GC in Time (Incremental) IV. Break Up GC in Space (Partial) Readings:
More informationManaged runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley
Managed runtimes & garbage collection CSE 6341 Some slides by Kathryn McKinley 1 Managed runtimes Advantages? Disadvantages? 2 Managed runtimes Advantages? Reliability Security Portability Performance?
More informationIntroduction to hardware design using VHDL
Introuction to hrwre esign using VHDL Tim Güneysu n Nele Mentens ECC school Novemer 11, 2017, Nijmegen Outline Implementtion pltforms Introuction to VHDL Hrwre tutoril 1 Implementtion pltforms Microprocessor
More informationManaged runtimes & garbage collection
Managed runtimes Advantages? Managed runtimes & garbage collection CSE 631 Some slides by Kathryn McKinley Disadvantages? 1 2 Managed runtimes Portability (& performance) Advantages? Reliability Security
More informationGarbage Collection. Steven R. Bagley
Garbage Collection Steven R. Bagley Reference Counting Counts number of pointers to an Object deleted when the count hits zero Eager deleted as soon as it is finished with Problem: Circular references
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationCompilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1
Compilers 8. Run-time Support Laszlo Böszörmenyi Compilers Run-time - 1 Run-Time Environment A compiler needs an abstract model of the runtime environment of the compiled code It must generate code for
More informationConcurrent Preliminaries
Concurrent Preliminaries Sagi Katorza Tel Aviv University 09/12/2014 1 Outline Hardware infrastructure Hardware primitives Mutual exclusion Work sharing and termination detection Concurrent data structures
More informationCSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1
CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Winter 2008 3/11/2008 2002-08 Hal Perkins & UW CSE V-1 Agenda Java virtual machine architecture.class files Class loading Execution engines
More informationPerformance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs
Performance of Non-Moving Garbage Collectors Hans-J. Boehm HP Labs Why Use (Tracing) Garbage Collection to Reclaim Program Memory? Increasingly common Java, C#, Scheme, Python, ML,... gcc, w3m, emacs,
More informationGo GC: Prioritizing Low Latency and Simplicity
Go GC: Prioritizing Low Latency and Simplicity Rick Hudson Google Engineer QCon San Francisco Nov 16, 2015 My Codefendants: The Cambridge Runtime Gang https://upload.wikimedia.org/wikipedia/commons/thumb/2/2f/sato_tadanobu_with_a_goban.jpeg/500px-sato_tadanobu_with_a_goban.jpeg
More informationHigh-Level Information Interface
High-Level Information Interface Deliverable Report: SRC task 1875.001 - Jan 31, 2011 Task Title: Exploiting Synergy of Synthesis and Verification Task Leaders: Robert K. Brayton and Alan Mishchenko Univ.
More information6.828: OS/Language Co-design. Adam Belay
6.828: OS/Language Co-design Adam Belay Singularity An experimental research OS at Microsoft in the early 2000s Many people and papers, high profile project Influenced by experiences at
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationExploiting the Behavior of Generational Garbage Collector
Exploiting the Behavior of Generational Garbage Collector I. Introduction Zhe Xu, Jia Zhao Garbage collection is a form of automatic memory management. The garbage collector, attempts to reclaim garbage,
More informationChapter 7. Storage Components
7. Storage Components 7- Chapter 7. Storage Components ntroduction Storage components store data and perform simple data transformations, such as counting and shifting. Registers, counters, register files,
More informationParallel Programming: Background Information
1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same
More informationParallel Programming: Background Information
1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same
More informationRuntime. The optimized program is ready to run What sorts of facilities are available at runtime
Runtime The optimized program is ready to run What sorts of facilities are available at runtime Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis token stream Syntactic
More informationEECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141
EECS 151/251A Spring 2019 Digital Design and Integrated Circuits Instructor: John Wawrzynek Lecture 18 Memory Blocks Multi-ported RAM Combining Memory blocks FIFOs FPGA memory blocks Memory block synthesis
More informationEECS150 - Digital Design Lecture 17 Memory 2
EECS150 - Digital Design Lecture 17 Memory 2 October 22, 2002 John Wawrzynek Fall 2002 EECS150 Lec17-mem2 Page 1 SDRAM Recap General Characteristics Optimized for high density and therefore low cost/bit
More information6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.
6.172 Performance Engineering of Software Systems Spring 2009 Lecture 9 MIT OpenCourseWare Dynamic Storage Allocation Stack allocation: LIFO (last-in-first-out) Array and pointer A used unused P before
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC 330 - Spring 2013 1 Memory Attributes! Memory to store data in programming languages has the following lifecycle
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC 330 Spring 2017 1 Memory Attributes Memory to store data in programming languages has the following lifecycle
More informationTopics of this Slideset. CS429: Computer Organization and Architecture. Digital Signals. Truth Tables. Logic Design
Topics of this Slideset CS429: Computer Organization and rchitecture Dr. Bill Young Department of Computer Science University of Texas at ustin Last updated: July 5, 2018 at 11:55 To execute a program
More informationHybrid Threading: A New Approach for Performance and Productivity
Hybrid Threading: A New Approach for Performance and Productivity Glen Edwards, Convey Computer Corporation 1302 East Collins Blvd Richardson, TX 75081 (214) 666-6024 gedwards@conveycomputer.com Abstract:
More informationEventrons: A Safe Programming Construct for High-Frequency Hard Real-Time Applications
Eventrons: A Safe Programming Construct for High-Frequency Hard Real-Time Applications Daniel Spoonhower Carnegie Mellon University Joint work with Joshua Auerbach, David F. Bacon, Perry Cheng, David Grove
More informationCS 241 Honors Memory
CS 241 Honors Memory Ben Kurtovic Atul Sandur Bhuvan Venkatesh Brian Zhou Kevin Hong University of Illinois Urbana Champaign February 20, 2018 CS 241 Course Staff (UIUC) Memory February 20, 2018 1 / 35
More informationCreating Computers from (almost) scratch using FPGAs, VHDL and FORTH. Recreative explorations of the hardware/software co-design space
Creating Computers from (almost) scratch using FPGAs, VHDL and FORTH Recreative explorations of the hardware/software co-design space Hans Hübner, ZSLUG Meetup, February 7th, 2011 Introduction Hacker since
More informationParallel GC. (Chapter 14) Eleanor Ainy December 16 th 2014
GC (Chapter 14) Eleanor Ainy December 16 th 2014 1 Outline of Today s Talk How to use parallelism in each of the 4 components of tracing GC: Marking Copying Sweeping Compaction 2 Introduction Till now
More informationROCCC 2.0 Pico Port Generation - Revision 0.7.4
ROCCC 2.0 Pico Port Generation - Revision 0.7.4 June 4, 2012 1 Contents CONTENTS 1 Pico Interface Generation GUI Options 4 1.1 Hardware Configuration....................................... 4 1.2 Stream
More informationConcepts Introduced in Chapter 7
Concepts Introduced in Chapter 7 Storage Allocation Strategies Static Stack Heap Activation Records Access to Nonlocal Names Access links followed by Fig. 7.1 EECS 665 Compiler Construction 1 Activation
More informationHardware/Software Codesign of Schedulers for Real Time Systems
Hardware/Software Codesign of Schedulers for Real Time Systems Jorge Ortiz Committee David Andrews, Chair Douglas Niehaus Perry Alexander Presentation Outline Background Prior work in hybrid co-design
More informationShort Notes of CS201
#includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system
More informationCMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection
CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC330 Fall 2018 1 Memory Attributes Memory to store data in programming languages has the following lifecycle
More informationG Programming Languages - Fall 2012
G22.2110-003 Programming Languages - Fall 2012 Lecture 2 Thomas Wies New York University Review Last week Programming Languages Overview Syntax and Semantics Grammars and Regular Expressions High-level
More informationField Programmable Gate Array
Field Programmable Gate Array System Arch 27 (Fire Tom Wada) What is FPGA? System Arch 27 (Fire Tom Wada) 2 FPGA Programmable (= reconfigurable) Digital System Component Basic components Combinational
More informationECOM 4311 Digital Systems Design
ECOM4311 Digital Systems Design ECOM 4311 Digital Systems Design Eng. Monther busultan Computer Engineering Dept. Islamic University of Gaza genda 1. History of Digital Design pproach 2. HDLs 3. Design
More informationCS201 - Introduction to Programming Glossary By
CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with
More information08 - Address Generator Unit (AGU)
October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem
More informationDesign of Concurrent and Distributed Data Structures
METIS Spring School, Agadir, Morocco, May 2015 Design of Concurrent and Distributed Data Structures Christoph Kirsch University of Salzburg Joint work with M. Dodds, A. Haas, T.A. Henzinger, A. Holzer,
More informationFPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1
FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1 Anurag Dwivedi Digital Design : Bottom Up Approach Basic Block - Gates Digital Design : Bottom Up Approach Gates -> Flip Flops Digital
More informationToday s Lecture. Basics of Logic Design: Boolean Algebra, Logic Gates. Recursive Example. Review: The C / C++ code. Recursive Example (Continued)
Tod s Lecture Bsics of Logic Design: Boolen Alger, Logic Gtes Alvin R. Leeck CPS 4 Lecture 8 Homework #2 Due Ferur 3 Outline Review (sseml recursion) Building the uilding locks Logic Design Truth tles,
More informationARM Architecture and Assembly Programming Intro
ARM Architecture and Assembly Programming Intro Instructors: Dr. Phillip Jones http://class.ece.iastate.edu/cpre288 1 Announcements HW9: Due Sunday 11/5 (midnight) Lab 9: object detection lab Give TAs
More informationGarbage Collection Algorithms. Ganesh Bikshandi
Garbage Collection Algorithms Ganesh Bikshandi Announcement MP4 posted Term paper posted Introduction Garbage : discarded or useless material Collection : the act or process of collecting Garbage collection
More informationCS 241 Honors Concurrent Data Structures
CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over
More informationDesign Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8
Subroutines and Control Abstraction Textbook, Chapter 8 1 Subroutines and Control Abstraction Mechanisms for process abstraction Single entry (except FORTRAN, PL/I) Caller is suspended Control returns
More informationLecture 13: Garbage Collection
Lecture 13: Garbage Collection COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer/Mikkel Kringelbach 1 Garbage Collection Every modern programming language allows programmers
More information2B 52 AB CA 3E A1 +29 A B C. CS120 Fall 2018 Final Prep and super secret quiz 9
S2 Fall 28 Final Prep and super secret quiz 9 ) onvert 8-bit (2-digit) 2 s complement hex values: 4-29 inary: Hex: x29 2) onvert 8-bit 2 s complement hex to decimal: x3 inary: xe5 Decimal: 58 Note 3*6+
More informationHierarchical Real-time Garbage Collection
Hierarchical Real-time Garbage Collection Filip Pizlo Antony L. Hosking Jan Vitek Presenter: Petur Olsen October 4, 2007 The general idea Introduction The Article The Authors 2/28 Pizlo, Hosking, Vitek
More informationThe benefits and costs of writing a POSIX kernel in a high-level language
1 / 38 The benefits and costs of writing a POSIX kernel in a high-level language Cody Cutler, M. Frans Kaashoek, Robert T. Morris MIT CSAIL Should we use high-level languages to build OS kernels? 2 / 38
More informationPowerPC on NetFPGA CSE 237B. Erik Rubow
PowerPC on NetFPGA CSE 237B Erik Rubow NetFPGA PCI card + FPGA + 4 GbE ports FPGA (Virtex II Pro) has 2 PowerPC hard cores Untapped resource within NetFPGA community Goals Evaluate performance of on chip
More informationFaculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology
Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4303 April 9, 2010 14.00-15.30 This exam (6 pages) consists of 52 True/False
More informationCS Computer Architecture
CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 An Example Implementation In principle, we could describe the control store in binary, 36 bits per word. We will use a simple symbolic
More informationCentral Processing Unit
Central Processing Unit Networks and Embedded Software Module.. by Wolfgang Neff Components () lock diagram Execution Unit Control Unit Registers rithmetic logic unit DD, SU etc. NOT, ND etc. us Interface
More informationSimplify System Complexity
1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller
More informationEEL 4783: Hardware/Software Co-design with FPGAs
EEL 4783: Hardware/Software Co-design with FPGAs Lecture 8: Short Introduction to Verilog * Prof. Mingjie Lin * Beased on notes of Turfts lecture 1 Overview Recap + Questions? What is a HDL? Why do we
More informationCprE 288 Introduction to Embedded Systems Course Review for Exam 3. Instructors: Dr. Phillip Jones
CprE 288 Introduction to Embedded Systems Course Review for Exam 3 Instructors: Dr. Phillip Jones 1 Announcements Exam 3: See course website for day/time. Exam 3 location: Our regular classroom Allowed
More informationProgramming Language Implementation
A Practical Introduction to Programming Language Implementation 2014: Week 10 Garbage Collection College of Information Science and Engineering Ritsumeikan University 1 review of last week s topics dynamic
More informationCooperative Memory Management in Embedded Systems
ooperative Memory Management in Embedded Systems, Philip Taffner, hristoph Erhardt, hris7an Dietrich, Michael S7lkerich Department of omputer Science 4 Distributed Systems and Opera7ng Systems 1 2 Motivation
More informationTick: Concurrent GC in Apache Harmony
Tick: Concurrent GC in Apache Harmony Xiao-Feng Li 2009-4-12 Acknowledgement: Yunan He, Simon Zhou Agenda Concurrent GC phases and transition Concurrent marking scheduling Concurrent GC algorithms Tick
More informationAcknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides
Garbage Collection Last time Compiling Object-Oriented Languages Today Motivation behind garbage collection Garbage collection basics Garbage collection performance Specific example of using GC in C++
More informationSpecializing Hardware for Image Processing
Lecture 6: Specializing Hardware for Image Processing Visual Computing Systems So far, the discussion in this class has focused on generating efficient code for multi-core processors such as CPUs and GPUs.
More informationThe Virtualized Virtual Machine:
he Virtualized Virtual Machine: he Next Generation of Virtual Machine echnology David F Bacon IBM.J. Watson Research Center November 5, 2003 2003 IBM Corporation he Virtual Machine Virtual Machine Definition
More informationJava without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector
Java without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector David F. Bacon IBM T.J. Watson Research Center Joint work with C.R. Attanasio, Han Lee, V.T. Rajan, and Steve Smith ACM Conference
More informationLecture 8 Dynamic Memory Allocation
Lecture 8 Dynamic Memory Allocation CS240 1 Memory Computer programs manipulate an abstraction of the computer s memory subsystem Memory: on the hardware side 3 @ http://computer.howstuffworks.com/computer-memory.htm/printable
More informationEECS 373 Midterm Winter 2012
EECS 373 Midterm Winter 2012 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Nor did I discuss this exam with anyone after
More informationSimplify System Complexity
Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint
More informationMemory Management Basics
Memory Management Basics 1 Basic Memory Management Concepts Address spaces! Physical address space The address space supported by the hardware Ø Starting at address 0, going to address MAX sys! MAX sys!!
More informationC++ for System Developers with Design Pattern
C++ for System Developers with Design Pattern Introduction: This course introduces the C++ language for use on real time and embedded applications. The first part of the course focuses on the language
More informationParsing Scheme (+ (* 2 3) 1) * 1
Parsing Scheme + (+ (* 2 3) 1) * 1 2 3 Compiling Scheme frame + frame halt * 1 3 2 3 2 refer 1 apply * refer apply + Compiling Scheme make-return START make-test make-close make-assign make- pair? yes
More informationGarbage Collection Basics
Garbage Collection Basics 1 Freeing memory is a pain Need to decide on a protocol (who frees what when?) Pollutes interfaces Errors hard to track down Remember 211 / 213?... but lets try an example anyway
More informationCS577 Modern Language Processors. Spring 2018 Lecture Garbage Collection
CS577 Modern Language Processors Spring 2018 Lecture Garbage Collection 1 BASIC GARBAGE COLLECTION Garbage Collection (GC) is the automatic reclamation of heap records that will never again be accessed
More informationFPGA Implementation and Validation of the Asynchronous Array of simple Processors
FPGA Implementation and Validation of the Asynchronous Array of simple Processors Jeremy W. Webb VLSI Computation Laboratory Department of ECE University of California, Davis One Shields Avenue Davis,
More informationSynthesizable Verilog
Synthesizable Verilog Courtesy of Dr. Edwards@Columbia, and Dr. Franzon@NCSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Design Methodology Structure and Function (Behavior) of a Design HDL
More informationManual Allocation. CS 1622: Garbage Collection. Example 1. Memory Leaks. Example 3. Example 2 11/26/2012. Jonathan Misurda
Manual llocation Dynamic memory allocation is an obvious necessity in a programming environment. S 1622: Garbage ollection Many programming languages expose some functions or keywords to manage runtime
More informationImplementing Symmetric Multiprocessing in LispWorks
Implementing Symmetric Multiprocessing in LispWorks Making a multithreaded application more multithreaded Martin Simmons, LispWorks Ltd Copyright 2009 LispWorks Ltd Outline Introduction Changes in LispWorks
More informationCSE 153 Design of Operating Systems
CSE 153 Design of Operating Systems Winter 19 Lecture 7/8: Synchronization (1) Administrivia How is Lab going? Be prepared with questions for this weeks Lab My impression from TAs is that you are on track
More informationRobust Memory Management Schemes
Robust Memory Management Schemes Prepared by : Fadi Sbahi & Ali Bsoul Supervised By: Dr. Lo ai Tawalbeh Jordan University of Science and Technology Robust Memory Management Schemes Introduction. Memory
More informationSchism: Fragmentation-Tolerant Real-Time Garbage Collection. Fil Pizlo *
Schism: Fragmentation-Tolerant Real-Time Garbage Collection Fil Pizlo Luke Ziarek Peta Maj * Tony Hosking * Ethan Blanton Jan Vitek * * Why another Real Time Garbage Collector? Why another Real Time Garbage
More informationLast week, David Terei lectured about the compilation pipeline which is responsible for producing the executable binaries of the Haskell code you
Last week, David Terei lectured about the compilation pipeline which is responsible for producing the executable binaries of the Haskell code you actually want to run. Today, we are going to look at an
More informationSystems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits
Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion
More informationCS429: Computer Organization and Architecture
CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: January 2, 2018 at 11:23 CS429 Slideset 5: 1 Topics of this Slideset
More informationEECS150 - Digital Design Lecture 20 - Finite State Machines Revisited
EECS150 - Digital Design Lecture 20 - Finite State Machines Revisited April 2, 2009 John Wawrzynek Spring 2009 EECS150 - Lec20-fsm Page 1 Finite State Machines (FSMs) FSM circuits are a type of sequential
More informationProcessor Architecture I. Alexandre David
Processor Architecture I Alexandre David Overview n Introduction: from transistors to gates. n and from gates to circuits. n 4.1 n 4.2 n Micro+macro code. 12-04-2011 CART - Aalborg University 2 Evolution
More informationLab 1: FPGA Physical Layout
Lab 1: FPGA Physical Layout University of California, Berkeley Department of Electrical Engineering and Computer Sciences EECS150 Components and Design Techniques for Digital Systems John Wawrzynek, James
More informationComputer Systems Lecture 9
Computer Systems Lecture 9 CPU Registers in x86 CPU status flags EFLAG: The Flag register holds the CPU status flags The status flags are separate bits in EFLAG where information on important conditions
More informationCMSC 330: Organization of Programming Languages. Ownership, References, and Lifetimes in Rust
CMSC 330: Organization of Programming Languages Ownership, References, and Lifetimes in Rust CMSC330 Spring 2018 1 Memory: the Stack and the Heap The stack constant-time, automatic (de)allocation Data
More informationSustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)
COMP 412 FALL 2017 Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java) Copyright 2017, Keith D. Cooper & Zoran Budimlić, all rights reserved. Students enrolled in Comp 412 at Rice
More informationHigh-Level Language VMs
High-Level Language VMs Outline Motivation What is the need for HLL VMs? How are these different from System or Process VMs? Approach to HLL VMs Evolutionary history Pascal P-code Object oriented HLL VMs
More informationERCBench An Open-Source Benchmark Suite for Embedded and Reconfigurable Computing
ERCBench An Open-Source Benchmark Suite for Embedded and Reconfigurable Computing Daniel Chang Chris Jenkins, Philip Garcia, Syed Gilani, Paula Aguilera, Aishwarya Nagarajan, Michael Anderson, Matthew
More informationQualifying Exam in Programming Languages and Compilers
Qualifying Exam in Programming Languages and Compilers University of Wisconsin Fall 1991 Instructions This exam contains nine questions, divided into two parts. All students taking the exam should answer
More informationNovel Architecture for Designing Asynchronous First in First out (FIFO)
I J C T A, 10(8), 2017, pp. 343-349 International Science Press ISSN: 0974-5572 Novel Architecture for Designing Asynchronous First in First out (FIFO) Avinash Yadlapati* and Hari Kishore Kakarla* ABSTRACT
More informationOverview of Lecture 4. Memory Models, Atomicity & Performance. Ben-Ari Concurrency Model. Dekker s Algorithm 4
Concurrent and Distributed Programming http://fmt.cs.utwente.nl/courses/cdp/ Overview of Lecture 4 2 Memory Models, tomicity & Performance HC 4 - Tuesday, 6 December 2011 http://fmt.cs.utwente.nl/~marieke/
More information