Delft-Java Dynamic Translation
|
|
- Regina Hunter
- 5 years ago
- Views:
Transcription
1 Delft-Java Dynamic Translation John Glossner 1,2 and Stamatis Vassiliadis 2 1 IBM Research DSP and Embedded Computing Yorktown Heights, NY glossner@us.ibm.com (formerly with Lucent Technologies) 2 Delft University of Technology CARDIT - Computer Architecture and Digital Technique Delft, The Netherlands {glossner,stamatis}@cardit.et.tudelft.nl
2 Overview Java Properties Delft-Java Engine Dynamic Translation Indirect Register Access Translated Instructions Non-translated Instructions Results Related Work Conclusions
3 Java Properties Object-Oriented Programming Language Inheritance and Polymorphism Supported Programmer Supplied Parallelism (Threads) Dynamically Linked Resolved C++ s fragile class problem but imposes performance constraints on class access Entire set of objects in system not required at compile time Strongly Typed Statically determinable type state enables simple on-the-fly translation of bytecodes into efficient machine code [Gos95] Compiled to Platform Independent Virtual Machine
4 Delft-Java Engine RISC-style Architecture 32-bit Instructions Multiple Register Files Concurrent Multithreaded Organization Multiple Hdwr Thread Units Multiple Instruction Issue Per Thread Indirect Register Access Supervisory Instructions Branch Java View (bex) Integer & Floating Point 8, 16, 32, and 64-bit Signed & Unsigned Integers IEEE-754 Floating Point Multimedia Instructions SIMD Parallelism DSP Arithmetic Extensions Saturation Logic Rounding Modes 32-bit Address Space Base + Offset + Displacement
5 Java Hardware Support Transparent Extraction of Parallelism Multiple concurrent thread units Dynamic Java Instruction Translation Register file caches stack with indirect access JVM Reserved Instruction Used For BEX Link Translation Buffer For Dynamic Linking Associates a caller s object reference and constant pool entry ID with a linked object invocation Logical Controller For Non-Supported Translations Thin interpretive layer and Java run-time
6 Delft-Java Organization
7 Indirect Access idx[7] o o o Resolved Registers idx[1] idx[0] ix iy it Post / pre update IsJava imm imm imm adjust Overflow / Underflow Update Offset Register
8 Register Mapping Resolved Register Register File r31 r30 r29 O O O r0 Tags V A L I D M O D I F I E D Main Memory r31 r30 r29 Stack llimit displacement offset Base address + O O O r0 Stack ulimit
9 Java Dynamic Translation Step 1: Translate Java instr To Indirect Instruction iadd translates to add.ind.w32 idx[7] ++it, it, ++it idx[7] is an index into the register file Step 2: Translate Indirect To Direct Instruction if idx[7].rt = 20, then translated instruction becomes: add.w32 r21, r20, r21 ; (r21=r20 + r21) In Java-mode, ix, iy, and it are locked together. In native-mode, they are free to contain independent offsets for Vector operations.
10 Vector Multiply class VectorMultiply { public static final int MAXVEC = 100; public static void main( String[] args ) { int[] a,b,c; a = new int[maxvec]; b = new int[maxvec]; c = new int[maxvec]; for( int i=0; i<maxvec; i++ ) { // Init Arrays a[i] = i; b[i] = 2*i; c[i] = 0; } for( int i=0; i<maxvec; i++ ) { c[i] = a[i] * b[i] }}}
11 Compiled Inner Loop 1 a_3 ; address of c[0] on heap 2 i 5 ; index into c[] 3 a_1 ; address of a[0] 4 i 5 ; index into a[] 5 ia ; element from a[index] 6 a_2 ; address of b[0] 7 i 5 ; index into b[] 8 ia ; element from b[index] 9 imul ; multiple a[i]*b[i] 10 iastore ; store it into c[index]
12 Translated Bytecode Opcode mpy store [idx7] [idx7] [idx7] [idx7] [idx7] [idx7] [idx7] [idx7] [idx7] [idx7] Indirect Register --it, base_lv + #3 --it, base_lv + #5 --it, base_lv + #1 --it, base_lv + #5 ++it, ++it + it --it, base_lv + #2 --it, base_lv + #5 ++it, ++it + it ++it, it, ++it 2+it + 1+it, it
13 DJ Executed Instructions Opcode mpy store Direct Register Initial value of idx[7] = 24 r23 <- Mem[base_LV + #3] r22 <- Mem[base_LV + #5] r21 <- Mem[base_LV + #1] r20 <- Mem[base_LV + #5] r21 <- Mem[r21 + r20] r20 <- Mem[base_LV + #2] r19 <- Mem[base_LV + #5] r20 <- Mem[r20 + r19] r21 <- r20 * r21 Mem[r23 + r22] <- r21
14 Translated Instructions anewarray arraylength athrow checkcast getfield getstatic goto_w 1 instanceof invokeinterface 1 invokespecial invokestatic invokevirtual jsr_w 1 lookupswitch 1 monitorenter monitorexit multianewarray new newarray putfield putstatic tableswitch wide 1 (traps) These instructions have special hardware support All other instructions are transparently translated
15 Model Characteristics IS: Ideal Stack Does not remove stack bottlenecks IX: Ideal Translated Uses Delft-Java Translation Multiple Issue (inorder) No Renaming IR: Ideal Translated with Renaming Out-of-order multiple issue PS: Pipelined Stack Pipeline Latency of 4 Cycles for Memory Pipeline Latency of 1 Cycle for Arithmetic No Delft-Java Translation PX: Pipelined Stack with Translation Model IS IX IR PS PX PR Rename No No Yes No No Yes Issue Inorder Inorder OOO Inorder Inorder OOO L/S Units ` ` ` ` ` ` Latency In-order Multiple Issue PR: Pipelined Translation and Renaming BR Yes OOO 2LV/2H 4 Out-of-order Multiple Issue BR: Bounded Resource PR 2 Concurrent Memory Space Accesses
16 Results PS (Pipelined Stack) Chosen as Basis for Comparison A potentially realizable implementation IS (Ideal Stack) is 3.5x faster than PS (Pipelined Stack) Stack Bottlenecks were reduced by 40% with the IX (Ideal Translated) model versus the IS (Ideal Stack) model. When Register Renaming was applied (IR model), stack bottlenecks were reduced 60%. Bounded Resources (BR) still performed 3.2x better than the pipelined stack. Register renaming with out-of-order execution successfully enhanced performance by 50% Model IS IX IR PS PX PR BR Peak Issue IPC Speedup
17 Results IS: Ideal Stack Does not remove stack bottlenecks IX: Ideal Translated Uses Delft-Java Translation Speedup Multiple Issue (inorder) 8 No Renaming IR: Ideal Translated with Renaming Out-of-order multiple issue PS: Pipelined Stack Pipeline Latency of 4 Cycles for Memory Pipeline Latency of 1 Cycle for Arithmetic No Delft-Java Translation PX: Pipelined Stack with Translation In-order Multiple Issue IS IX IR PS PX PR BR PR: Pipelined Translation and Renaming Out-of-order Multiple Issue 0 Vector Multiply BR: Bounded Resource PR 2 Concurrent Memory Space Accesses
18 Comparison Sun picojava Direct Execution Stack Cache Implemented with registers Automatic stack spill/fill Acceleration instruction folding Instruction and Data Cache Global L1 Extended bytecodes Complex instructions trap Contiguous Stack Frame Delft-Java Dynamic Translation Translated to RISC instructions Indirect register access Runtime register allocation Acceleration compounding instruction issue multiple thread units Link Translation Buffer Instruction and Data Cache Global L1 Cache Per thread L0 Instruction, Stack, Local Variable Cache Superset of instructions (w/ BEX) Complex instructions trap Contiguous Stack Frame
19 Comparison Tomasulo Dynamic Scheduling Munsil and Wang, 1998 Reduced Stack usage Stack Folding Sun, 1998 Chang, et. al Detect true dependencies in instruction stream and execute as single compound instruction Similar to our 1997 collapsing ALU technique. Virtual Registers Li, et. al., 1998 Allows arithmetic instructions to obtain source operands from a virtual register which may reference operands below the top of the stack. Multiple instructions can be issued in parallel Similar to our 1997 translation technique
20 Conclusion Dynamic Translation A form of hardware register allocation Transform stack bottlenecks into pipeline dependencies Pipeline dependencies are removed using superscalar techniques 3.2x speedup achieved over a pipelined stack model Up to 60% of stack bottlenecks removed For translated instruction streams, out-of-order execution realized a 50% performance improvement when compared with in-order execution
Delft-Java Dynamic Translation
Delft-Java Dynamic Translation John Glossner 1 2 1 IBM Research Yorktown Heights, NY glossner@cardit.et.tudelft.nl Stamatis Vassiliadis 2 2 Delft University of Technology Electrical Engineering Department
More informationDelft-Java Link Translation Buffer
Delft-Java Link Translation Buffer John Glossner 1,2 and Stamatis Vassiliadis 2 1 Lucent / Bell Labs Advanced DSP Architecture and Compiler Research Allentown, Pa glossner@lucent.com 2 Delft University
More informationGlobal Scheduler. Global Issue. Global Retire
The Delft-Java Engine: An Introduction C. John Glossner 1;2 and Stamatis Vassiliadis 2 1 Lucent / Bell Labs, Allentown, Pa. 2 Delft University oftechnology, Department of Electrical Engineering Delft,
More informationJVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls
CS6: Program and Data Representation University of Virginia Computer Science Spring 006 David Evans Lecture 8: Code Safety and Virtual Machines (Duke suicide picture by Gary McGraw) pushing constants JVML
More informationTowards a Java-enabled 2Mbps wireless handheld device
Towards a Java-enabled 2Mbps wireless handheld device John Glossner 1, Michael Schulte 2, and Stamatis Vassiliadis 3 1 Sandbridge Technologies, White Plains, NY 2 Lehigh University, Bethlehem, PA 3 Delft
More informationCSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1
CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Winter 2008 3/11/2008 2002-08 Hal Perkins & UW CSE V-1 Agenda Java virtual machine architecture.class files Class loading Execution engines
More informationAgenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1
Agenda CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Summer 2004 Java virtual machine architecture.class files Class loading Execution engines Interpreters & JITs various strategies
More informationCourse Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion
Course Overview PART I: overview material 1 Introduction (today) 2 Language Processors (basic terminology, tombstone diagrams, bootstrapping) 3 The architecture of a Compiler PART II: inside a compiler
More informationjavac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]
Analyzing Control Flow in Java Bytecode Jianjun Zhao Department of Computer Science and Engineering Fukuoka Institute of Technology 3-10-1 Wajiro-Higashi, Higashi-ku, Fukuoka 811-02, Japan zhao@cs.t.ac.jp
More informationJVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format
Course Overview What This Topic is About PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax
More informationRun-time Program Management. Hwansoo Han
Run-time Program Management Hwansoo Han Run-time System Run-time system refers to Set of libraries needed for correct operation of language implementation Some parts obtain all the information from subroutine
More informationProgramming Language Systems
Programming Language Systems Instructors: Taiichi Yuasa and Masahiro Yasugi Course Description (overview, purpose): The course provides an introduction to run-time mechanisms such as memory allocation,
More informationLet s make some Marc R. Hoffmann Eclipse Summit Europe
Let s make some Marc R. Hoffmann Eclipse Summit Europe 2012 24.10.2012 public class WhatIsFaster { int i; void inc1() { i = i + 1; } void inc2() { i += 1; } void inc3() { i++; } } Why? Compilers Scrip;ng
More informationThe Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona
The Java Virtual Machine CSc 553 Principles of Compilation 3 : The Java VM Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2011 Christian Collberg The Java VM has gone
More informationThe Java Language Implementation
CS 242 2012 The Java Language Implementation Reading Chapter 13, sections 13.4 and 13.5 Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches, pages 1 5. Outline Java virtual
More informationParallelism of Java Bytecode Programs and a Java ILP Processor Architecture
Australian Computer Science Communications, Vol.21, No.4, 1999, Springer-Verlag Singapore Parallelism of Java Bytecode Programs and a Java ILP Processor Architecture Kenji Watanabe and Yamin Li Graduate
More informationTowards Time-predictable Data Caches for Chip-Multiprocessors
Towards Time-predictable Data Caches for Chip-Multiprocessors Martin Schoeberl, Wolfgang Puffitsch, and Benedikt Huber Institute of Computer Engineering Vienna University of Technology, Austria mschoebe@mail.tuwien.ac.at,
More informationCS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II
CS2110 Fall 2011 Lecture 25 Under the Hood: The Java Virtual Machine, Part II 1 Java program last time Java compiler Java bytecode (.class files) Compile for platform with JIT Interpret with JVM run native
More informationMain Points of the Computer Organization and System Software Module
Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a
More informationCS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines
CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines Sreepathi Pai UTCS September 14, 2015 Outline 1 Introduction 2 Out-of-order Scheduling 3 The Intel Haswell
More informationChapter 3 Instruction-Level Parallelism and its Exploitation (Part 1)
Chapter 3 Instruction-Level Parallelism and its Exploitation (Part 1) ILP vs. Parallel Computers Dynamic Scheduling (Section 3.4, 3.5) Dynamic Branch Prediction (Section 3.3) Hardware Speculation and Precise
More informationCSC 4181 Handout : JVM
CSC 4181 Handout : JVM Note: This handout provides you with the basic information about JVM. Although we tried to be accurate about the description, there may be errors. Feel free to check your compiler
More informationCompiling Techniques
Lecture 10: Introduction to 10 November 2015 Coursework: Block and Procedure Table of contents Introduction 1 Introduction Overview Java Virtual Machine Frames and Function Call 2 JVM Types and Mnemonics
More informationMetaVM: A Transparent Distributed Object System Supported by Runtime Compiler
MetaVM: A Transparent Distributed Object System Supported by Runtime Compiler Kazuyuki Shudo Yoichi Muraoka School of Science and Engineering Waseda University Okubo 3-4-1, Shinjuku-ku, Tokyo 169-8555,
More informationCompilation 2012 Code Generation
Compilation 2012 Jan Midtgaard Michael I. Schwartzbach Aarhus University Phases Computing resources, such as: layout of data structures offsets register allocation Generating an internal representation
More informationCode Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University
Code Generation Frédéric Haziza Department of Computer Systems Uppsala University Spring 2008 Operating Systems Process Management Memory Management Storage Management Compilers Compiling
More informationChapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats
Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Chapter 5 A Closer Look at Instruction Set Architectures Gain familiarity with memory addressing modes. Understand
More informationInstruction Set Principles and Examples. Appendix B
Instruction Set Principles and Examples Appendix B Outline What is Instruction Set Architecture? Classifying ISA Elements of ISA Programming Registers Type and Size of Operands Addressing Modes Types of
More informationCS425 Computer Systems Architecture
CS425 Computer Systems Architecture Fall 2017 Multiple Issue: Superscalar and VLIW CS425 - Vassilis Papaefstathiou 1 Example: Dynamic Scheduling in PowerPC 604 and Pentium Pro In-order Issue, Out-of-order
More informationChapter 5. A Closer Look at Instruction Set Architectures
Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationAdministration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator
CS 412/413 Introduction to Compilers and Translators Andrew Myers Cornell University Administration Design reports due Friday Current demo schedule on web page send mail with preferred times if you haven
More informationCPU Architecture Overview. Varun Sampath CIS 565 Spring 2012
CPU Architecture Overview Varun Sampath CIS 565 Spring 2012 Objectives Performance tricks of a modern CPU Pipelining Branch Prediction Superscalar Out-of-Order (OoO) Execution Memory Hierarchy Vector Operations
More informationPart VII : Code Generation
Part VII : Code Generation Code Generation Stack vs Register Machines JVM Instructions Code for arithmetic Expressions Code for variable access Indexed variables Code for assignments Items How to use items
More informationA Java Processor Architecture for Embedded Real-Time Systems
A Java Processor Architecture for Embedded Real-Time Systems Martin Schoeberl Institute of Computer Engineering, Vienna University of Technology, Austria Abstract Architectural advancements in modern processor
More informationpicojava I Java Processor Core DATA SHEET DESCRIPTION
picojava I DATA SHEET DESCRIPTION picojava I is a uniquely designed processor core which natively executes Java bytecodes as defined by the Java Virtual Machine (JVM). Most processors require the JVM to
More informationTopics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation
Computer Organization CS 231-01 Data Representation Dr. William H. Robinson November 12, 2004 Topics Power tends to corrupt; absolute power corrupts absolutely. Lord Acton British historian, late 19 th
More informationChapter 03. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1
Chapter 03 Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure 3.3 Comparison of 2-bit predictors. A noncorrelating predictor for 4096 bits is first, followed
More informationTranslating JVM Code to MIPS Code 1 / 43
Translating JVM Code to MIPS Code 1 / 43 Outline 1 Introduction 2 SPIM and the MIPS Architecture 3 Our Translator 2 / 43 Introduction Compilation is not necessarily done after the class file is constructed
More informationExploiting ILP with SW Approaches. Aleksandar Milenković, Electrical and Computer Engineering University of Alabama in Huntsville
Lecture : Exploiting ILP with SW Approaches Aleksandar Milenković, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville Outline Basic Pipeline Scheduling and Loop
More informationExploitation of instruction level parallelism
Exploitation of instruction level parallelism Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering
More informationCompiler construction 2009
Compiler construction 2009 Lecture 2 Code generation 1: Generating Jasmin code JVM and Java bytecode Jasmin Naive code generation The Java Virtual Machine Data types Primitive types, including integer
More informationLike scalar processor Processes individual data items Item may be single integer or floating point number. - 1 of 15 - Superscalar Architectures
Superscalar Architectures Have looked at examined basic architecture concepts Starting with simple machines Introduced concepts underlying RISC machines From characteristics of RISC instructions Found
More informationSoot A Java Bytecode Optimization Framework. Sable Research Group School of Computer Science McGill University
Soot A Java Bytecode Optimization Framework Sable Research Group School of Computer Science McGill University Goal Provide a Java framework for optimizing and annotating bytecode provide a set of API s
More informationCS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department
CS263: Runtime Systems Lecture: High-level language virtual machines Part 1 of 2 Chandra Krintz UCSB Computer Science Department Portable, Mobile, OO Execution Model Execution model embodied by recent
More informationChapter 13 Reduced Instruction Set Computers
Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 14 Instruction Level Parallelism and Superscalar Processors
William Stallings Computer Organization and Architecture 8 th Edition Chapter 14 Instruction Level Parallelism and Superscalar Processors What is Superscalar? Common instructions (arithmetic, load/store,
More informationHardware-Based Speculation
Hardware-Based Speculation Execute instructions along predicted execution paths but only commit the results if prediction was correct Instruction commit: allowing an instruction to update the register
More informationCS252 Spring 2017 Graduate Computer Architecture. Lecture 8: Advanced Out-of-Order Superscalar Designs Part II
CS252 Spring 2017 Graduate Computer Architecture Lecture 8: Advanced Out-of-Order Superscalar Designs Part II Lisa Wu, Krste Asanovic http://inst.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 Last Time
More informationPage 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions
Structure of von Nuemann machine Arithmetic and Logic Unit Input Output Equipment Main Memory Program Control Unit 1 1 Instruction Set - the type of Instructions Arithmetic + Logical (ADD, SUB, MULT, DIV,
More informationHP PA-8000 RISC CPU. A High Performance Out-of-Order Processor
The A High Performance Out-of-Order Processor Hot Chips VIII IEEE Computer Society Stanford University August 19, 1996 Hewlett-Packard Company Engineering Systems Lab - Fort Collins, CO - Cupertino, CA
More informationChapter 5. A Closer Look at Instruction Set Architectures
Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationChapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats
Chapter 5 Objectives Chapter 5 A Closer Look at Instruction Set Architectures Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationThe Java Virtual Machine
Virtual Machines in Compilation Abstract Syntax Tree Compilation 2007 The compile Virtual Machine Code interpret compile Native Binary Code Michael I. Schwartzbach BRICS, University of Aarhus 2 Virtual
More informationComputer Systems Architecture I. CSE 560M Lecture 10 Prof. Patrick Crowley
Computer Systems Architecture I CSE 560M Lecture 10 Prof. Patrick Crowley Plan for Today Questions Dynamic Execution III discussion Multiple Issue Static multiple issue (+ examples) Dynamic multiple issue
More information301AA - Advanced Programming [AP-2017]
301AA - Advanced Programming [AP-2017] Lecturer: Andrea Corradini andrea@di.unipi.it Tutor: Lillo GalleBa galleba@di.unipi.it Department of Computer Science, Pisa Academic Year 2017/18 AP-2017-06: The
More informationOver-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.
Over-view CSc 3210 Slides by Michael Weeks Copyright 2015 Unix basics javac java.j files javap 1 2 jasmin converting from javap to jasmin classfile structure calling methods adding line numbers Java programs
More informationJOP: A Java Optimized Processor for Embedded Real-Time Systems. Martin Schoeberl
JOP: A Java Optimized Processor for Embedded Real-Time Systems Martin Schoeberl JOP Research Targets Java processor Time-predictable architecture Small design Working solution (FPGA) JOP Overview 2 Overview
More informationComplex Pipelining COE 501. Computer Architecture Prof. Muhamed Mudawar
Complex Pipelining COE 501 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Diversified Pipeline Detecting
More informationSISTEMI EMBEDDED. Computer Organization Central Processing Unit (CPU) Federico Baronti Last version:
SISTEMI EMBEDDED Computer Organization Central Processing Unit (CPU) Federico Baronti Last version: 20170516 Processing Unit A processor reads program instructions from the computer s memory and executes
More informationLecture 4: Instruction Set Architecture
Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)
More informationCMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)
CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer
More informationJava Class Loading and Bytecode Verification
Java Class Loading and Bytecode Verification Every object is a member of some class. The Class class: its members are the (definitions of) various classes that the JVM knows about. The classes can be dynamically
More informationLecture 12 Branch Prediction and Advanced Out-of-Order Superscalars
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 12 Branch Prediction and Advanced Out-of-Order Superscalars Krste Asanovic Electrical Engineering and Computer
More informationChapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.
Chapter 1 GETTING STARTED SYS-ED/ Computer Education Techniques, Inc. Objectives You will learn: Java platform. Applets and applications. Java programming language: facilities and foundation. Memory management
More informationECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010
ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010 This homework is to be done individually. Total 9 Questions, 100 points 1. (8
More informationEmbedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.
Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors
More informationHandout 2 ILP: Part B
Handout 2 ILP: Part B Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level Parallelism Loop unrolling by compiler to increase ILP Branch prediction to increase ILP
More informationMulti-cycle Instructions in the Pipeline (Floating Point)
Lecture 6 Multi-cycle Instructions in the Pipeline (Floating Point) Introduction to instruction level parallelism Recap: Support of multi-cycle instructions in a pipeline (App A.5) Recap: Superpipelining
More informationECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 14 Very Long Instruction Word Machines
ECE 252 / CPS 220 Advanced Computer Architecture I Lecture 14 Very Long Instruction Word Machines Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationCMPSC 497: Java Security
CMPSC 497: Java Security Trent Jaeger Systems and Internet Infrastructure Security (SIIS) Lab Computer Science and Engineering Department Pennsylvania State University 1 Enforcement Mechanisms Static mechanisms
More informationCS 61C: Great Ideas in Computer Architecture. (Brief) Review Lecture
CS 61C: Great Ideas in Computer Architecture (Brief) Review Lecture Instructor: Justin Hsia 7/16/2013 Summer 2013 Lecture #13 1 Topic List So Far (1/2) Number Representation Signed/unsigned, Floating Point
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationChapter 2. Instruction Set Principles and Examples. In-Cheol Park Dept. of EE, KAIST
Chapter 2. Instruction Set Principles and Examples In-Cheol Park Dept. of EE, KAIST Stack architecture( 0-address ) operands are on the top of the stack Accumulator architecture( 1-address ) one operand
More informationIMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner
IMPLEMENTATION OF H.264 DECODER ON SANDBLASTER DSP Vaidyanathan Ramadurai, Sanjay Jinturkar, Mayan Moudgill, John Glossner Sandbridge Technologies, 1 North Lexington Avenue, White Plains, NY 10601 sjinturkar@sandbridgetech.com
More information17. Instruction Sets: Characteristics and Functions
17. Instruction Sets: Characteristics and Functions Chapter 12 Spring 2016 CS430 - Computer Architecture 1 Introduction Section 12.1, 12.2, and 12.3 pp. 406-418 Computer Designer: Machine instruction set
More informationThe Java Virtual Machine
The Java Virtual Machine Norman Matloff and Thomas Fifield University of California at Davis c 2001-2007, N. Matloff December 11, 2006 Contents 1 Background Needed 3 2 Goal 3 3 Why Is It a Virtual Machine?
More informationProgram Dynamic Analysis. Overview
Program Dynamic Analysis Overview Dynamic Analysis JVM & Java Bytecode [2] A Java bytecode engineering library: ASM [1] 2 1 What is dynamic analysis? [3] The investigation of the properties of a running
More information3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]
Overview Program Dynamic Analysis Dynamic Analysis JVM & Java Bytecode [2] A Java bytecode engineering library: ASM [1] 2 What is dynamic analysis? [3] The investigation of the properties of a running
More informationProcessors. Young W. Lim. May 12, 2016
Processors Young W. Lim May 12, 2016 Copyright (c) 2016 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version
More informationWilliam Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers
More informationVMKit: a Substrate for Managed Runtime Environments
VMKit: a Substrate for Managed Runtime Environments Nicolas Geoffray Université Pierre et Marie Curie LIP6/INRIA/Regal Paris, France nicolas.geoffray@lip6.fr Gaël Thomas Université Pierre et Marie Curie
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 15 Very Long Instruction Word Machines
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 15 Very Long Instruction Word Machines Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationLecture: Pipelining Basics
Lecture: Pipelining Basics Topics: Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4: Loads/Stores and RISC/CISC Video
More informationJaos - Java on Aos. Oberon Event 03 Patrik Reali
Jaos - Java on Aos Oberon Event 03 Patrik Reali 1 Agenda! Oberon vs. Java! Java for Aos! Type Mapping! Compiling! Linking! Exceptions! Native Methods! Concurrency! Special Topics! Strings! Overloading!
More informationItanium 2 Processor Microarchitecture Overview
Itanium 2 Processor Microarchitecture Overview Don Soltis, Mark Gibson Cameron McNairy, August 2002 Block Diagram F 16KB L1 I-cache Instr 2 Instr 1 Instr 0 M/A M/A M/A M/A I/A Template I/A B B 2 FMACs
More informationComputer Architecture and Organization. Instruction Sets: Addressing Modes and Formats
Computer Architecture and Organization Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement (Indexed) Stack Immediate Addressing
More informationToday. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time
CS2110 Fall 2011 Lecture 25 Java program last time Java compiler Java bytecode (.class files) Compile for platform with JIT Interpret with JVM Under the Hood: The Java Virtual Machine, Part II 1 run native
More informationCISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP
CISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationDIGITAL content such as games, videos, and the like
1 System for Executing Encrypted Java Programs Michael Kiperberg, Amit Resh, Asaf Algawi, and Nezer Zaidenberg Abstract An important aspect of protecting software from attack, theft of algorithms, or illegal
More informationCSE 820 Graduate Computer Architecture. week 6 Instruction Level Parallelism. Review from Last Time #1
CSE 820 Graduate Computer Architecture week 6 Instruction Level Parallelism Based on slides by David Patterson Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level
More informationCISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP
CISC 662 Graduate Computer Architecture Lecture 13 - Limits of ILP Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer
More informationCase Study IBM PowerPC 620
Case Study IBM PowerPC 620 year shipped: 1995 allowing out-of-order execution (dynamic scheduling) and in-order commit (hardware speculation). using a reorder buffer to track when instruction can commit,
More informationMultithreaded Processors. Department of Electrical Engineering Stanford University
Lecture 12: Multithreaded Processors Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee382a Lecture 12-1 The Big Picture Previous lectures: Core design for single-thread
More informationOut of Order Processing
Out of Order Processing Manu Awasthi July 3 rd 2018 Computer Architecture Summer School 2018 Slide deck acknowledgements : Rajeev Balasubramonian (University of Utah), Computer Architecture: A Quantitative
More informationComputer Organization and Technology Processor and System Structures
Computer Organization and Technology Processor and System Structures Assoc. Prof. Dr. Wattanapong Kurdthongmee Division of Computer Engineering, School of Engineering and Resources, Walailak University
More informationOn the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine
On the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine Hitoshi Oi The University of Aizu June 16, 2005 Languages, Compilers, and Tools for Embedded Systems (LCTES
More informationPrecise Exceptions and Out-of-Order Execution. Samira Khan
Precise Exceptions and Out-of-Order Execution Samira Khan Multi-Cycle Execution Not all instructions take the same amount of time for execution Idea: Have multiple different functional units that take
More informationDynamic Scheduling. CSE471 Susan Eggers 1
Dynamic Scheduling Why go out of style? expensive hardware for the time (actually, still is, relatively) register files grew so less register pressure early RISCs had lower CPIs Why come back? higher chip
More informationCMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago
CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yanjing Li Department of Computer Science University of Chicago Administrative Stuff! Lab1 is out! " Due next Thursday (10/6)! Lab2 " Out next Thursday
More information