Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design

Similar documents
A program execution is memory safe so long as memory access errors never occur:

Java and C CSE 351 Spring

Design issues for objectoriented. languages. Objects-only "pure" language vs mixed. Are subclasses subtypes of the superclass?

CSE 401/M501 Compilers

Week 7. Statically-typed OO languages: C++ Closer look at subtyping

Overview. Constructors and destructors Virtual functions Single inheritance Multiple inheritance RTTI Templates Exceptions Operator Overloading

Expressing high level optimizations within LLVM. Artur Pilipenko

Inheritance, Polymorphism and the Object Memory Model

Java and C I. CSE 351 Spring Instructor: Ruth Anderson

Procedure and Object- Oriented Abstraction

L25: Modern Compiler Design Exercises

Java and C. CSE 351 Autumn 2018

Data Abstraction. Hwansoo Han

QUIZ. Write the following for the class Bar: Default constructor Constructor Copy-constructor Overloaded assignment oper. Is a destructor needed?

Field Analysis. Last time Exploit encapsulation to improve memory system performance

From IMP to Java. Andreas Lochbihler. parts based on work by Gerwin Klein and Tobias Nipkow ETH Zurich

System Software Assignment 1 Runtime Support for Procedures

Lecture 9 Dynamic Compilation

DEVIRTUALIZATION IN LLVM

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Running Valgrind on multiple processors: a prototype. Philippe Waroquiers FOSDEM 2015 valgrind devroom

Dynamic Control Hazard Avoidance

C++ for System Developers with Design Pattern

CSE351 Winter 2016, Final Examination March 16, 2016

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Overview. Constructors and destructors Virtual functions Single inheritance Multiple inheritance RTTI Templates Exceptions Operator Overloading

CSE 374 Programming Concepts & Tools

Classes. Compiling Methods. Code Generation for Objects. Implementing Objects. Methods. Fields

Class, Variable, Constructor, Object, Method Questions

Big Java Late Objects

Compiler construction 2009

THREADS: (abstract CPUs)

Introducing C++ David Chisnall. March 15, 2011

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

CSE 504: Compiler Design. Runtime Environments

Object Oriented Programming: In this course we began an introduction to programming from an object-oriented approach.

Implementing Interfaces. Marwan Burelle. July 20, 2012

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

The Procedure Abstraction

Forth Meets Smalltalk. A Presentation to SVFIG October 23, 2010 by Douglas B. Hoffman

G52CPP C++ Programming Lecture 10. Dr Jason Atkin

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Programming Languages: OO Paradigm, Polymorhism and Class Members

Implementing Higher-Level Languages. Quick tour of programming language implementation techniques. From the Java level to the C level.

Intermediate Code Generation (ICG)

Python Implementation Strategies. Jeremy Hylton Python / Google

Last Class: Multiple Inheritance. Implementing Polymorphism. Criteria. Problem. Smalltalk Message Passing. Smalltalk

Code Generation II. Code generation for OO languages. Object layout Dynamic dispatch. Parameter-passing mechanisms Allocating temporaries in the AR

Naming in OOLs and Storage Layout Comp 412

The structure of a compiler

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Short Notes of CS201

OBJECT ORIENTED PROGRAMMING USING C++ CSCI Object Oriented Analysis and Design By Manali Torpe

CSc 453 Interpreters & Interpretation

Chapter 1: Object-Oriented Programming Using C++

CS201 - Introduction to Programming Glossary By

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

COP 3330 Final Exam Review

StackVsHeap SPL/2010 SPL/20

CPS 506 Comparative Programming Languages. Programming Language

VIRTUAL FUNCTIONS Chapter 10

Pipelining, Branch Prediction, Trends

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

Object Oriented Programming with c++ Question Bank

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Datenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016

Traditional Smalltalk Playing Well With Others Performance Etoile. Pragmatic Smalltalk. David Chisnall. August 25, 2011

Variables, Memory and Pointers

Overriding Variables: Shadowing

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

JavaScript: Sort of a Big Deal,

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Object Oriented Languages. Hwansoo Han

Chapter 8: Memory-Management Strategies

SafeDispatch Securing C++ Virtual Calls from Memory Corruption Attacks by Jang, Dongseok and Tatlock, Zachary and Lerner, Sorin

Supporting Class / C++ Lecture Notes

Synchronization SPL/2010 SPL/20 1

Design Patterns. SE3A04 Tutorial. Jason Jaskolka

Data Structure Layout. In HERA/Assembly

Accelerating Ruby with LLVM

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

Object Model. Object Oriented Programming Spring 2015

Runtime Environments, Part II

CS263: Runtime Systems Lecture: High-level language virtual machines

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

Contents. Chapter 1 Overview of the JavaScript C Engine...1. Chapter 2 JavaScript API Reference...23

Virtual Memory (Real Memory POV)

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

Native POSIX Thread Library (NPTL) CSE 506 Don Porter

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Subtyping (Dynamic Polymorphism)

Kampala August, Agner Fog

OBJECT ORIENTED DATA STRUCTURE & ALGORITHMS

C++ Programming: Polymorphism

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Just-In-Time Compilers & Runtime Optimizers

Java SE7 Fundamentals

Chapter 8: Main Memory

DOT NET Syllabus (6 Months)

Transcription:

Dynamic Dispatch and Duck Typing L25: Modern Compiler Design

Late Binding Static dispatch (e.g. C function calls) are jumps to specific addresses Object-oriented languages decouple method name from method address One name can map to multiple implementations Destination must be computed somehow

VTable-based Dispatch Tied to class (or interface) hierarchy Array of pointers (virtual function table) for method dispatch struct Foo { int x; virtual void foo (); }; void Foo :: foo () {} void callvirtual ( Foo &f) { f. foo (); } void create () { Foo f; callvirtual (f); }

Calling the method via the vtable define void @_Z11callVirtualR3Foo (% struct. Foo * % f) uwtable ssp { %1 = bitcast % struct. Foo * % f to void (% struct. Foo *) *** %2 = load void (% struct. Foo *) *** %1, align 8,! tbaa!0 %3 = load void (% struct. Foo *) ** %2, align 8 tail call void %3(% struct. Foo * %f) ret void }

Creating the object @_ ZTV3Foo = unnamed_ addr constant [3 x i8 *] [ i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI3Foo to i8 *), i8* bitcast ( void (% struct. Foo *)* @_ ZN3Foo3fooEv to i8 *)] define linkonce_ odr void @_ ZN3FooC2Ev (% struct. Foo * nocapture % this ) { %1 = getelementptr inbounds % struct. Foo * % this, i64 0, i32 0 store i32 (...) ** bitcast ( i8 ** getelementptr inbounds ([3 x i8 *]* @_ZTV3Foo, i64 0, i64 2) to i32 (...) **), i32 (...) *** %1 }

Problems with VTable-based Dispatch VTable layout is per-class Languages with duck typing do not tie dispatch to the class hierarchy Selectors must be more abstract than vtable offsets (e.g. globally unique integers for method names)

Ordered Dispatch Tables All methods for a specific class in a sorted list Binary (or linear) search for lookup Lots of conditional branches for binary search Either very big dtables or multiple searches to look at superclasses Cache friendly for small dtables (entire search is in cache) Expensive to add methods (requires lock / RCU)

Sparse Dispatch Tables Tree structure, 2-3 pointer accesses + offset calculations Fast if in cache Pointer chasing is suboptimal for superscalar chips (inherently serial) Copy-on-write tree nodes work well for inheritance, reduce memory pressure

Inverted Dispatch Tables Normal dispatch tables are a per-class (or per object) map from selector to method Inverted dispatch tables are a per-selector map from class (or object) to method If method overriding is rare, this provides smaller maps (but more of them)

Lookup Caching Method lookup can be slow or use a lot of memory (data cache) Caching lookups can give a performance boost Most object-oriented languages have a small number of classes used per callsite Have a per-callsite cache

Callsite Categorisation Monomorphic: Only one method ever called Huge benefit from inline caching Polymorphic: A small number of methods called Can benefit from simple inline caching, depending on pattern Polymorphic inline caching (if sufficiently cheap) helps Megamorphic: Lots of different methods called Cache usually slows things down

Simple Inline Cache [ wobject amethod : foo ]; static struct { Class cls ; Method method ; } cache = {0, 0}; static SEL sel = compute_ selector (" amethod "); if ( object -> isa!= cache -> cls ) { cache -> cls = object -> isa cache -> method = method_lookup (cls, sel ); } cache -> method ( object, sel, foo ); What s wrong with this approach?

Simple Inline Cache [ wobject amethod : foo ]; static struct { Class cls ; Method method ; } cache = {0, 0}; static SEL sel = compute_ selector (" amethod "); if ( object -> isa!= cache -> cls ) { cache -> cls = object -> isa cache -> method = method_lookup (cls, sel ); } cache -> method ( object, sel, foo ); What s wrong with this approach? Updates? Thread-safety?

Global cache Cache Scope

Cache Scope Global cache Needs locking or lockless updates for multithreaded languages False sharing problems

Cache Scope Global cache Needs locking or lockless updates for multithreaded languages False sharing problems Per-thread cache

Cache Scope Global cache Needs locking or lockless updates for multithreaded languages False sharing problems Per-thread cache No Contention Accessing TLS can be expensive in shared libraries

Cache Scope Global cache Needs locking or lockless updates for multithreaded languages False sharing problems Per-thread cache No Contention Accessing TLS can be expensive in shared libraries Method-Local Cache

Cache Scope Global cache Needs locking or lockless updates for multithreaded languages False sharing problems Per-thread cache No Contention Accessing TLS can be expensive in shared libraries Method-Local Cache Allocated on stack No synchronisation needed Access is cheap Cache only persists for current function duration

Statically Determining Cache Sites Policies from the GNUstep Objective-C Runtime Superclass method invocation is always cached (rarely changes) Class methods are always cached (rarely change) Message sends in loops are cached with on-stack cache cheap cache checks Run-time type feedback (as in Self) can improve accuracy

Thread-Safe Inline Caching Method lookup returns a cacheable slot (structure) pointer Slot contains the method pointer and a version Cache contains slot pointer and a version Caches must be updated in a way that avoids data races

Thread-Safe Inline Caching Algorithm static int cache_ version ; static struct slot * cached_ slot ; struct slot * slot = lookup_ slot ( cls, selector ); cache_ version = 0; slot -> cached_for = cls ; cached_ slot = slot ; cache_ version = slot - > version ; Store 0 in version Store slot pointer in cache Store version from slot in cache All must be sequentially consistent atomic stores Slot and version either match, or version is 0

Thread-Safe Inline Caching Lookup struct slot * slot = atomic_ read ( cached_ slot ); if (slot -> version == atomic_read ( cached_version ) && slot -> cached_for == cls -> isa ) slot -> method (obj, sel ); } Read slot Read version Check class matches expected Call method

Inline Cache-Safe Method Update Replace method in slot Increment version for all slots with the same selector in subclasses No version increment for the same class (cached slot is still safe to use)

Prototype-based Languages Prototype-based languages (e.g. JavaScript) don t have classes Any object can have methods Caching per class is likely to hit a lot more cases than per object

Hidden Class Transforms Observation: Most objects don t have methods added to them after creation Create a hidden class for every constructor Also speed up property access by using the class contain fixed offsets for common properties

Type specialisation Code paths can be optimised for specific types For example, elide dynamic lookup Can use static hints, works best with dynamic profiling Must have fallback for when wrong

Decompilation Branches are expensive Code can be emitted to trap on other types (e.g. illegal instruction causing SIGILL) NOPs provide places to insert new instructions Stack maps allow mapping from register / stack values to IR values New code paths can be created on demand

Trace-based optimisation Branching is expensive Dynamic programming languages have lots of method calls Common hot code paths follow a single path Chain together basic blocks from different methods into a trace Compile with only branches leaving Contrast: trace vs basic block (single entry point in both, multiple exit points in a trace)

Questions?