Efficient Separate Compilation of Object-Oriented Languages

Similar documents
Efficient Separate Compilation of Object-Oriented Languages

Link-Time Static Analysis for Efficient Separate Compilation of Object-Oriented Languages

Intermediate Code, Object Representation, Type-Based Optimization

Empirical Assessment of Object-Oriented Implementations with Multiple Inheritance and Static Typing

A static and complete object-oriented model in C++

Type Feedback for Bytecode Interpreters

Teaching Encapsulation and Modularity in Object-Oriented Languages with Access Graphs

OOPLs - call graph construction Compile-time analysis of reference variables and fields. Example

Interprocedural Analysis with Data-Dependent Calls. Circularity dilemma. A solution: optimistic iterative analysis. Example

Efficient Multiple Dispatching Using Nested Transition-Arrays

Partial Dispatch: Optimizing Dynamically-Dispatched Multimethod Calls with Compile-Time Types and Runtime Feedback

Supporting parametric polymorphism in CORBA IDL

Scientist who joined IRISA. Institut de Recherche en Informatique et Systèmes Aléatoires

A Meta-Model for Composition Techniques in Object-Oriented Software Development

(12-1) OOP: Polymorphism in C++ D & D Chapter 12. Instructor - Andrew S. O Fallon CptS 122 (April 3, 2019) Washington State University

Perfect Hashing as an Almost Perfect Subtype Test

Lecture 13: Object orientation. Object oriented programming. Introduction. Object oriented programming. OO and ADT:s. Introduction

MODELLING COMPOSITIONS OF MODULAR EMBEDDED SOFTWARE PRODUCT LINES

Proposals for Multiple to Single Inheritance Transformation

Perfect class hashing and numbering for object-oriented implementation

Dimensions of Precision in Reference Analysis of Object-oriented Programming Languages. Outline

GOO: a Generative Object-Oriented Language

Lecture 5: Inheritance

OO Technology: Properties and Limitations for Component-Based Design

Idioms for Building Software Frameworks in AspectJ

Object-Specific Redundancy Elimination Techniques

Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design

Classboxes: A Minimal Module Model Supporting Local Rebinding

Dynamic Instantiation-Checking Components

Object-Oriented Languages and Object-Oriented Design. Ghezzi&Jazayeri: OO Languages 1

1 Introduction. 2 Overview of the Subclassing Anomaly. International Journal "Information Theories & Applications" Vol.10

Dynamic Languages. CSE 501 Spring 15. With materials adopted from John Mitchell

Static Type Analysis of Pattern Matching by Abstract Interpretation

Building Petri nets tools around Neco compiler

Modules and Class Refinement A Meta-modeling Approach to Object-Oriented Languages

JOVE. An Optimizing Compiler for Java. Allen Wirfs-Brock Instantiations Inc.

On Meaning Preservation of a Calculus of Records

Lecture Notes on Static Single Assignment Form

OOPLs - call graph construction. Example executed calls

Configuration Management for Component-based Systems

Interprocedural Analysis with Data-Dependent Calls. Circularity dilemma. A solution: optimistic iterative analysis. Example

The Classbox Module System

Type Hierarchy. Lecture 6: OOP, autumn 2003

Term Paper. Daniel Sarnow. Interface-based Programming in C++ Fakultät Technik und Informatik Studiendepartment Informatik

A Type System for Functional Traversal-Based Aspects

Inheritance (Chapter 7)

What are the characteristics of Object Oriented programming language?

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

class method dictionary

OBJECT ORIENTED PROGRAMMING USING C++ CSCI Object Oriented Analysis and Design By Manali Torpe

Late-bound Pragmatical Class Methods

CAP - Advanced Programming Concepts

Perfect Hashing as an Almost Perfect Subtype Test

Streamlining Feature-Oriented Designs

Compilation of Object Oriented Languages Tik Compilers Seminar

Reusability Metrics for Object-Oriented System: An Alternative Approach

Analysis of Object-oriented Programming Languages

Simple Dynamic Compilation with GOO. Jonathan Bachrach. MIT AI Lab 01MAR02 GOO 1

Object-Oriented Modeling of Rule-Based Programming

Message Dispatch on Pipelined Processors

What is Type-Safe Code Reuse?

Inheritance. Benefits of Java s Inheritance. 1. Reusability of code 2. Code Sharing 3. Consistency in using an interface. Classes

Lifted Java: A Minimal Calculus for Translation Polymorphism

CSc 520. Principles of Programming Languages 45: OO Languages Introduction

Abstract Interpretation and Object-oriented Programming: Quo Vadis?

A typed calculus of traits

Inline Caching meets Quickening

Olena: a Component-Based Platform for Image Processing, mixing Generic, Generative and OO Programming

Hardware-Supported Pointer Detection for common Garbage Collections

- couldn t be instantiated dynamically! - no type or other method of organizing, despite similarity to

JOURNAL OF OBJECT TECHNOLOGY

Index. Index. More information. block statements 66 y 107 Boolean 107 break 55, 68 built-in types 107

Programming Languages & Paradigms PROP HT Abstraction & Modularity. Inheritance vs. delegation, method vs. message. Modularity, cont d.

A Type Management System for an ODP Trader

Safe Instantiation in Generic Java

A Lightweight Language for Software Product Lines Architecture Description

Modularizing Web Services Management with AOP

Advanced Compiler Construction

SUBTLE METHODS IN C++

CSE 504: Compiler Design. Runtime Environments

Outlook on Composite Type Labels in User-Defined Type Systems

Class Analysis for Testing of Polymorphism in Java Software

Toward Language Independent Worst-Case Execution Time Calculation

A Case For. Binary Component Adaptation. Motivation. The Integration Problem. Talk Outline. Using Wrapper Classes. The Interface Evolution Problem

On the Algorithm for Specializing Java Programs with Generic Types

JOURNAL OF OBJECT TECHNOLOGY

Design issues for objectoriented. languages. Objects-only "pure" language vs mixed. Are subclasses subtypes of the superclass?

Object. Trait B. Behaviour B

Efficient Dynamic Dispatch without Virtual Function Tables. The SmallEiffel Compiler.

Implementing Reusable Collaborations with Delegation Layers

JOURNAL OF OBJECT TECHNOLOGY

DATA TYPES. CS 403: Types and Classes DATA TYPES (CONT D)

Object-Oriented Design

A Framework for Customisable Schema Evolution in Object-Oriented Databases

Object-Oriented Concepts and Design Principles

Lecture Notes on Programming Languages

Short Notes of CS201

Object Oriented Programming in Java. Jaanus Pöial, PhD Tallinn, Estonia

Polymorphism. Contents. Assignment to Derived Class Object. Assignment to Base Class Object

Atelier Java - J1. Marwan Burelle. EPITA Première Année Cycle Ingénieur.

Transcription:

Efficient Separate Compilation of Object-Oriented Languages Jean Privat, Floréal Morandat, and Roland Ducournau LIRMM Université Montpellier II CNRS 161 rue Ada 34392 Montpellier cedex 5, France {privat,morandat,ducour}@lirmm.fr Abstract. Compilers of object-oriented languages used in industry are mainly based on a separate compilation framework. However, the knowledge of the whole program improves the efficiency of compilation; therefore the most efficient implementation techniques are global. In this paper, we propose a compromise by including three global compilation techniques in a genuine separate compilation framework. 1 Introduction According to software engineering, programmers must write modular software. Object-oriented programming has become a major trend because it fulfils this need: heavy use of inheritance and late binding is likely to make more extensible and reusable. According to software engineering, programmers also need to produce software in a modular way. Typically, we can identify three advantages: (i) a software component (e.g. a library) can be distributed in a compiled form; (ii) a small modification in the source should not require a recompilation of the whole program; (iii) a single compilation of a software component is enough even if it is shared by many programs. Separate compilation frameworks offer these advantages since source files are compiled independently of future uses, and then linked to produce an executable program. The problem is that the knowledge of the whole program allows more efficient implementation techniques. Therefore previous works use these techniques in a global compilation framework, thus incompatible with modular production of software. Global techniques allow efficient implementation of the three main object-oriented mechanisms: late binding, read and write access to attributes, and dynamic type checking. In this paper, we present a genuine separate compilation framework that includes three global optimisation techniques. The framework described here can be used for any statically typed class-based languages. Position paper at ICOOOLPS Workshop at ECOOP 2006.

The remainder of the present paper is organised as follows. Section 2 presents the global optimisation techniques we consider. Section 3 introduces our separate compilation framework. We conclude in section 4. 2 Global Techniques The knowledge of the whole program source permits a precise analysis of the behaviour of each component and an analysis of the class hierarchy structure. Each of those allows important optimisations and may be used in any global compiler. Type Analysis. Statistics show that most method calls are actually monomorphic calls. In order to detect them, type analysis approximates three mutually dependent sets: the set of the classes that have instances (live classes), the concrete type of each expression (the concrete type is the set of potential dynamic types) and the set of called methods for each call site. There are many kinds of type analysis [1]. Even simple ones give good result and can detect many monomorphic calls [2]. Coloring. Coloring is an implementation technique with Virtual Function Table (VFT) that avoids the overhead of multiple inheritance [3, 4]. It can be applied to attributes, to methods and to classes for subtyping check [5 8, 4, 9]. Coloring is a global optimization which requires the knowledge of the whole class hierarchy and finding an optimal one is an NP-hard problem similar to the minimum graph coloring problem. Happily, class hierarchies seem to be simple cases of this problem and many efficient heuristics are proposed in [6,10,8]. Binary Tree Dispatch. SmartEiffel [11] introduces an implementation technique for object-oriented languages called binary tree dispatch (BTD). It is a systematisation of some techniques known as polymorphic inline cache and type prediction [12]. BTD has good results because VFT does not schedule well on modern processors since the unpredictable and indirect branches break their pipelines [13]. BTD requires a global type analysis in order to reduce the number of expected types of each call site. Once the analysis is performed, the knowledge of concrete types permits to implement polymorphism with an efficient select tree that enumerates types of the concrete type and provides a static resolution for each possible case. 3 Separate Compilation Separate compilation frameworks are divided into two phases: a local one (compiling) and a global one (linking). The local phase compiles a single software component (without loss of generality, we consider the compilation units to be classes) independently from the other components. We denote binary compo-

Input Result B external ask B A external A source ask C A internal C source C external A binary Fig. 1. Local Phase nents the results of this phase 1. Binary components are written in the target language of the whole compilation process (e.g. machine language) but they are not functional because some missing information is replaced by symbols. The binary components also contain metadata: debug information, symbol table, etc. The global phase gathers binary components of the whole program, collects some metadata, resolves symbols and substitutes them. The result of this phase is a functional executable that is the compiled version of the whole program. Application of global techniques to this framework can only be done during the global phase since the knowledge of the whole program is needed. The problem is that the source of the program is already compiled into binary components and no more available. The idea to perform optimisations during the global phase is not new. Computing a coloring at link-time was first proposed by [6] but, to our knowledge, this has never been implemented. Other works, [14] and [15], propose a separate compilation framework with global optimisation respectively for Modula-3 and for functional languages. In both cases, the main difference with our approach is that their local phases generate in an intermediate language. On linking, global optimisations are performed on the whole program then a genuine global compilation translates this intermediate language into the final language. 3.1 Local Phase The local phase takes as its input the source of a class, and produces as its results the binary and two metadata types: the external and the internal Fig. 1. These three parts can be included in the same file or in distinct files but the external should be separately available. The external of a class describes its interface: superclasses and definitions of methods and attributes. Even if the local phase compile classes indepen- 1 Traditionally, the results of separate compilation are called object files. Because this paper is about object-oriented languages, we chose not to use the traditional name to avoid conflicts.

local phase A source local phase B source external A A A B B B internal binary external internal binary global phase interclass analysis global live coloring symbol substitution Fig. 2. Global Phase dently from their future use, classes still depend on superclasses and used classes. Thus, the external of these classes must be available or be generated from the source file. In the latter case, a recursive generation may be performed. The binary contains symbols. As in standard separate compilation, symbols are used for addresses of functions and static variables. In our proposition, we also introduce other symbols related to the OO mechanism: (i) each late binding site is associated with a unique symbol, and compiled with a static direct call to this symbol; (ii) attribute accesses are compiled with a symbol representing the color of the attribute, i.e. the attribute index in the instance; (iii) type checks are compiled with two symbols representing the color and the identifier of the class to test. The internal of a class describes the behaviour of its methods. It gathers class instantiations, late binding sites, attribute accesses and type checks. It also contains the information about associated symbols. Using a type flow analysis, the internal of a method also contains a graph which represents the circulation of the types between the entries (the receiver, a parameter, the reading of an attribute, or the result of a method call) and the exits (the result of the method, the writing of an attribute, or the arguments of a method call) of the method. 3.2 Global Phase The global phase is divided into three stages: (i) type analysis which determines the live global, (ii) coloring which computes colors and identifiers of classes and attributes, and (iii) symbol substitution in the binary (Figure 2). Type analysis is based on the internal and external s of all classes. The live classes and their live attributes and methods are identified, as well as the information on the concrete types of the live call sites. The coloring stage is performed once the live global is obtained. A heuristic [6, 10] produces the values of the identifiers and the colors of the live classes, methods and attributes, as well as the size of the instances.

The last stage substitutes values to symbols. Colors and identifiers computed during the coloring stage are substituted to the corresponding symbols. For each late binding site, the symbol is replaced according to the polymorphism of the call site. On a monomorphic site, the symbol is replaced by the address of the single method: the result is a direct call. On a polymorphic site, the symbol is replaced by the address of a resolver. Resolvers are small link-time generated functions that select the correct method. On an oligomorphic site, BTD is the most efficient, therefore resolvers only contain a select tree where leaves are static jumps to the correct function. On a megamorphic site, VFT is the most efficient, therefore resolvers only contain a jump to the required method in the function table. 4 Conclusion We present in this article a genuine separate compilation framework for statically typed object-oriented languages in multiple inheritance. It includes three global techniques of optimisation and implementation: type analysis, coloring, and binary tree dispatch. Our proposition is a compromise between efficiency and modularity. It brings the efficiency of these global techniques without losing the advantages of separate compilation. For experiments [16], we developed a compiler prototype called prmc for Prm, an Eiffel-like language. It mainly follows the separate compilation scheme presented in the present paper. The only difference is that there is no external schema the global phase uses the source to perform the type analysis. However, the compilation is still truly separate since units are compiled separately then linked. In comparison with classical separate compilation, the space and time reductions are significant. Monomorphic, oligomorphic and megamorphic method calls are detected by the type analysis then are implemented with the most efficient technique (respectively direct call, BTD, and VFT). Attribute accesses and subtype checks are implemented with direct access. Comparing with pure global compilers, the performances are honourable. However, from the point of view of efficiency, even if the quality of the type analysis is the same, SmartEiffel and other global compilers keep a strong advantage with their specialisation techniques: method inlining, customisation [17] or heterogeneous generic class compilation [18]. At least, like global compilers, our framework removes the justification of the two uses of the virtual keyword in C++ because the overhead of multiple inheritance (virtual inheritance) and monomorphic late binding (virtual functions) are removed. The remaining question about libraries linked at load-time or dynamically loaded at run-time stays open. References 1. Grove, D., Chambers, C.: A framework for call graph construction algorithms. ACM Trans. Program. Lang. Syst. 23(6) (2001) 685 746

2. Bacon, D.F., Wegman, M., Zadeck, K.: Rapid type analysis for C++. Technical report, IBM Thomas J. Watson Research Center (1996) 3. Lippman, S.B.: Inside the C++ Object Model. Addison-Wesley, New York (NY), USA (1996) 4. Ducournau, R.: Implementing statically typed object-oriented programming languages. Technical Report 02-174, LIRMM, Montpellier (2002) 5. Dixon, R., McKee, T., Schweitzer, P., Vaughan, M.: A fast method dispatcher for compiled languages with multiple inheritance. In: Proc. OOPSLA 89, New Orleans, ACM Press (1989) 6. Pugh, W., Weddell, G.: Two-directional record layout for multiple inheritance. In: Proc. ACM Conf. on Programming Language Design and Implementation (PLDI 90). ACM SIGPLAN Notices, 25(6) (1990) 85 91 7. Cohen, N.H.: Type-extension type tests can be performed in constant time. Programming languages and systems 13(4) (1991) 626 629 8. Vitek, J., Horspool, R.N., Krall, A.: Efficient type inclusion tests. In: Proc. OOP- SLA 97. SIGPLAN Notices, 32(10), ACM Press (1997) 142 157 9. Ducournau, R.: Coloring, a versatile technique for implementing object-oriented languages. Technical Report 06-001, LIRMM, Montpellier (2006) 10. Takhedmit, P.: Coloration de classes et de propriétés : étude algorithmique et heuristique. Mémoire de DEA, Université Montpellier II (2003) 11. Zendra, O., Colnet, D., Collin, S.: Efficient dynamic dispatch without virtual function tables: The SmallEiffel compiler. In: Proc. OOPSLA 97. SIGPLAN Notices, 32(10), ACM Press (1997) 125 141 12. Hölzle, U., Chambers, C., Ungar, D.: Optimizing dynamically-typed objectoriented languages with polymorphic inline caches. In America, P., ed.: Proc. ECOOP 91. Volume 512 of LNCS., Springer-Verlag (1991) 21 38 13. Driesen, K., Hölzle, U.: The direct cost of virtual function calls in C++. In: Proc. OOPSLA 96. SIGPLAN Notices, 31(10), ACM Press (1996) 306 323 14. Fernandez, M.F.: Simple and effective link-time optimization of Modula-3 programs. In: SIGPLAN Conference on Programming Language Design and Implementation. (1995) 103 115 15. Boucher, D.: Analyse et Optimisations Globales de Modules Compilés Séparément. PhD thesis, Université de Montréal (1999) 16. Privat, J., Ducournau, R.: Link-time static analysis for efficient separate compilation of object-oriented languages. In Ernst, M., Jensen, T., eds.: Workshop on Program Analysis for Software Tools and Engineering PASTE 05. (2005) 29 36 17. Chambers, C., Ungar, D.: Customization: Optimizing compiler technology for SELF, a dynamically-typed object-oriented language. In: Proc. OOPSLA 89, New Orleans, ACM Press (1989) 146 160 18. Odersky, M., Wadler, P.: Pizza into Java: Translating theory into practice. In: Proc. POPL 97, ACM Press (1997) 146 159