Introduction. CS 2210 Compiler Design Wonsun Ahn

Similar documents
What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Jazelle ARM. By: Adrian Cretzu & Sabine Loebner

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Compilers and Code Optimization EDOARDO FUSELLA

Compilers and Interpreters

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Introduction to Java Programming

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

1DL321: Kompilatorteknik I (Compiler Design 1)

Computers in Engineering COMP 208. Computer Structure. Computer Architecture. Computer Structure Michael A. Hawker

Compiler Design Spring 2018

Introduction to Programming Languages and Compilers. CS164 11:00-12:30 TT 10 Evans. UPRM ICOM 4029 (Adapted from: Prof. Necula UCB CS 164)

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Welcome to CSE131b: Compiler Construction

Advances in Compilers

Just-In-Time Compilation

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

SKILL AREA 304: Review Programming Language Concept. Computer Programming (YPG)

Computer Components. Software{ User Programs. Operating System. Hardware

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

Usually, target code is semantically equivalent to source code, but not always!

Compiler Design 1. Introduction to Programming Language Design and to Compilation

BEAMJIT, a Maze of Twisty Little Traces

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

Computer Components. Software{ User Programs. Operating System. Hardware

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar

Pioneering Compiler Design

Compiling Techniques

Swift: A Register-based JIT Compiler for Embedded JVMs

CS 360 Programming Languages Interpreters

LESSON 13: LANGUAGE TRANSLATION

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

Trace-based JIT Compilation

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Introduction to Programming: Variables and Objects. HORT Lecture 7 Instructor: Kranthi Varala

CST-402(T): Language Processors

Programming 1. Lecture 1 COP 3014 Fall August 28, 2017

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Computer Organization & Assembly Language Programming (CSE 2312)

LECTURE 2. Compilers and Interpreters

What do Compilers Produce?

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Chapter 1. Preliminaries

Compilation I. Hwansoo Han

ECE 2400 / ENGRD 2140 Computer Systems Programming Course Overview

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Introduction to Mobile Development

Spectre and Meltdown. Clifford Wolf q/talk

Programming 1 - Honors

Concepts of Programming Languages

Accelerating Stateflow With LLVM

Compilers. Lecture 2 Overview. (original slides by Sam

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Chapter 1 Preliminaries

Interaction of JVM with x86, Sparc and MIPS

From High Level to Machine Code. Compilation Overview. Computer Programs

Chapter 9. Introduction to High-Level Language Programming. INVITATION TO Computer Science

Introduction to Java. Lecture 1 COP 3252 Summer May 16, 2017

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

Introduction. L25: Modern Compiler Design

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

Instruction Set Architecture

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

Principles of Programming Languages. Lecture Outline

A Simple Syntax-Directed Translator

Basic Concepts. Computer Science. Programming history Algorithms Pseudo code. Computer - Science Andrew Case 2

Adding custom instructions to the Simplescalar infrastructure

Compilers. Prerequisites

CS Computer Architecture

Combining Analyses, Combining Optimizations - Summary

Operating- System Structures

Intermediate Code & Local Optimizations

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Performance Per Watt. Native code invited back from exile with the Return of the King:

CS102 Unit 2. Sets and Mathematical Formalism Programming Languages and Simple Program Execution

Design & Implementation Overview

Chapter 1. Preliminaries

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

The role of semantic analysis in a compiler

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

Advanced Object-Oriented Programming Introduction to OOP and Java

WebAssembly what? why? whither? Ben L. Titzer Andreas Rossberg Google Germany

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

Grade Weights. Language Design and Overview of COOL. CS143 Lecture 2. Programming Language Economics 101. Lecture Outline

Ahead of Time (AOT) Compilation

Advanced Topics in MNIT. Lecture 1 (27 Aug 2015) CADSL

1) What is the first step of the system development life cycle (SDLC)? A) Design B) Analysis C) Problem and Opportunity Identification D) Development

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

The MPI API s baseline requirements

Undergraduate Compilers in a Day

Administration. Prerequisites. Meeting times. CS 380C: Advanced Topics in Compilers

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

Chapter 1 INTRODUCTION SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

Java Performance Analysis for Scientific Computing

Compilers and Compiler-based Tools for HPC

Programming for Engineers in Python

Dynamic Languages Strike Back. Steve Yegge Stanford EE Dept Computer Systems Colloquium May 7, 2008

CS 220: Introduction to Parallel Computing. Beginning C. Lecture 2

Transcription:

Introduction CS 2210 Compiler Design Wonsun Ahn

What is a Compiler? Compiler: A program that translates source code written in one language to a target code written in another language Source code: Input to compiler Typically a program written in human readable language (e.g. C) Target code: Output of compiler Typically binary code written in machine language Native machines: x86, ARM, MIPS code Virtual machines: Java bytecode, LLVM bitcode Binary code can later be executed on the given machine Target code for one compiler can be source code for another Javac compiler: Java source code è Java bytecode Java Virtual Machine compiler: Java bytecode è native code Cython: Python source code è C source code Invariant: original semantics must stay the same 2

What is an Interpreter? Interpreter: A program that reads in source code and executes it on the target machine by interpreting it line by line + Software becomes more portable Binary code is tied to a specific machine architecture Source code can run anywhere with correct interpreter + Software becomes more secure and safe Interpreter can perform runtime checks (e.g. to prevent reads of illegal memory locations) + Interpreter is relatively simpler to implement than compiler (Popular for languages with small user base) Interpretation much slower compared to binary execution (Source line has to be interpreted each time it s executed) 3

Just-In-Time (JIT) Compiler Just-In-Time (JIT) Compiler: A compiler that performs translation at runtime (just in time before execution) Traditional compilers are called Ahead-Of-Time (AOT) Typically implemented when an interpreted language gains enough of a user base to warrant its development Pros / Cons compared to AOT compiler + Software can be run in a portable, secure, safe way Slight performance overhead due to JIT compilation time (But much less than interpretation since compiled code can be cached and reused) + Opportunities for better performance By profiling program behavior and optimizing against it By tailoring target code to details of underlying machine 4

AOT vs. JIT Compilation Ahead-Of-Time (AOT) Compilation Just-In-Time (JIT) Compilation Compile Binary Distribute Binary Hardware Translate/ Minify Bytecode/ Source Distribute Bytecode/ Source JIT Compiler Languages: Languages: Hardware 5

Topics Covered

Compiler Phases 7

Front End Group of phases that analyzes the source code and builds one or more internal representations (IRs) out of that analysis Comprised of lexical, syntax, and semantic analyses IRs can be syntax trees, 3-address codes, etc. Lexical Analysis, Input: Source code text Output: Sequence of tokens (smallest unit with meaning) Tokens: Identifiers, keywords, constants, operators Scans text left to right and generates tokens one by one, using language token definitions Also checks for illegal tokens (e.g. malformed identifier) 8

Front End Syntax Analysis Input: Sequence of tokens Output: Syntax tree Adds tokens to a hierarchical structure called a syntax tree that represents program, using language grammar rules Also checks for syntax errors (e.g. malformed if statement) 9

Front End Semantic Analysis Input: Syntax tree Output: Low-level IR (e.g. 3-address code) Allocates memory location for each variable definition Associates variable uses with variable definitions Generates a pseudo-code IR easily translatable to machine code Also checks for semantic errors (e.g. undefined variable) 10

Back End Group of phases that synthesizes machine code from the internal representations (IRs) generated by the front end Comprised of code optimization and code generation Code Optimization Input: Low-level IR (e.g. 3-address code) Output: Optimized low-level IR Modifications to IR such that code runs faster and/or consumes less memory Comprised of multiple optimizations applied in sequence Typically does not change the IR format 11

Back End Code Generation Input: (Optimized) low-level IR Output: Target Code Perform following tasks to transform IR to target code: Instruction Selection select actual machine ISA instructions to implement computation in IR Register Allocation allocate frequently used locations in processor registers An additional code optimization phase may follow code generation to apply target machine specific optimizations 12

Multiple Front Ends and Back Ends Modern compilers typically have multiple front ends E.g. GCC (GNU Compiler Collection) has front ends for C, C++, Fortran, and Java among others Means all front ends generate the same IR format that can be passed to the back end for code generation Modern compilers typically have multiple back ends E.g. GCC has back ends for x86, ARM, SPARC, etc. Means same IR can be translated to multiple targets by the code generator A common IR is central in enabling this diversity Instead of M X N implementations for M languages and N machines, only need M + N implementations 13

Why Learn Compilers?

Why Learn Compilers? Will allow you to write more robust, more efficient programs You will have a deep understanding of how your program will execute on the actual machine Compiler analysis techniques can be used for other purposes Can be used to analyze any kind of structured text data Can be used to translate any format to another format Your research may involve compilers 15

Why Compilers is Still a Research Issue You may think The C language has been around since the 1970s We are still using the C language There is probably not much left to do in terms of compilers But you may be wrong Changes in programming environment is producing new challenges in the front end of the compiler Changes in computer hardware design is producing new pressures in the back end of the compiler 16

Popularity of Programming Languages * Number of programmers coding in given language over time (Reference: www.ohloh.net) Mobile Web AOT Big Data Analysis JIT C/C++ dominated in 2005 Now, JavaScript and Python are the most used languages 17

Front End Challenges Prevalence of scripting languages (JavaScript, Python, ) Need for fast prototyping in a fast changing industry Need for computing in non-cs fields (physics, finance, ) Very flexible by design. E.g. Dynamic Typing: var x = 1, y, z; if ( ) y = 2; else y = 2 ; z = x + y; // z == 3 or z == 12 Challenge: Generating efficient code for such languages Increasingly complex software è increasingly complex bugs Most insidious are heisenbugs (particularly data races) Challenge: Removing or detecting bugs through code analysis Increasing sophisticated security exploits Side-channel attacks (e.g. Spectre / Meltdown exploit) Challenge: Removing information leakage from generated code 18

Back End Challenges Limits in device miniaturization is driving a sea change in HW CPU frequencies scale no more due to power wall CPUs are less reliable due to transistor doping variability Performance must come from parallelism Must run code in parallel to make up for lack of freq. scaling Challenge: Automatically parallelize or vectorize code Processor efficiency must come from heterogeneous architectures Different types of processors on chip for different types of code (e.g. CPUs, GPGPUs, Accelerators, etc ) Challenge: Generating code for this type of architecture Memory efficiency must come from software cache coherence HW manages caches in a reactive manner: not efficient enough Challenge: Generating code that manages caches in software using knowledge of program provided by programmer or analysis 19