Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Similar documents
Compiler, Assembler, and Linker

Compilation I. Hwansoo Han

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Chapter 11 Introduction to Programming in C

Design & Implementation Overview

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Crafting a Compiler with C (II) Compiler V. S. Interpreter

4. An interpreter is a program that

LECTURE 2. Compilers and Interpreters


Principles of Programming Languages. Lecture Outline

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

M2 Instruction Set Architecture

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Chapter 1 Preliminaries

COSC121: Computer Systems: Runtime Stack

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

COMPILER DESIGN. For COMPUTER SCIENCE

Chapter 1. Preliminaries

Chapter 11 Introduction to Programming in C

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

CS 314 Principles of Programming Languages

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Intermediate Representations

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

The role of semantic analysis in a compiler

CS 415 Midterm Exam Spring 2002

CSCE 314 Programming Languages. Type System

Chapter 1. Preview. Reason for Studying OPL. Language Evaluation Criteria. Programming Domains

Concepts of Programming Languages

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Chapter 1. Preliminaries

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

SYSTEMS PROGRAMMING. Srimanta Pal. Associate Professor Indian Statistical Institute Kolkata OXFORD UNIVERSITY PRESS

ECE 471 Embedded Systems Lecture 4

CIT 595 Spring System Software: Programming Tools. Assembly Process Example: First Pass. Assembly Process Example: Second Pass.

CMPT 379 Compilers. Anoop Sarkar.

COMPILER DESIGN LECTURE NOTES

History of Compilers The term

Short Notes of CS201

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

UNIT I - INTRODUCTION

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

CS201 - Introduction to Programming Glossary By

Chapter 11 Introduction to Programming in C

GUJARAT TECHNOLOGICAL UNIVERSITY

Chapter 11 Introduction to Programming in C

Programming Languages Third Edition. Chapter 7 Basic Semantics

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Topic 6: A Quick Intro To C. Reading. "goto Considered Harmful" History

Linking and Loading. ICS312 - Spring 2010 Machine-Level and Systems Programming. Henri Casanova

AS-2883 B.Sc.(Hon s)(fifth Semester) Examination,2013 Computer Science (PCSC-503) (System Software) [Time Allowed: Three Hours] [Maximum Marks : 30]

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

LECTURE NOTES ON COMPILER DESIGN P a g e 2

A software view. Computer Systems. The Compilation system. How it works. 1. Preprocesser. 1. Preprocessor (cpp)

Topic 6: A Quick Intro To C

Compilers and Interpreters

Chapter 3:: Names, Scopes, and Bindings (cont.)

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Chapter 11 Introduction to Programming in C

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

COMPILER DESIGN LEXICAL ANALYSIS, PARSING

CSCI 171 Chapter Outlines

ECE 498 Linux Assembly Language Lecture 1

Have examined process Creating program Have developed program Written in C Source code

CST-402(T): Language Processors

Chapter 2. Operating-System Structures

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Programmiersprachen (Programming Languages)

Today s Big Adventure

Today s Big Adventure

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compilers and interpreters

Draft. Chapter 1 Program Structure. 1.1 Introduction. 1.2 The 0s and the 1s. 1.3 Bits and Bytes. 1.4 Representation of Numbers in Memory


CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Chapter 3:: Names, Scopes, and Bindings (cont.)

Computer Systems A Programmer s Perspective 1 (Beta Draft)

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

CS 314 Principles of Programming Languages. Lecture 11

Chapter 11 Introduction to Programming in C

Chapter 11 Introduction to Programming in C

Why are there so many programming languages?

Generating Programs and Linking. Professor Rick Han Department of Computer Science University of Colorado at Boulder

CS 415 Midterm Exam Fall 2003

Department of Computer Science and Engineering Yonghong Yan

Compiler Optimization

Introduction to Java Programming

CS133 C Programming. Instructor: Jialiang Lu Office: Information Center 703

Compiling Regular Expressions COMP360

Chapter 2: Operating-System Structures

CS5460/6460: Operating Systems. Lecture 21: Shared libraries. Anton Burtsev March, 2014

Embedded Systems Programming

Compiler Design (40-414)

From Code to Program: CALL Con'nued (Linking, and Loading)

Transcription:

Language Translation Compilation vs. interpretation Compilation diagram Step 1: compile program compiler Compiled program Step 2: run input Compiled program output

Language Translation compilation is translation from one language to another, where the translated form is typically easier to execute; a pure compiler produces language that will be directly executed by hardware compilation allows one translation and then multiple executions of the executable file (sometimes called a binary file, or load module); thus a fairly large amount of time can be spent by the compiler doing analysis and optimization once, in order to produce an executable that runs quickly each time it is run a compiled program typically runs fast but is harder to debug compiler example: gcc

Language Translation Interpretation diagram single step program interpreter output input

Language Translation interpretation skips the intermediate step of producing a form of the program in another language and combines translation and execution interpretation starts from the source code each time you want to run the program; it performs the same analysis as a compiler but on a source-line-by-sourceline basis; a pure interpreter keeps no results from this analysis even when encountering the same source line repeatedly within the body of a loop (this means an interpreted program will run faster if you make all the variable and function names only one or two characters in length and remove all the comments -- but I don't recommend doing this!)

Language Translation an interpreted program typically runs slow but is easier to debug because of better run-time error diagnostics interpreted languages easily support dynamic typing and dynamic scoping of variables interpreter examples: shells, m4 or python on the command line; also, formatted I/O (e.g., printf) relies on interpretation

Language Translation hybrid approach diagram Step 1: program compiler byte code Step 2: byte code J VM output input

Language Translation Java compiler and JVM interpreter - a hybrid translation model "javac" produces byte code, which is easy to interpret "java" interprets byte code provides for portability of byte code files across numerous systems Perl also has a hybrid translation model

Language Translation other hybrid translation models include just-in-time (JIT) compilers, which compile functions/procedures at runtime, on the first call terminology - source code that needs to be compiled is typically called a "program" while source code that is interpreted may be called a "script" (but may be called a "program" also)

Major translators in the compilation model 1. language preprocessor - textual substitution and conditional compilation (direct execution of special statements) 2. compiler - lexical analysis, parsing, code generation, optimization 3. macro processor - textual substitution and conditional assembly 4. assembler - translate symbols into addresses and machine code

Major translators in the compilation model 5. linker - external symbol resolution plus relocation, produces executable 6. loader - relocation according to load address, produces memory image (note many compilers generate object code directly - without calling a separate assembler)

Compile steps language preprocessor (cpp) compiler (ccom). assembler (as). linker (ld). source (.c) macro expansion and conditional compilation expanded source code compile time assembly language (.s) (.asm) assembly time object code (.o) (.obj) link time. executable load module (a.out) (.exe) macro processor (m4) library routine assembly source w/ macros (.m) static linking macro expansion and conditional assembly

Load and run steps command interpreter (shell) search for file name executable (load module) (a.out) (.exe) loader fetch/decode/execute in CPU memory..... (... machine language.....).. Image....... (... instructions and data...).. load-time linking (early Windows) run-time linking (most systems) library files (Microsoft DLL) shared objects (.so) dynamic linking

Translators (language preprocessor, e.g, for C) special syntax for preprocessor statements, e.g., #include macro facility, #define - trivially used for constant substitution conditional compilation, #ifdef - used for versioning #ifdef VERBOSE printf( "value of a is %d\n", a ); #endif where "#define VERBOSE" is included in the program source or where you compile with "gcc -DVERBOSE"

Translators (compiler) lexical analysis: extracting lexical items ("tokens") from the input syntactic analysis: parsing statements according to the grammar rules of the language, generates a parse tree semantic analysis: determining the meaning of operations according to the datatypes of the variables in the parse tree, may involve adding conversion operators to the parse tree intermediate code generation

Translators (compiler) machine-independent optimizations, e.g., loop transformations machine-specific code generation and register allocation machine-dependent optimizations, e.g., branch delay slot scheduling

Translators (compiler) consider the statement a = b + 2*c; in the following code float a,b; extern float c;... a = b + 2*c;... lexical analysis extracts eight tokens and assigns symbolic identifiers to entries in the symbol table `a' `=' `b' `+' `2' `*' `c' `;' symtab[0] `= ' symtab[1] `+' `2' `*' symtab[2] `;'

Translators (compiler) syntactic analysis builds a parse tree = / \ symtab[0] + / \ symtab[1] * / \ `2' symtab[2]

Translators (compiler) semantic analysis determines meaning =:float / \ symtab[0]:float +:float / \ symtab[1]:float *:float / \ convert_to_float symtab[2]:float `2'

Translators (compiler) intermediate code generation yields something like convert_to_float( 2, temp_float_0 ) multiply_float( temp_float_0, symtab[2], temp_float_1 ) add_float( symtab[1], temp_float_1, temp_float_2 ) store_float( temp_float_2, symtab[0] )

Translators (compiler) machine-independent optimization goes ahead and either does the conversion at compile time or strength reduces the multiply by 2 to an add add_float( symtab[2], symtab[2], temp_float_1 ) add_float( symtab[1], temp_float_1, temp_float_2 ) store_float( temp_float_2, symtab[0] ) from this registers would be assigned and ARM code would be generated (including storage allocation and addressing for variables)

Translators (macro processor) simple abstraction through textual substitution ("open" subroutines) provides either keyword or positional parameter substitution extends instruction set by synthesizing instructions using macro definitions

Translators (macro processor) cost occurs at assembly time of expanding macro definition, not at run time of procedure call, register save/restore, and procedure return conditional assembly is same idea as #ifdef facility of C preprocessor

Translators (macro processor) comparison of macro with run-time functions macro function invocation in-line substitution run-time call and return parameters untyped typed evaluated at each appearance evaluated once at time of call trade-offs fast but one copy of more overhead per call but code at each call site only one copy of code

Translators (assembler) translates program written in assembly language to binary machine code resolving local symbolic addresses; typically this is 1-to-1 translation

Translators (assembler) forward references generally require 2-pass assemblers pass 1: find symbolic labels and assign them addresses run location counter (virtual instruction pointer) determine instruction size record addresses in symbol table pass 2: use symbol table information to construct instructions symbolic -> binary alternative to 2-pass approach is 1-pass with fixup (i.e., backpatching) other assembler facilities include data layout directives (pseudo-ops)

Translators (linker) separate assembly or compilation means the assembler does not know all the addresses, thus the assembler produces only partially-resolved object files linker combines separate object files into a single executable layout pieces of code & data (storage allocation based on sizes) resolve external references perform relocation of absolute addresses

Translators (linker) two pass: 1. assign code and data to memory addresses and build symbol table from public symbols 2. use table to resolve external addresses and produce load module

Translators (linker) object module file format (this is early UNIX; ELF is more complex) - header (includes sizes of text, data, and bss sections) - text section (read only) - data section (read/write) - relocation/external symbol entries for text section - relocation/external symbol entries for data section - symbol table - string table (symbol table entries index into string table)

Translators (command interpreter) command interpreter (shell) - a program that reads command lines from the keyboard (or from a script file) and either directly executes the command or searches for an executable file having that command name and then loadsand branches to that loaded program

Translators (loader) bring a program into memory in preparation for execution read file header to find size of pieces allocate memory area(s) read instructions and data from file into memory relocation - adjusting absolute addresses relative to load point jump to startup code

Binding times The assembler, linker, and loader are all programs taking input files and producing output. Decisions and translations made by these programs are said to be done at "assembly time", at "link time", and at "run time", respectively. Actual execution (i.e., instruction interpretation by the hardware, such as performing adds, branches, etc.) takes place at "run time".

Binding times During execution, you can also talk of things happening at specific times, such as register saving at procedure call time. Dynamic linking is an example of a late decision, or "late binding". It is the linking of separate procedures at either load time or run time, and it typically requires that the normal (static) linker include a simple table that names the needed routines (for load-time linking) or include simple "stub" routines that find and link to the shared library routines on their first calls (for run-time linking).

Binding times Another form of delayed binding is "just-in-time" (JIT). This is used in several Java compilers, where methods are not compiled until the first call. Many storage allocation decisions are made at each step. For example, offsets are assigned to labels at assembly time, under the assumption that any absolute addresses will be updated by the linker and loader later. (When we later study virtual memory, we will see that it is also an example of late binding - specifically one where physical memory allocation decisions that might be made by a traditional loader are instead deferred to run time and made by the operating system.)

other programming tools other programming tools / components of a program development environment editors beautifiers project control version control GUI toolkit test coverage debuggers (e.g., vim, gedit, emacs) (e.g., indent) (e.g., make) (e.g., sccs) (e.g., widget library) (e.g., gcov) (e.g., gdb, dbx, ddd)

other programming tools debugging tools (e.g., Purify) reading or writing beyond the bounds of an array reading or writing freed memory freeing memory multiple times reading uninitialized memory reading or writing through null pointers overflowing the stack by recursive function calls reading or writing memory addresses on which a watch-point has been set

other programming tools portability advisors (e.g., lint) style checkers (e.g., CodeCheck) exceeding a given input line length exceeding a given nesting depth of if-else stmts. not aligning open and close curly braces (Horstmann) performance profilers (e.g., gprof)