Compiler Front-End. Compiler Back-End. Specific Examples

Similar documents
Yacc: A Syntactic Analysers Generator

Introduction. 1.1 Literate Programs

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Using an LALR(1) Parser Generator

An Introduction to LEX and YACC. SYSC Programming Languages

TDDD55- Compilers and Interpreters Lesson 3

Compiler Lab. Introduction to tools Lex and Yacc

Syntax-Directed Translation

Syntax Analysis Part IV

TDDD55 - Compilers and Interpreters Lesson 3

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Lexical and Syntax Analysis

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Yacc Yet Another Compiler Compiler

COMPILER CONSTRUCTION Seminar 02 TDDB44

Lexical and Parser Tools

Introduction to Lex & Yacc. (flex & bison)

Syntax-Directed Translation Part I

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Lexical Analyzer Scanner

Overview of Compiler. A. Introduction

Lexical Analyzer Scanner

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

CSE302: Compiler Design

Applications of Context-Free Grammars (CFG)

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

LECTURE 11. Semantic Analysis and Yacc

PRACTICAL CLASS: Flex & Bison

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Figure 2.1: Role of Lexical Analyzer

Static Checking and Intermediate Code Generation Pat Morin COMP 3002

CSCI Compiler Design

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Edited by Himanshu Mittal. Lexical Analysis Phase

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Summary: Direct Code Generation

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

UNIT III & IV. Bottom up parsing

The structure of a compiler

Compiler construction assignments

Compiler Code Generation COMP360

A programming language requires two major definitions A simple one pass compiler

VIVA QUESTIONS WITH ANSWERS

Preparing for the ACW Languages & Compilers

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Over-simplified history of programming languages

Compil M1 : Front-End

Automatic Scanning and Parsing using LEX and YACC

Principles of Programming Languages

TDDD55- Compilers and Interpreters Lesson 2

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Advances in Compilers

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

CSC 2400: Computer Systems. Using the Stack for Function Calls

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

Chapter 4. Lexical and Syntax Analysis

Syntax Analysis The Parser Generator (BYacc/J)

Parsing and Pattern Recognition

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Chapter 2: Syntax Directed Translation and YACC

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Intermediate Code Generation

Parsing How parser works?

Compiler construction in4303 lecture 9

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Introduction to Compiler Design

Chapter 2 A Quick Tour

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Syntax-Directed Translation. Lecture 14

Programming Language Syntax and Analysis

Instruction Set Architecture

Compiler Construction Assignment 3 Spring 2018

Syntax-Directed Translation. Introduction

CS606- compiler instruction Solved MCQS From Midterm Papers

Flex and lexical analysis

Context-free grammars

General Overview of Compiler

G Programming Languages - Fall 2012

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

Digital Forensics Lecture 3 - Reverse Engineering

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CSE302: Compiler Design

Semantic actions for declarations and expressions

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

4. Lexical and Syntax Analysis

Project 2 Interpreter for Snail. 2 The Snail Programming Language

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

Compiler Construction: Parsing

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

x86 assembly CS449 Fall 2017

Syntax Directed Translation

CIS 341 Midterm March 2, 2017 SOLUTIONS

tokens parser 1. postfix notation, 2. graphical representation (such as syntax tree or dag), 3. three address code

Transcription:

Compiler design

Overview Compiler Front-End What is a compiler? Lexical Analysis Syntax Analysis Parsing Compiler Back-End Code Generation Register Allocation Optimization Specific Examples lex yacc lcc

What is a Compiler? Example of tasks of compiler 1. Add two numbers 2. Move numbers from one location to another 3. Move information between CPU and memory Software Translator

Lexical Analysis First phase of compiler isolate words/tokens Example of tokens: key words while, procedure, var, for,.. identifier declared by the programmer Operators +, -, *, /, <>, Numeric numbers such as 124, 12.35, 0.09E-23, 09E etc. Character constants Special characters Comments

Syntax Analysis What is Syntax Analysis? Second phase of the compiler Also called Parser What is the Parsing Problem? How is the Parsing problem solved? Top-down and Bottom-up algorithm

Top-Down Parsing What does it do? One Method: Pushdown Machine Example: Consider the simple grammar: 1. S 0 S 1 A 2. S 1 0 A 3. A 0 S 0 4. A 1

Example Process to construct a Pushdown Machine 1. Build a table with each column labeled by a terminal symbol (and endmarker ) and each row labeled by a nonterminal or terminal symbol (and bottom marker ) 2. For each grammar rule of the form A aα, fill in the cell in row A and column a with with: REP(αra), retain, where αr represents α reversed 3. Fill in the cell in row a and column a with pop, advance, for each terminal symbol a. 4. Fill in the cell in row and column with Accept. 5. Fill in all other cells with Reject. 6. Initialize the stack with and the starting terminal.

Bottom-Up Parsing What does it do? Two Basic Operations: 1. Shift Operation 2. Reduce Risk Operation

Why Split the Compiler Front- End is Machine Independent Front-End can be written in a high level language Re-use Oriented Programming Back-End is Machine Dependent Lessens Time Required to Generate New Compilers Makes developing new programming languages simpler

Code Generation Convert functions into simple instructions Simple Complex Addressing the operands Base Register Offset Examples

Single Pass vs. Multiple pass Single pass Creates a table of Jump Instructions Forward Jump Locations are generated incompletely Jump Addresses entered into a fix-up table along with the label they are jumping to As label destinations encountered, it is entered into the table of labels l After all inputs are read, CG revisits all of these problematic jump instructions ti Multiple pass No Fix-Up table In the first pass through the inputs, CG does nothing but generate table of labels. Since all labels are now defined, whenever a jump is encountered, all labels already have pre-defined memory location. Possible problem: In first pass, CG needs to know how many MLI correspond to a label. Major Drawback-Speed

Register Allocation Assign specific CPU registers for specific values CG must maintain information on which registers: Are used for which purposes Are available for reuse Main objective: Maximize the utilization of the CPU registers Minimize references to memory locations Possible uses for CPU registers Values used many times in a program Values that are computationally expensive Importance? Efficiency Speed

An Example Example - For the following 2 statement program segment, determine a smart register allocation scheme: A = B + C * D B = A C * D Simple Register Allocation LOD (R1,C) MUL (R1,D) STO (R1,Temp) LOD (R1,B) ADD (R1,Temp) STO (R1,A) LOD (R1,C) MUL (R1,D) STO (R1,Temp2) LOD (R1,A) SUB (R1,Temp2) STO (R1,B) Net Result 12 instructions and memory ref. Smart Register Allocation LOD (R1,C) MUL (R1,D) C*D LOD (R2,B) ADD (R2,R1) B+C*D STO (R2,A) SUB (R2,R1) A-C*D STO (R2,B) Net Result 7 instruc. And 5 mem. refs.

Register Allocation Algorithm RAA determines how many registers will be needed to evaluate an expression. D t i th S i hi h b Determines the Sequence in which subexpressions should be evaluated to minimize register use

How does RAA work? Construct a tree starting at the bottom nodes Assign each leaf node a weight of: 1 if it is the left child 0 is it is the right child The weight of each parent node will be computed by the weights of the 2 children as follows: If the 2 children have different weights, take the max. If the weights are the same, the parent s weight is w+1 The number of CPU registers is determined by the The number of CPU registers is determined by the highest summed weight at any stage in the tree.

Example of RAA Example - For the following 2 statement program segment, determine a smart register allocation scheme: A*B (C+D) * (E+F) LOD (R1,c) ADD (R1,d) R1 = c +d LOD (R2,e) ADD (R2,f) R2 = e + f MUL (R1,R2) R1 = (c + d) * (e + f) LOD (R2,a) MUL (R2,b) R2 = a * b SUB (R2,R1) R2 = a * b - (c + d) * (e + f)

Optimization Global Directed Acyclic Graphs (DAGs) Data Flow Analysis Moving Loop Invariant Code Other Mathematical Transformations Local Load Store Optimization Jump over Jump Optimization Simple Algebraic Optimization

Analysis of specific compilers Programs to be discussed: lex Programming utility that generates a lexical analyzer yacc Parser generator lcc -ANSI C compiler Platforms: All three programs designed for use on Unix lcc runs under DOS and Unix

lex Programming Utility General Information: Input is stored in a file with *.l extension File consists of three main sections lex generates C function stored in lex.yy.c Using lex: 1) Specify words to be used as tokens (Extension of regular expressions) 2) Run the lex utility on the source file to generate yylex( ), a C function 3) Declares global variables char* yytext and int yyleng

lex Programming Utility Three sections of a lex input file: /* C declarations and #includes lex definitions */ %{ #include header.c int i; }% %% /* lex patterns and actions */ {INT} {sscanf (yytext, %d, &i); printf( INTEGER\n );} %% /* C functions called by the above actions */ {yy yylex(): }

yacc Parser Generator General Information: Input is specification of a language Output is a compiler for that language yacc generates C function stored in y.tab.c Public domain version available bison Using yacc: 1) Generates a C function called yyparse() 2) yyparse() may include calls to yylex() 3) Compile this function to obtain the compiler

yacc Parser Generator yacc source yacc y.tab.c cc aout a.out #include lex.yy.c lex source lex lex.yy.c Input source file similar to lex input file Declarations, Rules, Support routines Four parts of output atom: (Operation, Left Operand, Right Operand, Result)

lcc Compiler General Information: Retargetable ANSI C compiler (machine specific parts that are easy to replace) Different stages of code: 1. Preprocessed code 2. Tokens 3. Trees 4. DAG (directed acyclic graphs) 5. Assembly language Test program: int round(f) float f; { return f+0.5; /* truncates t the variable f */ }

lcc Compiler Token Stream Tokens Values Tokens Values INT inttype { ID round RETURN ( ID f ID f + ) FCON 0.5 FLOAT floattype ; ID f } ; EOI

lcc Compiler ASGN+F Syntax Trees RET+I ADDRF+P f (float) callee CVD+F INDIR+D ADDRF+P f (Double) caller CVF+D INDIR+F ADDRF+P CVD+I ADD+D CNST+D (0.5)

lcc Compiler Register Assembler Template fld qword ptr %a[ebp] \n fstp dword ptr %a[ebp] \n fld dword ptr %a[ebp] \n #nop \n fadd qword ptr %a \n sub esp, 4 \n fistp dword ptr 0[esp] \n eax pop %c \n #ret \n

Conclusion Compiler Front-End Compiler Back-End Specific Examples