CSE302: Compiler Design

Similar documents
Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Compiler course. Chapter 3 Lexical Analysis

Chapter 3 Lexical Analysis

Lexical Analysis. Prof. James L. Frankel Harvard University

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

CSE302: Compiler Design

PRACTICAL CLASS: Flex & Bison

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

Lexical Analyzer Scanner

Lexical Analyzer Scanner

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

An introduction to Flex

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

UNIT -2 LEXICAL ANALYSIS

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Figure 2.1: Role of Lexical Analyzer

CS308 Compiler Principles Lexical Analyzer Li Jiang

Lexical Analysis (ASU Ch 3, Fig 3.1)

Typical tradeoffs in compiler design are: speed of compilation size of the generated code speed of the generated code Speed of Execution Foundations

Using Lex or Flex. Prof. James L. Frankel Harvard University

CSE302: Compiler Design

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Scanning. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

CSC 467 Lecture 3: Regular Expressions

Recognition of Tokens

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Implementation of Lexical Analysis

Lexical Analysis - 2

Compiler Construction

Lexical Analysis. Chapter 2

Lexical Analysis. Lecture 3. January 10, 2018

[Lexical Analysis] Bikash Balami

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Implementation of Lexical Analysis

Compiler Construction

Lexical Analysis. Lecture 2-4

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CSE302: Compiler Design

Introduction to Lexical Analysis

2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively

Edited by Himanshu Mittal. Lexical Analysis Phase

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Lexical Analysis. Implementation: Finite Automata

CSE302: Compiler Design

CS415 Compilers. Lexical Analysis

CS 403: Scanning and Parsing

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

Parsing and Pattern Recognition

Zhizheng Zhang. Southeast University

Scanning. COMP 520: Compiler Design (4 credits) Alexander Krolik MWF 13:30-14:30, MD 279

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

Type 3 languages. Regular grammars Finite automata. Regular expressions. Deterministic Nondeterministic. a, a, ε, E 1.E 2, E 1 E 2, E 1*, (E 1 )

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

Syntax Analysis Part IV

The Language for Specifying Lexical Analyzer

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

2. Lexical Analysis! Prof. O. Nierstrasz!

Front End: Lexical Analysis. The Structure of a Compiler

Lexical Analysis. Introduction

Lexical Analysis. Lecture 3-4

UNIT II LEXICAL ANALYSIS

Principles of Compiler Design Presented by, R.Venkadeshan,M.Tech-IT, Lecturer /CSE Dept, Chettinad College of Engineering &Technology

Compiler phases. Non-tokens

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Implementation of Lexical Analysis

Ulex: A Lexical Analyzer Generator for Unicon

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

Introduction to Lex & Yacc. (flex & bison)

Preparing for the ACW Languages & Compilers

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

TDDD55- Compilers and Interpreters Lesson 2

Chapter 3 -- Scanner (Lexical Analyzer)

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

LEX/Flex Scanner Generator

Automatic Scanning and Parsing using LEX and YACC

Lexical Analysis 1 / 52

Flex and lexical analysis

Projects for Compilers

Lexical and Parser Tools

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

The structure of a compiler

CMSC 350: COMPILER DESIGN

Writing a Lexical Analyzer in Haskell (part II)

Lexical Analysis. Finite Automata

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

Yacc: A Syntactic Analysers Generator

CSCI312 Principles of Programming Languages!

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Transcription:

CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 13, 2007

Outline Recap The lexical-analyzer generator Lex Implementing lexical-analyzer generators Summary and homework

Overview of Flex Flex is a scanner generator Input is description of patterns and actions Output is a C program which contains a function yylex() which when called matches patterns and performs actions per input Execute the unix command man flex for full information

Overview of Flex Compile using Flex tool Results in C code Compile using C compiler Link to the flex library (-lfl) Run the executable and recognize tokens Lex source program: lex.l Lex compiler lex.yy.c lex.yy.c C compiler a.out Input stream a.out sequence of tokens

Flex Source Program Format %{ declarations %} regular definitions %% translation rules %% auxiliary procedures/functions

Commands flex <prog_name>.l On CSE Department Suns, flex is in /usr/sfw/bin/flex gcc o sample lex.yy.c -lfl sample < input.text flex generates a main routine that is not needed when parsing with Yacc-generated parser

Some Functions and Variables yylex() The primary function generated input() Returns the next char from the input unput(int c) Returns char c to input yylval // Used to pass values to parser yytext // String with token from input yyleng // Length of string yyin // File handle yyin = fopen(args[0], r )

Regular Expressions For Tokens ws (blank tab newline)+

Example Lex Source Programs %{ /* definitions of constants LT, LE, EQ, NE, GT, GE, IF, THEN, ELSE, ID, NUMBER, RELOP */ %} delim [ \t\n] ws {delim}+ letter [A-Za-z] digit [0-9] id {letter}({letter} {digit})* number {digit}+(\.{digit}+)?(e[+-]?{digit}+)? %% {ws} {/* no action */} if {return("if");} then {return("then");} else {return("else");} {id} {return("id );} {number} {return( NUMBER );} "<" {return("relop");} "<=" {return("relop");} "=" {return("relop");} "<>" {return("relop");} ">" {return("relop");} ">=" {return("relop");} %%

Conflict Resolution in Lex When several prefixes of the input match one or more patterns Always prefer a longer prefix to a shorter prefix If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the Lex source program if i>0 then i=1 else i=0

Outline Recap The lexical-analyzer generator Lex Implementing lexical-analyzer generators Summary and homework

Implementing Lexical-Analyzer Generators Regular expressions Nondeterministic finite automata Nondeterministic finite automata Deterministic finite automata Deterministic finite automata A lexer Regular expressions Deterministic finite automata Deterministic finite automata A lexer

Deterministic Finite Automata A Lexer A transition table based approach s = 1; while( s!=acceptstate and s!=errorstate) { c = next input character; s = T[s,c]; } Characters in the alphabet c States s States representing transitions T(s,c)

Deterministic Finite Automata A finite set of states S. A set of input symbols or characters Σ as the input alphabet. The empty string ε is not a member of Σ A transition function T: S Σ S that gives a next state for each state and each symbol/character A state s 0 from S as the initial state. A set of states F that is a subset of S as the final/accepting states.

DFA/NFA Accepting Regular Expressions The language or regular expression accepted by a DFA or NFA D, written as L(D) The set of strings of symbols c 1 c 2 c n with each c i such as there exist states s 1 =T(s 0,c 1 ),, s n =T(s n-1,c n ), with s n an accepting/final state

Nondeterministic Finite Automata A finite set of states S. A set of input symbols or characters Σ as the input alphabet. The empty string ε is not a member of Σ A transition function T: S Σ S that gives a set of next states for each state and each symbol/character in Σ {ε} A state s 0 from S as the initial state. A set of states F that is a subset of S as the final/accepting states.

Implementing Lexical-Analyzer Generators Regular expressions Nondeterministic finite automata Nondeterministic finite automata Deterministic finite automata Deterministic finite automata A lexer Regular expressions Deterministic finite automata Deterministic finite automata A lexer

MYT Algorithm Constructing an NFA from a regular expression r by McNaughton-Yamada- Thompson algorithm Organizing r into its constituent sub-expressions Sub-expressions with no operators Operators Using basic rules to construct NFA for Subexpressions with no operators Using inductive rules to construct larger NFA based on the constructed NFA for operations of sub-expressions

Basic Rules to Construct NFA For expression ε start i ε f For any subexpression a, i.e. {a} start i a f

Inductive Rules to Construct Larger NFA For Operations Assume N(s) and N(t) are NFA for regular expressions s and t, respectively Parenthesis operation r=(s) Use the NFA N(s) as N(r) Union operation r=s t Concatenation operation r=st Repetition operation r=s*

Examples

Implementing Lexical-Analyzer Generators Regular expressions Nondeterministic finite automata Nondeterministic finite automata Deterministic finite automata Deterministic finite automata A lexer

Conversion of NFA to DFA Subset construction algorithm Input: An NFA N Output: A DFA D accepting the same language as N Algorithm: construct a transition table Dtran corresponding to D Initially, ε-closure(s 0 ) is the only state in Dstates, and it is unmarked; while ( there is an unmarked state T in Dstates ) { mark T; for ( each input symbol a) { U = ε-closure(move(t,a)); if ( U is not in Dstates ) add U as an unmarked state to Dstates; Dtran[T,a] = U; } }

ε-closure(s) and ε-closure(t) ε-closure(s): a set of NFA states reachable from NFA state s on ε-transitions alone ε-closure(t): a set of NFA states reachable from some NFA state s in the set T on ε-transitions alone s in T ε-closure(s) push all states of T onto stack; initialize ε-closure(t) to T; while ( stack is not empty ) { pop t, the top element, off stack; for ( each state u with an edge from t to u labeled ε) if ( u is not in ε-closure(t) ) { add u to ε-closure(t); push u onto stack; } }

move(t,a) A set of NFA states to which there is a transition on input symbol a from some state s in T

Examples

Outline Recap The lexical-analyzer generator Lex Implementing lexical-analyzer generators Summary and homework

Homework (Due on 02/19 at 11:55 PM) 5.1. (10 points). Using flex and based on the Example 3.8 (pages 128-129 in the textbook), generate a lexer that scans the following input stream and outputs the following output stream. Input stream: if i>0 then i=1 else i=0 Output stream: IF ID:i RELOP:GT NUMBER:0 THEN ID:i RELOP:EQ NUMBER:1 ELSE ID:i RELOP:EQ NUMBER:0 Please provide a readme file explaining how you generate and test your lexer. 5.2. Conversion of a NFA to a DFA will be posted at the Blackboard.