Compiler construction 2002 week 2

Similar documents
Lexical Analysis. Textbook:Modern Compiler Design Chapter 2.1

Compiler construction lecture 1

Compiler construction in4020 course 2006

Compiler construction 2002 week 5

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Compiler construction in4303 answers

The Language for Specifying Lexical Analyzer

Writing a Lexical Analyzer in Haskell (part II)

Compiler Construction

Compiler Construction

CSC 467 Lecture 3: Regular Expressions

Figure 2.1: Role of Lexical Analyzer

CSCI312 Principles of Programming Languages!

Compiler construction in4303 lecture 3

Compiler construction lecture 3

Chapter 4. Lexical and Syntax Analysis

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Alternation. Kleene Closure. Definition of Regular Expressions

A Scanner should create a token stream from the source code. Here are 4 ways to do this:

Monday, August 26, 13. Scanners

Wednesday, September 3, 14. Scanners

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Implementation of Lexical Analysis

CSE302: Compiler Design

CS 314 Principles of Programming Languages. Lecture 3

JFlex Regular Expressions

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Compiler phases. Non-tokens

Compiler Construction

LECTURE 7. Lex and Intro to Parsing

Scanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012

CSc 453 Lexical Analysis (Scanning)

4. Lexical and Syntax Analysis

Lexical and Syntax Analysis

Compiler construction 2005 lecture 5

Dr. D.M. Akbar Hussain

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

4. Lexical and Syntax Analysis

Finite Automata and Scanners

CS415 Compilers. Lexical Analysis

Lexical Analysis. Introduction

Chapter 3 Lexical Analysis

Implementation of Lexical Analysis

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Preparing for the ACW Languages & Compilers

Compiler course. Chapter 3 Lexical Analysis

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

TDDD55- Compilers and Interpreters Lesson 2

IN4305 Engineering project Compiler construction

Zhizheng Zhang. Southeast University

LEX/Flex Scanner Generator

Compiler Construction

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Formal Languages and Compilers Lecture VI: Lexical Analysis

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Lexical Analysis - 2

Lexical Analysis (ASU Ch 3, Fig 3.1)

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Lexical Analyzer Scanner

2 Input and Output The input of your program is any file with text. The output of your program will be a description of the strings that the program r

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Lexical Analyzer Scanner

Compiling Regular Expressions COMP360

Type 3 languages. Regular grammars Finite automata. Regular expressions. Deterministic Nondeterministic. a, a, ε, E 1.E 2, E 1 E 2, E 1*, (E 1 )

Implementation of Lexical Analysis

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

L L G E N. Generator of syntax analyzier (parser)

Programming Languages 2nd edition Tucker and Noonan"

The structure of a compiler

Context-free grammars

Edited by Himanshu Mittal. Lexical Analysis Phase

Parsing and Pattern Recognition

Chapter 3. Describing Syntax and Semantics ISBN

An Introduction to LEX and YACC. SYSC Programming Languages

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compiler Construction: Parsing

CPSC 434 Lecture 3, Page 1

The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Lexical and Syntax Analysis

Building lexical and syntactic analyzers. Chapter 3. Syntactic sugar causes cancer of the semicolon. A. Perlis. Chomsky Hierarchy

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

Compiler Construction D7011E

Recognition of Tokens

2 SECTION 1.1 Why study compiler construction?

Lexical and Syntax Analysis

CS 314 Principles of Programming Languages

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

LECTURE 6 Scanning Part 2

Lexical Scanning COMP360

Transcription:

Compiler construction in4020 lecture 2 Overview Koen Langendoen s elft University of Technology The Netherlands Generating a lexical analyzer generic methods specific tool lex Token Lex to recognize integers (f)lex: for UNIX C code an integer is a nonzero sequence of s optionally followed by a letter denoting the base class (b for binary and o for octal). format of the lex input file: definitions regular s base [bo] integer + base? rule = expr + action %{ #include "lex.h" %} base [bo] [09] rules user code regular expressions + actions auxiliary Ccode {} signal application of a {}+ {base}? {return INTEGER;} Lex resulting Ccode automatic generation char yytext[]; /* representation */ int yylex(void); /* returns type of next */ wrapper function to add attributes \n {line_number++;} void get_next_(void) { Token.class = yylex(); if (Token.class == 0) { Token.class = EOF; Token.repr = "<EOF>"; return; } Token.pos.line_number = line_number; Token.repr = strdup(yytext); } finite state automaton s 1

Finitestate automaton FSA examples Recognize input character by character Transfer between states integral_number [09]+ FSA Initial state set of accepting states transition function: State x Char State i f fixed_point_number [09]* [09]+ integral_number [09]+ fixed_point_number [09]* [09]+ recognize both s in one pass integral_number [09]+ fixed_point_number [09]* [09]+ naïve approach: merge initial states FSA implementation: transition table integral_number [09]+ fixed_point_number [09]* [09]+ correct approach: share common prefix transitions concurrent recognition of integers and fixed point numbers state character dot other recognized integer fixed point 2

FSA exercise (6 min.) Answers draw an FSA to recognize integers base [bo] integer + base? draw an FSA to recognize the regular expression (a b)*bab Automatic generation: FSA otted items start with initial set () of all s to be recognized for each character (ch) find the set (S ch ) of s that can start with ch extend the FSA with transition (,ch, S ch ) repeat adding transitions (to S ch ) until no new set is generated keeping track of matched characters in a : T R input regular expression α β already matched T α β still to be matched Types of dotted items Character moves shift item: dot in front of a basic pattern if i f if i f identifier [az] [az09]* input T α c β c reduce item: dot at the end if i f identifier [az] [az09]* nonbasic item: dot in front of repeated pattern or parenthesis identifier [az] [az09]* input c T αc β T α c β T α [class] β T α. β c c class T αc β T α[class] β T α. β 3

T α (R)? β T α(r )? β T α (R)* β T α(r )* β T α(r)? β T α( R)? β T α(r)? β T α(r)* β T α( R)* β T α(r)* β T α( R)* β T α (R)+ β T α(r )+ β T α (R 1 R 2 ) β T α(r 1 R 2 ) β T α( R)+ β T α(r)+ β T α( R)+ β T α( R 1 R 2 ) β T α(r 1 R 2 ) β T α(r 1 R 2 ) β a state corresponds to a set of basic items a character move yields a new set expand nonbasic items into basic items using see if the resulting set was produced before, if not introduce a new state add transition Example s integer: I ()+ fixedpoint: F ()* ()+ initial state I ()+ F ()* ()+ I ( )+ F ( )* ()+ F ()* ()+ Example Exercise (7 min.) character moves I ( )+ F ( )* ()+ F ()* ()+ I ()+ )+ F I ( ( )+ )* ()+ F ()* )* ()+ F ( )* ()+ draw the FSA (with item sets) for recognizing an identifier: identifier letter (letter_or or_und* letter_or_+)? extend the above FSA to recognize the keyword if as well. F ()* ( ()+ F ()* ()+ )+ F ()* ( )+ if i f 4

Answers Transition table compression redundant rows empty transitions state i f character L U recognized identifier keyword if row displacement Summary: generating a lexical analyzer Homework tool: lex s + actions wrapper interface dotted items character moves s study sections 2.1.10 2.1.12 lexical identification of s symbol tables macro processing print handout lecture 3 [blackboard] find a partner for the practicum register your group send email to koen@pds.twi.tudelft.nl 5