CSE302: Compiler Design

Similar documents
CSE302: Compiler Design

CSE302: Compiler Design

CSE302: Compiler Design

Lexical Analysis. Introduction

CSE302: Compiler Design

Compiler Construction LECTURE # 3

Lexical Analysis (ASU Ch 3, Fig 3.1)

Compiler course. Chapter 3 Lexical Analysis

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

Figure 2.1: Role of Lexical Analyzer

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Chapter 3 Lexical Analysis

CSc 453 Lexical Analysis (Scanning)

Front End: Lexical Analysis. The Structure of a Compiler

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

CS 403: Scanning and Parsing

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CPS 506 Comparative Programming Languages. Syntax Specification

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Formal Languages and Compilers Lecture VI: Lexical Analysis

Part 5 Program Analysis Principles and Techniques

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

David Griol Barres Computer Science Department Carlos III University of Madrid Leganés (Spain)

Lexical Analysis - An Introduction. Lecture 4 Spring 2005 Department of Computer Science University of Alabama Joel Jones

CS 314 Principles of Programming Languages

CS415 Compilers. Lexical Analysis

CSE302: Compiler Design

A Simple Syntax-Directed Translator

Introduction to Lexical Analysis

CS 314 Principles of Programming Languages. Lecture 3

Compiler Construction

G Compiler Construction Lecture 4: Lexical Analysis. Mohamed Zahran (aka Z)

Academic Formalities. CS Language Translators. Compilers A Sangam. What, When and Why of Compilers

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Lexical Analysis. Sukree Sinthupinyo July Chulalongkorn University

Introduction to Lexical Analysis

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

Zhizheng Zhang. Southeast University

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Lexical Analysis 1 / 52

Lexical Analysis. Chapter 2

Lexical Analysis. Implementation: Finite Automata

UNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

Lexical Analysis. Lecture 2-4

CSE302: Compiler Design

Compiler Construction

CSE 401/M501 Compilers

Lexical Analysis. Prof. James L. Frankel Harvard University

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

Group A Assignment 3(2)

A programming language requires two major definitions A simple one pass compiler

CSCI-GA Compiler Construction Lecture 4: Lexical Analysis I. Hubertus Franke

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Introduction to Compiler Design

Syntax. In Text: Chapter 3

UNIT -2 LEXICAL ANALYSIS

Compiler Construction

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

Chapter 3. Describing Syntax and Semantics ISBN

Implementation of Lexical Analysis

Lexical Scanning COMP360

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

Lexical Analysis. Lecture 3-4

Software II: Principles of Programming Languages

Lexical and Syntax Analysis

Compiler Construction

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CMSC 350: COMPILER DESIGN

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

LANGUAGE TRANSLATORS

Programming Language Syntax and Analysis

Lexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer

Compiling Regular Expressions COMP360

Projects for Compilers

Undergraduate Compilers in a Day

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

CSE398: Network Systems Design

Building lexical and syntactic analyzers. Chapter 3. Syntactic sugar causes cancer of the semicolon. A. Perlis. Chomsky Hierarchy

Transcription:

CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 01, 2007

Outline Recap Intermediate code generation (Section 2.8) Symbol tables (Section 2.7) Lexical analysis (Chapter 3) Summary and homework

Two Kinds of Intermediate Representation Syntax tree representation Expressions E1 op E2 Statements while (expr) stmt do stmt while expr Use a translation scheme Semantic rules or semantic actions Three-address-code representation OP E1 E2 while expr stmt do stmt expr

Symbol Tables Hold info of source program constructs Collected during analysis Used for synthesis Support multiple declarations of the same identifier within a program A separate symbol table for each scope A program block A class

Outline Recap Lexical analysis in a nutshell Overview Regular expressions Finite automata Implementation of a scanner Summary and homework

Overview Pattern matching between Input: characters in a source file Output: tokens based on theories of regular expressions and finite automata source program lexical analyzer token get next token symbol table parser

Tokens and Lexemes A lexeme is the lowest level syntactic unit of a language described by a lexical specification A token is a category/abstraction of lexemes

Tokens Defined as an enumerated type in C: typedef enum {IF, THEN, ELSE, EQ, GE, LE,NE, NUM, ID, } TokenType; in Java: Appendix A: Tag.java Fall into several categories Reserved words The lexeme or string value of the token IF is if Special symbols The lexeme or string value of the token EQ is == Identifiers Represent multiple lexemes Literals or constants

Overview The scanner is operated under the control of the parser In Parser.java: move() {look=lex.scan();}; In Lexer.java: public Token scan() { }

Outline Recap Lexical analysis in a nutshell Overview Regular expressions Finite automata Implementation of a scanner Summary and homework

Regular Expressions Represent patterns of strings of characters The set of strings generated by a regular expression r is as L(r)

Basic Regular Expressions Single characters from the alphabet The set of legal symbols Σ L(a) = {a} L(ε) = {ε} L( ) = {} Regular expression operations Choice among alternatives: L(r s) = L(r) L(s) Concatenation: L(rs) = L(r)L(s) Repetition (zero or more times): L(r*) = L(r)* A regular expression for a sequence of one or more numeric digits (0 1 9)(0 1 9)* digit digit* where digit = 0 1 9

Extensions to Regular Expressions One or more repetitions r+: digit+ where digit = 0 1 9 A range of characters in the alphabet a b c: [abc] a b z:[a-z] 0 1 9: [0-9] Any character in the alphabet, any character not in a given set

Regular Expressions for Identifiers An identifier starts with a letter, followed by one or more letters or one or more digits

Outline Recap Lexical analysis in a nutshell Overview Regular expressions Finite automata Implementation of a scanner Summary and homework

A Finite Automaton for Identifiers There is an algorithm that constructs a finite automaton below for the regular expression of identifiers, e.g. Thompson s construction letter 1 2 letter digit States in the pattern recognition process State 1: start state State 2: the state after a single letter has been matched Accepting states drawn in double-line border

Outline Recap Lexical analysis in a nutshell Overview Regular expressions Finite automata Implementation of a scanner Summary and homework

Implementation of Finite Automata and Demo A transition table based approach s = 1; while( s!=acceptstate and s!=errorstate) { c = next input character; s = T[s,c]; } Characters in the alphabet c States s States representing transitions T(s,c)

Outline Recap Lexical analysis (Chapter 3) Summary and homework