PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering

Similar documents
CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

Abstract Syntax Trees & Top-Down Parsing


Top down vs. bottom up parsing

Table-Driven Parsing

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

8 Parsing. Parsing. Top Down Parsing Methods. Parsing complexity. Top down vs. bottom up parsing. Top down vs. bottom up parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

TABLE OF CONTENTS S.No DATE TOPIC PAGE No UNIT I LEXICAL ANALYSIS 1 Introduction to Compiling-Compilers 6 2 Analysis of the source program 7 3 The

Note that for recursive descent to work, if A ::= B1 B2 is a grammar rule we need First k (B1) disjoint from First k (B2).

VIVA QUESTIONS WITH ANSWERS

Types of parsing. CMSC 430 Lecture 4, Page 1

3. Parsing. Oscar Nierstrasz

Syntactic Analysis. Top-Down Parsing

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

CA Compiler Construction


Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

COMPILER DESIGN. For COMPUTER SCIENCE

UNIT I- LEXICAL ANALYSIS. 1.Interpreter: It is one of the translators that translate high level language to low level language.

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Compilers. Predictive Parsing. Alex Aiken

2068 (I) Attempt all questions.

LANGUAGE PROCESSORS. Introduction to Language processor:

Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Context-free grammars

Formal Languages and Compilers Lecture VII Part 3: Syntactic A

G Compiler Construction Lecture 4: Lexical Analysis. Mohamed Zahran (aka Z)

Front End. Hwansoo Han

VETRI VINAYAHA COLLEGE OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

Compiler Construction: Parsing

CS Parsing 1

Lexical and Syntax Analysis. Top-Down Parsing

Programming Language

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

Lexical and Syntax Analysis

COMPILER DESIGN LECTURE NOTES

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

A programming language requires two major definitions A simple one pass compiler

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

CSCI-GA Compiler Construction Lecture 4: Lexical Analysis I. Hubertus Franke

LANGUAGE TRANSLATORS

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:


Lexical Analysis. Introduction

Top-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7

CS 321 Programming Languages and Compilers. VI. Parsing

CSCI312 Principles of Programming Languages!

Table-driven using an explicit stack (no recursion!). Stack can be viewed as containing both terminals and non-terminals.

Compilerconstructie. najaar Rudy van Vliet kamer 140 Snellius, tel rvvliet(at)liacs(dot)nl. college 3, vrijdag 22 september 2017

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

Chapter 3. Parsing #1

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Parsing Part II (Top-down parsing, left-recursion removal)

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

CSc 453 Lexical Analysis (Scanning)

CS502: Compilers & Programming Systems

Alternatives for semantic processing

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Zhizheng Zhang. Southeast University

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Lexical and Syntax Analysis (2)

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

CSCI312 Principles of Programming Languages

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

1. Lexical Analysis Phase

PART 3 - SYNTAX ANALYSIS. F. Wotawa TU Graz) Compiler Construction Summer term / 309

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

The procedure attempts to "match" the right hand side of some production for a nonterminal.

Syntax Analysis, III Comp 412

UNIT -2 LEXICAL ANALYSIS

Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1. Top-Down Parsing. Lect 5. Goutam Biswas

1 Introduction. 2 Recursive descent parsing. Predicative parsing. Computer Language Implementation Lecture Note 3 February 4, 2004

Formal Languages and Compilers Lecture VI: Lexical Analysis

Bottom up parsing. The sentential forms happen to be a right most derivation in the reverse order. S a A B e a A d e. a A d e a A B e S.

A simple syntax-directed

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

Monday, September 13, Parsers

Syntax Analysis Part I

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Compiler Design 1. Top-Down Parsing. Goutam Biswas. Lect 5

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Concepts Introduced in Chapter 4

MidTerm Papers Solved MCQS with Reference (1 to 22 lectures)

LECTURE NOTES ON COMPILER DESIGN P a g e 2

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

CS308 Compiler Principles Lexical Analyzer Li Jiang

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Transcription:

TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a. Define Compiler. What are the phases of the Compiler? Explain with a neat diagram. Mention the input and output for each. Answar: A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). Phases of a compiler The first three phases, forms the bulk of the analysis portion of a compiler. Symbol table management and error handling, are shown interacting with the six phases.

Symbol table management An essential function of a compiler is to record the identifiers used in the source program and collect information about various attributes of each identifier. A symbol table is a data structure containing a record for each identifier, with fields for the attributes of the identifier. The data structure allows us to find the record for each identifier quickly and to store or retrieve data from that record quickly. When an identifier in the source program is detected by the lex analyzer, the identifier is entered into the symbol table. Error Detection and Reporting Each phase can encounter errors. A compiler that stops when it finds the first error is not as helpful as it could be. The syntax and semantic analysis phases usually handle a large fraction of the errors detectable by the compiler. The lexical phase can detect errors where the characters remaining in the input do not form any token of the language. Errors when the token stream violates the syntax of the language are determined by the syntax analysis phase. During semantic analysis the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved. The Analysis phases As translation progresses, the compiler s internal representation of the source program changes. The lexical analysis phase reads the characters in the source pgm and groups them into a stream of tokens in which each token represents a logically cohesive sequence of characters, such as an identifier, a keyword etc. The character sequence forming a token is called the lexeme for the token. Certain tokens will be augmented by a lexical value. For example, for any identifier the lex analyzer generates not only the token id but also enter s the lexeme into the symbol table, if it is not already present there. The lexical value associated this occurrence of id points to the symbol table entry for this lexeme. Syntax analysis imposes a hierarchical structure on the token stream, which is shown by syntax trees. Intermediate Code Generation After syntax and semantic analysis, some compilers generate an explicit intermediate representation of the source program. This intermediate representation can have a variety of forms. Code Optimisation

The code optimization phase attempts to improve the intermediate code, so that faster running machine codes will result. Some optimizations are trivial. There is a great variation in the amount of code optimization different compilers perform. In those that do the most, called optimising compilers, a significant fraction of the time of the compiler is spent on this phase. Code Generation The final phase of the compiler is the generation of target code, consisting normally of relocatable machine code or assembly code. Memory locations are selected for each of the variables used by the program. Then, intermediate instructions are each translated into a sequence of machine instructions that perform the same task. A crucial aspect is the assignment of variables to registers. Lexical analyzer takes the source program as an input and produces a long string of tokens. Syntax Analyzer takes an out of lexical analyzer and produces a large tree. Semantic analyzer takes the output of syntax analyzer and produces another tree. Intermediate code generator takes a tree as an input produced by semantic analyzer and produces intermediate code. ------------------------------------------------------------------------------------------------------------------- b. What are Compiler Construction Tools? Explain its specifications in detail. The compiler writer, like any software developer, can profitably use modern software development environments containing tools such as language editors, debuggers, version managers, profilers, test harnesses, and so on. In addition to these general software-development tools, other more specialized tools have been created to help implement various phases of a compiler. These tools use specialized languages for specifying and implementing specific components, and many use quite sophisticated algorithms. The most successful tools are those that hide the details of the generation algorithm and produce components that can be easily integrated into the remainder of the compiler. Some commonly used compiler-construction tools include 1. Parser generators that automatically produce syntax analyzers from a grammatical description of a programming language. 2. Scanner generators that produce lexical analyzers from a regular-expression description of the tokens of a language.

3. Syntax-directed translation engines that produce collections of routines for walking a parse tree and generating intermediate code. 4. Code-generator generators that produce a code generator from a collection of rules for translating each operation of the intermediate language into the machine language for a target machine. 5. Data-flow analysis engines that facilitate the gathering of information about how values are transmitted from one part of a program to each other part. Data-flow analysis is a key part of code optimization. 6. Compiler-construction toolkits that provide an integrated set of routines for constructing various phases of a compiler. ------------------------------------------------------------------------------------------------------------------ 2. a. Define the role of input buffer in lexical analysis. It is used for the purpose of reading a reading a character from source program in speed. If the source program is large only some lexemes of source program are loaded in buffer,in some case if lexemes are more compared to size of buffer for this purpose we need to use two input buffer. This two buffer handles the large lookaheads safely. Buffer pairs : An important scheme involves two buffers that are alternatively reloaded. Each buffer is of size N,where N can be size of disk block ie 4096. If the lexemes in source program are more compared to the size of buffer, then move to next buffer and mark current buffer as eof which is a sentinel which saves the time checking for the ends of buffer and also it is not a part of source program. Any eof that appears other than at the end of the buffer marks the end of input. Two pointers to the input are maintained : 1. lexemebegin is a pointer marks the beginning of current lexeme. 2. Forward is a pointer scans ahead until it gets a next lexeme.

Once the next lexeme is determined, forward is set to the character at its right end. Then after the lexeme is recorded as an attribute value of a token returned to the parser. lexemebegin is set to character immediately after the lexeme just found. Code: fwd++; if ( *fwd == EOF ) { /* special processing needed */ if (fwd at end of first half)... else if (fwd at end of second half)... else /* end of input */ terminate processing. } The code indicates that once we have reached the end of first buffer then we must reload second buffer the process continues and finally terminates the processing. ---------------------------------------------------------------------------------------------------------------- b. Construct the transition diagram to recognize the tokens given below. i). Relational operator. ii). Unsigned number. Answer: As an intermediate step in the construction of a lexical analyzer, we first convert patterns into stylized flowcharts called transition diagrams. i) for relop : it recognizes the lexemes matching token relop. In beginning start state < first input symbol,then among the lexemes that matches the pattern for relop,we can look at <,<> or <= we go to state 1, then next character, if = then recognize lexeme <= then state 2 and return token relop with attribute LE constant representing this particular comparison operator. if in state 1 next is > then instead we have <> and enter 3 to return an indication that the not equal operator has been found. on any other if lexeme is < and we enter state 4 to return information, however that state 4 has a* to indicate that we must retract input to state 1.if in state 0 first character is = then this one character must be lexeme we immediately return from state 5 remaining possibility is > we must enter state 6 and decide on basis of next character, whether the lexeme is >= {> or = ).

ii). for unsigned numbers : transition diagram for token no. is so far most complex diagram. Beginning in state 12, if we see digit we go to 13, in that state we can read any no. of additional digits. if we see anything but a digt or dot, we have seen a no. in form of integers eg.. 123 that case is handled by entering state 20,where we return token no. and pointer to a table of constants where the found lexeme is entered. If we instead see a dot in state 13, we have optional fraction. state 14 is entered and we look for one or more digits 15 is used. if we see E then optional exponent whose recognition is job of state 16 through state 19. in state 15 instead see anything but E or digit, then we have to come to end of fraction, there is no exponent and we return lexeme found via state 21.

3.a. Show the translation for an assignment statement: Position=initial + rate * 60 Clearly indicate the output of each phase. b for the grammar: S cad A ab a Trace the input cad for the recursive descent parser. Trace for the given example: S -> c A d A -> a b a

w=cad string step 1: S has only one production we expand S first character of input w=cad matches the leftmost leaf in the tree c Step 2:; we expand A->a b we have a match for second input character a we go to next symbol d b does not match d we report failure we go back to A to try another alternative we reset input pointer to position 2 step 3: the second alternative for A is A->a leaf a matches second symbol leaf d matched the third symbol we halt with successful parsing message

---------------------------------------------------------------------------------------------------- 4. a Explain the left recursion. Describe the algorithm used for eliminating the left recursion. Left Recursion A grammar is left recursive if it has a non-terminal A such that there is a derivation. A A for some string Algorithm for leftrecursion: - Arrange non-terminals in some order: A 1... A n - for i from 1 to n do { - for j from 1 to i-1 do { replace each production A i A j by A i 1... k where A j 1... k } - eliminate immediate left-recursions among A i productions } ---------------------------------------------------------------------------------------------------- b Define FIRST and FOLLOW rules used in predictive parsing technique. Rules to compute FIRST? 1. if X is terminal then FIRST(X)={X} 2. if X is non-terminal X->Y 1 Y 2 Y k is a production for some k>=1 place a in FIRST(X) if for some i a is in FIRST(Y i ) and ε is in FIRST(Y 1 ) FIRST(Y i-1 )

if ε is in FIRST(Y j ) j=1,,k then add ε to FIRST(X) 3.if X-> ε is a production then add ε to FIRST(X) Rules to compute FOLLOW 1. place $ in FOLLOW(S) S is the start symbol $ is the right end-marker 2. if there is a production A->αBβ Bβ everything in FIRST(β) except ε is in FOLLOW(B) 3. if there is a production A->αB or A->αBβ where first(β) contains ε everything in FOLLOW(A) is in FOLLOW(B) 5. Consider the following CFG, which has the set of terminals T = {id, (, ), [, ], ;} E id id(a) id [E] A E E ; A (a). Left-factor this grammar so that no two productions with the same lefthand side have right-hand sides with a common prefix. (b). Construct an LL(1) parsing table for the left-factored grammar. (c). Show the operation of an LL(1) parser on the input string id(id[id]; id) Answer: a). E id X X ε (A) [E] A EY Y ε ; A (b) The First and Follow sets of the non-terminals are as follows. First(E) = {id} Follow(E) = {$, ], ;, )} First(X) = { (, [,ε } Follow(X) = {$, ], ;, )} First(A) = {id} Follow(A) = { ) } First(Y ) = {;, ε} Follow(Y ) = { ) } Here is an LL(1) parsing table for the grammar:

Id ( ) [ ] ; $ E E id X X X (A) X ε X [E] X ε X ε X ε A A EY Y Y ε Y ; A Stack Input Action E$ id(id[id]; id)$ E idx idx$ id(id[id]; id)$ terminal id X$ (id[id]; id)$ X (A) (A)$ (id[id]; id)$ terminal ( A)$ id[id]; id)$ A EY EY )$ id[id]; id)$ E idx idxy )$ id[id]; id)$ terminal id XY )$ [id]; id)$ X! [E] [E]Y )$ [id]; id)$ terminal [ E]Y )$ id]; id)$ E idx idx]y )$ id]; id)$ terminal id X]Y )$ ]; id)$ X ε ]Y )$ ]; id)$ terminal ] Y )$ ; id)$ Y ; A ; A)$ ; id)$ terminal ; A)$ id)$ A EY EY )$ id)$ E -->! idx idxy )$ id)$ terminal id XY )$ )$ X ε Y )$ )$ Y ε )$ )$ terminal ) $ $ Accept ***************************************************************************