cmps104a 2002q4 Assignment 2 Lexical Analyzer page 1

Size: px
Start display at page:

Download "cmps104a 2002q4 Assignment 2 Lexical Analyzer page 1"

Transcription

1 cmps104a 2002q4 Assignment 2 Lexical Analyzer page 1 $Id: asg2-scanner.mm,v :20: $ 1. The Scanner (Lexical Analyzer) Write a main program, string table manager, and lexical analyzer for the language c0 that you will be compiling this quarter. The usage and options were described in the first assignment. The main program will scan the input with the following code, which will be removed inassignment 3 and replaced by a call to the parser : int token_code; yyin = fopen( /*... argv[?] or something...*/ ); for(;;){ token_code = yylex(); if( token_code == YYEOF ) break; fprintf( stderr, "yylex() returned %d (yytext=%s).\n", token_code, yytext ); }; Flex reads characters from the FILE* yyin, which must point at a valid file structure before calling yylex(). Whatever you called this file in the first assignment, change it to yyin. Note that yylex() returns YYEOF (which is 0) when it hits end of file. The scanner should dump its tokens itself from a semantic action when the -t flag is set. Warning : This is where the course project really starts. The string table assignment was really just a Data Structures assignment, which you should have found rather easy. This assignment, together with the parse of the next assignment, is the «real stuff».afailing grade in the scanner or parser assignment will result in failing the course. The scanner specification should be placed a file with a.l suffix, such as scanner.l. At the beginning of this file, ensure that at least the following #includes are present in the C declarations : %{ #include "yyexternals.h" #include "tokenast.h" %} 2. Options You must implement all of the options from the previous assignment, and all options for any assignment must carry forward to future assignments. In this case, the t option will cause the tokens to be dumped into program.tok and the L option will cause the flex-generated scanner to produce its debug output by setting yy_flex_debug to 1. See assignment 1 for information pointing at dbx. 3. Global interface You will need a set of global declarations for communication among the various different modules. Try not to make too much of a hash of things and do not use globals when it is possible to avoid them. The file yyexternals.h should contain : int yylex( void ); int yyparse( void ); extern FILE *yyin; extern char *yytext; #define YYEOF 0 4. The Token AST ADT You must also implement a Token Abstruct Syntax Tree. For the current assignment, you don t need any tree implementation code, as each token is a stand-alone unit. For the parser assignment, you must add tree management code to your ADT. The file tokenast.h should contain :

2 cmps104a 2002q4 Assignment 2 Lexical Analyzer page 2 #define YYSTYPE TokenAST_ref typedef struct TokenAST *TokenAST_ref; #include "parser.h" The ordering of things above is important. YYSTYPE is a macro definition which defines the type of the objects on the parser s semantic stack. This is used by parser.h, and must be defined before parser.h is included. Hence, to include it from inside of tokenast.h ensures that things will always be defined in the correct order. A sample parser.h is to be found in the dummy-parser subdirectory. With every token recognized there should be a semantic action which creates a new struct TokenAST with malloc() and initializes the various fields as appropriate. The external declaration yylval will automatically be generated from the scanner and will be of type TokenAST_ref, soanappropriate statement to create a token node is : yylval = malloc( sizeof (struct TokenAST) ); Then fill the various fields. Note that you don t need to bother free() ing the nodes in this assignment. That, of course, leads to storage leak, but in the next project, instead of abandoning the nodes, you will link them into a parse tree. In your implementation file, you will declare the various fields : int token_code; is a copy ofthe token code to be returned by yylex(). Itwill be useful later when walking the parse tree. It also means that every lexical semantic action that returns a token may terminate with the statement (actually, you will need to write some access functions to have the equivalent effect) : return yylval->token_code; int serial_nr; is a token serial number consisting of line_nr * offset where offset is either the character number of the token within the current line or a unique integer within the current line. This will be used for two purposes : generating semantic error messages so that they can properly reference input line numbers ; and choosing unique label numbers in the generated intermediate code. StringNode_ref lex_info; is a pointer to a string node created from the lexical information found by yylex(). Strictly speaking, this is unnecessary for tokens without necessary semantic information, but it is easier to include it in every token. When lexical information needs to be associated with a token, it can be done as follows, after the malloc() of anew token. Note : yytext is declared by the scanner to point at the text of the last-recognized token. yylval->lex_info = intern_stringtable( stringtable, yytext ); In the next assignment, struct TokenASTs will be the nodes in the abstract syntax tree, and hence a facility to enter them into an n-way tree will be needed. Note that the parser s semantic stack needs to have a uniform type, and so it should be made into a stack of TokenAST_refs. 5. Tokens in the c0 language The language c0 has the following tokens in it : special symbols : =+-*/%&==!=>>=<<=;,()[]{} reserved words : int char void return if else while tokens with lexical information : identifiers and literals (character, integer, and string), all with C syntax. You do not need to interpret the semantics of literal tokens, just write a pattern to recognize them. Comments in c0 are just like incand are skipped over and never returned back to the parser. They are not tokens. Comments also begin with the hash (#) character and continue up to but not including the newline character. Thus, C #include s are treated as comments as well. This is a hack so that gcc can compile c0 programs with the inclusion of appropriate header files. According to the flex manual, here is a scanner which discards C comments and white space while maintaining the current input line counter :

3 cmps104a 2002q4 Assignment 2 Lexical Analyzer page 3 %x comment %% "/*" { BEGIN( comment ); } <comment>[ˆ*\n]* { } <comment>"*"+[ˆ*/\n]* { } <comment>\n { line_count++; } <comment>"*"+"/" { BEGIN( INITIAL ); } \n { line_count++; } [\t ]+ {} 6. Dumping to the.tok file The function make_token() should dump each node to the debug file as the node is created. Each token dumped to program.tok should have the format : TOK_KW_RETURN (return) = (=) TOK_IDENT (hello) TOK_LIT_INT (1234) { ({) TOK_LIT_STRING ("beep\007") The first column contains (double) serial_nr / in %8.3f format, followed by the integer token_code followed by the symbolic name of the token code. Lastly, ifthere is any lexical information associated with the token it is printed between parentheses exactly as stored in the string table, except that any character that is not isgraph() is printed as three octal digits following a backslash and the backslash is printed as two backslashes. The following function, if it appears in the third part of the parser source, can be used to translate an integer symbol number into a symbolic name for a grammar symbol : const char *token_code_name( int token_code ) /* input: numeric token code (symbol) *result: symbolic (char*) name of input token_code */ { return yytname[ YYTRANSLATE( token_code ) ]; } Do not worry about the contents of the c0_lib.h file until the symbol table assignment. Specifically, the sample test data shows these symbols generated into the string table. They will not be there until such time as you have the symbol table assignment done. The sample output is thus a little advanced for the current assignment. You should still link in the dummy parser in this project in order to make some undefined external references disappear at link time. Doing this is also necessary in order to make the function token_code_name() be available to the scanner. This function must be defined in the parser file since it uses the macro YYTRANSLATE, which is defined therein. The command bison -dtv -o parser.c parser.y can be used to generate the output C parser. 7. Debugging generated C programs Why amigetting the following error message? /cats/gnu/sparclib/bison/bison.simple:270: parse error before ) This is a recurring problem caused by the stupid way that the C compiler works (or doesn t). It first runs a preprocessor over the program and then compiles the output thereof. In order to be «helpful», bison and flex put in #line directives to point errors at the original source, but with matchfix operator errors, this can lead to confusing error

4 cmps104a 2002q4 Assignment 2 Lexical Analyzer page 4 messages. bison.simple is the prototype parser into which your actions are merged. It means that the error occurred somewhere in front of where it is reported, but that could be anywhere and the printed line numbers are not necessarily of any use at all. First, edit the parser.c file and delete all #line directives. Recompile, and see if the error message refers to a more meaningful location. The problem is in code you wrote in your.y file and which was propagated to the.y file. Second, if that doesn t work, use the command : gcc -E y_parser.c > plain.c This will preprocess the program so that you can see exactly what is being compiled when you gcc plain.c. Ifthat doesn t work, apply the binary search technique to the program. Comment out all your semantic actions : {/*... */} or /*{... }*/. and #ifdef out your section 3 code : #ifdef COMMENT_OUT your section 3 code #endif do the same in your section 1 %{... %} declarations. If you recompile, the error (hopefully) will be gone, because the offending code will be gone. Then put the code back in a little at a time until the error comes back. Especially : check for mismatched matchfix operators like {}[]()/**/. Of course, if you are running using the options -ansi -Wall -pedantic when trying to compile the generated K&R code, you ll get a ton of warnings. So compile the generated code without those options and only use the «friendly» options when compiling code you wrote yourself. 8. Avoid keywords in the lexical grammar The following is a very poor way of recognizing reserved words : "if" { return make_token( KW_IF ); } "while" { return make_token( KW_WHILE ); }...etc... {IDENT} { return make_token( IDENTIFIER ); } Amuch better way to do it is as follows : {IDENT} { return make_ident_token( IDENTIFIER ); } where the function make_ident_token() first searches for yytext() in a reserved word table and then returns either the code for IDENTIFIER or one of the keyword codes, as appropriate. Searching a keyword table can be done with the C library function bsearch(). Alinear search is NOT acceptable, NOR is a sequence if if-else statements. Alternatively, instead of a reserved word table, you could statically initialize an array of String_nodes and then inserte them into the string table by a function similar to the intern function, but which does not allocate any new storage. That way, looking up a string in the string table will automatically distinguish between an identifier and a reserved word. Of course, it would require an extra bit in the string table. As an experiment, let s take all of the C++ keywords and drop them into a scanner and see what is produced : If there are no keywords in the lexical grammar, flex produces the following : 221/2000 NFA states 57/1000 DFA states (266 words) 509 state/nextstate pairs created 101/408 unique/duplicate transitions 57/1000 base-def entries created 655/2000 (peak 0) nxt-chk entries created static const struct yy_trans_info yy_transition[683] =

5 cmps104a 2002q4 Assignment 2 Lexical Analyzer page 5 If all of the C++ keywords are put in the lexical grammar, flex produces the following : 632/2000 NFA states 275/1000 DFA states (1536 words) 6681 state/nextstate pairs created 773/5908 unique/duplicate transitions 275/1000 base-def entries created 12271/14000 (peak 0) nxt-chk entries created static const struct yy_trans_info yy_transition[12323] = The statistics come from the output of running flex and the line of C declaration is from the generated scanner. As you can see, the numbers for the second scanner are MUCH larger : 2732 bytes for the scanner without keywords and for the scanner with keywords. It does not take bytes of memory to store a keyword table. And these numbers are just for the array containing the FSM integer codes. 9. Flex options : -pp -8bdsv -CeF Agood set of options to use with flex is : -pp -8bdsv -CeF. -pp generate a performance report for both major and minor performance losses. -8 generate an 8-bit clean scanner. -b generate backup information. -d compile the scanner in debug mode. -s suppress the default rule to find holes in the rule set. -v generate summary stats. -Ce construct equivalence classes to reduce the scanner size. -CF generate an alternate fast scanner. Youcan use whichever options work for you. 10. The Error reporting module Youmust have an error handling module which will accept error messages in various different formats. One of them must be called yyerror() with a specific format. Error messages should be printed in a format similar to that printed by gcc, namely with the filename, line number, and specific message. For the scanner and the parser, yyerror() will be used, and the current line number maintained by your scanner code can be printed. For other phases, the line number from the token node can be printed. One thing you will need when you link in the dummy parser is a function : void yyerror( const char *message ){ put_error( yylineno, message ); } It should in turn call your own error message function. You should have an error message function which prints to stderr the name of the file in error (i.e., the file whose name you got from argv[], the line number in that file most closely associated with the error, and the text of the error message. It should also maintain an error count so that main() knows whether to return a zero or non-zero return code. 11. Gcc options Both flex and bison produce old-style K&R code which, when compiled with the -ansi option generates many warnings. Suppress this option when you compile the generated code, but only for that code. Also, never put more than the absolute minimum amount of C code in either the.l or the.y file. Use function calls and includes and put the code elsewhere whenever possible. This will tend to reduce the number of times the compiler fails to warn you about non-ansi things in your code. In addition, flex and bison do not understand C code. They simply take whatever you have in the semantic actions between squiggle brackets and in sections one and three and copy them to the output file. Errors in the C code will not show upduring the flex or bison phase, but only when you get to compile the generated code.

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

TDDD55 - Compilers and Interpreters Lesson 3

TDDD55 - Compilers and Interpreters Lesson 3 TDDD55 - Compilers and Interpreters Lesson 3 November 22 2011 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University LESSON SCHEDULE November 1,

More information

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

COMPILERS AND INTERPRETERS Lesson 4 TDDD16 COMPILERS AND INTERPRETERS Lesson 4 TDDD16 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University TODAY Introduction to the Bison parser generator

More information

COMPILER CONSTRUCTION Seminar 02 TDDB44

COMPILER CONSTRUCTION Seminar 02 TDDB44 COMPILER CONSTRUCTION Seminar 02 TDDB44 Martin Sjölund (martin.sjolund@liu.se) Adrian Horga (adrian.horga@liu.se) Department of Computer and Information Science Linköping University LABS Lab 3 LR parsing

More information

An Introduction to LEX and YACC. SYSC Programming Languages

An Introduction to LEX and YACC. SYSC Programming Languages An Introduction to LEX and YACC SYSC-3101 1 Programming Languages CONTENTS CONTENTS Contents 1 General Structure 3 2 Lex - A lexical analyzer 4 3 Yacc - Yet another compiler compiler 10 4 Main Program

More information

TDDD55- Compilers and Interpreters Lesson 3

TDDD55- Compilers and Interpreters Lesson 3 TDDD55- Compilers and Interpreters Lesson 3 Zeinab Ganjei (zeinab.ganjei@liu.se) Department of Computer and Information Science Linköping University 1. Grammars and Top-Down Parsing Some grammar rules

More information

CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis

CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis CS143 Handout 05 Summer 2011 June 22, 2011 Programming Project 1: Lexical Analysis Handout written by Julie Zelenski with edits by Keith Schwarz. The Goal In the first programming project, you will get

More information

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017 CS 426 Fall 2017 1 Machine Problem 1 Machine Problem 1 CS 426 Compiler Construction Fall Semester 2017 Handed Out: September 6, 2017. Due: September 21, 2017, 5:00 p.m. The machine problems for this semester

More information

CSCI-243 Exam 2 Review February 22, 2015 Presented by the RIT Computer Science Community

CSCI-243 Exam 2 Review February 22, 2015 Presented by the RIT Computer Science Community CSCI-43 Exam Review February, 01 Presented by the RIT Computer Science Community http://csc.cs.rit.edu C Preprocessor 1. Consider the following program: 1 # include 3 # ifdef WINDOWS 4 # include

More information

Programming Project II

Programming Project II Programming Project II CS 322 Compiler Construction Winter Quarter 2006 Due: Saturday, January 28, at 11:59pm START EARLY! Description In this phase, you will produce a parser for our version of Pascal.

More information

Chapter 3 -- Scanner (Lexical Analyzer)

Chapter 3 -- Scanner (Lexical Analyzer) Chapter 3 -- Scanner (Lexical Analyzer) Job: Translate input character stream into a token stream (terminals) Most programs with structured input have to deal with this problem Need precise definition

More information

CSC 467 Lecture 3: Regular Expressions

CSC 467 Lecture 3: Regular Expressions CSC 467 Lecture 3: Regular Expressions Recall How we build a lexer by hand o Use fgetc/mmap to read input o Use a big switch to match patterns Homework exercise static TokenKind identifier( TokenKind token

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 4. Lexical Analysis (Scanning) Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office: TA-121 For

More information

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No. # 01 Lecture No. # 01 An Overview of a Compiler This is a lecture about

More information

CMSC445 Compiler design Blaheta. Project 2: Lexer. Due: 15 February 2012

CMSC445 Compiler design Blaheta. Project 2: Lexer. Due: 15 February 2012 CMSC445 Compiler design Blaheta Project 2: Lexer Due: 15 February 2012 In this project we ll actually get started on our C compiler; we ll use flex to process a C program into a stream of tokens (not just

More information

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides

More information

LECTURE 11. Semantic Analysis and Yacc

LECTURE 11. Semantic Analysis and Yacc LECTURE 11 Semantic Analysis and Yacc REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely specifying valid structures with a context-free

More information

cmps104a 2002q4 Assignment 3 LALR(1) Parser page 1

cmps104a 2002q4 Assignment 3 LALR(1) Parser page 1 cmps104a 2002q4 Assignment 3 LALR(1) Parser page 1 $Id: asg3-parser.mm,v 327.1 2002-10-07 13:59:46-07 - - $ 1. Summary Write a main program, string table manager, lexical analyzer, and parser for the language

More information

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care. A Bison Manual 1 Overview Bison (and its predecessor yacc) is a tool that take a file of the productions for a context-free grammar and converts them into the tables for an LALR(1) parser. Bison produces

More information

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to

More information

Using Lex or Flex. Prof. James L. Frankel Harvard University

Using Lex or Flex. Prof. James L. Frankel Harvard University Using Lex or Flex Prof. James L. Frankel Harvard University Version of 1:07 PM 26-Sep-2016 Copyright 2016, 2015 James L. Frankel. All rights reserved. Lex Regular Expressions (1 of 4) Special characters

More information

Rule 1-3: Use white space to break a function into paragraphs. Rule 1-5: Avoid very long statements. Use multiple shorter statements instead.

Rule 1-3: Use white space to break a function into paragraphs. Rule 1-5: Avoid very long statements. Use multiple shorter statements instead. Chapter 9: Rules Chapter 1:Style and Program Organization Rule 1-1: Organize programs for readability, just as you would expect an author to organize a book. Rule 1-2: Divide each module up into a public

More information

Syntax Analysis Part IV

Syntax Analysis Part IV Syntax Analysis Part IV Chapter 4: Bison Slides adapted from : Robert van Engelen, Florida State University Yacc and Bison Yacc (Yet Another Compiler Compiler) Generates LALR(1) parsers Bison Improved

More information

The structure of a compiler

The structure of a compiler The structure of a compiler Source code front-end Intermediate front-end representation compiler back-end machine code Front-end & Back-end C front-end Pascal front-end C front-end Intel x86 back-end Motorola

More information

Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm

Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm 1 Overview Programming assignments I IV will direct you to design and build a compiler for Cool. Each assignment will cover one component

More information

Parsing and Pattern Recognition

Parsing and Pattern Recognition Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its

More information

Intermediate Programming, Spring 2017*

Intermediate Programming, Spring 2017* 600.120 Intermediate Programming, Spring 2017* Misha Kazhdan *Much of the code in these examples is not commented because it would otherwise not fit on the slides. This is bad coding practice in general

More information

Yacc: A Syntactic Analysers Generator

Yacc: A Syntactic Analysers Generator Yacc: A Syntactic Analysers Generator Compiler-Construction Tools The compiler writer uses specialised tools (in addition to those normally used for software development) that produce components that can

More information

Motivation was to facilitate development of systems software, especially OS development.

Motivation was to facilitate development of systems software, especially OS development. A History Lesson C Basics 1 Development of language by Dennis Ritchie at Bell Labs culminated in the C language in 1972. Motivation was to facilitate development of systems software, especially OS development.

More information

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process. Big Picture: Compilation Process Source program CSCI: 4500/6500 Programming Languages Lex & Yacc Scanner Lexical Lexical units, token stream Parser Syntax Intermediate Parse tree Code Generator Semantic

More information

Lexical and Syntax Analysis

Lexical and Syntax Analysis Lexical and Syntax Analysis (of Programming Languages) Bison, a Parser Generator Lexical and Syntax Analysis (of Programming Languages) Bison, a Parser Generator Bison: a parser generator Bison Specification

More information

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool Lexical Analysis Implementing Scanners & LEX: A Lexical Analyzer Tool Copyright 2016, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California

More information

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages: Lex & Yacc by H. Altay Güvenir A compiler or an interpreter performs its task in 3 stages: 1) Lexical Analysis: Lexical analyzer: scans the input stream and converts sequences of characters into tokens.

More information

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages: Lex & Yacc By H. Altay Güvenir A compiler or an interpreter performs its task in 3 stages: 1) Lexical Analysis: Lexical analyzer: scans the input stream and converts sequences of characters into tokens.

More information

Handout 7, Lex (5/30/2001)

Handout 7, Lex (5/30/2001) Handout 7, Lex (5/30/2001) Lex is a venerable Unix tool that generates scanners. Input to lex is a text file that specifies the scanner; more precisely: specifying tokens, a yet to be made scanner must

More information

Automatic Scanning and Parsing using LEX and YACC

Automatic Scanning and Parsing using LEX and YACC Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser Yacc Generator of LALR(1) parsers YACC = Yet Another Compiler Compiler symptom of two facts: 1. Popularity of parser generators in the 70s 2. Historically: compiler phases mixed within syntax analysis

More information

Lexical and Parser Tools

Lexical and Parser Tools Lexical and Parser Tools CSE 413, Autumn 2005 Programming Languages http://www.cs.washington.edu/education/courses/413/05au/ 7-Dec-2005 cse413-20-tools 2005 University of Washington 1 References» The Lex

More information

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs

Lab 2. Lexing and Parsing with Flex and Bison - 2 labs Lab 2 Lexing and Parsing with Flex and Bison - 2 labs Objective Understand the software architecture of flex/bison. Be able to write simple grammars in bison. Be able to correct grammar issues in bison.

More information

Motivation was to facilitate development of systems software, especially OS development.

Motivation was to facilitate development of systems software, especially OS development. A History Lesson C Basics 1 Development of language by Dennis Ritchie at Bell Labs culminated in the C language in 1972. Motivation was to facilitate development of systems software, especially OS development.

More information

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

Marcello Bersani  Ed. 22, via Golgi 42, 3 piano 3769 Marcello Bersani bersani@elet.polimi.it http://home.dei.polimi.it/bersani/ Ed. 22, via Golgi 42, 3 piano 3769 Flex, Bison and the ACSE compiler suite Marcello M. Bersani LFC Politecnico di Milano Schedule

More information

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8 Compiler Design 1 Yacc/Bison Compiler Design 2 Bison Yacc (yet another compiler-compiler) is a LALR a parser generator created by S. C Johnson. Bison is an yacc like GNU parser generator b. It takes the

More information

CSCI Compiler Design

CSCI Compiler Design Syntactic Analysis Automatic Parser Generators: The UNIX YACC Tool Portions of this lecture were adapted from Prof. Pedro Reis Santos s notes for the 2006 Compilers class lectured at IST/UTL in Lisbon,

More information

Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm

Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm 1 Overview of the Programming Project Programming assignments I IV will direct you to design and build a compiler for Cool. Each assignment

More information

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents

I. OVERVIEW 1 II. INTRODUCTION 3 III. OPERATING PROCEDURE 5 IV. PCLEX 10 V. PCYACC 21. Table of Contents Table of Contents I. OVERVIEW 1 II. INTRODUCTION 3 1. FEATURES 3 2. CONVENTIONS 3 3. READING THIS MANUAL 3 III. OPERATING PROCEDURE 5 1. WRITING GRAMMAR DESCRIPTION FILES FOR PCYACC 5 2. GENERATING THE

More information

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process Big Picture: Compilation Process Source program CSCI: 4500/6500 Programming Languages Lex & Yacc Symbol Table Scanner Lexical Parser Syntax Intermediate Code Generator Semantic Lexical units, token stream

More information

Compil M1 : Front-End

Compil M1 : Front-End Compil M1 : Front-End TD1 : Introduction à Flex/Bison Laure Gonnord (groupe B) http://laure.gonnord.org/pro/teaching/ Laure.Gonnord@univ-lyon1.fr Master 1 - Université Lyon 1 - FST Plan 1 Lexical Analysis

More information

Have examined process Creating program Have developed program Written in C Source code

Have examined process Creating program Have developed program Written in C Source code Preprocessing, Compiling, Assembling, and Linking Introduction In this lesson will examine Architecture of C program Introduce C preprocessor and preprocessor directives How to use preprocessor s directives

More information

File I/O in Flex Scanners

File I/O in Flex Scanners File I/O in Flex Scanners Unless you make other arrangements, a scanner reads from the stdio FILE called yyin, so to read a single file, you need only set it before the first call to yylex. The main routine

More information

Compiling Regular Expressions COMP360

Compiling Regular Expressions COMP360 Compiling Regular Expressions COMP360 Logic is the beginning of wisdom, not the end. Leonard Nimoy Compiler s Purpose The compiler converts the program source code into a form that can be executed by the

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

CD Assignment I. 1. Explain the various phases of the compiler with a simple example. CD Assignment I 1. Explain the various phases of the compiler with a simple example. The compilation process is a sequence of various phases. Each phase takes input from the previous, and passes the output

More information

PRACTICAL CLASS: Flex & Bison

PRACTICAL CLASS: Flex & Bison Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS PRACTICAL CLASS: Flex & Bison Eliana Bove eliana.bove@poliba.it Install On Linux: install with the package

More information

Programming Assignment II

Programming Assignment II Programming Assignment II 1 Overview of the Programming Project Programming assignments II V will direct you to design and build a compiler for Cool. Each assignment will cover one component of the compiler:

More information

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things.

Appendix. Grammar. A.1 Introduction. A.2 Keywords. There is no worse danger for a teacher than to teach words instead of things. A Appendix Grammar There is no worse danger for a teacher than to teach words instead of things. Marc Block Introduction keywords lexical conventions programs expressions statements declarations declarators

More information

Control flow and string example. C and C++ Functions. Function type-system nasties. 2. Functions Preprocessor. Alastair R. Beresford.

Control flow and string example. C and C++ Functions. Function type-system nasties. 2. Functions Preprocessor. Alastair R. Beresford. Control flow and string example C and C++ 2. Functions Preprocessor Alastair R. Beresford University of Cambridge Lent Term 2007 #include char s[]="university of Cambridge Computer Laboratory";

More information

CS113: Lecture 7. Topics: The C Preprocessor. I/O, Streams, Files

CS113: Lecture 7. Topics: The C Preprocessor. I/O, Streams, Files CS113: Lecture 7 Topics: The C Preprocessor I/O, Streams, Files 1 Remember the name: Pre-processor Most commonly used features: #include, #define. Think of the preprocessor as processing the file so as

More information

Decaf PP2: Syntax Analysis

Decaf PP2: Syntax Analysis Decaf PP2: Syntax Analysis Date Assigned: 10/10/2013 Date Due: 10/25/2013 11:59pm 1 Goal In this programming project, you will extend the Decaf compiler to handle the syntax analysis phase, the second

More information

Compiler construction in4020 lecture 5

Compiler construction in4020 lecture 5 Compiler construction in4020 lecture 5 Semantic analysis Assignment #1 Chapter 6.1 Overview semantic analysis identification symbol tables type checking CS assignment yacc LLgen language grammar parser

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

CSCI 171 Chapter Outlines

CSCI 171 Chapter Outlines Contents CSCI 171 Chapter 1 Overview... 2 CSCI 171 Chapter 2 Programming Components... 3 CSCI 171 Chapter 3 (Sections 1 4) Selection Structures... 5 CSCI 171 Chapter 3 (Sections 5 & 6) Iteration Structures

More information

Semantic actions for declarations and expressions. Monday, September 28, 15

Semantic actions for declarations and expressions. Monday, September 28, 15 Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going

More information

Chapter 11 Introduction to Programming in C

Chapter 11 Introduction to Programming in C Chapter 11 Introduction to Programming in C C: A High-Level Language Gives symbolic names to values don t need to know which register or memory location Provides abstraction of underlying hardware operations

More information

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community http://csc.cs.rit.edu History and Evolution of Programming Languages 1. Explain the relationship between machine

More information

The Structure of a Syntax-Directed Compiler

The Structure of a Syntax-Directed Compiler Source Program (Character Stream) Scanner Tokens Parser Abstract Syntax Tree Type Checker (AST) Decorated AST Translator Intermediate Representation Symbol Tables Optimizer (IR) IR Code Generator Target

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Recap: First-Longest-Match Analysis The Extended Matching

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: First-Longest-Match Analysis Outline of Lecture

More information

A Fast Review of C Essentials Part I

A Fast Review of C Essentials Part I A Fast Review of C Essentials Part I Structural Programming by Z. Cihan TAYSI Outline Program development C Essentials Functions Variables & constants Names Formatting Comments Preprocessor Data types

More information

Using an LALR(1) Parser Generator

Using an LALR(1) Parser Generator Using an LALR(1) Parser Generator Yacc is an LALR(1) parser generator Developed by S.C. Johnson and others at AT&T Bell Labs Yacc is an acronym for Yet another compiler compiler Yacc generates an integrated

More information

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool Module 8 - Lexical Analyzer Generator This module discusses the core issues in designing a lexical analyzer generator from basis or using a tool. The basics of LEX tool are also discussed. 8.1 Need for

More information

Cooking flex with Perl

Cooking flex with Perl Cooking flex with Perl Alberto Manuel Simões (albie@alfarrabio.di.uminho.pt) Abstract There are a lot of tools for parser generation using Perl. As we know, Perl has flexible data structures which makes

More information

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT C Review MaxMSP Developers Workshop Summer 2009 CNMAT C Syntax Program control (loops, branches): Function calls Math: +, -, *, /, ++, -- Variables, types, structures, assignment Pointers and memory (***

More information

Compiler Lab. Introduction to tools Lex and Yacc

Compiler Lab. Introduction to tools Lex and Yacc Compiler Lab Introduction to tools Lex and Yacc Assignment1 Implement a simple calculator with tokens recognized using Lex/Flex and parsing and semantic actions done using Yacc/Bison. Calculator Input:

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou Lecture Outline COMP-421 Compiler Design! Lexical Analyzer Lex! Lex Examples Presented by Dr Ioanna Dionysiou Figures and part of the lecture notes taken from A compact guide to lex&yacc, epaperpress.com

More information

Figure 2.1: Role of Lexical Analyzer

Figure 2.1: Role of Lexical Analyzer Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer

More information

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell Handout written by Julie Zelenski with minor edits by Keith. flex is a fast lexical analyzer generator. You specify the scanner you want in

More information

Compilers Project 3: Semantic Analyzer

Compilers Project 3: Semantic Analyzer Compilers Project 3: Semantic Analyzer CSE 40243 Due April 11, 2006 Updated March 14, 2006 Overview Your compiler is halfway done. It now can both recognize individual elements of the language (scan) and

More information

Compiler construction 2002 week 5

Compiler construction 2002 week 5 Compiler construction in400 lecture 5 Koen Langendoen Delft University of Technology The Netherlands Overview semantic analysis identification symbol tables type checking assignment yacc LLgen language

More information

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD GROUP - B EXPERIMENT NO : 07 1. Title: Write a program using Lex specifications to implement lexical analysis phase of compiler to total nos of words, chars and line etc of given file. 2. Objectives :

More information

LEX/Flex Scanner Generator

LEX/Flex Scanner Generator Compiler Design 1 LEX/Flex Scanner Generator Compiler Design 2 flex - Fast Lexical Analyzer Generator We can use flex a to automatically generate the lexical analyzer/scanner for the lexical atoms of a

More information

DDMD AND AUTOMATED CONVERSION FROM C++ TO D

DDMD AND AUTOMATED CONVERSION FROM C++ TO D 1 DDMD AND AUTOMATED CONVERSION FROM C++ TO D Daniel Murphy (aka yebblies ) ABOUT ME Using D since 2009 Compiler contributor since 2011 2 OVERVIEW Why convert the frontend to D What s so hard about it

More information

Chapter 11 Introduction to Programming in C

Chapter 11 Introduction to Programming in C Chapter 11 Introduction to Programming in C C: A High-Level Language Gives symbolic names to values don t need to know which register or memory location Provides abstraction of underlying hardware operations

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation

More information

COMPILER DESIGN. For COMPUTER SCIENCE

COMPILER DESIGN. For COMPUTER SCIENCE COMPILER DESIGN For COMPUTER SCIENCE . COMPILER DESIGN SYLLABUS Lexical analysis, parsing, syntax-directed translation. Runtime environments. Intermediate code generation. ANALYSIS OF GATE PAPERS Exam

More information

G52CPP C++ Programming Lecture 6. Dr Jason Atkin

G52CPP C++ Programming Lecture 6. Dr Jason Atkin G52CPP C++ Programming Lecture 6 Dr Jason Atkin 1 Last lecture The Stack Lifetime of local variables Global variables Static local variables const (briefly) 2 Visibility is different from lifetime Just

More information

Programming in C++ 4. The lexical basis of C++

Programming in C++ 4. The lexical basis of C++ Programming in C++ 4. The lexical basis of C++! Characters and tokens! Permissible characters! Comments & white spaces! Identifiers! Keywords! Constants! Operators! Summary 1 Characters and tokens A C++

More information

Crafting a Compiler with C (V) Scanner generator

Crafting a Compiler with C (V) Scanner generator Crafting a Compiler with C (V) 資科系 林偉川 Scanner generator Limit the effort in building a scanner to specify which tokens the scanner is to recognize Some generators do not produce an entire scanner; rather,

More information

Project 1: Scheme Pretty-Printer

Project 1: Scheme Pretty-Printer Project 1: Scheme Pretty-Printer CSC 4101, Fall 2017 Due: 7 October 2017 For this programming assignment, you will implement a pretty-printer for a subset of Scheme in either C++ or Java. The code should

More information

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division CS 164 Spring 2010 P. N. Hilfinger CS 164: Final Examination (revised) Name: Login: You have

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #54. Organizing Code in multiple files

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #54. Organizing Code in multiple files Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #54 Organizing Code in multiple files (Refer Slide Time: 00:09) In this lecture, let us look at one particular

More information

Compiler, Assembler, and Linker

Compiler, Assembler, and Linker Compiler, Assembler, and Linker Minsoo Ryu Department of Computer Science and Engineering Hanyang University msryu@hanyang.ac.kr What is a Compilation? Preprocessor Compiler Assembler Linker Loader Contents

More information

Building a Parser Part III

Building a Parser Part III COMP 506 Rice University Spring 2018 Building a Parser Part III With Practical Application To Lab One source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda

More information

Lecture 03 Bits, Bytes and Data Types

Lecture 03 Bits, Bytes and Data Types Lecture 03 Bits, Bytes and Data Types Computer Languages A computer language is a language that is used to communicate with a machine. Like all languages, computer languages have syntax (form) and semantics

More information

DOID: A Lexical Analyzer for Understanding Mid-Level Compilation Processes

DOID: A Lexical Analyzer for Understanding Mid-Level Compilation Processes www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 12 Dec. 2016, Page No. 19507-19511 DOID: A Lexical Analyzer for Understanding Mid-Level Compilation

More information

A simple syntax-directed

A simple syntax-directed Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character

More information