Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process

Similar documents
Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Lex and YACC primer/howto

Lex and YACC primer/howto

PRACTICAL CLASS: Flex & Bison

Using Lex or Flex. Prof. James L. Frankel Harvard University

LECTURE 11. Semantic Analysis and Yacc

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

An Introduction to LEX and YACC. SYSC Programming Languages

Syntax Analysis Part IV

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Edited by Himanshu Mittal. Lexical Analysis Phase

Lexical and Parser Tools

CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell

CS4850 SummerII Lex Primer. Usage Paradigm of Lex. Lex is a tool for creating lexical analyzers. Lexical analyzers tokenize input streams.

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

The structure of a compiler

CSC 467 Lecture 3: Regular Expressions

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Yacc: A Syntactic Analysers Generator

An introduction to Flex

Introduction to Lex & Yacc. (flex & bison)

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Flex and lexical analysis

Compil M1 : Front-End

Module 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Flex and lexical analysis. October 25, 2016

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Chapter 3 -- Scanner (Lexical Analyzer)

Marcello Bersani Ed. 22, via Golgi 42, 3 piano 3769

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Using an LALR(1) Parser Generator

Compiler Lab. Introduction to tools Lex and Yacc

Lexical Analysis. Implementing Scanners & LEX: A Lexical Analyzer Tool

Preparing for the ACW Languages & Compilers

CSE302: Compiler Design

TDDD55- Compilers and Interpreters Lesson 2

LEX/Flex Scanner Generator

Handout 7, Lex (5/30/2001)

Automatic Scanning and Parsing using LEX and YACC

Lexical and Syntax Analysis

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

TDDD55 - Compilers and Interpreters Lesson 3

Parsing and Pattern Recognition

IBM. UNIX System Services Programming Tools. z/os. Version 2 Release 3 SA

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Figure 2.1: Role of Lexical Analyzer

Applications of Context-Free Grammars (CFG)

Lexical and Syntax Analysis

Gechstudentszone.wordpress.com

Type 3 languages. Regular grammars Finite automata. Regular expressions. Deterministic Nondeterministic. a, a, ε, E 1.E 2, E 1 E 2, E 1*, (E 1 )

Principles of Programming Languages

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

Etienne Bernard eb/textes/minimanlexyacc-english.html

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Ulex: A Lexical Analyzer Generator for Unicon

PRINCIPLES OF COMPILER DESIGN UNIT II LEXICAL ANALYSIS 2.1 Lexical Analysis - The Role of the Lexical Analyzer

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

TDDD55- Compilers and Interpreters Lesson 3

COMPILER CONSTRUCTION Seminar 02 TDDB44

CSCI Compiler Design

Lexical Analyzer Scanner

LECTURE 7. Lex and Intro to Parsing

UNIT - 7 LEX AND YACC - 1

Parsing How parser works?

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Structure of Programming Languages Lecture 3

Programming Assignment II

Gechstudentszone.wordpress.com

Part 5 Program Analysis Principles and Techniques

Lexical Analyzer Scanner

Yacc Yet Another Compiler Compiler

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

CSCI312 Principles of Programming Languages!

Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm

Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

Group A Assignment 3(2)

LECTURE 6 Scanning Part 2

Syntax-Directed Translation

Department : Computer Science & Engineering

Compiler course. Chapter 3 Lexical Analysis

CSE302: Compiler Design

Compiler Design 1. Yacc/Bison. Goutam Biswas. Lect 8

Compiler Construction D7011E

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

File I/O in Flex Scanners

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Compiler Construction

Lex (Lesk & Schmidt[Lesk75]) was developed in the early to mid- 70 s.

Compiler Construction

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

Transcription:

Big Picture: Compilation Process Source program CSCI: 4500/6500 Programming Languages Lex & Yacc Symbol Table Scanner Lexical Parser Syntax Intermediate Code Generator Semantic Lexical units, token stream Parse tree Optimizer (optional) Abstract syntax tree or other intermediate form Code Generator Machine Language Maria Hybinette, UGA 1 Computer Maria Hybinette, UGA 2 Big Picture: Compilation Process Big Picture: Compilation Process Scanner Lexical Parser Syntax Source program Lexical units, token stream Parse tree Scanner Lexical Parser Syntax Source program Lexical units, token stream Parse tree a = b + c * d id1 = id2 + id3 * id4 = id1 + * id2 id3 id4 Code Generator Machine Language Computer Maria Hybinette, UGA 3 Code Generator Machine/Assembly Language load id3 mul id4 Computer add id2 store id1 Maria Hybinette, UGA 4 General Process Overview! 1975 Lex & YACC automated the compilation process» The GNU version of these are called flex and bison and are free.! Lex take patterns and generate code for a lexical analyzer or scanner.» Converts strings of input using user defined patterns to tokens.! Yacc reads user specified grammars to generate code for a syntax analyzer or a parser, then the parser compiles your program in your language» The syntax analyzer uses grammar rules that allow it to analyze tokens from the lexical analyzer and create a syntax tree (a hierarchical data structure).» CC.! Final step - code generation, does a depth-first walk of the syntax tree to generate code (e.g., in machine code). Maria Hybinette, UGA 5 file.y file.l yacc y.tab.h lex (yyparse) y.tab.c lex.yy.c (yylex)! Pattern matching rules for tokens in file.l (defines the vocabulary of the language)» Tells lex what the strings/symbols of the language look like and so it can convert them to tokens (enters them into the symbol table, with attributes, such as data type, e.g., integer) which yacc understands! Grammar rules for language in file.y» Tells yack what the grammar rules are so it can analyze the tokens that it got from lex and creates a syntax tree. Maria Hybinette, UGA 6 gcc Source in language Hello.maria MariasNewCompiler.exe Compiled output

Lex and Yacc What is Lex?! Lex and Yacc are tools for generating language parsers! Each do a single function! This is what they do:» Get a token from the input stream (stream of characters) and» Parse a sequence of tokens to see if it matches a grammar! Yacc generates a parser (another program).» Calls Lex generated tokenizer function (yylex()) each time it wants a token.» You can define actions for particular grammar rules. For example, it can: print the string match if the input matched the grammar it expected (or something more complex) Maria Hybinette, UGA 7! Lex is a lexical analyzer generator (or tokenizer). It enables you to define the vocabulary of the language.» How?: It automatically generates a lexer or scanner given a lex specification file as input (.l file)! Purpose: Breaks up the input stream into tokens and it sees a group of characters that match a key, takes a certain action.» For example consider breaking up a file containing the story Moby Dick into individual words Ignore white space Ignore punctuation! Generates a C source file, e.g., maria.c, example1.c» contains a function called yylex() that obtains the next valid token in the input stream.» this source file in turn can then be compiled by the C compiler (e.g., gcc) to machine/assembly code. Maria Hybinette, UGA 8 Lex s Input File Syntax Lex Syntax! Lex input file consist of up to three sections» Lex and C definition section that can be used in the middle section. C definitions are wrapped in % and» Pattern action pairs, where the pattern is a regular expression and the action is in C syntax definitions.» Supplementary C-routines (later) % #include <stdio> Stop printf( Stop command received ) Start printf( Start command received ) example1.l patterns rules Maria Hybinette, UGA 9 subroutines! Lex input file consist of up to three sections (' ) %!.c is generated after running! This part will be embedded into *.c! Substitutions, code and start states will be copied into *.c! Define how to scan (pattern) and what action to take for each lexeme that translates to a token! Any user code for example main() calls yylex() one is provided by default. Maria Hybinette, UGA 10 % Stop printf( Stop command received ) Start printf( Start command received ) http://www.cs.uga.edu/~maria/classes/4500-spring-2012/lexyacc.zip example1.l Maximize compatibility to ATT s lex implementation Write to standard output instead of lex.yy.c atlas:maria:255 flex -l -t example1.l > example1.c atlas:maria:257 gcc example1.c -o example1 -lfl atlas:maria:261 example1 hello hello Start Start command received ^D You may wonder how the program runs, as we didn't define a main() function. This function is defined for you in libl (liblex) which we compiled in with the -lfl command. Regular expressions % [01234567890]+ printf( NUMBER\n ) [a-za-z][a-za-z0-9]* printf( WORD\n ) atlas:maria:422 flex -l -t example2.l > example2.c atlas:maria:423 gcc example2.c -o example2 -lfl atlas:maria:424 example2 hello WORD 5lkfsj NUMBER WORD lkjklj3245 WORD example2.l Make sure that you do not create zero length matches like '[0-9]*' - your lexer might get confused and start matching empty strings repeatedly.

More Complicated (C-like) Syntax! Supposing we know what we want the Syntax to look more C like! category lame-servers null category cname null type hint file /etc/bind/db.root Maria Hybinette, UGA 13 category lame-servers null category cname null type hint file /etc/bind/db.root % input3.txt example3.l [a-za-z][a-za-z0-9]* printf("word ") [a-za-z0-9\/.-]+ printf("filename ") \" printf( QUOTE ") \ printf("obrace ") \ printf("ebrace ") printf("semicolon ") \n printf("\n") [ \t]+ /* ignore whitespace */ category lame-servers null category cname null type hint file /etc/bind/db.root atlas:maria:470 example3 < input3.txt WORD OBRACE WORD FILENAME OBRACE WORD SEMICOLON EBRACE SEMICOLON WORD WORD OBRACE WORD SEMICOLON EBRACE SEMICOLON EBRACE SEMICOLON % input3.txt WORD QUOTE FILENAME QUOTE OBRACE WORD WORD SEMICOLON WORD QUOTE FILENAME QUOTE SEMICOLON EBRACE SEMICOLON [a-za-z][a-za-z0-9]* printf("word ") [a-za-z0-9\/.-]+ printf("filename ") \" printf("quote ") \ printf("obrace ") \ printf("ebrace ") printf("semicolon ") \n printf("\n") [ \t]+ /* ignore whitespace */ example3.l Some Rules of the Rules Patterns: Regular Expressions in Lex! The rules section of the Lex/Flex input contains a series of rules of the form:» pattern action! Patterns (more next slide):» Un-indented (starts in the first column) and the action starts on the same line» Ends or terminates at first non-escaped white space! Action:» If action is empty and a pattern match then input token is discarded (ignored)» If the action is enclosed in braces then the action may cross multiple lines Maria Hybinette, UGA 17! Operators:» \ [ ] ^ -?. * + ( ) $ / % < >» Use escape to use an operator as a character, the escape character is \» Examples: \$ = $ \\ = \! [ ]: Defines character classes - matches the string once.» [ab] a or b» [a-z] a or b or c or z» [^a-za-z] ^ negates - any character that is NOT a letter not the circumflex is in a character class[], outside a character class ^ means beginning of line.!. :» Matches all characters except newline (\n)! A? A* A+: Maria Hybinette, UGA» 0 or one instance of A, 0 or more, 1 or more instances of A 18

Regular Expressions in Lex Regular Expressions! More Examples:» ab?c ac or abc» [a-z]+ Lowercase strings» [a-za-z][a-za-z0-9] Alphanumeric strings, maria1, maria2 Maria Hybinette, UGA 19! Order of precedence» Kleene *,?, +» Concatenation» Alternation ( )» All operators are left associative» Example: a*b cd*! [ \t\n]! = ((a*)b) (c(d*))» if no action defined for this match, lex just ignores that input. Maria Hybinette, UGA 20 Pattern Matching Primitives Lex predefined variables Metacharacter Matches. any character except newline \n newline * zero or more copies of the preceding expression + one or more copies of the preceding expression? zero or one copy of the preceding expression ^ $ end of line a b (ab)+ [ab] a3 a+b beginning of line / complement (outside the [ ] character class). a or b one or more copies of ab (grouping) a or b 3 instances of a literal a+b Maria Hybinette, UGA 21 yytext String containing the lexeme yyleng Length of the string yyin Input stream pointer FILE * yyout Output stream pointer FILE * Example: Counting number of words in a file and their total size: [a-za-z]+ words++ chars += yyleng Maria Hybinette, UGA 22 %!! int num_words = 0, num_chars = 0!!!! [a-za-z]+!! int main()!! Printing out a Summary!num_words++ num_chars += yyleng! yylex()! printf("%d words %d chars\n", num_words, num_chars )!! atlas:maria:422 flex -l -t examplex.l > examplex.c! atlas:maria:423 gcc examplex.c -o examplex -lfl! atlas:maria:424 examplex! helkje helkhweoj lkwjerflkwj maria is great!! Skljdf! examplex.l! ^D! 7 words 44 chars! Maria Hybinette, UGA 23 Lex Library Routines yylex() Default main() calls yylex() yymore() Returns next token yyless(n) Retain the first n characters in yytext yywarp() Called at end of file EOF, returns 1 is default Maria Hybinette, UGA 24

More Lex Summary LEX! A regular expression in Lex finishes with a space, tab or newline! The choices of Lex:» Lex always matches the longest (number of characters) lexeme possible» If two or more lexemes are of the same length, then Lex choose the lexem corresponding to the first regular expression.» Example: matchlongest.l! Lex is a scanner generator» generates C code (or C++ code) to scan inputs for lexemes (string of the language) and convert them to tokens! Lex defines the tokens by using an input file (specification file) that specifies the lexemes as regular expressions! Regular expressions in the specification file terminates with» space, newline or tab! Rules:» Longest possible match first,» Same length, picks the expression that is specified first Maria Hybinette, UGA 25 Maria Hybinette, UGA 26 What is YACC: Yet Another Compiler-Compiler?! Last time: Big picture and tutorial on LEX! Today: Tutorial on YACC! Thursday Parse Trees / Ambiguity, and Conclusion on Parsing. Maria Hybinette, UGA 27! A tool that automatically generates a parser according to given specifications.» YACC is higher-level that lex: it deals with sentence instead of words.» YACC specification is given in a specification file, typically post fixed with a y, so a.y file.» Parses input streams coming from lex that now contains tokens (these tokens may have values associated with them, more on this shortly).! YACC uses grammar rules that allows it to analyze if the stream of tokens from lex is legal.» We will use a free version of YACC called bison (http://www.gnu.org/software/bison/manual/pdf/bison.pdf) Maria Hybinette, UGA 28 YACC File Format (looks a lot like a lex configuration file) YACC Syntax! Yacc input file consist of up to three sections %... C declarations...... yacc declarations...... rules... Declarations %!.c is generated after running! This part will be embedded into *.c! Contains token declarations. These are the tokens that are recognized by lex (y.tab.h)... Subroutines... Rules! Define how to understand the language and what actions to take for each sentence Maria Hybinette, UGA 29 Subroutines! Any user code for example main() calls yyparse() one is provided by default. Maria Hybinette, UGA 30

The YACC Specification File (.y) The Rule Section: BNF like! Definitions» Declarations of tokens» Type of values used on parser stack (if different from default type (INTEGER)).! Rules These definitions are then defined in a header file, y.tab.h that lex includes (processed separately d)» Lists grammar rules with corresponding routines! User Code production : symbol1 symbol2 action symbol3 symbol4 action Production : symbol1 symbol2 action Maria Hybinette, UGA 31 Maria Hybinette, UGA 32 Example (Preview) Diving In: Example Rule! Give me a rule in a grammar that includes» expressions that» add or subtract two NUMBERS.! Assume NUMBERS is already defined.! USES stacks to keep track of values and current symbols (current parsing state) in parsing a grammar.» Parse stack (current parsing state)» Value stack (type and value) Maria Hybinette, UGA 33 statement : expression printf ( = %g\n, $1) expression : expression + expression $$ = $1 + $3 expression - expression $$ = $1 - $3 number $$ = $1 statement expression According these two productions, expression expression 5 + 4 3 + 2 is parsed into: number expression expression number expression expression number number 5 + 4-3 + 2 Maria Hybinette, UGA 34 BUT First: Lets do a simpler example to see how YACC works with LEX Controlling a Thermostat: Lex input! Create a simple language to control a thermostat (red)! Need:» 2 states that are set to on or off» Target» Temperature» Number! First create a scanner (lexer) to define the tokens of the language:» example4.l next! Session: New temperature set! Tokens: on off temperature number Maria Hybinette, UGA 35 Tokens: on off temperature number New temperature set! % example4.l #include example4.tab.h /* generated from yacc */ extern YYSTYPE yylval /* need for lex/yacc */ [0-9]+ return NUMBER return TOKHEAT on off return STATE return TOKTARGET temperature return TOKTEMPERATURE \n /* ignore end of line */ [ \t]+ /* ignore whitespace */ Tokens are fed (returned) to yacc y.tab.h defines the tokens : generated from yacc. Maria Hybinette, UGA 36

Yacc Rule Section Controlling a Thermostat: YACC input Details: More later New temperature set! Tokens: on off temperature number commands: /* empty */ commands command command: _switch _set _switch: TOKHEAT STATE printf("\theat turned on or off\n") _set: TOKTARGET TOKTEMPERATURE NUMBER printf("\ttemperature set\n") example4.y Maria Hybinette, UGA 37! If components in a rule is empty, it means that the result can match the empty string ("). For example, how would you define a comma-separated sequence of zero or more letters?» A, B, C, D,!. E startlist: /* empty */ letterlist letterlist: letter letterlist, letter! Left recursion better on Bison because it can then use a bounded stack (we look at this at more detail next week or maybe this Thursday) Maria Hybinette, UGA 38 Controlling a Thermostat: The rest of the YACC specifications New temperature set! Tokens: on off temperature number % #include <string.h> void yyerror(const char *str) fprintf(stderr,"error: %s\n",str) int yywrap() /* called at EOF can open another then return 0 */ return 1 /* YES! I am really done */ main() yyparse() /* get things started */ example4.y! yyerror() called when yacc finds an error! yywrap() used if reading from multiple files, yywrap() called at EOF, so you can open up another file and return 0. OR you return 1 to indicate nope you are really done. %token NUMBER TOKHEAT STATE TOKTARGET TOKTEMPERATURE Maria Hybinette, UGA 39 Maria Hybinette, UGA 40 Flex, YACC and Run example4.l example4.y atlas:maria:194 flex -l -t example4.l > example4l.c atlas:maria:195 bison -d example4.y rm example4.tab.c // declaration atlas:maria:196 bison -v example4.y -o example4y.c atlas:maria:197 gcc example4l.c example4y.c -o example4 atlas:maria:202 example4 Heat turned on or off Heat turned on or off temperature 10 Temperature set temperature 10 atlas:maria:202 example4 Heat turned on or off Heat turned on or off temperature 10 Temperature set temperature 10 error: parse error Notice: No link command to gcc, i.e. no -lfl Do you know why? Wanted this New temperature set! Temperature set to 22 Maria Hybinette, UGA 41 Need access to the value of parameters

Add parameters to YACC Controlling a Thermostat: Lex input! When lex matches a :» The matched string is in yytext» How is it communicated to YACC? To communicate a value to yacc you can use the variable yylval (so far we have not seen how to do this, but we will now). Maria Hybinette, UGA 43 Temperature set to 22 Tokens: on off temperature number % example5.l #include example5.tab.h /* generated from yacc */ extern YYSTYPE yylval [0-9]+ yylval=atoi(yytext) return NUMBER return TOKHEAT on off yylval=!strcmp(yytext,"on") return STATE return TOKTARGET temperature return TOKTEMPERATURE \n /* ignore end of line */ [ \t]+ /* ignore whitespace */ Tokens are fed (returned) to yacc y.tab.h defines the tokens Maria Hybinette, UGA 44 Temperature set to 22 % example5.l #include example4.tab.h /* generated from yacc */ extern YYSTYPE yylval [0-9]+ yylval=atoi(yytext) return NUMBER return TOKHEAT on off yylval=!strcmp(yytext,"on") return STATE return TOKTARGET temperature return TOKTEMPERATURE \n /* ignore end of line */ [ \t]+ /* ignore whitespace */ Temperature set to 22 % example5.l #include example4.tab.h /* generated from yacc */ extern YYSTYPE yylval [0-9]+ yylval=atoi(yytext) return NUMBER return TOKHEAT on off yylval=!strcmp(yytext,"on") return STATE return TOKTARGET temperature return TOKTEMPERATURE \n /* ignore end of line */ [ \t]+ /* ignore whitespace */ _switch: TOKHEAT STATE if($2) else example5.y printf("\theat turned on\n") printf("\theat turned off\n") _set: example5.y TOKTARGET TOKTEMPERATURE NUMBER printf( \ttemperature set to %d\n, $3) _switch: example5.y TOKHEAT STATE if($2) printf("\theat turned on\n") else printf("\theat turned off\n") _set: TOKTARGET TOKTEMPERATURE NUMBER printf("\ttemperature set to %d\n",$3) atlas:maria:248 flex -l -t example5.l > example5l.c atlas:maria:249 bison -v example5.y -o example5y.c atlas:maria:250 bison -d example5.y rm example5.tab.c rm: remove example5.tab.c (yes/no)? yes atlas:maria:251 gcc example5l.c example5y.c -o example5 atlas:maria:253 example5 Heat turned on temperature 10 Temperature set to 10 Continue lame-servers! Write YACC grammar! Need to translate the lexer so that it returns values to YACC (e.g., file names and zone names) example6.l category lame-servers null category cname null type hint file /etc/bind/db.root Maria Hybinette, UGA 48

category lame-servers null category cname null type hint file /etc/bind/db.root % #include example5.tab.h" extern YYSTYPE yylval % zone return ZONETOK file return FILETOK [a-za-z][a-za-z0-9]* yylval = strdup(yytext) return WORD [a-za-z0-9\/.-]+ yylval = strdup(yytext) return FILENAME \" return QUOTE \ return OBRACE \ return EBRACE return SEMICOLON \n /* ignore EOL */ [ \t]+ /* ignore whitespace */ example6.l [a-za-z][a-za-z0-9]* printf("word ") [a-za-z0-9\/.-]+ printf("filename ") \" printf("quote ") \ printf("obrace ") \ printf("ebrace ") printf("semicolon ") \n printf("\n") [ \t]+ /* ignore whitespace */ example6.y category lame-servers null category cname null type hint #define YYSTYPE char * file /etc/bind/db.root other routines commands: commands command SEMICOLON command: zone_set zone_set: ZONETOK quotedname zonecontent printf( Complete zone for %s found\n, $2 ) example6.y category lame-servers null category cname null type hint file /etc/bind/db.root! Generic statements to catch statements within the zone block example6.y category lame-servers null category cname null type hint file /etc/bind/db.root zonecontent: OBRACE zonestatements EBRACE quotedname: QUOTE FILENAME QUOTE $$=$2! quotedname s value is without the quotes,» $$=$2 zonestatements: zonestatements zonestatement SEMICOLON zonestatement: statements FILETOK quotedname printf( A zonefile name %s was encountered\n, $2 )! Generic statements to catch block statements too block: OBRACE zonestatements EBRACE SEMICOLON statements: statements statement statement: WORD block quotedname example6.y category lame-servers null category cname null type hint file /etc/bind/db.root Flex, Yacc and Run atlas:maria:194 flex -l -t example6.l > example6l.c atlas:maria:195 bison -d example6.y rm example6.tab.c atlas:maria:196 bison -v example6.y -o example6y.c atlas:maria:197 gcc example6l.c example6y.c -o example6 type hint file /etc/bind/db.root example6.l example6.y atlas:maria:202 example6 < input6.txt A zonefile name '/etc/bind/db.root' was encountered Complete zone for '.' found Maria Hybinette, UGA 54

Summary! YACC file you write your own main() which calls yyparse()» yyparse() is created by YACC and ends up in y.tab.c! yyparse() reads a stream of token/value pairs from yylex()» Code yylex() yourself or have lex do it for you! yylex() returns an integer value representing a token type you can optionally define a value for the token in yylval (default int)» tokens have numeric id s starting from 256 Maria Hybinette, UGA 55