More Examples. Lex/Flex/JLex
|
|
- Amberlynn Jenkins
- 5 years ago
- Views:
Transcription
1 More Examples A FORTRAN-like real literal (which requires digits on either or both sides of a decimal point, or just a string of digits) can be defined as RealLit = (D + (λ. )) (D *. D + ) This corresponds to the DFA. D An identifier consisting of letters, digits, and underscores, which begins with a letter and allows no adjacent or trailing underscores, may be defined as ID = L (L D) * ( _ (L D) + ) * This definition includes identifiers like sum or unit_cost, but excludes _one and two_ and grand total. The DFA is: D D L D D. L _ L D Lex/Flex/JLex Lex is a well-known Unix scanner generator. It builds a scanner, in C, from a set of regular expressions that define the tokens to be scanned. Flex is a newer and faster version of Lex. Jlex is a Java version of Lex. It generates a scanner coded in Java, though its regular expression definitions are very close to those used by Lex and Flex. Lex, Flex and JLex are largely nonprocedural. You don t need to tell the tools how to scan. All you need to tell it what you want scanned (by giving it definitions of valid tokens). This approach greatly simplifies building a scanner, since most of the details of scanning (I/O, buffering, character matching, etc.) are automatically handled
2 JLex JLex is coded in Java. To use it, you enter java JLex.Main f.jlex Your CLASSPATH should be set to search the directories where JLex s classes are stored. (The CLASSPATH we gave you includes JLex s classes). After JLex runs (assuming there are no errors in your token specifications), the Java source file f.jlex.java is created. (f stands for any file name you choose. Thus csx.jlex might hold token definitions for CSX, and csx.jlex.java would hold the generated scanner). You compile f.jlex.java just like any Java program, using your favorite Java compiler. After compilation, the class file Yylex.class is created. It contains the methods: Token yylex() which is the actual scanner. The constructor for Yylex takes the file you want scanned, so new Yylex(System.in) will build a scanner that reads from System.in. Token is the token class you want returned by the scanner; you can tell JLex what class you want returned. String yytext() returns the character text matched by the last call to yylex A simple example of using JLex is in ~cs536-1/pubic/jlex Just enter make test Input to JLex There are three sections, delimited by %%. The general structure is: User Code %% Jlex Directives %% Regular Expression rules The User Code section is Java source code to be copied into the generated Java source file. It contains utility classes or return type classes you need. Thus if you want to return a class IntlitToken (for integer literals that are scanned), you include its definition in the User Code section
3 JLex directives are various instructions you can give JLex to customize the scanner you generate. These are detailed in the JLex manual. The most important are: %{ Code copied into the Yylex class (extra fields or methods you may want) %} %eof{ Java code to be executed when the end of file is reached %eof} %type classname classname is the return type you want for the scanner method, yylex() Macro Definitions In section two you may also define macros, that are used in section three. A macro allows you to give a name to a regular expression or character class. This allows you to reuse definitions and make regular expression rule more readable. Macro definitions are of the form name = def Macros are defined one per line. Here are some simple examples: Digit=[0-9] AnyLet=[A-Za-z] In section 3, you use a macro by placing its name within { and }. Thus {Digit} expands to the character class defining the digits 0 to Regular Expression Rules The third section of the JLex input file is a series of token definition rules of the form RegExpr {Java code} When a token matching the given RegExpr is matched, the corresponding Java code (enclosed in { and } ) is executed. JLex figures out what RegExpr applies; you need only say what the token looks like (using RegExpr) and what you want done when the token is matched (this is usually to return some token object, perhaps with some processing of the token text). Here are some examples: "+" {return new Token(sym.Plus);} (" ")+ {/* skip white space */} {Digit}+ {return new IntToken(sym.Intlit, new Integer(yytext()).intValue());}
4 Regular Expressions in JLex To define a token in JLex, the user to associates a regular expression with commands coded in Java. When input characters that match a regular expression are read, the corresponding Java code is executed. As a user of JLex you don t need to tell it how to match tokens; you need only say what you want done when a particular token is matched. Tokens like white space are deleted simply by having their associated command not return anything. Scanning continues until a command with a return in it is executed. The simplest form of regular expression is a single string that matches exactly itself. For example, if {return new Token(sym.If);} If you wish, you can quote the string representing the reserved word ("if"), but since the string contains no delimiters or operators, quoting it is unnecessary. For a regular expression operator, like +, quoting is necessary: "+" {return new Token(sym.Plus);} Character Classes Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixedcase. To extend our definition, we ll use a very useful feature of Lex and JLex character classes. Characters often naturally fall into classes, with all characters in a class treated identically in a token definition. In our definition of identifiers all letters form a class since any of them can be used to form an identifier. Similarly, in a number, any of the ten digit characters can be used. Character classes are delimited by [ and ]; individual characters are listed without any quotation or separators. However \, ^, ] and -, because of their special meaning in character classes, must be escaped. The character class [xyz] can match a single x, y, or z. The character class [\])] can match a single ] or ). (The ] is escaped so that it isn t misinterpreted as the end of character class.) Ranges of characters are separated by a -; [x-z] is the same as [xyz]. [0-9] is the set of all digits and [a-za-z] is the set of all letters, upper- and lower-case. \ is the escape character, used to represent
5 unprintables and to escape special symbols. Following C and Java conventions, \n is the newline (that is, end of line), \t is the tab character, \\ is the backslash symbol itself, and \010 is the character corresponding to octal 10. The ^ symbol complements a character class (it is JLex s representation of the Not operation). [^xy] is the character class that matches any single character except x and y. The ^ symbol applies to all characters that follow it in a character class definition, so [^0-9] is the set of all characters that aren t digits. [^] can be used to match all characters. Here are some examples of character classes: Character Class Set of Characters Denoted [abc] Three characters: a, b and c [cba] Three characters: a, b and c [a-c] Three characters: a, b and c [aabbcc] Three characters: a, b and c [^abc] All characters except a, b and c [\^\-\]] Three characters: ^, - and ] [^] All characters "[abc]" Not a character class. This is one five character string: [abc] Regular Operators in JLex JLex provides the standard regular operators, plus some additions. Catenation is specified by the juxtaposition of two expressions; no explicit operator is used. Outside of character class brackets, individual letters and numbers match themselves; other characters should be quoted (to avoid misinterpretation as regular expression operators). Regular Expr Characters Matched a b cd Four characters: abcd (a)(b)(cd) Four characters: abcd [ab][cd] Four different strings: ac or ad or bc or bd while Five characters: while "while" Five characters: while [w][h][i][l][e] Five characters: while Case is significant. The alternation operator is. Parentheses can be used to control grouping of subexpressions. If we wish to match the reserved word while allowing any mixture of upper- and lowercase, we can use (w W)(h H)(i I)(l L)(e E) or [ww][hh][ii][ll][ee] Regular Expr ab cd (ab) (cd) [ab] [cd] Characters Matched Two different strings: ab or cd Two different strings: ab or cd Four different strings: a or b or c or d
6 Postfix operators: * Kleene closure: 0 or more matches (ab)* matches λ or ab or abab or ababab... + Positive closure: 1 or more matches (ab)+ matches ab or abab or ababab...? Optional inclusion: expr? matches expr zero times or once. expr? is equivalent to (expr) λ and eliminates the need for an explicit λ symbol. [-+]?[0-9]+ defines an optionally signed integer literal. Single match: The character "." matches any single character (other than a newline). Start of line: The character ^ (when used outside a character class) matches the beginning of a line. End of line: The character $ matches the end of a line. Thus, ^A.*e$ matches an entire line that begins with A and ends with e
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3 CS 536 Spring 2015 1 Scanning A scanner transforms a character stream into a token stream. A scanner is sometimes
More informationAlternation. Kleene Closure. Definition of Regular Expressions
Alternation Small finite sets are conveniently represented by listing their elements. Parentheses delimit expressions, and, the alternation operator, separates alternatives. For example, D, the set of
More informationFinite Automata and Scanners
Finite Automata and Scanners A finite automaton (FA) can be used to recognize the tokens specified by a regular expression. FAs are simple, idealized computers that recognize strings belonging to regular
More informationCharacter Classes. Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixedcase.
Character Classes Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixedcase. To extend our definition, we ll use a very useful feature of Lex
More informationRegular Expressions in JLex. Character Classes. Here are some examples: "+" {return new Token(sym.Plus);}
Here are some examples: "+" {return new Token(sym.Plus); (" ")+ {/* skip white space */ {Digit+ {return new IntToken(sym.Intlit, new Integer(yytext()).intValue()); Regular Expressions in JLex To define
More informationThe character class [\])] can match a single ] or ). (The ] is escaped so that it isn t misinterpreted as the end of character class.
Character Classes Our specification of the reserved word if, as shown earlier, is incomplete. We don t (yet) handle upper or mixed- case. To extend our definition, we ll use a very useful feature of Lex
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Spring 2015
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Spring 2015 http://www.cs.wisc.edu/~fischer/cs536.html 1 Class Meets Tuesdays, 5:30 8:30 Beatles Room, Epic Campus Instructor
More informationOverlapping Definitions
Postfix operators: * Kleene closure: 0 or more matches. (ab)* matches λ or ab or abab or ababab... + Positive closure: 1 or more matches. (ab)+ matches ab or abab or ababab...? Optional inclusion: expr?
More informationStructure of Programming Languages Lecture 3
Structure of Programming Languages Lecture 3 CSCI 6636 4536 Spring 2017 CSCI 6636 4536 Lecture 3... 1/25 Spring 2017 1 / 25 Outline 1 Finite Languages Deterministic Finite State Machines Lexical Analysis
More informationCS 541 Spring Programming Assignment 2 CSX Scanner
CS 541 Spring 2017 Programming Assignment 2 CSX Scanner Your next project step is to write a scanner module for the programming language CSX (Computer Science experimental). Use the JFlex scanner-generation
More informationAn introduction to Flex
An introduction to Flex 1 Introduction 1.1 What is Flex? Flex takes a set of descriptions of possible tokens and produces a scanner. 1.2 A short history Lex was developed at Bell Laboratories in the 1970s.
More informationThe Structure of a Syntax-Directed Compiler
Source Program (Character Stream) Scanner Tokens Parser Abstract Syntax Tree (AST) Type Checker Decorated AST Translator Intermediate Representation Symbol Tables Optimizer (IR) IR Code Generator Target
More informationCSC 467 Lecture 3: Regular Expressions
CSC 467 Lecture 3: Regular Expressions Recall How we build a lexer by hand o Use fgetc/mmap to read input o Use a big switch to match patterns Homework exercise static TokenKind identifier( TokenKind token
More informationHandout 7, Lex (5/30/2001)
Handout 7, Lex (5/30/2001) Lex is a venerable Unix tool that generates scanners. Input to lex is a text file that specifies the scanner; more precisely: specifying tokens, a yet to be made scanner must
More informationProperties of Regular Expressions and Finite Automata
Properties of Regular Expressions and Finite Automata Some token patterns can t be defined as regular expressions or finite automata. Consider the set of balanced brackets of the form [[[ ]]]. This set
More informationJFlex Regular Expressions
JFlex Regular Expressions Lecture 17 Section 3.5, JFlex Manual Robb T. Koether Hampden-Sydney College Wed, Feb 25, 2015 Robb T. Koether (Hampden-Sydney College) JFlex Regular Expressions Wed, Feb 25, 2015
More informationJFlex. Lecture 16 Section 3.5, JFlex Manual. Robb T. Koether. Hampden-Sydney College. Mon, Feb 23, 2015
JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether Hampden-Sydney College Mon, Feb 23, 2015 Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 1 / 30 1 Introduction 2 JFlex User Code
More informationCSE 401 Midterm Exam 11/5/10
Name There are 5 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes, closed
More informationReading Assignment. Scanner. Read Chapter 3 of Crafting a Compiler.
Reading Assignment Source Program (Character Stream) Scanner Tokens Parser Abstract Syntax Tree (AST) Type Checker Decorated AST Read Chapter 3 of Crafting a Compiler. Translator Intermediate Representation
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationLECTURE 6 Scanning Part 2
LECTURE 6 Scanning Part 2 FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions. We then discussed how we can create a recognizer
More informationModule 8 - Lexical Analyzer Generator. 8.1 Need for a Tool. 8.2 Lexical Analyzer Generator Tool
Module 8 - Lexical Analyzer Generator This module discusses the core issues in designing a lexical analyzer generator from basis or using a tool. The basics of LEX tool are also discussed. 8.1 Need for
More information12/22/11. Java How to Program, 9/e. Help you get started with Eclipse and NetBeans integrated development environments.
Java How to Program, 9/e Education, Inc. All Rights Reserved. } Java application programming } Use tools from the JDK to compile and run programs. } Videos at www.deitel.com/books/jhtp9/ Help you get started
More informationA lexical analyzer generator for Standard ML. Version 1.6.0, October 1994
A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994 Andrew W. Appel 1 James S. Mattson David R. Tarditi 2 1 Department of Computer Science, Princeton University 2 School of Computer
More informationProgramming Assignment I Due Thursday, October 9, 2008 at 11:59pm
Programming Assignment I Due Thursday, October 9, 2008 at 11:59pm 1 Overview Programming assignments I IV will direct you to design and build a compiler for Cool. Each assignment will cover one component
More informationAssoc. Prof. Dr. Marenglen Biba. (C) 2010 Pearson Education, Inc. All rights reserved.
Assoc. Prof. Dr. Marenglen Biba (C) 2010 Pearson Education, Inc. All rights reserved. Java application A computer program that executes when you use the java command to launch the Java Virtual Machine
More informationOptimizing Finite Automata
Optimizing Finite Automata We can improve the DFA created by MakeDeterministic. Sometimes a DFA will have more states than necessary. For every DFA there is a unique smallest equivalent DFA (fewest states
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 2
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 2 CS 536 Spring 2015 1 Reading Assignment Read Chapter 3 of Crafting a Com piler. CS 536 Spring 2015 21 The Structure
More informationCPSC 434 Lecture 3, Page 1
Front end source code tokens scanner parser il errors Responsibilities: recognize legal procedure report errors produce il preliminary storage map shape the code for the back end Much of front end construction
More informationRay Pereda Unicon Technical Report UTR-02. February 25, Abstract
iflex: A Lexical Analyzer Generator for Icon Ray Pereda Unicon Technical Report UTR-02 February 25, 2000 Abstract iflex is software tool for building language processors. It is based on flex, a well-known
More informationProgramming for Engineers Introduction to C
Programming for Engineers Introduction to C ICEN 200 Spring 2018 Prof. Dola Saha 1 Simple Program 2 Comments // Fig. 2.1: fig02_01.c // A first program in C begin with //, indicating that these two lines
More informationFigure 2.1: Role of Lexical Analyzer
Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer
More informationLexical and Syntax Analysis
Lexical and Syntax Analysis (of Programming Languages) Flex, a Lexical Analyser Generator Lexical and Syntax Analysis (of Programming Languages) Flex, a Lexical Analyser Generator Flex: a fast lexical
More informationFundamentals of Programming Session 4
Fundamentals of Programming Session 4 Instructor: Reza Entezari-Maleki Email: entezari@ce.sharif.edu 1 Fall 2011 These slides are created using Deitel s slides, ( 1992-2010 by Pearson Education, Inc).
More informationCS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell
CS143 Handout 04 Summer 2011 June 22, 2011 flex In A Nutshell Handout written by Julie Zelenski with minor edits by Keith. flex is a fast lexical analyzer generator. You specify the scanner you want in
More informationProgramming Assignment I Due Thursday, October 7, 2010 at 11:59pm
Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm 1 Overview of the Programming Project Programming assignments I IV will direct you to design and build a compiler for Cool. Each assignment
More informationAn Introduction to LEX and YACC. SYSC Programming Languages
An Introduction to LEX and YACC SYSC-3101 1 Programming Languages CONTENTS CONTENTS Contents 1 General Structure 3 2 Lex - A lexical analyzer 4 3 Yacc - Yet another compiler compiler 10 4 Main Program
More informationEXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD
GROUP - B EXPERIMENT NO : 06 1. Title: Write a program using Lex specifications to implement lexical analysis phase of compiler to generate tokens of subset of Java program 2. Objectives : - To understand
More informationThe Structure of a Syntax-Directed Compiler
Source Program (Character Stream) Scanner Tokens Parser Abstract Syntax Tree Type Checker (AST) Decorated AST Translator Intermediate Representation Symbol Tables Optimizer (IR) IR Code Generator Target
More informationWeek 2: Syntax Specification, Grammars
CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences
More informationFundamental Data Types. CSE 130: Introduction to Programming in C Stony Brook University
Fundamental Data Types CSE 130: Introduction to Programming in C Stony Brook University Program Organization in C The C System C consists of several parts: The C language The preprocessor The compiler
More informationUsing Lex or Flex. Prof. James L. Frankel Harvard University
Using Lex or Flex Prof. James L. Frankel Harvard University Version of 1:07 PM 26-Sep-2016 Copyright 2016, 2015 James L. Frankel. All rights reserved. Lex Regular Expressions (1 of 4) Special characters
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5 CS 536 Spring 2015 1 Multi Character Lookahead We may allow finite automata to look beyond the next input character.
More informationProgramming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual
Programming Languages & Translators (COMS W4115) Department of Computer Science Columbia University Summer 2007 XML Document Manipulation Language (XDML) Language Reference Manual Luba Leyzerenok ll2310@columbia.edu
More informationB The SLLGEN Parsing System
B The SLLGEN Parsing System Programs are just strings of characters. In order to process a program, we need to group these characters into meaningful units. This grouping is usually divided into two stages:
More informationProgramming Assignment II
Programming Assignment II 1 Overview of the Programming Project Programming assignments II V will direct you to design and build a compiler for Cool. Each assignment will cover one component of the compiler:
More informationFull file at
Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class
More informationLecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou
Lecture Outline COMP-421 Compiler Design! Lexical Analyzer Lex! Lex Examples Presented by Dr Ioanna Dionysiou Figures and part of the lecture notes taken from A compact guide to lex&yacc, epaperpress.com
More informationConcepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective
Concepts Lexical scanning Regular expressions DFAs and FSAs Lex CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 2 Lexical analysis
More informationChapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective
Chapter 4 Lexical analysis Lexical scanning Regular expressions DFAs and FSAs Lex Concepts CMSC 331, Some material 1998 by Addison Wesley Longman, Inc. 1 CMSC 331, Some material 1998 by Addison Wesley
More informationIntroduction to Regular Expressions Version 1.3. Tom Sgouros
Introduction to Regular Expressions Version 1.3 Tom Sgouros June 29, 2001 2 Contents 1 Beginning Regular Expresions 5 1.1 The Simple Version........................ 6 1.2 Difficult Characters........................
More informationLex Spec Example. Int installid() {/* code to put id lexeme into string table*/}
Class 5 Lex Spec Example delim [ \t\n] ws {delim}+ letter [A-Aa-z] digit [0-9] id {letter}({letter} {digit})* number {digit}+(\.{digit}+)?(e[+-]?{digit}+)? %% {ws} {/*no action and no return*?} if {return(if);}
More informationLECTURE 7. Lex and Intro to Parsing
LECTURE 7 Lex and Intro to Parsing LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens) and create real programs that can recognize them.
More informationC How to Program, 6/e by Pearson Education, Inc. All Rights Reserved.
C How to Program, 6/e 1992-2010 by Pearson Education, Inc. An important part of the solution to any problem is the presentation of the results. In this chapter, we discuss in depth the formatting features
More informationCompiler phases. Non-tokens
Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree
More informationUlex: A Lexical Analyzer Generator for Unicon
Ulex: A Lexical Analyzer Generator for Unicon Katrina Ray, Ray Pereda, and Clinton Jeffery Unicon Technical Report UTR 02a May 21, 2003 Abstract Ulex is a software tool for building language processors.
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More informationRegular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)
Regular Expressions Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl) JavaScript started supporting regular expressions in
More informationProgramming Project 1: Lexical Analyzer (Scanner)
CS 331 Compilers Fall 2017 Programming Project 1: Lexical Analyzer (Scanner) Prof. Szajda Due Thursday, September 21, 11:59:59 pm 1 Overview of the Programming Project Programming projects I IV will direct
More informationChapter Seven: Regular Expressions
Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:
More informationProgramming in C++ 4. The lexical basis of C++
Programming in C++ 4. The lexical basis of C++! Characters and tokens! Permissible characters! Comments & white spaces! Identifiers! Keywords! Constants! Operators! Summary 1 Characters and tokens A C++
More informationflex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.
flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides
More informationChapter 3 -- Scanner (Lexical Analyzer)
Chapter 3 -- Scanner (Lexical Analyzer) Job: Translate input character stream into a token stream (terminals) Most programs with structured input have to deal with this problem Need precise definition
More informationLexical Analyzer Scanner
Lexical Analyzer Scanner ASU Textbook Chapter 3.1, 3.3, 3.4, 3.6, 3.7, 3.5 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Main tasks Read the input characters and produce
More informationMonday, August 26, 13. Scanners
Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can
More informationChapter 2, Part I Introduction to C Programming
Chapter 2, Part I Introduction to C Programming C How to Program, 8/e, GE 2016 Pearson Education, Ltd. All rights reserved. 1 2016 Pearson Education, Ltd. All rights reserved. 2 2016 Pearson Education,
More informationScanners. Xiaokang Qiu Purdue University. August 24, ECE 468 Adapted from Kulkarni 2012
Scanners Xiaokang Qiu Purdue University ECE 468 Adapted from Kulkarni 2012 August 24, 2016 Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved
More informationEXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD
GROUP - B EXPERIMENT NO : 07 1. Title: Write a program using Lex specifications to implement lexical analysis phase of compiler to total nos of words, chars and line etc of given file. 2. Objectives :
More informationWednesday, September 3, 14. Scanners
Scanners Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens Identifiers, reserved words, literals, etc. What do we need to know? How do we define tokens? How can
More informationCS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 3: SEP. 13TH INSTRUCTOR: JIAYIN WANG
CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 3: SEP. 13TH INSTRUCTOR: JIAYIN WANG 1 Notice Reading Assignment Chapter 1: Introduction to Java Programming Homework 1 It is due this coming Sunday
More informationCSCI 2010 Principles of Computer Science. Data and Expressions 08/09/2013 CSCI
CSCI 2010 Principles of Computer Science Data and Expressions 08/09/2013 CSCI 2010 1 Data Types, Variables and Expressions in Java We look at the primitive data types, strings and expressions that are
More informationFlex and lexical analysis. October 25, 2016
Flex and lexical analysis October 25, 2016 Flex and lexical analysis From the area of compilers, we get a host of tools to convert text files into programs. The first part of that process is often called
More informationCrafting a Compiler with C (V) Scanner generator
Crafting a Compiler with C (V) 資科系 林偉川 Scanner generator Limit the effort in building a scanner to specify which tokens the scanner is to recognize Some generators do not produce an entire scanner; rather,
More informationConcepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens
Concepts Introduced in Chapter 3 Lexical Analysis Regular Expressions (REs) Nondeterministic Finite Automata (NFA) Converting an RE to an NFA Deterministic Finite Automatic (DFA) Lexical Analysis Why separate
More informationLexical Analysis. Introduction
Lexical Analysis Introduction Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies
More informationCS 2210 Programming Project (Part I)
CS 2210 Programming Project (Part I) January 17, 2018 Lexical Analyzer In this phase of the project, you will write a lexical analyzer for the CS 2210 programming language, MINI-JAVA. The analyzer will
More informationFlex and lexical analysis
Flex and lexical analysis From the area of compilers, we get a host of tools to convert text files into programs. The first part of that process is often called lexical analysis, particularly for such
More informationUnderstanding Regular Expressions, Special Characters, and Patterns
APPENDIXA Understanding Regular Expressions, Special Characters, and Patterns This appendix describes the regular expressions, special or wildcard characters, and patterns that can be used with filters
More informationFundamentals of Programming. Lecture 3: Introduction to C Programming
Fundamentals of Programming Lecture 3: Introduction to C Programming Instructor: Fatemeh Zamani f_zamani@ce.sharif.edu Sharif University of Technology Computer Engineering Department Outline A Simple C
More informationLexical Analysis. Textbook:Modern Compiler Design Chapter 2.1.
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1 http://www.cs.tau.ac.il/~msagiv/courses/wcc11-12.html 1 A motivating example Create a program that counts the number of lines in a given input
More informationITC213: STRUCTURED PROGRAMMING. Bhaskar Shrestha National College of Computer Studies Tribhuvan University
ITC213: STRUCTURED PROGRAMMING Bhaskar Shrestha National College of Computer Studies Tribhuvan University Lecture 07: Data Input and Output Readings: Chapter 4 Input /Output Operations A program needs
More informationMarcello Bersani Ed. 22, via Golgi 42, 3 piano 3769
Marcello Bersani bersani@elet.polimi.it http://home.dei.polimi.it/bersani/ Ed. 22, via Golgi 42, 3 piano 3769 Flex, Bison and the ACSE compiler suite Marcello M. Bersani LFC Politecnico di Milano Schedule
More informationRegex, Sed, Awk. Arindam Fadikar. December 12, 2017
Regex, Sed, Awk Arindam Fadikar December 12, 2017 Why Regex Lots of text data. twitter data (social network data) government records web scrapping many more... Regex Regular Expressions or regex or regexp
More informationThe structure of a compiler
The structure of a compiler Source code front-end Intermediate front-end representation compiler back-end machine code Front-end & Back-end C front-end Pascal front-end C front-end Intel x86 back-end Motorola
More informationJavaCC: SimpleExamples
JavaCC: SimpleExamples This directory contains five examples to get you started using JavaCC. Each example is contained in a single grammar file and is listed below: (1) Simple1.jj, (2) Simple2.jj, (3)
More informationLanguage Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */
Overview Language Basics This chapter describes the basic elements of Rexx. It discusses the simple components that make up the language. These include script structure, elements of the language, operators,
More information1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationFeatures of C. Portable Procedural / Modular Structured Language Statically typed Middle level language
1 History C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. C was originally first implemented on the DEC
More informationfpp: Fortran preprocessor March 9, 2009
fpp: Fortran preprocessor March 9, 2009 1 Name fpp the Fortran language preprocessor for the NAG Fortran compiler. 2 Usage fpp [option]... [input-file [output-file]] 3 Description fpp is the preprocessor
More informationLexical Analysis (ASU Ch 3, Fig 3.1)
Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program
More informationC Language, Token, Keywords, Constant, variable
C Language, Token, Keywords, Constant, variable A language written by Brian Kernighan and Dennis Ritchie. This was to be the language that UNIX was written in to become the first "portable" language. C
More informationA First Look at ML. Chapter Five Modern Programming Languages, 2nd ed. 1
A First Look at ML Chapter Five Modern Programming Languages, 2nd ed. 1 ML Meta Language One of the more popular functional languages (which, admittedly, isn t saying much) Edinburgh, 1974, Robin Milner
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler
More informationVENTURE. Section 1. Lexical Elements. 1.1 Identifiers. 1.2 Keywords. 1.3 Literals
VENTURE COMS 4115 - Language Reference Manual Zach Adler (zpa2001), Ben Carlin (bc2620), Naina Sahrawat (ns3001), James Sands (js4597) Section 1. Lexical Elements 1.1 Identifiers An identifier in VENTURE
More informationCS415 Compilers. Lexical Analysis
CS415 Compilers Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Lecture 7 1 Announcements First project and second homework
More informationLexical Analysis. Textbook:Modern Compiler Design Chapter 2.1
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1 A motivating example Create a program that counts the number of lines in a given input text file Solution (Flex) int num_lines = 0; %% \n ++num_lines;.
More information1. Lexical Analysis Phase
1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,
More informationIBM. UNIX System Services Programming Tools. z/os. Version 2 Release 3 SA
z/os IBM UNIX System Services Programming Tools Version 2 Release 3 SA23-2282-30 Note Before using this information and the product it supports, read the information in Notices on page 305. This edition
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 4
CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical
More informationParsing and Pattern Recognition
Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its
More information