FSASIM: A Simulator for Finite-State Automata

Similar documents
A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

Using Lex or Flex. Prof. James L. Frankel Harvard University

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Features of C. Portable Procedural / Modular Structured Language Statically typed Middle level language

Fundamentals of Programming Session 4

Programming for Engineers Introduction to C

Structure of Programming Languages Lecture 3

Full file at C How to Program, 6/e Multiple Choice Test Bank

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

c) Comments do not cause any machine language object code to be generated. d) Lengthy comments can cause poor execution-time performance.

Lexical Analysis. Lecture 2-4

Chapter 2, Part I Introduction to C Programming

CSC 467 Lecture 3: Regular Expressions

CSCI312 Principles of Programming Languages!

Lecture 2 Tao Wang 1

Homework 1 Due Tuesday, January 30, 2018 at 8pm

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

The Language for Specifying Lexical Analyzer

Introduction to Lexical Analysis

Figure 2.1: Role of Lexical Analyzer

Lexical Analysis. Chapter 2

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

UNIT -2 LEXICAL ANALYSIS

Typescript on LLVM Language Reference Manual

Chapter 3 Lexical Analysis

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

Compiler Design. 2. Regular Expressions & Finite State Automata (FSA) Kanat Bolazar January 21, 2010

2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively

Compiler course. Chapter 3 Lexical Analysis

Implementation of Lexical Analysis

Lexical Analysis. Lecture 3-4

Implementation of Lexical Analysis

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

Administrivia. Lexical Analysis. Lecture 2-4. Outline. The Structure of a Compiler. Informal sketch of lexical analysis. Issues in lexical analysis

Introduction to Regular Expressions Version 1.3. Tom Sgouros

User Group Configuration

BoredGames Language Reference Manual A Language for Board Games. Brandon Kessler (bpk2107) and Kristen Wise (kew2132)

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

BIL 104E Introduction to Scientific and Engineering Computing. Lecture 1

Programming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual

Cellular Automata Language (CAL) Language Reference Manual

Overview. - General Data Types - Categories of Words. - Define Before Use. - The Three S s. - End of Statement - My First Program

Programming Languages 2nd edition Tucker and Noonan"

The PCAT Programming Language Reference Manual

Implementation of Lexical Analysis

Recognition of Tokens

Edited by Himanshu Mittal. Lexical Analysis Phase

IT 374 C# and Applications/ IT695 C# Data Structures

Basic Elements of C. Staff Incharge: S.Sasirekha

EMPATH LANGUAGE REFERENCE MANUAL

2/29/2016. Definition: Computer Program. A simple model of the computer. Example: Computer Program. Data types, variables, constants

1 Lexical Considerations

Lexical Considerations

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 3

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

CSc Introduction to Computing

Ray Pereda Unicon Technical Report UTR-02. February 25, Abstract

- c list The list specifies character positions.

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Regular expressions. LING78100: Methods in Computational Linguistics I

Standard 11. Lesson 9. Introduction to C++( Up to Operators) 2. List any two benefits of learning C++?(Any two points)

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

The structure of a compiler

Data Types and Variables in C language

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Sprite an animation manipulation language Language Reference Manual

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

DECLARATIONS. Character Set, Keywords, Identifiers, Constants, Variables. Designed by Parul Khurana, LIECA.

1. Lexical Analysis Phase

6 NFA and Regular Expressions

Introduction to Lexing and Parsing

Part 5 Program Analysis Principles and Techniques

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

بسم اهلل الرمحن الرحيم

The SPL Programming Language Reference Manual

Flex and lexical analysis

Learning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104)

UEE1302 (1102) F10: Introduction to Computers and Programming

Language Reference Manual

EXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD

Lexical Analysis. Lecture 3. January 10, 2018

CS 297 Report. By Yan Yao

Implementation of Lexical Analysis

Chapter Overview. C++ Basics. Variables and Assignments. Variables and Assignments. Keywords. Identifiers. 2.1 Variables and Assignments

1.0 Languages, Expressions, Automata

Regular Expressions & Automata

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

Formal Languages. Formal Languages

Villanova University CSC 2400: Computer Systems I

12/22/11. Java How to Program, 9/e. Help you get started with Eclipse and NetBeans integrated development environments.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

CSc 10200! Introduction to Computing. Lecture 2-3 Edgardo Molina Fall 2013 City College of New York

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

VENTURE. Section 1. Lexical Elements. 1.1 Identifiers. 1.2 Keywords. 1.3 Literals

Password Management Guidelines for Cisco UCS Passwords

1 CS580W-01 Quiz 1 Solution

UNIT - I. Introduction to C Programming. BY A. Vijay Bharath

Transcription:

FSASIM: A Simulator for Finite-State Automata P. N. Hilfinger

Chapter 1: Overview 1 1 Overview The fsasim program reads in a description of a finite-state recognizer (either deterministic or non-deterministic), and a sequence of strings. It reports which strings are recognized. For example, here is a description of a (deterministic) FSR for the language described by the regular expression (01)*00 : alphabet [01] start state Begin [0] -> Zero state Zero [0] -> Done [1] -> Begin final state Done The alphabet declaration describes the alphabet used by the machine, in the format used by the lex program to denote a pattern that matches one of a set of single characters. The start state, state, and final state declarations introduce new states. Use start state to introduce the initial state (by default, the first state declared). Use final state to introduce a final state. The lines containing -> denote transitions out of the last state declared. Each transition has the form of a character-set pattern as in lex followed by an arrow, followed by a destination state. Here is another example, this time for Ada identifiers, which start with a letter, followed by letters, digits, and underscores, but not containing adjacent underscores and not ending with an underscore: alphabet [a-za-z0-9_] start state 0 [a-za-z] -> 1 final state 1 [a-za-z0-9] -> 1 [_] -> 2 state 2 [^_] -> 1 As in lex, the notation [^...] used under state 2 means any character in the alphabet other than.... Here is an example of a non-deterministic machine s description. # Recognizer for 0(01)*1 (0 1)*0 alphabet [01] start state A [0] -> B [01] -> A [0] -> C final state C state B [0] -> B1 [1] -> D

2 FSASIM: Finite-State Automata Simulator state B1 [1] -> B final state D The following alternative machine description recognizes the same language, and uses epsilon transitions. # Recognizer for 0(01)*1 (0 1)*0 alphabet [01] start state Init -> A1 # Epsilon transition -> B1 # Epsilon transition state A1 [0] -> A2 [1] -> A1 state A2 -> F -> A1 state B1 [0] -> B2 state B2 -> B4 [0] -> B3 state B3 [1] -> B2 state B4 [1] -> F final state F After reading in the machine description, fsasim will process any number of quoted input strings from a file or the standard input. For the machine just above, one might have the following session: % fsasim desc.fsm sample.inp "" rejected. "001010" accepted. "001011" accepted. "110001" rejected. "110000" accepted. where sample.inp contains "" "001010" "001011" "110001" "110000"

Chapter 2: FSA Descriptions 3 2 FSA Descriptions FSA descriptions are in free format; that is, you may insert additional whitespace (blanks, tabs, newlines) freely except in the middle of keywords or charsets (described below). You may also insert comments between declarations (but not within them). A comment starts with the character # and proceeds to the end of the line. An FSA description has the following format An optional alphabet declaration of the form alphabet charset (See the description of charsets below). If this declaration is absent, the alphabet is taken to consist of all characters between blank and delete (i.e., between ASCII codes 32 and 127), inclusive. Any number of state declarations, consisting of A header line having one of the three forms state name start state name final state name denoting a state labeled name that is, respectively, an ordinary state, the starting state (there should be only one), or a final state (of which there may be any number). The label name may be any sequence of letters, digits, periods, hyphens, underscores, and dollar signs. Zero or more transition declarations having one of the two formats charset -> state -> state Charsets Denoting, respectively, a transition on any of the characters in charset or an epsilon transition. Ordinary transitions and the alphabet declaration use charsets to describe sets of characters. These consist of a sequence of characters or character ranges surrounded in square braces ([]). If the first character is ^, the charset denotes the complement of the character set specified by the remaining characters, relative to the declared alphabet (this form may not be used in the alphabet declaration itself). You may use any characters in a charset, but to include the special characters ], \, ^, or -, you should precede them with a backslash ( \ ). You can use the standard C escape characters for tabs, newlines, returns, etc. You may denote a range of characters as c1-c2, which is the same as listing all characters whose codes are greater than or equal to that of c1 and less than or equal to that of c2.

4 FSASIM: Finite-State Automata Simulator

Chapter 3: Using fsasim 5 3 Using fsasim To invoke fsasim from a shell, type fsasim [-v] [--verbose] [--deterministic] [--limit=limit] [inputfile...] The first inputfile contains the machine description. In the absence of and inputfile, all input comes from the standard input (which is also denoted by an inputfile of -). The input strings to be tested may either be appended to the FSA description, following the keyword input, placed in subsequent inputfiles, or (if there is at least one inputfile) in the standard input. Each input string is surrounded in double quotes. Any embedded double quotes must be preceded by a backslash ( \ ). You may also use the usual C/C++ conventions for special characters (e.g., \n for newline). You may optionally use whitespace to separate input strings. The deterministic option causes fsasim to reject any non-deterministic machine description. The limit option issues a warning if the machine has more than limit states. The verbose option causes fsasim to print out various explanatory messages. Finally, -v simply prints the fsasim version number and exits.

6 FSASIM: Finite-State Automata Simulator