CSE Theory of Computing Fall 2017 Project 2-Finite Automata

Similar documents
CSE Theory of Computing Fall 2017 Project 3: K-tape Turing Machine

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

CSE Theory of Computing Fall 2017 Project 1-SAT Solving

CSE Theory of Computing Fall 2017 Project 4-Combinator Project

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5

CS103 Handout 13 Fall 2012 May 4, 2012 Problem Set 5

Course Project 2 Regular Expressions

CS2 Practical 2 CS2Ah

CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

The Language for Specifying Lexical Analyzer

ASSIGNMENT TOOL TRAINING

Lexical Analysis. Lecture 2-4

Midterm I (Solutions) CS164, Spring 2002

CS103 Handout 14 Winter February 8, 2013 Problem Set 5

Implementation of Lexical Analysis

Lexical Analysis. Chapter 2

Lab Assignment. Lab 1, Part 2: Experimental Evaluation of Algorithms. Assignment Preparation. The Task

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

User s Guide: JFLAP Finite Automata Feedback/Grading Tool

Lexical Analysis. Implementation: Finite Automata

CS 342 Software Design Spring 2018 Term Project Part III Saving and Restoring Exams and Exam Components

Compilers. Computer Science 431

CSCI 204 Introduction to Computer Science II

Lexical Analysis. Lecture 3-4

Implementation of Lexical Analysis

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Buzz Student Guide BUZZ STUDENT GUIDE

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

6 NFA and Regular Expressions

Lab 4: Google Forms. Armand Poblete ( 2016)

University of Waterloo Midterm Examination

Lexical Error Recovery

CS 1510: Intro to Computing - Fall 2017 Assignment 8: Tracking the Greats of the NBA

CS 361 Computer Systems Fall 2017 Homework Assignment 1 Linking - From Source Code to Executable Binary

Formal Definition of Computation. Formal Definition of Computation p.1/28

Homework Assignment #3

Behaviour Diagrams UML

CS 342 Software Design Spring 2018 Term Project Part II Development of Question, Answer, and Exam Classes

Instructor Feedback Location and Printing. Locating Instructor Feedback When Available within Canvas

CS 432 Fall Mike Lam, Professor. Finite Automata Conversions and Lexing

Computation, Computers, and Programs. Administrivia. Resources. Course texts: Kozen: Introduction to Computability Hickey: Introduction to OCaml

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

CSE450. Translation of Programming Languages. Lecture 20: Automata and Regular Expressions

Programming Standards: You must conform to good programming/documentation standards. Some specifics:

The e-marks System: Instructions for Faculty of Arts and Science Users

Working with Drop Box Activities

CIS 505: Software Systems

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

Theory of Computations Spring 2016 Practice Final Exam Solutions

Lab 19: Excel Formatting, Using Conditional Formatting and Sorting Records

Implementation of Lexical Analysis. Lecture 4

Regular Expression Constrained Sequence Alignment

CSCI 320 Group Project

2018 Pummill Relay problem statement

UNIT -2 LEXICAL ANALYSIS

1. [5 points each] True or False. If the question is currently open, write O or Open.

Lexical Analysis - 2

The Journals Tool. How to Create a Journal

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

How to Use Turnitin An Introduction for Instructors -Teachers/ Faculty

Comparing and Contrasting different Approaches of Code Generator(Enum,Map-Like,If-else,Graph)

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

AEV Project Portfolio

The print queue was too long. The print queue is always too long shortly before assignments are due. Print your documentation

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

ECS 120 Lesson 7 Regular Expressions, Pt. 1

D2L Assignment Grader App Faculty Support Guide

Figure 2.1: Role of Lexical Analyzer

Project 4: ATM Design and Implementation

Over the Summer, we might have more new tools, features, updates, and workflow changes as we get ready for the Fall semester.

Zhizheng Zhang. Southeast University

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Instructor Quick Guide for Springboard 10.6

FACULTY OF ENGINEERING LAB SHEET INFORMATION THEORY AND ERROR CODING ETM 2126 ETN2126 TRIMESTER 2 (2011/2012)

Call Detail Reporting

Important Project Dates

SOLAR GRADE ROSTER UPLOAD - QUICK GUIDE

CSE450. Translation of Programming Languages. Automata, Simple Language Design Principles

COS 226 Algorithms and Data Structures Fall Final

Decidable Problems. We examine the problems for which there is an algorithm.

CS 2704 Project 2: Elevator Simulation Fall 1999

Introduction to elearning for Instructors

Programming Assignment 3

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Introduction to Lexical Analysis

Instructions for Using the e-learning Module on Service-Learning for Students

Angel Turnitin Drop Box User Guide (updated )

ADVANCED ENERGY VEHICLE. AEV Project Portfolio

Optimizing Finite Automata

Introduction to the SAM Student Guide 4. How to Use SAM 5. Logging in the First Time as a Pre-registered Student 5 Profile Information 7

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

Translator Design CRN Test 3 Revision 1 CMSC 4173 Spring 2012

Blackboard Essentials

Compiler Design. 2. Regular Expressions & Finite State Automata (FSA) Kanat Bolazar January 21, 2010

Implementation of Lexical Analysis

CT32 COMPUTER NETWORKS DEC 2015

Viewing and Managing a Grade book. Entering grades

Transcription:

CSE 30151 Theory of Computing Fall 2017 Project 2-Finite Automata Version 1: Sept. 27, 2017 1 Overview The goal of this project is to have each student understand at a deep level the functioning of a finite automaton (FA), how programs for such a machine should be written, and what we mean by complexity of such programs when executed on real inputs. This design should set the stage for later projects on alternative classes of automaton. 2 Project Structure Completion of this project will involve two parts. First is writing and running a program that, when given one file that represents a set of possibly many input strings and another that represents a rule set for a deterministic finite automaton (DFA), computes the states that the DFA transitions through for each input string, and provides execution statistics. Second is writing a program that takes the rule set for a NFA and converts it into a rule set for an equivalent DFA. This latter output should be compatible with the first program so that it can be run with some string as input, and produce a correct accept/reject. We are not after world-class performance here, just some relatively simple code that is functional and from which you can draw insight. You are free to go beyond the minimal requirements and produce enhanced code, and if in the instructor s view it truly represents valuable extensions, then it will be considered for extra credit. You are free to use C, C++, or Python. The latter in particular has proven in the past as perhaps the most efficient way to get decent working code (the overall goal). If you use Python, feel free to use the time, csv, sys, and copy modules. Use of any other modules requires preapproval of the instructor. 2.1 DFA This program, to be called dfa-team, where team is the name of your team, must be written by each group, and must take the following inputs (in this order): An input file defining the FA to be simulated An input file providing multiple character strings to run one at a time against the machine. 1

The input file containing the FA s program should be read in and internally processed to whatever representation the group is using. As each rule is read in, an integer Rule # should be associated with it, assigned in sequential order. The Rule #, followed by a :, and an echo of the rule should be dumped to stdout so that later traces can be followed. After reading in the program file, and before each character string is processed from the second file, the FA should be reset to its start state, and all statistics should be cleared. The name of the test file shall be echoed to stdout. Then the FA should be allowed to run against the next line of text from the input file until one of: There are no more input characters in the current string, The input character is not part of the alphabet, There is no rule covering the current state and input character. At each transition, output to stdout the current-state comma the input character = next-state. At this point, if there are more input strings (as signified by more lines in the input file), the FA is reset and execution repeated on the next string. 2.2 NFA-to-DFA Translator This program, to be called nfa2dfa-team, where team is the name of your team, should read in a machine file in the same format as for the DFA, and produce an output file that represents a DFA version of the original input. It is expected that this original input file is for an NFA, that is, there may be either multiple transitions from the same state and same input symbol, and/or transitions with ɛ as the character. 2.3 Test Files Under the Projects tab of the class website https://www3.nd.edu/ kogge/courses/cse30151- fa17/index.html there is a directory called Project2. Within this directory there are several test files associated with dfa-team: Four different machine definition: M1.txt, M2.txt, M3.txt, Mystery.txt For the first three machines, sets of strings that should be accepted or rejected by the associated machines For the Mystery machine, a set of strings whose acceptance is unknown 2.4 Student-supplied Machines In addition to instructor-supplied machines and test program, each individual student shall create their own machine and test strings for each of the dfa and nfa cases. These should be documented in the readme and included with the code. If there are 3 students in a group, then 3 different machines and test code shall be included. The name field of each machine should thus include the netid of the student doing the design. 2

3 Documentation Two pieces of documentation, both in PDF format, are required. One (entitled readmeteam.pdf is submitted once by the entire team; the other teamwork-netid.pdf is submitted separately by each member of the team. 3.1 readme-team A key part of what your teams submit is a readme-team.pdf that includes the following in this order: 1. The members of the team. 2. Approximately how much time was spent in total on the project, and how much by each student. 3. A description of how you managed the code development and testing. Use of github in particular is a particularly strong suggestion to simplifying this process, as is some sort of organized code review. 4. The language you used, and a list of libraries you invoked. 5. A description of the key data structures you used, especially for the internal representation of states and the state machines and the transitions. 6. For each of the NFA (N) and its equivalent DFA (D), plot the number of states of D v. the number of states of N. 7. If you did any extra programs, or attempted any extra test cases, describe them separately. Only one member of the team need submit this report to their dropbox. 3.2 teamwork-netid In addition, each team member should prepare a brief discussion of their own personal view of the team dynamics, and stored in a PDF called teamwork-netid.pdf, where netid is your netid. The contents should include: 1. Who were the other team members. 2. Under whose netid is the readme-team.pdf, code, and other material saved. 3. How much time did you personally spend on the project, and what did you do? 4. Which machines did you design and how did you generate test strings? 5. What did you personally learn from the project, both about the topic, above programming and code development techniques, and about algorithms. 6. In your own words, how did the team dynamics work? What could be improved? (e.g. did you use github and if so did it help, did you meet frequently enough, etc.) 3

7. From your own perspective, what was the role of each team member, and did any member exceed expectations, or vice versa. Each student s submission here should be in their own words and SHOULD NOT be a copy of any other team members submission, nor should they be shared with the other team members. These reports will be kept private by the graders and instructor, and will be used to ensure healthy team dynamics. The instructor retains the right to adjust the score of an individual team member from the base score (both up and down) on the basis of these reports. Also, a composite of all such reports from all projects will be used to create an overall lessons learned at the end of the project in what techniques seemed to work better, and where problems arose. These hopefully will be of use for the next project. 4 File Formats 4.1 Machine Format The file provided with the rules for the FA shall consist of a file where each line provides distinct information: Line 1: The name of the FA program. This should be echoed to stdout before the rules are echoed. Line 2: The alphabet to be used: only single ASCII letters are allowed, comma separated, as in a,b,c,... Any ASCII character other than is allowed. That character is reserved to stand for ɛ in transition rules for NFAs. Line 3: The names of the states, separated by commas, as in q0,q1,... constraint on the length or character set of a state name. Line 4: The name of the state that should be considered the start state. There is no Line 5: A comma separated list of state names that should be marked as accepting states. Line 6 and beyond: one transition rule per line, in a comma-separated format: Initial State Name, Input Symbol, New State Name The Input Symbol is any symbol from the defined Σ from line 2, or (when describing an NFA) optionally a that stands for an ɛ. Processing of rules stops with the end of file. The comma-separated format should be compatible with a.csv form, allowing a machine description to be prepared easily by a spreadsheet such as excel if desired. In fact a valid extension for such files could be.csv. Note that it may be that such files are produced on a Windows machine that adds a carriage return and a linefeed to the end of each line. 4

4.2 Test File Format Each test file is a text file, with one line for each string to be input to the machine. Note that it may be that such files are produced on a Windows machine that adds a carriage return and a linefeed to the end of each line. 4.3 Output Format After processing the rules file, the output from each string run shall be sent to stdout and consist of: The input string being processed, prefixed by String: For each transition, the following on a separate line: Step number, Rule Number, Initial State Name, Input Symbol, New State Name After all transitions have been performed, print either Accepted (if the last state was an accepting state) or Rejected (otherwise) The comma-separated format should be compatible with a.csv form, allowing a redirect to a.csv file so that the output can be read by a spreadsheet program such as excel. 5 Submission Each team member should create a directory in their own class dropbox called Project2- team, where team is your team name (all members in the should use the same team name). When the team is ready to submit, one (and only one) team member will place copies of all code at output in the designated common directory. Your code should be runnable on any of the studentnn.cse.nd.edu machines so that if there is an issue the graders can run the code themselves. If you wrote in a compiled language like C++, include all needed source files (excepting standard libraries), a make file, and a compiled executable. The source code is there to allow the graders to look at the code for comments and to resolve any discrepancies that may arise in looking at your results. If you wrote in a language like Python make sure your code is compatible with one of the versions supported on the studentnn.cse.nd.edu machines, again to allow the graders to check something if there is an issue. Also in the same dropbox as the code the team should place an output file for each of the test files that you ran. The format should be as described in Section 4.3, and the name should be the same name as the test file but with a.csv rather than a.cnf extension. For this project there should be 5 of them. In addition, every team member should include in their own Project1 directory: A copy of their readme-team file, again in pdf. For the team members who did not upload code and readme s 5

Machine and test files for the DFA and NFA designs that the student did themselves. The output files from running the above machine files against the string files. A short readme describing what the machines are, what language they were supposed to accept, and whether or not the test set showed proper operation. 6 Grading Grading of the project will be based on a 100 points, divided up as the following Points off for late submissions (10 points per day). 5 points for following naming and submission conventions. 10 points for reasonably commented source code. 50 points: 25 points each based on the percent of cases you got correct for running dfa against the provided test cases, and 25 points for similarly running nfa2dfa through dfa against the provided test cases. 20 points for completeness and quality of the readme file. 10 points for the student-designed machines and their tests. 5 points for the teamwork report. All but the last two items will be common to all members of a team. The last two are specific to each member. Special consideration will be given for successful projects done by 1 or 2 person teams, and/or for exceptional algorithm performance (as judged by the instructor). 6