CS2 Practical 2 CS2Ah

Similar documents
CS2 Practical 6 CS2Bh 24 January 2005

CS2 Practical 1 CS2A 22/09/2004

CS103 Handout 14 Winter February 8, 2013 Problem Set 5

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

School of Informatics, University of Edinburgh

Formal Definition of Computation. Formal Definition of Computation p.1/28

CS103 Handout 13 Fall 2012 May 4, 2012 Problem Set 5

CS52 - Assignment 10

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Context-Free Grammars

CS/ECE 374 Fall Homework 1. Due Tuesday, September 6, 2016 at 8pm

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

JNTUWORLD. Code No: R

School of Informatics, University of Edinburgh

CS103 Handout 42 Spring 2017 May 31, 2017 Practice Final Exam 1

Important Project Dates

ASSIGNMENT 5 Objects, Files, and a Music Player

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Chapter Seven: Regular Expressions

14.1 Encoding for different models of computation

CSSE2002/7023 The University of Queensland

Lexical Analysis. Lecture 2-4

CMPE 4003 Formal Languages & Automata Theory Project Due May 15, 2010.

Compiler Construction LECTURE # 3

Department of Computer Science. COS 122 Operating Systems. Practical 3. Due: 22:00 PM

Lexical Analysis. Lecture 3. January 10, 2018

CS 337 Project 1: Minimum-Weight Binary Search Trees

ASSIGNMENT 5 Objects, Files, and More Garage Management

Compiler Construction

CSE Theory of Computing Fall 2017 Project 2-Finite Automata

Homework 1 Due Tuesday, January 30, 2018 at 8pm

CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer

NFAs and Myhill-Nerode. CS154 Chris Pollett Feb. 22, 2006.

Automating Construction of Lexers

CPSC 121: Models of Computation Assignment #5

Lexical Analysis - 2

CS103 Handout 16 Winter February 15, 2013 Problem Set 6

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

CS2 Language Processing note 3

Lexical Analysis. Lecture 3-4

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

CS 403 Compiler Construction Lecture 3 Lexical Analysis [Based on Chapter 1, 2, 3 of Aho2]

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

Understanding and Exploring Memory Hierarchies

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

Worksheet 3: Predictive Text Entry

Models of Computation II: Grammars and Pushdown Automata

ECS 120 Lesson 7 Regular Expressions, Pt. 1

CSE P 501 Exam 8/5/04

Lexical Analysis. Chapter 2

Multiple Choice Questions

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

Regular Languages (14 points) Solution: Problem 1 (6 points) Minimize the following automaton M. Show that the resulting DFA is minimal.

Theory Bridge Exam Example Questions Version of June 6, 2008

Implementation of Lexical Analysis. Lecture 4

CMPSCI 250: Introduction to Computation. Lecture 20: Deterministic and Nondeterministic Finite Automata David Mix Barrington 16 April 2013

1. [5 points each] True or False. If the question is currently open, write O or Open.

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

(Refer Slide Time: 0:19)

Lexical Analysis. Implementation: Finite Automata

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 5

3.15: Applications of Finite Automata and Regular Expressions. Representing Character Sets and Files

CS 215 Fundamentals of Programming II Fall 2017 Project 7. Morse Code. 30 points. Out: November 20, 2017 Due: December 4, 2017 (Monday) a n m

Announcements. Prelude (2) Prelude (1) Data Structures and Information Systems Part 1: Data Structures. Lecture 6: Lists.

Note: This is a miniassignment and the grading is automated. If you do not submit it correctly, you will receive at most half credit.

CSE 105 THEORY OF COMPUTATION

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

UNIT I PART A PART B

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

Project 1: Implementation of the Stack ADT and Its Application

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

Dr. D.M. Akbar Hussain

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Formal Languages and Automata

Lecture 18 Regular Expressions

CS103 Handout 35 Spring 2017 May 19, 2017 Problem Set 7

ASSIGNMENT 5 Data Structures, Files, Exceptions, and To-Do Lists

Actually talking about Turing machines this time

Introduction to Lexical Analysis

Context-Free Grammars

Regular Expressions & Automata

Programming Assignments

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CT32 COMPUTER NETWORKS DEC 2015

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

Midterm I (Solutions) CS164, Spring 2002

Implementation of Lexical Analysis

COP Programming Assignment #7

Homework Assignment #3

Lexical Analysis 1 / 52

CS Homework 10 p. 1. CS Homework 10

Tips from the experts: How to waste a lot of time on this assignment

Regular Languages and Regular Expressions

Formal Languages and Compilers Lecture VI: Lexical Analysis

Programming Assignment IV Due Thursday, November 18th, 2010 at 11:59 PM

Transcription:

CS2 Practical 2 Finite automata This practical is based on material in the language processing thread. The practical is made up of two parts. Part A consists of four paper and pencil exercises, designed to test your understanding of the lecture material. Your answers to this part should be submitted to your tutor, by the deadline below, using whatever mechanism your tutor arranges for receiving submissions. For part B you must write and submit part of a Java application, which can be used to convert NFAs into equivalent DFAs and to simulate the execution of DFAs. Your answer to this part should be submitted electronically, using the handin command, as described below. Both parts of the practical are worth 50% of the mark. Note that there are no dependencies between parts A and B of the practical. You are strongly encouraged to start on both parts of the practical as soon as possible. In particular, it is not advisable to leave part B until you have completed part A. This practical has been issued on Monday 5th November, and the deadline for submitting answers to both parts is 5pm on Monday 19th November. Late submissions will only be accepted in the case of genuine mitigating circumstances, for example illness, and with the agreement of the course organiser. Your marked work for this practical will be returned to you by your tutor during the tutorial in the week starting Monday 3rd December. Remember that you will need at least a grade C (i.e. at least 50%) in your final CS2 mark to proceed to any of the Computer Science or Software Engineering single or joint honours courses. You should also bear in mind the guidelines on plagiarism, which can be found via a link on the CS2 Web page. Resources The practical handout refers to various files. webpage These can all be found on the http://www.dcs.ed.ac.uk/teaching/cs2/www/practicals.html. 1

Part A [50] 1. (a) Design a DFA that recognises the language L over the alphabet {a, b, c} consisting of all strings that contain the substring ababc, i.e., the language L = { xababcy x, y {a, b, c} }. Draw a picture of this DFA and write out the formal specification of the DFA including its transition table. (b) Design a DFA that accepts precisely those strings over {a, b, c} that are not contained in the language L defined in (a), i.e., a DFA that recognises the language Draw a picture of this DFA. L { = { x {a, b, c} x L }. 2. Which of the following languages over the alphabet {0, 1, 2} are regular? L 1 = { 0 n 112 2n n N }, L 2 = {(01122) n n N}, L 3 = { 0 m 112 2n m, n N }. Justify your answers, either by providing a DFA, an NFA, or a regular expression for the language to show that it is regular or by proving that the language is not regular. Use the Pumping Lemma to prove that a language is not regular. 3. (a) Give regular expressions for the following two languages over the alphabet {0, 1}: L 1 = { x {0, 1} x contains an even number of 0s }, L 2 = { x {0, 1} x contains an even number of 0s and an odd number of 1s }. Hint: To find regular expressions, you may first want to design a DFA recognising the languages and then convert them to regular expressions using the method described in Lecture Note 4. (b) Construct an NFA with ε-transitions that recognises the language L(R) of the following regular expression R over the alphabet {a, b}: 4. For a language L Σ, let R = (a + b(ab a) b) SH(L) = { y Σ x Σ x = y and xy L }, the language consisting of the second halves of all strings of even length in L. For example, if L = {ε, a, abba, aba, aaba, aa} then SH(L) = {ε, ba, a}. Show that if L is regular then SH(L) is also regular. (Note: This problem is quite difficult, it is recommended that you complete the rest of the practical before attempting to solve it.) 2

Problem 1 is worth 10% of your mark, Problems 2 and 3 are worth 15% each, and Problem 4 is worth 10%. Submit your answers to Part A following the instructions given to you by your tutor. You must submit before the deadline of 5pm on Monday 19th November. Part B [50] The directory for this practical contains a partially implemented Java application, Prac2, for simulating finite automata. Your task is to complete this application and apply it to a pattern matching problem. The files and Java classes The directory contains the following files: A file FA.java, which contains the code of an abstract class FA implementing most of the features of finite automata. States of finite automata are represented by String objects, and letters of the alphabet by chars. The most important public methods of the class are: the constructor method public FA(), which creates an empty automaton object, the constructor method public FA(String filename), which reads an automaton from a file (the format of this file is explained below), a number of public methods for modifying an existing automaton object, for example a method public void addstate(string q) that adds a state to an automaton, or a method public void setstart- State(String q) that defines the start state, a number of methods for obtaining information about an existing automaton object, for example a method public boolean isstate(string q) that test if string q represents a state of the automaton, a method public String tostring() that writes the whole automaton into a string in the same format as the file used in the constructor. For details, look at the source code. Files DFA.java and NFA.java, which contain the code of two subclasses DFA and NFA of FA implementing the remaining features of DFAs and NFAs, respectively. The reason for this design is that DFAs and NFAs share most of their basic functionality, but have different types of transition functions. So most features related to the transitions have to be implemented separately. Most notably, the method public String getnextstate(string q,char c) 3

of the class DFA returns an object of type String representing the next state of a DFA when reading c in state q, and the method public LLSet getnextstates(string q,char c) of the class NFA returns an object of type LLSet representing the set of possible next states of an NFA when reading c in state q. A file LLSet.java, which contains the implementation of a class LLSet that can be used to represent sets. The implementation is based on linked lists; technically, LLSet is a subclass of class LinkedList. The important property of sets, as opposed to simple linked lists, is that no object can occur twice in a set. We use LLSets to represent sets of next states in an NFA. A file FAGui.java which contains the code of the graphical user interface. A file Prac2.java which contains the main method of the application. Files RunDFA.java and NFA2DFA.java in which you are supposed to implement the following three methods: public static boolean accepts(dfa M,String s), public static LinkedList extract(dfa M,String filename), public static DFA convert(nfa N). A file TokExample.java which contains a class illustrating the use of the Java class StreamTokenizer. Example files nfa1, nfa1a, dfa1, dfa2, which contain a few example automata. You should begin by copying all the files to a new directory in your own filespace (and remember to read-protect the directory). 1 You can then compile the, currently incomplete, application: javac Prac2.java To execute it type java Prac2. 1 Look at the CS2 Practical 1 handout if you don t remember how to do these things. 4

0 9 1 1 0 2 q0 q1 q2 q3 q4 Figure 1: The NFA stored in file nfa1 The graphical user interface Starting the application produces a GUI containing an input field, a result field, and six buttons with the following functionalities: Load NFA: Asks you to enter a filename in the input field, then loads the NFA specified in this file and displays it in the result field. Load DFA: Same for DFA instead of NFA. Save DFA: Asks you to enter a filename in the input field and then saves the currently loaded DFA into a file with that name. Save Result: Asks you to enter a filename in the input field and then saves the current content of the result field into a file with that name. Convert: Converts the NFA currently loaded into a DFA accepting the same language and displays this DFA in the result field. This functionality is not yet implemented. Run DFA: Asks you to enter a string in the input field and tests if the DFA currently loaded accepts this string. This functionality is not yet implemented. Extract: Asks you to enter a filename in the input field and displays all words in the file with this name that are accepted by the DFA currently loaded in the result window. This functionality is not yet implemented. Quit: Quits the application. The file format The format in which finite automata are stored for this application is largely selfexplanatory. Open the file nfa1. It contains a description of the automaton in Figure 1. Since it is often tedious to write out the full alphabet, it is allowed to represent intervals of consecutive letters using a dash, as in A-Z. This is allowed both in the alphabet and transitions section of the file. Open the file nfa1a to see how this allows a more compact representation of the automaton in Figure 1. 5

In general, a file representing a DFA or an NFA consists of: A field enclosed by tags <states> and </states> which contains strings representing the states of the automaton, separated by whitespaces. Strings representing states may consist of all printable ASCII symbols except (blank), or more precisely, of all ASCII symbols whose code is between 33 and 126. A field enclosed by tags <alphabet> and </alphabet> which contains either single characters (whose ASCII code is between 33 and 126) representing letters of the alphabet of the automaton or expressions of the form c 1 -c 2, where c 1 and c 2 are characters, representing all characters whose ASCII code is between that of c 1 and c 2. Note that no whitespaces are allowed between the dash and the enclosing letters. A field enclosed by tags <startstate> and </startstate> which contains a string representing the start state of the automaton. A field enclosed by tags <finalstates> and </finalstates> which contains strings representing the final states of the automaton, separated by whitespaces. A field enclosed by tags <transitions> and </transitions> which contains the transitions of the automaton. Transitions are represented by lines q1 c q2 saying that if the automaton is in state q1 and reads character c it may proceed to state q2. Similarly to the alphabet section, transitions may also be represented in the form q1 c 1 -c 2 q2 saying that if the automaton is in state q1 and reads any character c whose ASCII code is between that of c 1 and c 2 it may proceed to state q2. As a final example, look at the file dfa1, which contains a DFA accepting the same language as the NFA in nfa1. Your Tasks 1. Implement the method public static boolean accepts(dfa M,String s) of the class RunDFA (in the file RunDFA.java). Given an object M of type DFA and an object s of type String, this method is supposed to return true if the DFA represented by M accepts the string s and false otherwise. 2. Implement the method public static LinkedList extract(dfa M,String filename) 6

of the class RunDFA (in the file RunDFA.java). Given an object M of type DFA, representing a DFA whose alphabet is {A,..., Z, a,..., z}, and an object filename of type String, this method is supposed to read the textfile specified by filename, split it into words that only contain letters (i.e., characters between A and Z and between a and z ), and return a linked list consisting of all strings accepted by the DFA represented by M. 3. Implement the method public static DFA convert(nfa N) of the class NFA2DFA (in the file NFA2DFA.java). Given an object N of type NFA, this method is supposed to return an object M of type DFA such that the NFA represented by N and the DFA represented by M recognise the same language. 4. Specify an NFA N that accepts the language L consisting of all strings over the alphabet Σ = {A, B,..., Z, a, b,..., z} that contain the letters a, b, c in any order. (Thus for example, tobacco and subtraction are in L, and char is not.) Write N into a file abc-nfa in the format specified above. Use the convert function to transform it into an equivalent DFA, and save this DFA in a file abc-dfa. Then use the extract function to extract all words that contain the letters a,b,c from the file /usr/dict/words. Save the result in a file abc-words. When implementing the methods, it is advisable to split the tasks into subtask handled by separate methods (which you can implement as private methods of the respective classes). It is important that you comment your code appropriately. A complete solution consists of the (modified) files RunDFA.java and NFA2DFA.java and the files abc-nfa and abc-words. Ensure that all your files contain your name, email address, tutor s name, and submission date. Submit your solution using the commands handin 2 RunDFA.java handin 2 NFA2DFA.java handin 2 abc-nfa handin 2 abc-words You may do this any time after 5pm on Monday 12th November, up until the deadline of 5pm on Monday 19th November. Bear in mind that the computer labs are likely to be busy just before the practical deadline. It is up to you to ensure that you plan your work in order to submit in time. (Note that if you modify your code once you have submitted then you can resubmit and your earlier submission will be overwritten.) 7

Assessment Assessment for part B will include: marks assigned on the basis of how your program behaves on a test suite, and also marks for good programming style (including commenting the code). To get reasonable marks: You must submit code that compiles. You should test your code yourself on a number of suitably chosen test cases to ensure that it works. You must submit following the instructions above. (I.e. you must give the files the correct filenames, and you must use the handin command.) Tasks 1, 2, and 4 are worth 10% of your mark each and tasks 3 is worth 20%. Hints After you have implemented task 1, test the accept functionality by running the example DFAs, for instance those in the files dfa1 and dfa2, on a couple of example strings. To process the input file for task 2, you may want to use a StreamTokenizer. The file TokExample.java gives an example of how this can be done. Also recall CS1 Lecture Notes 25 and 27. Consult the Java API documentation to learn more about StreamTokenizers. For task 3, use the method described in Lecture Note 3. Recall that in this method, the states of the DFA you obtain are sets of states of the original NFA. Sets can be represented as objects of the class LLSet (defined in the file LLSet.java). Since the states of an NFA are strings, you have to turn the LLSets into strings using the method LLSet.toString(). Familiarise yourselves with the Java class java.util.linkedlist and its subclass LLSet. Recall CS1 Lecture Note 26 on Lists and Iterators and also look at the Java API documentation. Note that we are exclusively dealing with NFAs without ε-transitions. This simplifies the conversion to DFAs, because in this case for every state q a of an NFA and every letter a of its alphabet, the set s with q s (in the notation of Lecture Note 3) is simply the q, a-entry in the transition table, that is, the set returned by the method getnextstates of the class NFA. To see the result of an example conversion, consider the file nfa1 describing an NFA and the file dfa1 showing the result of converting this NFA into a DFA recognising the same language. 8

To test whether your conversion algorithm works properly, you can (but do not have to) use the method public boolean isdeterministic() of the class NFA. Do not make changes to any classes except RunDFA and DFA2NFA, or your submitted coded may not work properly when we compile it and test it with the original classes. Martin Grohe 9