A Characterization of the Chomsky Hierarchy by String Turing Machines

Similar documents
ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES *

A Note on the Succinctness of Descriptions of Deterministic Languages

Regular Languages (14 points) Solution: Problem 1 (6 points) Minimize the following automaton M. Show that the resulting DFA is minimal.

CT32 COMPUTER NETWORKS DEC 2015

CSE 105 THEORY OF COMPUTATION

Turing Machines. A transducer is a finite state machine (FST) whose output is a string and not just accept or reject.

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

Theory of Computations Spring 2016 Practice Final Exam Solutions

Theory of Programming Languages COMP360

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

ECS 120 Lesson 16 Turing Machines, Pt. 2

where is the transition function, Pushdown Automata is the set of accept states. is the start state, and is the set of states, is the input alphabet,

Turing Machine Languages

Limited Automata and Unary Languages

Formal languages and computation models

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Decision Properties for Context-free Languages

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Introduction to Computers & Programming

CSE 105 THEORY OF COMPUTATION

Equivalence of NTMs and TMs

pp Variants of Turing Machines (Sec. 3.2)

CSE 105 THEORY OF COMPUTATION

Turing Machines, continued

Chapter 14: Pushdown Automata

Actually talking about Turing machines this time

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Source of Slides: Introduction to Automata Theory, Languages, and Computation By John E. Hopcroft, Rajeev Motwani and Jeffrey D.

Context Free Languages and Pushdown Automata

Theory of Languages and Automata

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

UNIT I PART A PART B

We can create PDAs with multiple stacks. At each step we look at the current state, the current input symbol, and the top of each stack.

Languages and Compilers

CSE 105 THEORY OF COMPUTATION

Pushdown Automata. A PDA is an FA together with a stack.

Final Course Review. Reading: Chapters 1-9

From Theorem 8.5, page 223, we have that the intersection of a context-free language with a regular language is context-free. Therefore, the language

Theory and Compiling COMP360

Problems, Languages, Machines, Computability, Complexity

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

LING/C SC/PSYC 438/538. Lecture 20 Sandiway Fong

Computer Sciences Department

Introduction to Lexing and Parsing

CS6160 Theory of Computation Problem Set 2 Department of Computer Science, University of Virginia

CS402 - Theory of Automata Glossary By

1 Parsing (25 pts, 5 each)

JNTUWORLD. Code No: R

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Theory of Computations Spring 2016 Practice Final

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

Decidable Problems. We examine the problems for which there is an algorithm.

Universal Turing Machine Chomsky Hierarchy Decidability Reducibility Uncomputable Functions Rice s Theorem Decidability Continued

Theory of Computation, Homework 3 Sample Solution

Multiple Choice Questions

FAdo: Interactive Tools for Learning Formal Computational Models

Regular Languages and Regular Expressions

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Theory of Computation Dr. Weiss Extra Practice Exam Solutions

Recursively Enumerable Languages, Turing Machines, and Decidability

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Introduction to the Theory of Computation, Sipser, PWS, ISBN X, 1996

THEORY OF COMPUTATION

Theory Bridge Exam Example Questions Version of June 6, 2008

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

LLparse and LRparse: Visual and Interactive Tools for Parsing

Variants of Turing Machines

Compiler Construction

Reflection in the Chomsky Hierarchy

Lexical Analysis - 2

1. Draw the state graphs for the finite automata which accept sets of strings composed of zeros and ones which:

CS21 Decidability and Tractability

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

A Formal Study of Practical Regular Expressions

Formal Languages and Automata

Theory of Computation

Structure of a Compiler: Scanner reads a source, character by character, extracting lexemes that are then represented by tokens.

Learn Smart and Grow with world

6.045J/18.400J: Automata, Computability and Complexity. Practice Quiz 2

Limits of Computation p.1/?? Limits of Computation p.2/??

TOPIC PAGE NO. UNIT-I FINITE AUTOMATA

LECTURE NOTES THEORY OF COMPUTATION

Outline. Language Hierarchy

Optimizing Finite Automata

lec3:nondeterministic finite state automata

Specifying Syntax COMP360

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

LECTURE NOTES THEORY OF COMPUTATION

Welcome to MTH401A!! Theory of Computation

Model checking pushdown systems

Pumping Visibly Pushdown Languages

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

Nested Words. A model of linear and hierarchical structure in data. Examples of such data

CSCI312 Principles of Programming Languages!

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Lecture 8: Context Free Grammars

Transcription:

A Characterization of the Chomsky Hierarchy by String Turing Machines Hans W. Lang University of Applied Sciences, Flensburg, Germany Abstract A string Turing machine is a variant of a Turing machine designed for easy manipulation of strings. In contrast to the standard Turing machine, a string Turing machine can insert and delete squares on the tape. It is easy to see that the models of standard and string Turing machines are equivalent in computing power. However, in case of the string Turing machine, imposing certain restrictions on the allowed actions of the machine exactly yields recognizers for the type-i language classes of the Chomsky hierarchy. Keywords: Turing machine, Chomsky hierarchy 1. Introduction In standard textbooks on formal languages and automata like [1] or [4] the hierarchy of automata corresponding to the Chomsky hierarchy comprises nondeterministic Turing machine, linear bounded automaton, pushdown automaton, and finite automaton. In this paper, a variant of the Turing machine, called string Turing machine, is introduced. A string Turing machine is able to manipulate strings and, especially, to reduce words to the start symbol according to the productions of a grammar. Restrictions of the form of productions correspond in a beautiful way to restrictions of the actions of the string Turing machine. Thus, the Chomsky hierarchy which is based on restrictions to grammars can also be based on restrictions of string Turing machines. In the following, the Chomsky hierarchy of language classes as characterized by certain restricted forms of grammars is revisited. Then, the new characterization by certain restrictions of string Turing machines is introduced. 2. The Chomsky hierarchy The Chomsky hierarchy identifies language classes L 0, L 1, L 2, L 3 where L 0 L 1 L 2 L 3 All the inclusions are proper. The language classes are denoted as type-i languages for i =0, 1, 2, 3. L 1 is also known as the class of context-sensitive languages, L 2 as the class of context-free languages, and L 3 as the class of regular languages, because their languages are generated by context-sensitive, contextfree, and regular grammars, respectively. Definition: A grammar is a tuple G =(V,T,P,S) with V T the alphabet of variables or nonterminal symbols, the alphabet of terminal symbols where V T = ; moreover, let A = V T, P a finite relation with P A + A ; the elements of P are called productions or replacement rules, S V a special variable, the start symbol. The replacement rules are written in the form u v, indicating that the subword u occurring in some word w may be replaced by the subword v. Given a grammar, the words of a language are generated by applying such a sequence of replacements to the start symbol until a word consisting of terminal symbols only is reached. The sequence of replacements is called a derivation of the word. Languages can also be recognized by grammars in the sense that all words that can be reduced to the start symbol belong to the language [3]. Such a reduction

is a derivation in opposite direction. The string Turing machine described later makes use of reductions. 3. Special grammars Certain restrictions on the form of the productions of a grammar may or may not restrict its power to generate a language. However, the restriction that P V A requiring that the left side of each production consists of exactly one variable restricts the languages generated by this kind of grammars to the class L 2 which is a proper subset of L 0. It turns out that the following restrictions imposed on the form of the productions of a grammar correspond to the language classes of the Chomsky hierarchy. Observe that each form of the productions is a special case of the preceding one. Type Productions of the form Name 0 u w Recursively enumerable 1 u v with u v Context-sensitive 2 X v Context-free 3 X ay or X a Regular w 0 w 1 w 2 w 3 w 4 w 5 & string cursor control Fig. 1: String Turing machine symbol is moved one position to the left. A blank symbol is inserted and becomes the new cursor symbol. The delete action is shown in Figure 2(b). The cursor symbol is deleted. The prefix of the string left to the cursor is moved one position to the right. Its last symbol becomes the new cursor symbol. I - insert a b c d & a b c d & (a) (b) Fig. 2: (a) Insert and (b) delete action a b c d & D - delete a b d & where u, v A +, w A, X,Y V, a T. If necessary, as an exception the production S ε is allowed to produce the empty word ε. 4. String Turing machine A string Turing machine is a device as shown in Figure 1. It has access to the symbols of a string, one at a time. A cursor points to the current position. The machine can read the symbol at the cursor position (the cursor symbol) and it can overwrite that symbol by some other symbol. It can also move the cursor to the left and to the right. Moreover, a string Turing machine can insert a symbol at the cursor position and it can delete the cursor symbol. Initially, the string consists of an input word enclosed by the special delimiter symbols and &. The string has finite length, however, by inserting symbols it can be made arbitrarily long. The insert action is performed as depicted in Figure 2(a). The prefix of the string including the cursor Formally, a string Turing machine is defined as follows. Definition: A nondeterministic string Turing machine is a tuple M = (Z, E, A, d, q, p) with Z E a finite, non-empty set of states, the input alphabet, A the string alphabet where E A, d q Z the transition relation with d Z A A Z where A = A {L, R, I, D}, the start state, p {, &} the start position. The string alphabet A contains the special symbols, & and the blank symbol ; these symbols do not belong to the input alphabet. The elements of the set {L, R, I, D} do not belong to A, these elements are

called cursor actions. At the beginning, the string Turing machine is in its start state, and the cursor points to one of the symbols or & that enclose the input word. An element (s, a, a,s ) of the transition relation is interpreted in the following way. If the string Turing machine is in state s and reads symbol a at the cursor position, it replaces symbol a by symbol a and enters state s. However, if a is not a symbol but one of the cursor actions, the string Turing machine does not overwrite symbol a but performs the corresponding cursor action: move left (L), move right (R), insert (I) or delete (D). The string Turing machine accepts an input word w, if there is a sequence of transitions that deletes its string completely. 5. Equivalence with standard Turing machine A standard Turing machine can simulate a string Turing machine. The standard Turing machine uses as tape alphabet the alphabet A of the string Turing machine plus symbols a for all symbols a in A. It then simulates the insert action as follows. It overwrites the symbol a under its read/write head by the symbol a. Then it moves all symbols to the right of symbol a by one position to the right. It returns to symbol a, overwrites it by a, moves one position to the right and prints a blank symbol. In a similar way, the standard Turing machine simulates the delete action. It overwrites the symbol a under its read/write head by the symbol a. Then it moves left to the first non-blank symbol on its tape and moves all these symbols by one position to the right until it reaches a. All other actions are identical to those of the string Turing machine. The standard Turing machine enters the accepting state when it has deleted all symbols on the tape. A string Turing machine can simulate a standard Turing machine. It performs identical actions as the standard Turing machine, except when it reaches the delimiter symbols or & (which do not belong to the tape alphabet of the standard Turing machine simulated). Then it performs insert actions if it needs space. When the standard Turing machine enters an accepting state and stops, the string Turing machine deletes its string so that it accepts, too. Otherwise, the string Turing machine does not accept, because at least the symbols and & remain. 6. Recognition of languages by string Turing machines Given a grammar G and a nonempty word w as input string, the string Turing machine tries to reduce the word w to the start symbol S of the grammar. It does so by replacing the right side of some production that occurs in w by the corresponding left side in a nondeterministic way. It repeats this procedure until only the start symbol S remains. Finally, it deletes S and the delimiter symbols and & and recognizes the word w. Otherwise, if no such reduction sequence to the start symbol S is possible, it does not recognize the word w. Example: Let G be a grammar with the productions S asbc abc cb Bc bb bb and let the input word be w = aabbcc. The string Turing machine searches nondeterministically for occurrences of right sides of the productions in the input word w. It finds bb, and replaces it with bb, yielding aabbcc. Then it replaces Bc with cb, yielding aabcbc. Then it replaces abc with S, yielding asbc. Finally, it replaces asbc with S and deletes S and the delimiter symbols and &. Thus, it has recognized the word w. When the right side of a production is longer than the left side, the string Turing machine needs to perform delete actions. When the right side is shorter than the left side, it needs to perform insert actions. However, this last case does only occur in type-0 grammars. Thus, any type-1 language is recognizable by a string Turing machine without insert actions.

This observation is part of the following hierarchy theorem for string Turing machines. Recall that the cursor actions L, R, I, D denote left move, right move, insert and delete, respectively. Theorem: Any type-0 language is recognizable by a string Turing machine with cursor actions {I, L, R, D}. Any type-1 language is recognizable by a string Turing machine with cursor actions {L, R, D}. Any type-2 language is recognizable by a string Turing machine with cursor actions {R, D}. Any type-3 language is recognizable by a string Turing machine with cursor actions {D}. Proof sketch: The string Turing machine recognizes a word w of a type-0 language by applying reduction steps in a nondeterministic way until it reaches the start symbol S of the grammar, which it subsequently deletes together with the delimiter symbols and &. In the same way, a string Turing machine recognizes a word w of a type-1 language. However, every type-1 language has a monotonic grammar, i.e. a grammar where no right side of a production is shorter than the left side. Thus, each reduction step can be performed without insert actions. Again in the same way, a string Turing machine recognizes a word w of a type-2 language. Every type-2 (context-free) language has a grammar in reverse Greibach normal form, with each production in the form X Y k 1... Y 0 a where k N 0, X, Y 0,..., Y k 1 V and a T. Observe that when k =0the production has the form X a. Recognition of a context-free language is based on such a grammar in reverse Greibach normal form. When processing an input word, the string Turing machine applies a production after each terminal symbol read. It does so by matching and deleting the right side of the production except of the last symbol, which it overwrites by the left side of the production. Cursor actions {R, D} suffice for this. In this way, the string Turing machine simulates a pushdown automaton. The prefix of the string including the cursor position corresponds to the stack of the pushdown automaton. Every type-3 language is generated by a left linear grammar. A string Turing machine with cursor actions {D} starts at the delimiter symbol & and reads the input word from right to left. When reading the first symbol a, it applies some production X a and enters state X. then it reads the next symbol b and applies some production Y Xb and enters state Y, and so on. If it reads the delimiter symbol and is in the state of the start symbol S, it deletes and accepts, otherwise it overwrites with and rejects. It may seem strange that in the case of type-3 languages the string Turing machine processes the input word from right to left. However, if acceptance by empty string is required there is no other choice. Another possibility would have been to define acceptance by final state and to restrict the type-3 cursor actions to {R}. The following example illustrates the way a word of a context-free language is recognized by a string Turing machine. Example: Consider the grammar S Xb XSb X a which is in reverse Greibach normal form. Let w = aabb be the input word. The string Turing machine first reduces each a to X by the production X a yielding the string XXbb. When it reads the first b it chooses production S Xb. It deletes b with the result that X appears at the cursor position. It overwrites X by the left side of the production, S, yielding the string XSb. It then moves the cursor to the right, reads b, and applies the production S XSb. Namely, it deletes b, deletes S, and overwrites X by the left side S. Now it has reduced the input word w to the start symbol S. In order to make sure that the current word is just S, it moves the

cursor to the right, deletes the delimiter symbol &, deletes S, and deletes the other delimiter symbol. Since the string is now empty, the string Turing machine accepts the input word. 7. Language classes recognized by string Turing machines We have seen that every type-i language is recognizable by a corresponding type of nondeterministic string Turing machine, which we call type-i string Turing machine. We show now that, vice-versa, any language recognized by a type-i string Turing machine is a type-i language. Theorem: Any language recognized by a type-i string Turing machine is a type-i language (i =0, 1, 2, 3). Proof: We show that a type-i string Turing machine for i =0, 1, 2, 3 can be simulated by a nondeterministic standard Turing machine, linear bounded automaton, pushdown automaton, and finite automaton, respectively. Thus, if a language is recognized, for instance, by a type-2 string Turing machine, it is recognized by a pushdown automaton and therefore is context-free or type-2. For string Turing machines of type i =0, 1 the construction of a corresponding standard Turing machine and linear bounded automaton, respectively, is obvious. We show in detail the construction of a nondeterministic pushdown automaton from a type-2 string Turing machine and a nondeterministic finite automaton from a type-3 string Turing machine. The input alphabet of the pushdown automaton is the same as that of the string Turing machine, the stack alphabet corresponds to the string alphabet, the sets of states are identical, so is the start state. It is assumed that a stack initially contains the delimiter symbol. The transition relation of a pushdown automaton consists of 5-tuples of the form (s, a, h, h,s ) where s is the current state, a is the symbol read, h is the topmost stack symbol that is popped from the stack, h is the symbol pushed to the stack, and s is the next state. Any of a, h, and h may be the empty word ε. We construct the transition relation d of the pushdown automaton from the transition relation d of the string Turing machine in the following way. 1) For all elements (s, a, R, s ) and (s,b,b,s ) of d where b E let (s, b, ε, b, s ) be an element of d. That is, whenever the string Turing machine moves its cursor to the right the pushdown automaton reads the next input symbol b and pushes it to the stack. Moreover, the tuple (s, ε, ε, &,s ) is added; the pushdown automaton may choose this transition when it has read the input completely. 2) For each element (s, a, a,s ) of d with a A let (s, ε, a, a,s ) be an element of d. That is, when the string Turing machine overwrites symbol a by symbol a, the pushdown automaton pops symbol a from the stack and and pushes symbol a to the stack. 3) For each element (s, a, D, s ) of d let (s, ε, a, ε, s ) be an element of d. That is, when the string Turing machine deletes symbol a, the pushdown automaton pops symbol a from the stack. If the string Turing machine processes a word w, then the pushdown automaton processes w in a corresponding way. If and only if the string Turing machine has reduced its string to the empty string and accepts, then the pushdown automaton accepts with empty stack. Thus, the pushdown automaton recognizes the same language. Any language recognized by a pushdown automaton is context-free. Therefore, any language recognized by a type-2 string Turing machine is context-free. A type-3 Turing machine essentially acts like a nondeterministic finite automaton that processes the input word from right to left. Thus, if L is the language accepted by the string Turing machine, then &L R is the language accepted by the nondeterministic finite automaton, where L R is the mirror image of L. Since &L R is regular, so is L R, and so is L, since the mirror image of a regular language is regular.

8. Conclusions We have introduced the concept of the string Turing machine as a natural device for recognizing languages. A string Turing machine can recognize a word by applying reduction steps according to the productions of some grammar. Depending of the type of the grammar, a corresponding type of string Turing machine suffices to recognize the language. We have defined type-i string Turing machines for i =0, 1, 2, 3 where each type i is a special case of type i 1 using only a proper subset of the cursor actions {I, L, R, D}. The main result is that the (nondeterministic) type-i string Turing machines correspond exactly to the typei languages of the Chomsky hierarchy. References [1] J.E. Hopcroft, R. Motwani, J.D. Ullman, Automata Theory, Languages, and Computation, 3rd edition, Addison-Wesley, 2006. [2] H.R. Lewis, C.H. Papadimitriou, Elements of the Theory of Computation, Prentice Hall, 1981. [3] A. Salomaa, Formal Languages, Academic Press, 1973. [4] M. Sipser, Introduction to the Theory of Computation, PWS Publishing Company, 1996.