Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1

Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental capabilities and limitations of computers? The theories of computability and complexity are closely related. In complexity theory, the objective is to classify problems as easy ones and hard ones, In computability theory, the classification of problems is by those that are solvable and those that are not. Computability theory introduces several of the concepts used in complexity theory. Automata theory deals with the definitions and properties of mathematical models of computation. The finite automaton, is used in text processing, compilers, and hardware design. The context free grammar, is used in programming languages and artificial intelligence. Turing machines represent computable functions. BİL405 - Automata Theory and Formal Languages

Why Study Automata? Automata theory is the core of computer science. Automata theory presents many useful models for software and hardware. In compilers we use finite automatons for lexical analyzers, and push down automatons for parsers. In search engines, we use finite automatons in order to determine tokens in web pages. Finite automata model protocols, electronic circuits. Context-free grammars are used to describe the syntax of essentially every programming language. Automata theory offers many useful models for natural language processing. When developing solutions to real problems, we often confront the limitations of what software can do. Undecidable things no program whatever can do it. Intractable things there are programs, but no fast programs. BİL405 - Automata Theory and Formal Languages 3

A Simple Finite Automaton On/Off Switch push States are represented by circles. start off on Accepting (final) states are represented by double circles. push One of the states is a starting state. Arcs represent external influences (inputs). BİL405 - Automata Theory and Formal Languages 4

A Simple Finite Automaton Recognizing A Word start i l i il y a ily ilya s ilyas BİL405 - Automata Theory and Formal Languages 5

Central Concepts of Automata Theory - Alphabets An alphabet is a finite, non empty set of symbols. We use the symbol for an alphabet. = {0,1} - binary alphabet = {a,b,c,,z} - lowercase letters The set of ASCII chracters is an alphabet. BİL405 - Automata Theory and Formal Languages 6

Central Concepts of Automata Theory - Strings A string is a sequence of symbols chosen from some alphabet. A string sometimes is called as word. 01101 is a string from the alphabet = {0,1}. Some other strings: 11, 010, 1, 0 The empty string, denoted as, is a string of zero occurrences of symbols. Length of string: number of symbols in the string ab = b = 1 = 0 BİL405 - Automata Theory and Formal Languages 7

Central Concepts of Automata Theory - Strings Powers of an alphabet: If is an alphabet, the set of all strings of a certain length from the alphabet by using an exponential notation. k is the set of strings of length k from. Let = {0,1}. 0 = {} 1 = {0,1} = {00,01,10,11} The set of all strings over an alphabet is denoted by *. * = 0 1 + = 1 - set of nonempty strings Concatenation of strings If x and y are strings xy represents their concatenations. If x=abc and y=de then xy = abcde BİL405 - Automata Theory and Formal Languages 8

Central Concepts of Automata Theory - Languages A set of strings that are chosen from * is called as a language. If is an alphabet, and L *, then L is a language over. A language over may not include strings with all symbols of. Some Languages: The language of all strings consisting of n 0 s followed by n 1 for some n 0 : {, 01, 0011, 000111, } * is a language Empty set is a language. The empty language is denoted by. The set {} is a language, {} is not equal to the empty language. The set of all identifers in a programming language is a language. The set of all syntactically correct C programs is a language. Turkish, English are lanuguages. BİL405 - Automata Theory and Formal Languages 9

Set-Formers to Define Languages A set-former is a common way to define a language Set-former: {w something about w} {w w consists of equal number of 0 s and 1 s} {w w is a binary integer that is prime} Sometimes we replace w with an expression {0 n 1 n n 1} {0 i 1 j 0 i j} BİL405 - Automata Theory and Formal Languages 10

Language - Problem In automata theory, a problem is the question of deciding whether a given string is a member of a particular language. If is an alphabet, and L is a language over, then the problem L is: Given a string w in *, decide whether or not w is in L. In order to make decision requires some computational resources. Deciding whether a given string is a correct C identifier Deciding whether a given string is a syntactically correct C program. Some decision problems are simple, some others are harder. A decision question may require exponential resources in the size of its input. A decision question may be unsolvable. BİL405 - Automata Theory and Formal Languages 11

Formal Proofs When we study automata theory, we encounter theorems that we have to prove. There are different forms of proofs: Deductive Proofs Inductive Proofs Proof by Contradiction Proof by a counter example (disproof) To create a proof may NOT be so easy. BİL405 - Automata Theory and Formal Languages 1

Deductive Proofs A deductive proof consists of a sequence of statement whose truth leads us from some initial statement (hypothesis or given statements) to a conclusion statement. Each step of a deductive proof MUST follow from a given fact or previous statements (or their combinations) by an accepted logical principle. The theorem that is proved when we go from a hyphothesis H to a conclusion C is the statement if H then C. We say that C is deduced from H. BİL405 - Automata Theory and Formal Languages 13

Deductive Proofs Assume that the following theorem (initial statement) is given: Given Thm. (initial statement): If x 4, then x x Theorem to be proved: If x is the sum of the squares of four positive integers, then x x BİL405 - Automata Theory and Formal Languages 14

Proof of Theorem 1. x = a + b + c + d Given. a 1 b 1 c 1 d 1 Given 3. a 1 b 1 c 1 d 1 From (1) and principle of arithmetic 4. x 4 From (1), (3), and principle of arithmetic 5. x x From (4) and principle of arithmetic BİL405 - Automata Theory and Formal Languages 15

If-And-Only-If Statements Some times theorems contain if-and-only-if statements. In this case we have to prove in both directions. Theorem: Let x be a real number. Then x = x if and only if x is an integer. BİL405 - Automata Theory and Formal Languages 16

Proof of Theorem Let x be a real number. Then x = x if and only if x is an integer. If-Part: Given that x is an integer. By definitions of ceiling and floor operations. x = x and x = x Thus, x = x. Only-If-Part: Given that x = x By definitions of ceiling and floor operations. x x and x x Since given that x = x, x x and x x By the proporties of arithmetic inequalities, x = x Since x is always an integer, x MUST be integer too. BİL405 - Automata Theory and Formal Languages 17

Inductive Proofs An inductive proof has three parts: Basis case Inductive hypothesis Inductive step Base case can be one or more than one BİL405 - Automata Theory and Formal Languages 18

Inductive Proofs Theorem: n i i1 n( n 1) For all n>=1. Proof #1: (by induction on n) Basis: n = 1 1 i1 i 1(1 1) 1 = 1 19

Inductive hypothesis: Suppose that Inductive step: i1 We will show that k i k( k 1) for some k>=1. k 1 i1 ( k i 1)( k ) k 1 i1 i k i i1 ( k 1) k( k 1) ( k k( k 1) ( k 1) ( k 1)( k ) 1) by the inductive hypothesis n n( n 1) It follows that i for all n>=1. i1 0

Theorem: n i i1 n( n 1) For all n>=1. Proof #: (by induction on n) Basis: n = 1 1 i1 i 1(1 1) 1 = 1 1

Inductive hypothesis: m m( m 1) Suppose there exists a k>=1 such that that i for all m, where 1 <= m <= k. i1 Inductive step: We will show that k 1 i1 i ( k 1)( k ) k 1 i1 i k i i1 ( k 1) k( k 1) ( k 1) by the inductive hypothesis k( k 1) ( k 1) ( k 1)( k ) n It follows that n( n 1) i for all n>=1. i1

What is the difference between proof #1 and proof #? The inductive hypothesis for proof # assumes more than that for proof #1. Proof #1 is sometimes called weak induction Proof # is sometimes called strong induction Many times weak induction is sufficient, but other times strong induction is more convenient. 3

Strict Binary Trees Definition #1: Let T=(V,E) be a tree with V >=1. Then T is a strict binary tree if and only if each vertex in T is either a leaf or has exactly two children. Definition #: (recursive) 1. A single vertex is a strict binary tree.. Let T1=(V1,E1) and T=(V,E) be strict binary trees with roots r1 and r, respectively, where V1 and V do not intersect. In addition, let r be a new vertex that is not in V1 or V. Finally, let: V 3 V1V { r} E3 E1 Then T3=(V3,E3) is a strict binary tree. 3. Nothing else is a strict binary tree. E {( r, r1), ( r, r)} 4

Theorem: Let L(T) denote the number of leaves in a strict binary tree T=(V,E). Then *L(T) = E Proof: (by induction on L(T)) Basis: L(T) = 1 Observation: The only strict binary tree with L(T) = 1 has a single and 0 edges, i.e., E = 0. vertex Lets evaluate *L(T)- and see if we can show it is equal to E. *L(T) = *1 Since L(T) = 1 = 0 = E Since the observation tell us that E = 0 5

Inductive hypothesis: Suppose there exists a k>=1 such that for every strict binary tree where 1<=L(T)<=k that *L(T) = E. Inductive step: Let T=(V,E) be a strict binary tree where L(T)=k+1. We must show that *L(T) = E. Notes about T: 1. Since k>=1 and L(T)=k+1, it follows that L(T)>1. Therefore T consists of a root r and two strict binary trees T1=(V1,E1) and T=(V,E), where 1<=L(T1)<=k and 1<=L(T)<=k.. L(T) = L(T1) + L(T) by definition of T 3. E = E1 + E + also by definition of T Lets start with *L(T) - and, using 1-3, see if we can show that it is equal to E. 6

* L(T) = * (L(T1) + L(T)) by () = * L(T1) + * L(T) = ( * L(T1) ) + ( * L(T) ) + Since 1<=L(T1)<=k and 1<=L(T)<=k from (1), it follows from the inductive hypothesis that: * L(T1) = E1 and * L(T) = E Substituting these in the preceding equation gives: = E1 + E + = E by (3) Hence, for all strict binary trees, * L(T) = E. Finally, note the use of strong induction in the above proof. 7

Proving Equivalances About Sets In order to prove S = T, we have to prove that 1. If x is a member of S, then x is also a member of T (S T), and. If x is a member of T, than x is also a member of S (T S), Theorem: R ( S T) = (R S) (R T) We have to show that 1. If x is in R ( S T), than x is in (R S) (R T), and. If x is in (R S) (R T), than x is in R ( S T) BİL405 - Automata Theory and Formal Languages 8

Proof of R ( S T) = (R S) (R T) BİL405 - Automata Theory and Formal Languages 9

Proof of R ( S T) = (R S) (R T) BİL405 - Automata Theory and Formal Languages 30