Proof Techniques Alphabets, Strings, and Languages. Foundations of Computer Science Theory

Proof Techniques Alphabets, Strings, and Languages Foundations of Computer Science Theory

Proof By Case Enumeration Sometimes the most straightforward way to prove that a property holds for all elements of a set is to divide the set into two or more subsets (i.e., partition the set) and then prove the property separately for each subset Example: Suppose that the postage required to mail a letter is always at least 6. Prove that it is possible to apply any required postage to a letter given only 2 and 7 stamps You can prove this general claim by dividing it into two cases (i.e., two equivalence classes), based on the value of n (the required postage): 1. If n is even, apply n/2 2 stamps 2. If n is odd, then n 7 and n-7 0 and is even. 7 can be applied with one 7 stamp. Apply one 7 stamp and (n-7)/2 2 stamps

Proof By Counter-Example Consider the set of numbers of the form 2 n -1, for some positive integer n Prove or disprove the following claim: If n is prime, then 2 n -1 is prime Hundreds of years ago this was believed to be true In 1536, Hudalricus Regius refuted it by giving a counter example: 2 11-1 = 2047 is not prime (2047 = 23 x 89)

Proof By Counter-Example Consider any sets A, B, and C Prove or disprove the following claim: If A - C = A - B then B = C You can show that this claim is false with a counter-example: Let A =, B = {1}, and C = {2} A - C = A - B = But B C

Proof By Contradiction Prove that a statement P is true using proof by contradiction: Assume that the converse statement, P, is true Then show how that assumption will lead to an incorrect conclusion The statement P must therefore be true because the initial assumption of P turned out to be wrong

Proof By Contradiction Example: Prove that 2 is not rational Assume that 2 is rational If 2 is rational, it can be written as the ratio of two integers; i.e., 2 = n/m, where n and m have no common factors (if they do, then divide both by the common factor) Square both sides and rearrange to yield 2m 2 = n 2 Since n 2 must be even (since it is 2 times some other number), then n must also be even, so we could write n = 2k, or 2m 2 = 4k 2, or m 2 = 2k 2 (proof continued on the next slide)

Proof By Contradiction Since m 2 = 2k 2, m must also be even, so m and n must have a common factor (that is, 2) But recall that we assumed m and n do not have any common factors Contradiction! Therefore, since our assumption that 2 was rational led to a contradiction, 2 cannot be rational

Fallacies A fallacy of affirming the conclusion is a type of incorrect reasoning p q implies q p p q does not imply p q Example: Suppose it s true that if you do every homework assignment for this class, then you will know computer science theory Suppose you did not do every homework assignment for this class Does this mean that you do not know computer science theory?

Proof By Induction The principle of mathematical induction: If P(b) is true for some integer b, and if for all integers n b, P(n) P(n+1), then for all integers n b, P(n) is true An induction proof has three parts: 1. A clear statement of the proposition P 2. An example showing that P holds for some base case b (the smallest value with which we are concerned) 3. A proof that, for all integers n b, if P(n) is true, then it is also the case that P(n+1) is true We call the claim P(n) the inductive hypothesis

Proof By Induction Example: Prove that n 3 n is divisible by 3 whenever n is a positive integer Let P(n) be the proposition: n 3 n is divisible by 3 whenever n is a positive integer Basis: P(1) is true because 1 3 1 = 0 is divisible by 3 Inductive hypothesis: Assume that P(n) is true (i.e., assume that n 3 n is divisible by 3 for an arbitrary positive integer n) (proof continued on the next slide)

Proof By Induction Inductive step: Show that (n + 1) 3 (n + 1) is divisible by 3 Note that (n + 1) 3 (n + 1) = (n 3 + 3n 2 + 3n + 1) (n + 1) = (n 3 n) + 3(n 2 + n) Using the inductive hypothesis, we conclude that the first term, n 3 n, is divisible by 3 The second term is divisible by 3 because it is 3 times an integer So, we know that (n + 1) 3 (n + 1) is also divisible by 3

The Pigeonhole Principle Consider any function f: A B The pigeonhole principle says: If A > B then f is not one-to-one If you drop n + 1 pigeons into n holes, then at least one hole will have more than one pigeon

The Pigeonhole Principle Question: How many students must be in a class to guarantee that at least two students receive the same score on the final exam, assuming the exam is graded on a scale from 0 to 100?

The Pigeonhole Principle Question: What is the minimum number of students required to be enrolled in FCT to be sure that at least six will receive the same grade? Assume there are five possible grades: A, B, C, D, and F.

The Pigeonhole Principle Question: Let (x i, y i, z i ), i = 1, 2, 3, 4, 5, 6, 7, 8, 9, be a set of nine distinct points with integer coordinates in xyz space. Prove that the midpoint of at least one pair of these points has integer coordinates. Recall from analytic geometry that the midpoint of a segment whose endpoints are (a, b, c) and (d, e, f) is ((a+d)/2, (b+e)/2, (c+f)/2).

Cardinality We will be concerned with three cases: finite sets countably infinite sets uncountably infinite sets A set A is finite and has a cardinality of 0, or some n N iff either: A =, or there is a bijection from {1, 2, n} to A, for some n A set is infinite iff it is not finite

Countably Infinite Sets To prove that a set A is countably infinite, it suffices to find a bijection from N to the set A. For example, the set E of even natural numbers is countably infinite. To prove this, we offer the bijection: Even : N E, Even(x) = 2x N E 1 0 2 2 3 4 4 6

Uncountably Infinite Sets The power set of the integers P(S) is not countable (it is uncountably infinite). Diagonalization proof: Elem 1 of S Elem 2 of S Elem 3 of S Elem 4 of S Elem 5 of S. Subset 1 of P(S) 1 0 0 0 0.. Subset 2 of P(S) 0 1 0 0 0.. Subset 3 of P(S) 1 1 0 0 0.. Subset 4 of P(S) 0 0 1 0 0.. Subset 5 of P(S) 1 0 1 0 0.... Below is a set that cannot be any of Subset 1 through Subset 5 of P(S) because it is the complement of the diagonal. The same thing applies to all other subsets of P(S). 0 0 1 1 1..

Alphabets An alphabet Σ is any finite set of symbols Examples: ASCII, Unicode {0,1} (binary alphabet) {a,b,c}, {s,o} set of signals used by a protocol

Strings A string over an alphabet Σ is a list, each element of which is a member of Σ Strings are usually shown without commas or quotes, e.g., abc or 01101 Σ* = set of all strings over alphabet Σ The length of a string is its number of characters ε stands for the empty string (string of length 0)

Functions on Strings Cardinality: s is the number of characters in string s ε = 0 1001101 = 7 # c (s) is the number of times that character c occurs in s # a (abbaaa) = 4 Concatenation: st is the concatenation of strings s and t If x = good and y = bye, then xy = goodbye Note that xy = x + y ε is the identity for concatenation of strings. So: x (xε = εx = x) Concatenation is associative. So: s, t, w ((st)w = s(tw))

Functions on Strings Replication: For each string w and each natural number i, the string w i can be found as: w 0 = ε, w i+1 = w i w. Examples: a 3 = aaa (bye) 2 = byebye a 0 b 3 = bbb Reverse: If w and x are strings, then (wx) R = x R w R Example: Let w = name and x = tag, then (nametag) R = (tag) R (name) R = gateman

Relations on Strings aaa is a substring of aaabbbaaa aaaaaa is not a substring of aaabbbaaa aaa is a proper substring of aaabbbaaa Every string is a substring of itself (but not a proper substring of itself). ε is a substring of every string.

The Prefix Relation s is a prefix of t iff: x Σ* : (t = sx) s is a proper prefix of t iff: s is a prefix of t and s t Examples: The prefixes of abba are: The proper prefixes of abba are: ε, a, ab, abb, abba ε, a, ab, abb Every string is a prefix of itself. ε is a prefix of every string.

The Suffix Relation s is a suffix of t iff: x Σ* : (t = xs) s is a proper suffix of t iff: s is a suffix of t and s t Examples: The suffixes of abba are: The proper suffixes of abba are: ε, a, ba, bba, abba ε, a, ba, bba Every string is a suffix of itself. ε is a suffix of every string.

Defining a Language A language is a (finite or infinite) set of strings over a finite alphabet Σ Examples: Let Σ = {a, b} Some languages over Σ:, {ε}, {a, b}, {ε, a, aa, aaa, aaaa, aaaaa} The language Σ* contains an infinite number of strings, including: ε, a, b, ab, ababaa

Example of a Language Definition L = {x {a, b}* : all a s precede all b s} ε, a, aa, aabbb, and bb are in L aba, ba, and abc are not in L

Example of a Language Definition L = {x : y {a, b}* and x = ya} Simple English description: L consists of all strings that can be formed by taking some string in {a, b}* and concatenating a single a onto the end of it (i.e., L consists of all strings from the alphabet {a, b} that end in a). The strings a, aa, aaa, bbaaa, and ba are in L. The strings ε, bab, and bca are not in L.

The Empty Language L = { } =, i.e., L is the empty language (the language that contains no strings) The empty language is different from the empty string L = {ε}, i.e., L is the language that contains a single string, ε

A Halting Language L = {w: w is a C program that halts on all inputs}. Can we decide what strings it contains?

The Perils of English L = {w: w is a grammatically correct sentence in English}. Examples: Kerry hit the ball. Colorless green ideas sleep furiously. The panda bear eats shoots and leaves. Ball the Stacy hit blue. /* In L /* In L /* In L /* Not in L

Languages are Sets To provide a computational definition of a language, we could specify either: A language generator, which enumerates (lists) all of the strings in the language, or A language recognizer, which decides whether or not a candidate string is in the language and returns true if it is, or false if it is not

Language Enumeration In some cases, when considering an enumerator for a language L, we may care about the order in which the elements of L are generated If there exists a total order D of the elements of the alphabet, then we can use D to define on L a useful total order called lexicographic order (written < L )

Lexicographic Order We will say that a program lexicographically enumerates the elements of L if and only if it enumerates them in lexicographic order: Shorter strings precede longer ones, and Of strings that are the same length, they are sorted in dictionary order using D Example: The lexicographic enumeration of L = {x {a, b}* : all a s precede all b s} is ε, a, b, aa, ab, bb, aaa, aab, abb, bbb, aaaa, aaab, aabb, abbb, bbbb, aaaaa,...

How Many Strings are in a Language? What is the cardinality of a language (the number of strings in the language)? The smallest language over any Σ is, with cardinality 0 The cardinality of a language L is written L The cardinality of Σ* is written Σ* What is Σ*? Suppose that Σ =, then Σ* = {ε} and Σ* = 1 What about when Σ is not empty?

How Many Strings are in a Language? Theorem: If Σ then Σ* (the set of all strings in a given language with alphabet Σ) is countably infinite. Proof: The elements of Σ* can be lexicographically enumerated by the following procedure: Enumerate all strings of length 0, then length 1, then length 2, and so forth Within the strings of a given length, enumerate them in dictionary order This enumeration is infinite since there is no longest string in Σ*. Since there exists an infinite enumeration of Σ*, it is countably infinite.

How Many Languages Are There? Theorem: If Σ then the set of languages over Σ is uncountably infinite. Proof: The set of languages defined on Σ is the power set of Σ*, denoted P(Σ*). Recall that we have already proven that Σ* is countably infinite. Recall also that if S is a countably infinite set, P(S) is uncountably infinite. So P(Σ*) is uncountably infinite.

Functions on Languages Set operations: Union Intersection Difference Complement Reversal Language operations: Concatenation Kleene star

Set Functions Applied to Languages Let Σ = {a, b}. Also, let L 1 = {strings with an even number of a s and any number of b s} and L 2 = {strings with no b s} L 1 L 2 = {strings with an even number of a s and any number of b s, plus all strings with no b s)} L 1 L 2 = {ε, aa, aaaa, aaaaaa, aaaaaaaa,... } L 2 L 1 = {a, aaa, aaaaa, aaaaaaa,...} (L 2 L 1 ) = {strings with at least one b} {strings with an even number of a s} (L 1 L 2 ) R = L 2 R L 1 R

Concatenation of Languages If L 1 and L 2 are languages over Σ: L 1 L 2 = {w Σ* : s L 1 t L 2 (w = st)} Examples: L 1 = {cat, dog} L 2 = {apple, pear} L 1 L 2 = {catapple, catpear, dogapple, dogpear}

Concatenation of Languages {ε} is the identity for concatenation: L{ε} = {ε}l = L is a zero for concatenation: L = L = Concatenation is associative, so for all languages L 1, L 2, and L 3 : ((L 1 L 2 )L 3 ) = (L 1 (L 2 L 3 ))

Concatenation of Languages The scope of any variable used in an expression that invokes replication will be assumed to be the entire expression. For example, let: L 1 = {a n : n 0} this is the same as {w : n 0 (w = a n )} L 2 = {b n : n 0} this is the same as {w : n 0 (w = b n )} Then: L 1 L 2 = {a n b m : n, m 0}, but notice L 1 L 2 {a n b n : n 0}

The Kleene Star, * Strings in L* are formed by concatenating together any number of strings from L Example 1: L = {dog, cat, fish} // L consists of 3 strings L* = {ε, dog, cat, fish, dogdog, dogcat, dogfish, fishcatfish, fishdogdogfishcat, } // L* is countably infinite

The Kleene Star, * L* = {ε} {w Σ* : k 1 ( w 1, w 2, w k L (w = w 1 w 2 w k ))} L* always contains an infinite number of strings as long as L is not equal to either or {ε} (i.e., as long as there is at least one nonempty string and any number of them can be concatenated together). If L =, then L* = {ε}, since there are no strings that could be concatenated to ε to make it longer. If L = {ε}, then L* is also {ε}.

The Kleene Star, * Example 2: Let L = {w {a, b}* : # a (w) is odd and # b (w) is even} Then L* = {w {a, b}* : # b (w) is even} The constraint on the number of a s disappears in the description of L* due to the fact that strings in L* are formed by concatenating together any number of strings from L. If an odd number of strings are concatenated together, the result will contain an odd number of a s. If an even number are concatenated together, the result will contain an even number of a s. Thus, the number of a s does not matter the resulting string will have either an odd number of a s or an even number of a s which means any number of a s.

The + Operator It is sometimes useful to require that at least one element of L be selected L + = LL* L + is called the closure of L under concatenation Example: {0, 1} + is the set of all binary strings (note: L + does not include ε)

Assigning Meaning to Strings Let L = A n B n = {a n b n : n 0} Do the strings in L mean anything? What exactly does a language define?