Diagonalization. The cardinality of a finite set is easy to grasp: {1,3,4} = 3. But what about infinite sets?

Similar documents
Notes on Turing s Theorem and Computability

Material from Recitation 1

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

p x i 1 i n x, y, z = 2 x 3 y 5 z

Recursively Enumerable Languages, Turing Machines, and Decidability

Lecture 5: The Halting Problem. Michael Beeson

Computability, Cantor s diagonalization, Russell s Paradox, Gödel s s Incompleteness, Turing Halting Problem.

To prove something about all Boolean expressions, we will need the following induction principle: Axiom 7.1 (Induction over Boolean expressions):

To prove something about all Boolean expressions, we will need the following induction principle: Axiom 7.1 (Induction over Boolean expressions):

EDAA40 At home exercises 1

CS61A Lecture 38. Robert Huang UC Berkeley April 17, 2013

CSE 20 DISCRETE MATH. Fall

CSE 20 DISCRETE MATH. Winter

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics

Chapter 3. Set Theory. 3.1 What is a Set?

A B. bijection. injection. Section 2.4: Countability. a b c d e g

CS 531: Notes II. January 31, 2014

CSCI 270: Introduction to Algorithms and Theory of Computing Fall 2017 Prof: Leonard Adleman Scribe: Joseph Bebel

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Elementary Recursive Function Theory

Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5

Chapter 12. Computability Mechanizing Reasoning

Discrete Mathematics Lecture 4. Harper Langston New York University

THE HALTING PROBLEM. Joshua Eckroth Chautauqua Nov

.Math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in .

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

To illustrate what is intended the following are three write ups by students. Diagonalization

Lecture 6,

Lecture 1. 1 Notation

2.8 Universal Turing Machines and the Halting Problem

CS 3512, Spring Instructor: Doug Dunham. Textbook: James L. Hein, Discrete Structures, Logic, and Computability, 3rd Ed. Jones and Barlett, 2010

CHAPTER 7. Copyright Cengage Learning. All rights reserved.

Binary Decision Diagrams

Turing Machine Languages

The Undecidable and the Unprovable

Sets. {1, 2, 3, Calvin}.

COMPUTABILITY THEORY AND RECURSIVELY ENUMERABLE SETS

QED Q: Why is it called the triangle inequality? A: Analogue with euclidean distance in the plane: picture Defn: Minimum Distance of a code C:

CS 275 Automata and Formal Language Theory. First Problem of URMs. (a) Definition of the Turing Machine. III.3 (a) Definition of the Turing Machine

6c Lecture 3 & 4: April 8 & 10, 2014

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

CHAPTER 2. Graphs. 1. Introduction to Graphs and Graph Isomorphism

COMP260 Spring 2014 Notes: February 4th

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

(Refer Slide Time 3:31)

In this section we will study problems for which we can prove that there is no algorithm solving them.

Does this program exist? Limits of computability. Does this program exist? A helper function. For example, if the string program is

CSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University. Name: ID#: Section #: Score: / 4

THEORY OF COMPUTATION

γ(ɛ) (a, b) (a, d) (d, a) (a, b) (c, d) (d, d) (e, e) (e, a) (e, e) (a) Draw a picture of G.

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

AXIOMS FOR THE INTEGERS

Cardinality of Sets. Washington University Math Circle 10/30/2016

Enumerations and Turing Machines

This Lecture. We will first introduce some basic set theory before we do counting. Basic Definitions. Operations on Sets.

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

CS Bootcamp Boolean Logic Autumn 2015 A B A B T T T T F F F T F F F F T T T T F T F T T F F F

chapter 14 Monoids and Automata 14.1 Monoids GOALS Example

Introduction to Computer Science

Integers and Mathematical Induction

CS 125 Section #4 RAMs and TMs 9/27/16

Cantor s Diagonal Argument for Different Levels of Infinity

HW1. Due: September 13, 2018

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Handout 9: Imperative Programs and State

Mathematical Logic Prof. Arindama Singh Department of Mathematics Indian Institute of Technology, Madras. Lecture - 37 Resolution Rules

Exercises Computational Complexity

14.1 Encoding for different models of computation

Reflection in the Chomsky Hierarchy

Polygon Triangulation

We ve studied the main models and concepts of the theory of computation:

Cardinality of Sets MAT231. Fall Transition to Higher Mathematics. MAT231 (Transition to Higher Math) Cardinality of Sets Fall / 15

2. CONNECTIVITY Connectivity

MITOCW watch?v=4dj1oguwtem

CSC Discrete Math I, Spring Sets

Lecture 6: Arithmetic and Threshold Circuits

Horn Formulae. CS124 Course Notes 8 Spring 2018

CS103 Spring 2018 Mathematical Vocabulary

the Computability Hierarchy

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Solutions to Homework 10

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #06 Loops: Operators

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

Safe Stratified Datalog With Integer Order Does not Have Syntax

CS40-S13: Functional Completeness

Math 485, Graph Theory: Homework #3

Lecture Notes on Ints

MATHEMATICS 191, FALL 2004 MATHEMATICAL PROBABILITY Outline #1 (Countability and Uncountability)

Solutions to In Class Problems Week 5, Wed.

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Treewidth and graph minors

The Satisfiability Problem [HMU06,Chp.10b] Satisfiability (SAT) Problem Cook s Theorem: An NP-Complete Problem Restricted SAT: CSAT, k-sat, 3SAT

Topology and Topological Spaces

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5

A.1 Numbers, Sets and Arithmetic

axiomatic semantics involving logical rules for deriving relations between preconditions and postconditions.

DISCRETE MATHEMATICS

STABILITY AND PARADOX IN ALGORITHMIC LOGIC

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

1 Linear programming relaxation

Transcription:

Diagonalization Cardinalities The cardinality of a finite set is easy to grasp: {1,3,4} = 3. But what about infinite sets? We say that a set S has at least as great cardinality as set T, written S T, if there is an onto function f from S to T. For example, R N there are at least as many reals as there are natural numbers. The onto function from R to N maps natural numbers to themselves and every real that is not a natural number to 0, say. Similarly, Z N (it s always trivial that the cardinality of a set is at least that of a subset). But it turns out that also N Z, and so we can write Z = N. The onto function that establishes this is, for example, f(2i) = i, f(2i+1) = -i. We also know from the previous lecture that N = Q = N 2 (by Cantor s sweeping trick). How about R and R 2, do they also have the same cardinality? The answer is yes. The onto function from R to R 2 is the following (by example): f(3718.056921 ) = (31.062,78.591 ): you separate the even digits from the odd digits. It is also easy to see that R = [0,1] (exercise: find an onto function). Finally, [0,1] = 2 N (remember, 2 S, the power set of S, is the set of all subsets of the set S). To see this, just encode both reals and sets of integers as infinite binary strings (an trick is needed to take care of the double binary encoding problem, as in.011111 =.100000 ). Recall now the proof that N < [0,1] from last time ( < means and not ; that theorem said that there are more real numbers in the interval [0,1] than there are integers). In view of the equation [0,1] = 2 N above, this is a special case of the following general result: Theorem (Cantor, 1880): For every set S, S < 2 S. Proof: We want to show that there is no onto function from S to 2 S. Suppose that such a function f existed. Then consider the following subset of S called D (for diagonal ) D = { a in S: a is not in f(a)} This is an element of 2 S, and so, since f is onto, there is an element of S, call it d, such that f(d) = D.

And here is the unanswerable question, a contradiction establishing that f cannot exist: is d an element of D? It is iff it is not (think about it!). This is the famous diagonalization argument. It can be thought of as defining a table (see below for the first few rows and columns) which displays the function f, denoting the set f(a1), for example, by a bit vector, one bit for each element of S, 1 if the element is in f(a1) and 0 otherwise. The diagonal of this table is 0100. If we flip it we get the set D = 1011 This is a set that is guaranteed not to be a row of the table (because it differs from the i-th row in the i-th bit), proving that f is not onto. a f(a) a1 0 1 1 0 a2 1 1 1 1 a3 0 1 0 1 a4 0 1 1 0 It follows from Cantor s Theorem that N < 2 N = R < 2 R < an infinite hierarchy of ever increasing cardinalities A last question: Is there a cardinality between N and R? This is the famous Continuum Hypothesis, proposed by Cantor; it is now known that this possibility and its negation are both consistent with set theory The halting problem The diagonalization method was invented by Cantor in 1881 to prove the theorem above. It was used again by Gödel in 1931 to prove the famous Incompleteness Theorem (stating that in every mathematical system that is general enough to contain the integers, there must be theorems that have no proofs). And again by Turing in 1937 to establish that there are perfectly reasonable computational problems, which are unsolvable: there is no algorithm (program) for solving them. Let s talk about the latter (and keep the Incompleteness Theorem for last). Suppose that somebody told you that they have written a program with the following specs: halts (M, I: files) It returns true if the program in file M, when supplied with the data in file I, runs for a finite amount of time and then stops (in other words, it does not loop); it returns false if M loops on I. Would you believe them? Theorem (Turing, 1937): There is no program that does the above. Proof: Suppose such a program exists. Then we write the following program (with the ominous name D, from the previous proof):

D(M : file) 1: if halts (M, M) goto 1 Notice what D does: If the program in its input file halts when supplied with itself as input, then D halts, otherwise it loops forever. But now suppose that we put program D in a file called D, and then we run D(D). Will this computation halt? It will iff it will not (check!), and this is a contradiction. The unsolvability of the halting problem is a very important insight for CS. It tells us that there are limits to computation, to algorithms, to software engineering. You cannot tell whether a program will halt but neither can you tell if a program has a buffer overrun, or is redundant in that lines can be deleted with impunity, etc. etc. It means that programming and debugging and analyzing code will forever remain forms of art and hackery. And what is doubly amazing is that this important CS insight was articulated at the same time that CS was born (because Turing s paper that contained this theorem was the same that defined the Turing machine, the mathematical abstraction of a computer that started us on the way to the von Neumann computer and its software). The Incompleteness Theorem Logic is a mathematical field that studies Mathematics itself. We have seen Boolean Logic and First Order Logic at the beginning of the course. The latter, augmented with operations like + and * and (exponentiation) and relations like < and constants like 0 and 1 (and of course variables and for all, there exists, and Boolean connectives AND, OR etc.) can be used to express very complicated mathematical sentences about the natural numbers. This language is called First-Order Arithmetic, or FOA. For example, these are sentences in FOA: x y z [z = x + y] x y [x = y + y OR x = y + y + 1] x y [x = y + y] x [x = 1 + 1 + 1] x [x = y] The first two are theorems (true sentences about the natural numbers), stating that addition is a fully defined operation and every number is either even or odd, respectively, while the next two are false. (The last is not even a sentence, because its truth depends on the value of variable y ) What would a proof for such a true sentence be? Presumably it would start from some axioms, stating some basic facts about Boolean connectives and properties of the quantifiers, plus some important properties of the natural numbers, like Ax [0 + x = x]. Then it would allow us to deduce new true sentences from axioms and already established ones, until the sought theorem is produced. We don t need to define proofs precisely; we will just specify that proofs are finite strings that have the following three properties:

1. If P is a proof of statement A, then it can be checked as such. That is, there is a program proves(a,p) that always halts and decides whether P is a valid proof for A. 2. If P is a valid proof for A, then A is a true sentence, a theorem about the natural numbers. That is, our proof system is sound, it does not prove false statements. We can now state the Incompleteness Theorem: Incompleteness Theorem (Gödel 1931): In any system with these properties, there are (true) theorems in FOA that have no proofs. Proof: We show that it follows from the unsolvability of the halting problem (which Turing proved, however, having been influenced by Gödel s proof of the IC). It proceeds in these steps: 1. The halting problem is unsolvable even if it is tightened as follows: Given a program with no input statements, will it halt? This is done by reducing the original form of the halting problem to this one: Given a program M and a data bitsring I, define a new program M, without input, that does the following: It starts by setting internal variables to the data bitsring I, and then it simulates P by looking up any needed input. Then telling whether M halts is the same as telling whether M halts on input I. 2. Given such a program M, the statement program M will halt can be restated as a sentence S(M) about natural numbers. S(M) is either a theorem (if M indeed halts) or its negation is a theorem. Of course to carry out this step we have to fix a programming language, and here we use the simple programming language called Turing machine due to Turing himself. We show how this is done in the Appendix below. 3. Now we are done: If every theorem had a proof, then we could use the proves routine to come up with a program that looks through all possible proofs and solves the halting problem a contradiction. In more detail, given a program M, the following algorithm decides whether M halts or not: Start with constructing the statement S(M) For i = 0, 1, 2, do: For each possible proof P of length i do: If proves(p,s(m)) then output M halts and stop If proves (P, NOT S(M)), then output M loops and stop This contradicts the unsolvability of the halting problem, completing the proof of the Incompleteness Theorem.

APPENDIX Here we show how, given a program M, we can construct a statement in FOA S(M) such that S(M) is a theorem iff M halts. First we must define a programming language. We choose a very primitive, but still completely general, programming language known as Turing machine, invented by Turing. In this language, a program M operates on a data structure consisting of an infinite string. This string is initially all zeros (think of 0 as a blank). Each position in the string contains a character from an alphabet, say 0 9 (different programs may have alphabets of different sizes). At each point in time, the program is looking at a particular position in the string, and acts on this position in a way specified by the current instruction. The machine M has a number of instructions, say four instructions A, B, C, D. One of these instructions (by convention A) is the starting instruction. All instructions are of the same type: Depending on the character at the current position of the string, three things are done: A new character is overwritten; the current position is moved to the right (R), left (L) or is unchanged (U); and a new instruction is executed next. That is, every instruction is a case statement; for example, instruction A could be A: if current = 0 then {write 2, move R, goto C} if current = 1 then {write 0, move L, goto A} if current = 2 then {write 2, move U, goto B}... if current = 9 then {write 7, move L, goto F} The other instructions of M, instructions B, C, and D, are similar. There is also a special instruction F ( freeze! ) whereby the program halts. Programs in this language may have many more instructions or symbols. The point is that this primitive programming language can simulate any other programming language, like C++! (Does this sound believable? It s true!) To represent the operation of this program, we make the convention that we omit the infinite 0 portions that are to the left and right of the active part of the string, and represent a state of the program by the remaining string (without trailing or initial 0s) with the instruction being executed inserted just to the left of the current position. This is called a configuration. For example, configuration 18A93452 means that the program is about to execute instruction A, that the string is 00 00189345200 00 and that the underlined position is the current one. According to our description of instruction A above, the next time step the configuration will be 1F873452, and the execution will stop.

A computation of the program M above can be denoted as sequences of configurations, separated by. The first configuration is thus A0 (the current symbol is a 0, surrounded by infinitely many other 0 s, and the instruction executed is the start instruction A). From that (according to our example A above) we go to configuration 2C0, and from that perhaps (if it is the case that C contains if current = 0 then {write 1, move L, goto A} ) to A20 and from there to B20, and so on. We represent this as A0 2C0 A20 B20 Notice that, if we encode as E, this becomes a hexadecimal number! The bottom line is that sequences of configurations of a program can be thought of as numbers. We are now ready to construct S(M), the sentence on FOA that says program M halts. S(M) is of the form S(M) = x [ B(x) AND W(x) AND H(x)], stating that there is a number x which encodes a halting computation of machine M. The three conjuncts B(x), W(x) and H(x) are themselves FOA statements declaring that computation x starts right (its first three symbols are A0 ); that its works correctly (the parts between any three consecutive s follow from one another according to the instructions of the program), and that it halts (the symbol F appears somewhere). Here is how you write these parts: B(x) is the following: z [ x div 16 z = (A0E) 16 ]. It states that there is a number z (it will end up being the length of x minus 3) such that, if you divide the integer x by 16 z (thus effectively obtaining the first 3 hexadecimal digits of x) what you get is A0E, meaning that x starts with A0 as needed. H(x) is equally simple: z w [x div 16 z = w*16 + F 16 ], that is, if you delete w symbols from the end of x then the last symbol of the result is F that is, the freeze instruction F appears somewhere on the computation (presumably near the end), and thus M halts. W(x) is a little more complicated. It states that for any three consecutive occurrences of E (that is, of ) in x, the two strings u and v between these occurrences represent configurations of M such that v is the next configuration after u: z z >z z >z w w w u v [ [x div 16 z = w*16 + E 16 AND x div 16 z = w *16 + E 16 AND x div 16 z = w *16 + E 16 AND u = w mod 16 (z z) AND v = w mod 16 (z z ) AND C(u) AND C(v) ] N(u,v) ]

After the universal quantifiers of W(x), we have a sequence of conditions stating that there are s in the digits that are z, z, and z positions from the right end in the hexadecimal representation of x, and that u and v are the numbers whose hexadecimal representations are the strings between these s, and that u and v are configurations (this is what C(u) and C(v) mean), that is, they contain only one instruction symbol (A, B, C, D, or F) and no s. The precise details of formula C(u) are very simple and are left as an exercise. Then W(x) goes on to say that, if all those conditions are met, then v is the next configuration after u (this is what the formula N(u,v) means). The writing of N(u,v) is quite cumbersome, as it has to take into account the details of all instructions of M, and is also left as an exercise.