Reasoning and writing about algorithms: some tips

Similar documents
MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: September 28, 2016 Edited by Ofir Geri

Outline. Introduction. 2 Proof of Correctness. 3 Final Notes. Precondition P 1 : Inputs include

Binary Search to find item in sorted array

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: April 1, 2015

Solutions to Problem 1 of Homework 3 (10 (+6) Points)

CSE 21 Spring 2016 Homework 5. Instructions

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points

Introduction to Algorithms February 24, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Handout 8

This chapter covers recursive definition, including finding closed forms.

CSE 21 Spring 2016 Homework 5. Instructions

The divide and conquer strategy has three basic parts. For a given problem of size n,

1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:

More Complicated Recursion CMPSC 122

CS103 Spring 2018 Mathematical Vocabulary

12/30/2013 S. NALINI,AP/CSE

Framework for Design of Dynamic Programming Algorithms

Recursive Definitions Structural Induction Recursive Algorithms

Merge Sort. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

Lecture 15 : Review DRAFT

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

Computer Science 236 Fall Nov. 11, 2010

COMP 250 Fall Recursive algorithms 1 Oct. 2, 2017

(Refer Slide Time: 00:18)

Algorithms for Data Science

Induction and Recursion. CMPS/MATH 2170: Discrete Mathematics

CS 373: Combinatorial Algorithms, Spring 1999

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]:

CS 3512, Spring Instructor: Doug Dunham. Textbook: James L. Hein, Discrete Structures, Logic, and Computability, 3rd Ed. Jones and Barlett, 2010

Binary Search and Worst-Case Analysis

CS383, Algorithms Spring 2009 HW6 Solutions

Reading 8 : Recursion

1 Introduction. 2 InsertionSort. 2.1 Correctness of InsertionSort

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Dynamic Programming I Date: 10/6/16

Notes for Recitation 8

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

PRAM Divide and Conquer Algorithms

Ramsey s Theorem on Graphs

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Shortest Paths Date: 10/13/15

Recursion Chapter 3.5

Lecture 4 Median and Selection

CS 4349 Lecture August 21st, 2017

1 Algorithm and Proof for Minimum Vertex Coloring

6.001 Notes: Section 4.1

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8

Divide and Conquer Algorithms

Chapter 3. Set Theory. 3.1 What is a Set?

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

15 212: Principles of Programming. Some Notes on Induction

Algorithms. Lecture Notes 5

Binary Search and Worst-Case Analysis

Slides for Faculty Oxford University Press All rights reserved.

Complexity, Induction, and Recurrence Relations. CSE 373 Help Session 4/7/2016

1 i n (p i + r n i ) (Note that by allowing i to be n, we handle the case where the rod is not cut at all.)

GRAPH THEORY: AN INTRODUCTION

CMPSCI 311: Introduction to Algorithms Practice Final Exam

Fundamental mathematical techniques reviewed: Mathematical induction Recursion. Typically taught in courses such as Calculus and Discrete Mathematics.

Week - 03 Lecture - 18 Recursion. For the last lecture of this week, we will look at recursive functions. (Refer Slide Time: 00:05)

Graph Theory Questions from Past Papers

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

CS 4349 Lecture September 13th, 2017

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

Figure 4.1: The evolution of a rooted tree.

MAT 243 Test 2 SOLUTIONS, FORM A

Algorithms - Ch2 Sorting

(Refer Slide Time: 01.26)

CS103 Handout 13 Fall 2012 May 4, 2012 Problem Set 5

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 17: Divide and Conquer Design examples.

Subset sum problem and dynamic programming

Searching Algorithms/Time Analysis

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

1 Counting triangles and cliques

CS 170 Spring 2000 Solutions and grading standards for exam 1 Clancy

Treewidth and graph minors

Solutions to the Second Midterm Exam

Homework 2. Sample Solution. Due Date: Thursday, May 31, 11:59 pm

Algorithms and Complexity

Exercises Optimal binary search trees root

CMSC 451: Lecture 10 Dynamic Programming: Weighted Interval Scheduling Tuesday, Oct 3, 2017

Principles of Algorithm Design

Invariant Based Programming

Problem Set 1. Solution. CS4234: Optimization Algorithms. Solution Sketches

P1 Engineering Computation

The angle measure at for example the vertex A is denoted by m A, or m BAC.

i W E I R D U T O P I A i

Week - 04 Lecture - 01 Merge Sort. (Refer Slide Time: 00:02)

CS2 Algorithms and Data Structures Note 1

Distributed Sorting. Chapter Array & Mesh

H c. Show that n n (n!) 2 for all integers n n 1.

Lecture 2: Divide and Conquer

Proofwriting Checklist

CS 173 [A]: Discrete Structures, Fall 2012 Homework 8 Solutions

Inference rule for Induction

U.C. Berkeley CS170 : Algorithms, Fall 2013 Midterm 1 Professor: Satish Rao October 10, Midterm 1 Solutions

Unit #2: Recursion, Induction, and Loop Invariants

n 1 x i = xn 1 x 1. i= (2n 1) = n 2.

6.856 Randomized Algorithms

COMPSCI 311: Introduction to Algorithms First Midterm Exam, October 3, 2018

Introduction to Sets and Logic (MATH 1190)

The Diagonal Lemma: An Informal Exposition

Transcription:

Reasoning and writing about algorithms: some tips Theory of Algorithms Winter 2016, U. Chicago Notes by A. Drucker The suggestions below address common issues arising in student homework submissions. Absorbing these lessons should be worthwhile in this class and beyond. Outline your writeup. Here is a widely-applicable outline for the HW problems: 1. We define the following algorithm A: [blah blah] 2. We claim that on every input x, A(x) halts with the desired output (as specified in the problem statement). Proof: [blah blah] 3. We claim that on any input x with input size-parameter n, the runtime of A is at most F (n) [for some explicit function F ]. Proof: [blah blah] You may want to write a similar outline for these individual parts. For example, if you are writing a divide-and-conquer algorithm, you will probably want an algorithm with: a base case; a way of taking a problem instance and generating recursive calls to related, smaller instances; and a way of combining these smaller problems solutions to obtain a solution to the main problem. It may help to draw an outline of these parts, and then fill in the details of what should happen in each part. Similarly, the proof of correctness and efficiency will use induction and have a predictable structure, and it may help to diagram these. The 3-part solution outline listed above might sound obvious, but student submissions often deviate from this format and usually are worse as a result, for one of several reasons: 1

omission: forgetting to provide a clear proof of correctness or efficiency; redundancy: writing down more thoughts and observations than are needed to solve the problem, e.g. explaining things once vaguely and then again with more details; confusion: the submission is poorly organized and it s not clear what is being proved at any given point in the argument. An important, closely-related point: Separate algorithm from analysis. This is so important and comes up often. It is almost always best to first state your algorithm in clear pseudocode, then start making and proving assertions about what it does. Students who fail to follow this approach may include some partial analysis observations while motivating or specifying the algorithm, and mistakenly believe this is sufficient to show correctness. The only common exception to the separation rule is that you might want to first define a subroutine, prove that it correctly solves some sub-task, and then use that subroutine to specify a larger algorithm to solve your main problem. For example, when we use efficient data structures, we like to specify and analyze their implementations first, and then we can use them freely in various applications. As with ordinary programming, you have the right to include comments in your pseudocode, indicating what it is supposed to be achieving at a given point in the code. But these are just helpful hints to the reader that may aid in verifying the actual analysis, which comes after. Separate program variables from the quantities we wish to compute. This is part of the previous point. Here is a simple example: suppose that on input an integer N > 2, our goal is to compute the N th Fibonacci number f N. Algorithm: Define the algorithm A as follows: Input: integer N > 2. Create an integer array a[1 : N] of length N; 2

Set a[1] 1, a[2] 1; For i = 2 to N: Set a[i] a[i 1] + a[i 2]; //Comment: I will claim that a[i] = f i... Output a[n]; Now to prove correctness, one can just prove that for any fixed input value N, each entry a[i] equals f i, which implies that the output a[n] is the desired value. This is very easy, since the array entries obey the same base-cases and recurrence relation that one typically uses to define f i in the first place. But in more interesting examples, the algorithm may use a different method of computation from that in the definition, yet arrive at the same quantities. In this class we need to prove that the program variables end up with the desired values that we have defined mathematically as our targets. For this reason, it is key that you give the program variables different names from the target values. Otherwise you will confuse the reader, and might confuse yourself into thinking that there is no need to prove correctness. For example, it is typical to use OP T to denote the maximum achievable value of some optimization problem, and then often we might define an i th subproblem whose optimum value is referred to as OP T (i). You might observe that these values obey some recurrence relation, and it s natural to write an iterative program to compute these quantities. Just give the program variables a different name, say Y i, and prove by induction on i that your computed values satisfy Y i = OP T (i). Use induction. It s always worth emphasizing this. When analyzing an algorithm that takes many steps, we are always trying to show some kind of gradual progress over time. The additional progress we make at each execution step is reliant upon what we have accomplished in prior steps, so we should analyze step by step. It is usually best to formalize such an argument by induction (instead of something hand-wavy like the pattern becomes clear ). Being explicit about this makes it clear to graders that you know how the proof is supposed to look, even if some element of your inductive proof ends up being flawed. 3

When using induction, clearly identify the induction variable k and the statement S(k) to be proved for a given value of k. In the Fibonacci algorithm I gave above, it might be tempting to state and inductively prove the statement S(k) : algorithm A halts with the correct ouput f k on input k. The difficulty with this is that A is an iterative algorithm, not a recursive one. A does not call itself on smaller inputs, so it s not so clear how to use the inductive hypothesis to extend to the value k + 1. Instead, we fix an input N; and we prove by induction on k {1, 2,..., N} that during this execution A(N) each program variable a[k] eventually gets set to f k. This is easy. In general, the two key features S(k) should have are: 1. S(k) should capture the essential progress we have made, that makes further progress possible (so, we need S(k) to contain the information that allows us to derive S(k + 1)); 2. The final statement S(n) should imply that our algorithm has reached our goal. These points suggest that we should make S(k) as strong and informative as we can. This is not completely true, however, because the stronger we make S(k), the more we also have to prove when deriving S(k + 1). Avoid logical pitfalls by paying close attention to language. Here is a proposal for a linear-time sorting algorithm, ULT RASORT : on input a 1,..., a n, if n = 1 then do nothing, else split the list in half and apply ULT RASORT on each half. I claim the algorithm is correct on any input a 1,..., a n. Proof: induction on n. The base case is easy any 1-element list is sorted. Now suppose true for n = 1,..., N, and consider n = N + 1. Well by inductive hypothesis, the left half ends up sorted, and the right half ends up sorted; so the whole list is sorted! Of course this is bogus. The source of the error is failing to analyze the meaning of sortedness carefully enough, and distinguishing between a list whose two halves are internally sorted, versus one that is globally sorted. A similar but trickier issue appeared in one of our HW problems. Suppose we are trying to find a local minimum of a rectangular array of numbers. One approach is to just chop the array in half and find a local minimum on one of the two sides. But what does this mean? Values on the rightmost edge of the left may be considered 4

local minima when we restrict our attention to their adjacent cells within the left half, but they also have neighbors on the right half and we need to consider these too when trying to argue that what a recursive program call gives us is a solution to our larger problem. To obtain a correct efficient algorithm for this problem, it turns out to be helpful to think about the more subtle notion of an internal local minimum within a grid one that does not live on the outer boundary. This distinction should get us thinking about making a stronger inductive hypothesis about what a recursive call on a subgrid returns, so that we do not get back a faulty local minimum that lives on the edge and may fail to be a local minimum in the larger grid. 5