Fall 2018 Lecture 7. Stephen Brookes

Similar documents
Spring 2018 Lecture 8. Sorting Integer Trees

Fall Lecture 8 Stephen Brookes

15 150: Principles of Functional Programming Sorting Integer Lists

Lecture 6: Sequential Sorting

CSC/MAT-220: Lab 6. Due: 11/26/2018

Fall Recursion and induction. Stephen Brookes. Lecture 4

(2,4) Trees. 2/22/2006 (2,4) Trees 1

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

Programming Languages and Techniques (CIS120)

Exercise 1 : B-Trees [ =17pts]

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

Assignment 1. Stefano Guerra. October 11, The following observation follows directly from the definition of in order and pre order traversal:

Programming Languages ML Programming Project Due 9/28/2001, 5:00 p.m.

Midterm solutions. n f 3 (n) = 3

Fall 2018 Lecture N

Lecture 21: Red-Black Trees

Algorithms and Data Structures (INF1) Lecture 8/15 Hua Lu

Lower Bound on Comparison-based Sorting

Sorting Goodrich, Tamassia Sorting 1

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

15 150: Principles of Functional Programming. Exceptions

Algorithms and Data Structures

Fall 2018 Lecture 1

Fall Lecture 3 September 4. Stephen Brookes

CS134 Spring 2005 Final Exam Mon. June. 20, 2005 Signature: Question # Out Of Marks Marker Total

TREES AND ORDERS OF GROWTH 7

Search Trees - 1 Venkatanatha Sarma Y

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

Programming Languages 3. Definition and Proof by Induction

TREES. Trees - Introduction

Lecture 3: Recursion; Structural Induction

Friday Four Square! 4:15PM, Outside Gates

CMPS 2200 Fall 2017 Red-black trees Carola Wenk

CSC236 Week 5. Larry Zhang

Data Structure Lecture#10: Binary Trees (Chapter 5) U Kang Seoul National University

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Today s Outline. Motivation. Disjoint Sets. Disjoint Sets and Dynamic Equivalence Relations. Announcements. Today s Topics:

(b) int count = 0; int i = 1; while (i<m) { for (int j=i; j<n; j++) { count = count + 1; i = i + 1; O(M + N 2 ) (c) int count = 0; int i,j,k; for (i=1

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Comparison Based Sorting Algorithms. Algorithms and Data Structures: Lower Bounds for Sorting. Comparison Based Sorting Algorithms

We will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:

CSE 373 Spring 2010: Midterm #1 (closed book, closed notes, NO calculators allowed)

Recitation 9. Prelim Review

Proving Properties of Recursive Functions and Data Structures. CS 270 Math Foundations of CS Jeremy Johnson

Algorithms and Data Structures: Lower Bounds for Sorting. ADS: lect 7 slide 1

Quick-Sort. Quick-Sort 1

Programming Principles

Figure 4.1: The evolution of a rooted tree.

Recitation 1. Scan. 1.1 Announcements. SkylineLab has been released, and is due Friday afternoon. It s worth 125 points.

Multiway Search Trees. Multiway-Search Trees (cont d)

We assume uniform hashing (UH):

Programming II (CS300)

Lecture 6: Analysis of Algorithms (CS )

Lecture 6 Sequences II. Parallel and Sequential Data Structures and Algorithms, (Fall 2013) Lectured by Danny Sleator 12 September 2013

Multi-Way Search Tree

B-Trees. Based on materials by D. Frey and T. Anastasio

B-Trees and External Memory

Recitation 1. Scan. 1.1 Announcements. SkylineLab has been released, and is due Friday afternoon. It s worth 125 points.

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

ECE242 Data Structures and Algorithms Fall 2008

B-Trees and External Memory

CMPSCI 250: Introduction to Computation. Lecture #14: Induction and Recursion (Still More Induction) David Mix Barrington 14 March 2013

Design and Analysis of Algorithms Lecture- 9: B- Trees

Recap from last time. Programming Languages. CSE 130 : Fall Lecture 3: Data Types. Put it together: a filter function

15 212: Principles of Programming. Some Notes on Continuations

CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees. Linda Shapiro Spring 2016

CSC Design and Analysis of Algorithms

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

INF2220: algorithms and data structures Series 1

CSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer

CMPS 2200 Fall 2015 Red-black trees Carola Wenk

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 3 Parallel Prefix, Pack, and Sorting

Australian researchers develop typeface they say can boost memory, that could help students cramming for exams.

Data Structures and Algorithms

CS 173 [A]: Discrete Structures, Fall 2012 Homework 8 Solutions

Trees. CSE 373 Data Structures

CSE 214 Computer Science II Introduction to Tree

Chapter 15. Binary Search Trees (BSTs) Preliminaries

Quick-Sort fi fi fi 7 9. Quick-Sort Goodrich, Tamassia

CPS 231 Exam 2 SOLUTIONS

A Second Look At ML. Chapter Seven Modern Programming Languages, 2nd ed. 1

Computer Science E-22 Practice Final Exam

CS2110 Assignment 3 Inheritance and Trees, Summer 2008

Binary Trees, Binary Search Trees

Recursion and Structural Induction

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 373 MAY 24 TH ANALYSIS AND NON- COMPARISON SORTING

Name: Entry: Gp: 1. CSL 102: Introduction to Computer Science. Minor 2 Mon 21 Mar 2005 VI 301(Gps 1-6)& VI 401(Gps 7-8) 9:30-10:30 Max Marks 40

CS 3343 Fall 2007 Red-black trees Carola Wenk

Quick-Sort. Quick-sort is a randomized sorting algorithm based on the divide-and-conquer paradigm:

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Binary Trees

CS24 Week 8 Lecture 1

DIVIDE AND CONQUER ALGORITHMS ANALYSIS WITH RECURRENCE EQUATIONS

ECE250: Algorithms and Data Structures Midterm Review

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

CMSC 330, Fall 2013, Practice Problems 3

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

These are not polished as solutions, but ought to give a correct idea of solutions that work. Note that most problems have multiple good solutions.

Transcription:

15-150 Fall 2018 Lecture 7 Stephen Brookes

last time Sorting a list of integers Specifications and proofs helper functions that really help

principles Every function needs a spec Every spec needs a proof Recursive functions need inductive proofs Learn to pick an appropriate method... Choose helper functions wisely! proof of msort was easy, because of split and merge

so far sorting for integer lists specifications and correctness but what about efficiency?

work An estimate of the number of evaluation steps, on a sequential processor basic ops count as one unit of work add the work for sub-expressions some ops, like @, incur an extra cost Many ML constructs use left-to-right evaluation e1 + e2 e1 :: e2 e1 @ e2

work rules W(e) is the work for e W(e1+e2) = W(e1) + W(e2) + 1 W(e1::e2) = W(e1) + W(e2) + 1 W(e1@e2) = W(e1) + W(e2) + length e1 + 1 W(if e0 then e1 else e2) = W(e0) + max(w(e1),w(e2)) + 1 W ((f x, f y)) = W(f x) + W(f y) + 1 sequential evaluation: f x, then f y, then build pair

span An estimate of the number of evaluation steps, assuming unlimited parallelism For dependent sub-expressions, add the span For independent sub-expressions, max the span independent: no data or control dependency, so can be evaluated in parallel GOVERNMENT HEALTH WARNING ML constructs are inherently sequential

span S(e) is the span for e S(e1+e2) = S(e1) + S(e2) + 1 S(e1::e2) = S(e1) + S(e2) + 1 for sequential e work = span S(e1@e2) = S(e1) + S(e2) + length e1 + 1 S(if e0 then e1 else e2) = S(e0) + max(s(e1),s(e2)) + 1 S ((f x, f y)) = max (S(f x), S(f y)) + 1 W ((f x, f y)) = W(f x) + W(f y) + 1 for parallelizable e span < work

example fun fib (n:int) : int = if n <= 1 then 1 else let in val (x, y) = (fib(n-1), fib(n-2)) x + y end could be evaluated independently Wfib (n) = Wfib (n-1) + Wfib (n-2) + 1 Sfib (n) = max(sfib (n-1), Sfib (n-2)) + 1 Wfib (n) is exponential Sfib (n) is linear

example -fun W_fib n = if n <= 1 then 1 else W_fib (n-1) + W_fib (n-2) + 1; val W_fib = fn : int -> int - [W_fib 10, W_fib 20, W_fib 30, W_fib 40]; val it = [177,21891,2692537,331160281] : int list - fun S_fib n = if n <= 1 then 1 else Int.max(S_fib (n-1), S_fib (n-2)) + 1; val S_fib = fn : int -> int - [S_fib 10, S_fib 20, S_fib 30, S_fib 40]; val it = [10,20,30,40] : int list

caveat In ML pairs are not parallel, execution is sequential So in ML, fib n has work = span The span speed-up is only achievable on parallel hardware Nevertheless it s important for you to be aware of the potential for speed-up! Evaluating fib n on parallel hardware causes lots of duplicated effort there are faster ways to fib, even sequentially!

work and spam work is the number of evaluation steps, assuming sequential processing span is the number of evaluation steps, assuming unlimited parallelism span is always work For sequential code, span = work

work msort L = let val (A, B) = split L in merge (msort A, msort B) end when length L > 0 Let Wmsort(n) = work of msort L when length L = n Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n) Wmsort(n) = O(n) + 2Wmsort(n div 2) Wmsort(n) is O(n log n)

assessment msort(l) does O(n log n) work, where n is the length of L List operations are inherently sequential e1::e2 evaluates e1 first, then e2 split and merge are not easily parallelizable We could use parallel evaluation in msort(l) for the recursive calls to msort A and msort B How would this affect runtime?

span msort L = let val (A, B) = split L in merge (msort A, msort B) end independent Let Smsort(n) = span of sort L when length L = n Smsort(n) = Ssplit(n) + Smsort(n div 2) + Smerge(n) Smsort(n) = O(n) + Smsort(n div 2) Smsort(n) is O(n)

work and span Wmsort(n) = O(n) + 2Wmsort(n div 2) work Wmsort(n) is O(n log n) Smsort(n) = O(n) + Smsort(n div 2) span Smsort(n) is O(n) O(n) O(n log n) mergesort is potentially worth parallelizing

summary msort(l) has O(n log n) work, O(n) span So the potential speed-up factor from parallel evaluation is O(log n) in principle, we can speed up mergesort on lists by a factor of log n To do any better, we need a different data structure

next Trees are better than lists for parallel evaluation Sorting a tree Specifications and proofs Asymptotic analysis Insertion Parallel Mergesort

integer trees datatype tree = Empty Node of tree * int * tree A user-defined type named tree With constructors Empty and Node Empty : tree Node : tree * int * tree -> tree

tree values An inductive definition A tree value is either Empty or has the form Node(t1, x, t2), where t1 and t2 are tree values and x is an integer. Contrast with integer lists: A list value is either nil or has the form x::l, where L is a list value and x is an integer.

structural definition To define a function F on all trees: Base clause: Define F(Empty) Inductive clause: Define F(Node(t1, x, t2)) using x and F(t1) and F(t2). That s enough! Why? Contrast with structural definition for lists

structural induction To prove: For all tree values t, P(t) holds by structural induction on t Base case: Prove P(Empty). Inductive case: Assume Induction Hypothesis: P(t1) and P(t2). Prove P(Node(t1, x, t2)), for all integers x. That s enough! Why? Contrast with structural induction for lists

tree patterns Empty Node(p1, p, p2) pattern Empty Node(_, _, _) Node(Empty, _, Empty) Node(_, 42, _) matching values an empty tree a non-empty tree a tree with one node a tree with 42 at root

patterns match values Empty matches t iff t is Empty Node(p1, p, p2) matches t iff t is Node(t1, v, t2) such that p1 matches t1, p matches v, p2 matches t2 and combines all the bindings Node(A, x, B) matches 3 4 2 and binds x to 3, A to Node(Empty,4,Empty) B to Node(Empty,2,Empty)

tree code fun Leaf(x:int):tree = Node(Empty, x, Empty) fun Full(x:int, n:int):tree = if n=0 then Empty else let val T = Full(x, n-1) in 42. 42. 42. 42. Leaf 42 Full(42,3) Node(T, x, T) end 42. 42. 42. 42.

size fun size Empty = 0 size (Node(t1, _, t2)) = size t1 + size t2 + 1 Uses tree patterns Recursion is structural size(node(t1, v, t2)) calls size(t1) and size(t2) Easy to prove by structural induction that for all trees t, size(t) = a non-negative integer the number of nodes

size matters Size is always non-negative size(t) 0 Children have smaller size size(ti) < size(node(t1, x, t2)) Many recursive functions on trees make recursive calls on trees with smaller size. Use induction on size to prove correctness.

depth (or height) fun depth Empty = 0 depth (Node(t1, _, t2)) = Int.max(depth t1, depth t2) + 1 Can prove by structural induction that for all trees t, depth(t) = a non-negative integer the length of longest path from root to a leaf node

depth matters For all trees t, depth(t) 0. Children have smaller depth depth(ti) < depth(node(t1, x, t2)) Many recursive functions on trees make recursive calls on trees with smaller depth. Can use induction on depth to prove properties or analyze efficiency.

exercises Prove that for all n 0 size(full(42, n)) = 2 n -1 depth(full(42, n)) = n 42. Full(42, 3) 42. 42. size is 7 42. 42. 42. 42. depth is 3

in-order traversal inord : tree -> int list fun inord Empty = [ ] inord (Node(t1, x, t2)) = inord t1 @ (x :: inord t2) left before root before right 42. inord 2. 21. [2, 42, 3, 21, 7] 3. 7. inord t = the in-order traversal list for t

inord fun inord Empty = [ ] inord (Node(t1, x, t2)) = inord t1 @ (x :: inord t2) For all trees T, length (inord T) = size T For all lists L1, L2 of the same type length (L1 @ L2) = length L1 + length L2

work analysis fun inord Empty = [ ] inord (Node(t1, x, t2)) = inord t1 @ (x :: inord t2) Let Winord(n) be the work to evaluate inord(t) when T is a full binary tree of depth n depth(t) = n, size(t) = 2 n -1 Winord(0) = 1 if T = Node(A, x, B) is full and depth(t) = n, then size(a) = size(b) = 2 n-1-1 Winord(n) = 2Winord(n-1) + O(2 n ), for n>0 Winord(n) is O(n2 n ) work for L1@L2 is O(length L1)

faster inord inorder : tree * int list -> int list fun inorder (Empty, L) = L Theorem inorder (Node(t1, x, t2), L) = inorder (t1, x :: inorder (t2, L)) For all trees T and integer lists L, inorder (T, L) = (inord T) @ L The work for inorder(t, L), when T is a full tree of depth n, is O(2 n )