Lecture 6 Sequences II. Parallel and Sequential Data Structures and Algorithms, (Fall 2013) Lectured by Danny Sleator 12 September 2013

Similar documents
Parallel and Sequential Data Structures and Algorithms Lecture (Spring 2012) Lecture 16 Treaps; Augmented BSTs

1 Leaffix Scan, Rootfix Scan, Tree Size, and Depth

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Divide and Conquer Algorithms

1 Ordered Sets and Tables

1 Binary Search Trees (BSTs)

Scan and Quicksort. 1 Scan. 1.1 Contraction CSE341T 09/20/2017. Lecture 7

Sorting is a problem for which we can prove a non-trivial lower bound.

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8

Parallel and Sequential Data Structures and Algorithms Lecture (Spring 2012) Lecture 25 Suffix Arrays

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

The divide and conquer strategy has three basic parts. For a given problem of size n,

Algorithms and Data Structures (INF1) Lecture 7/15 Hua Lu

Algorithms and Applications

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Greedy Algorithms CHAPTER 16

Sorting: Given a list A with n elements possessing a total order, return a list with the same elements in non-decreasing order.

would be included in is small: to be exact. Thus with probability1, the same partition n+1 n+1 would be produced regardless of whether p is in the inp

Comparison Sorts. Chapter 9.4, 12.1, 12.2

II (Sorting and) Order Statistics

Sorting Goodrich, Tamassia Sorting 1

Lecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

Lecture 19 Sorting Goodrich, Tamassia

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

Lecture Notes on Quicksort

CSE100. Advanced Data Structures. Lecture 8. (Based on Paul Kube course materials)

Notes on Binary Dumbbell Trees

Solutions to Problem Set 1

(Refer Slide Time: 01.26)

Parallel Breadth First Search

PRAM Divide and Conquer Algorithms

Lecture 9: Sorting Algorithms

Reading for this lecture (Goodrich and Tamassia):

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Chapter 15. Binary Search Trees (BSTs) Preliminaries

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

1 (15 points) LexicoSort

CS301 - Data Structures Glossary By

COMP 250 Fall recurrences 2 Oct. 13, 2017

Sorting Algorithms. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Sorting and Selection

Lecture Notes on Quicksort

DIVIDE AND CONQUER ALGORITHMS ANALYSIS WITH RECURRENCE EQUATIONS

Quick-Sort. Quick-Sort 1

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

CSC 505, Spring 2005 Week 6 Lectures page 1 of 9

Lecture 15 : Review DRAFT

Sorting L7.2 Recall linear search for an element in an array, which is O(n). The divide-and-conquer technique of binary search divides the array in ha

Sorting. Data structures and Algorithms

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

COMP3121/3821/9101/ s1 Assignment 1

SORTING, SETS, AND SELECTION

CS 598: Communication Cost Analysis of Algorithms Lecture 15: Communication-optimal sorting and tree-based algorithms

CS134 Spring 2005 Final Exam Mon. June. 20, 2005 Signature: Question # Out Of Marks Marker Total

Subset sum problem and dynamic programming

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

We can use a max-heap to sort data.

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

SORTING AND SELECTION

Lecture 8: The Traveling Salesman Problem

Fundamental mathematical techniques reviewed: Mathematical induction Recursion. Typically taught in courses such as Calculus and Discrete Mathematics.

Lecture Notes: Range Searching with Linear Space

Quick-Sort fi fi fi 7 9. Quick-Sort Goodrich, Tamassia

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Question 7.11 Show how heapsort processes the input:

Lecture #2. 1 Overview. 2 Worst-Case Analysis vs. Average Case Analysis. 3 Divide-and-Conquer Design Paradigm. 4 Quicksort. 4.

Mergesort again. 1. Split the list into two equal parts

Algorithms for Memory Hierarchies Lecture 2

Quick-Sort. Quick-sort is a randomized sorting algorithm based on the divide-and-conquer paradigm:

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 3 Parallel Prefix, Pack, and Sorting

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Balanced Search Trees Date: 9/20/18

(Refer Slide Time 04:53)

CSE 530A. B+ Trees. Washington University Fall 2013

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

Lecture 6: Sequential Sorting

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

1 Minimum Cut Problem

CSE 5095 Topics in Big Data Analytics Spring 2014; Homework 1 Solutions

TU/e Algorithms (2IL15) Lecture 2. Algorithms (2IL15) Lecture 2 THE GREEDY METHOD

Information Coding / Computer Graphics, ISY, LiTH

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

16 Greedy Algorithms

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

15 150: Principles of Functional Programming Sorting Integer Lists

Search Trees. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Binary Search Trees

1 Connected components in undirected graphs

Lecture 6: Divide-and-Conquer

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

CS584/684 Algorithm Analysis and Design Spring 2017 Week 3: Multithreaded Algorithms

Multi-way Search Trees! M-Way Search! M-Way Search Trees Representation!

Algorithms. AVL Tree

Multiple-choice (35 pt.)

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute

Transcription:

Lecture 6 Sequences II Parallel and Sequential Data Structures and Algorithms, 15-210 (Fall 2013) Lectured by Danny Sleator 12 September 2013 Material in this lecture: Today s lecture is about reduction. - Scan implementation (from the last lecture notes) - Divide-and-conquer with reduction - Cost of reduce when f is not constant work 1 Reduce Operation Recall that reduce function has the interface reduce f I S : (α α α) α α seq α When the combining function f is associative i.e., f (f (x, y), z) = f (x, f (y, z)) for all x, y and z of type α reduce returns the sum with respect to f of the input sequence S. It is the same result returned by iter f I S. The reason we include reduce is that it is parallel, whereas iter is strictly sequential. Note, though, iter can use a more general combining function with type: β α β. The results of reduce and iter, however, may differ if the combining function is non-associative. In this case, the order in which the reduction is performed determines the result; because the function is non-associative, different orderings will lead to different answers. While we might try to apply reduce to only associative operations, unfortunately even some functions that seem to be associative are actually not. For instance, floating point addition and multiplication are not associative. In SML/NJ, integer addition is also not associative because of the overflow exception. To properly deal with combining functions that are non-associative, it is therefore important to specify the order that the combining function is applied to the elements of a sequence. This order is part of the specification of the ADT Sequence. In this way, every (correct) implementation returns the same result when applying reduce; the results are deterministic regardless of what data structure and algorithm are used. For this reason, we define a specific combining tree, which is defined quite carefully in the library documentation for reduce. This tree is the same as if we rounded up the length of the input sequence to the next power of 2, i.e., x = 2 k, and then put a perfectly balanced binary tree 1 over the sequence with 2 k leaves. Wherever we are missing children in the tree, we don t apply the combining function. An example is shown in the following figure. Lecture notes by Umut A. Acar, Guy E Blelloch, Margaret Reid-Miller, and Kanat Tangwongsan. 1 A perfect binary tree is a tree in which every node other than the leaves have exactly 2 children. 1 Version 1.0

= combine = "dummy" elements x 1 x 2 x 3 x 4 x 5 x 6 x 1 x 2 x 3 x 4 x 5 x 6 Later on today, we will offer an explanation why we chose this particular combining order. 1.1 Divide and Conquer with Reduce Now, let s look back at divide-and-conquer algorithms you have encountered so far. Many of these algorithms have a divide step that simply splits the input sequence in half, proceed to solve the subproblems recursively, and continue with a combine step. This leads to the following structure where everything except what is in boxes is generic, and what is in boxes is specific to the particular algorithm. 1 function mydandc(s) = 2 case showt(s) of 3 EMPTY emptyval 4 ELT(v) base (v) 5 NODE(L, R) let 6 (L, R ) = mydandc(l) mydandc(r) 7 in 8 somemessycombine (L, R ) 9 end Algorithms that fit this pattern can be implemented in one line using the sequence reduce function. Turning a divide-and-conquer algorithm into a reduce-based solution is as simple as invoking reduce with the following parameters: reduce somemessycombine emptyval (map base S) We will take a look two examples where reduce can be used to implement a relatively sophisticated divide-and-conquer algorithm. Both problems should be familiar to you. 2 Version 1.0

Algorithm 4: MCSS Using Reduce. The first example is the Maximum Contiguous Subsequence Sum problem from last lecture. Given a sequence S of numbers, find the contiguous subsequence that has the largest sum more formally: j mcss(s) = max s k : 1 i n, i j n. k=i Recall that the divide-and-conquer solution involved returning four values from each recursive call on a sequence S: the desired result mcss(s), the maximum prefix sum of S, the maximum suffix sum of S, and the total sum of S. We will denote these as M, P, S, T, respectively. To solve the mcss problem we can then use the following implementations for combine, base, and emptyval: function combine((m L, P L, S L, T L ), (M R, P R, S R, T L )) = (max(s L + P R, M L, M R ), max(p L, T L + P R ), max(s R, S L + T R ), T L + T R ) function base(v) = (v, v, v, v) emptyval = (,,, 0) and then solve the problem with: function mcss(s) = reduce combine emptyval (map base S) Stylistic Notes. We have just seen that we could spell out the divide-and-conquer steps in detail or condense our code into just a few lines that take advantage of the almighty reduce. So which is preferable, using the divide-and-conquer code or using reduce? We believe this is a matter of taste. Clearly, your reduce code will be (a bit) shorter, and for simple cases easy to write. But when the code is more complicated, the divide-and-conquer code is easier to read, and it exposes more clearly the inductive structure of the code and so is easier to prove correct. Restriction. You should realize, however, that this pattern does not work in general for divide-andconquer algorithms. In particular, it does not work for algorithms that do more than a simple split that partitions their input in two parts in the middle. For example, it cannot be used for implementing quick sort as the divide step partitions the data with respect to a pivot. This step requires picking a pivot, and then filtering the data into elements less than, equal, and greater than the pivot. 3 Version 1.0

2 Analyzing the Costs of Higher Order Functions In Section 1 we looked at using reduce to solve divide-and-conquer problems. In the MCSS problem the combining function f had O(1) cost (i.e., both its work and span are constant). In that case the cost specifications of reduce on a sequence of length n is simply O(n) work and O(log n) span. Does that hold true when the combine function does not have constant cost? For map it is easy to find its costs based on the cost of the function applied: W (map f S) = 1 + W (f (s)) s S S(map f S) = 1 + max s S Tabulate is similar. But can we do the same for reduce? S(f (s)) Merge Sort. As an example, let s consider merge sort. As you have likely seen from previous courses you have taken, merge sort is a popular divide-and-conquer sorting algorithm with optimal work. It is based on a function merge that takes two already sorted sequences and returns a sorted sequence containing all elements from both sequences. We can use our reduction technique for implementing divide-and-conquer algorithms to implement merge sort with a reduce. In particular, we can write a version of merge sort, which we refer to as reducesort, as follows: combine = merge < base = singleton emptyval = empty() function reducesort(s) = reduce combine emptyval (map base S) where merge < is a merge function that uses an (abstract) comparison operator <. Note that merging is an associative function. Assuming a constant work comparison function, two sequences S 1 and S 2 with lengths n 1 and n 2 can be merged with the following costs: W (merge < (S 1, S 2 )) = O(n 1 + n 2 ) S(merge < (S 1, S 2 )) = O(log(n 1 + n 2 )) What do you think the cost of reducesort is? 2.1 Reduce: Cost Specifications We want to analyze the cost of reducesort. Does the reduction order matter? As mentioned before, if the combining function is associative, which it is in this case, all reduction orders give the same answer so it seems like it should not matter. 4 Version 1.0

To answer this question, let s consider the reduction order we get by sequentially adding the elements in one after the other. On input x = x 1, x 2,..., x n, the sequence of merge < calls looks like the following: merge < (... merge < (merge < (merge < (I, x 1 ), x2 ), x3 ),... ) i.e. we first merge I and x 1, then merge in x2, then x3, etc. With this order merge < is called when its left argument is a sequence of varying size between 1 and n 1, while its right argument is always a singleton sequence. The final merge combines (n 1)-element with 1-element sequences, the second to last merge combines (n 2)-element with 1-element sequences, so on so forth. Therefore, the total work for an input sequence S of length n is W (reducesort S) n 1 c (1 + i) O(n 2 ) i=1 since merge on sequences of lengths n 1 and n 2 has O(n 1 + n 2 ) work. to: Note that this reduction order is the order that the iter function uses, and hence is equivalent function reducesort (S) = iter merge < (empty()) (map singleton S) Furthermore, using this reduction order, the algorithm is effectively working from the front to the rear, inserting each element into a sorted prefix where it is placed at the correct location to maintain the sorted order. This corresponds to the well-known insertion sort. is Clearly, this reduction order has no parallelism except within each merge, and therefore the span S(reduceSort x) n 1 c log(1 + i) O(n log n) i=1 since merge on sequences of lengths n 1 and n 2 has O(log(n 1 + n 2 )) span. Notice that in the reduction order above, the reduction tree was extremely unbalanced. Would the costs change if the merges are balanced? For ease of exposition, let s suppose that the length of our sequence is a power of 2, i.e., x = 2 k. Now we lay on top the input sequence a full binary tree 2 with 2 k leaves and merge according to the tree structure. As an example, the merge sequence for x = 2 3 is shown below. 2 This is simply a binary tree in which every node either has exactly 2 children or is a leaf, and all leaves are at the same depth. 5 Version 1.0

x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 = merge Clearly using this balanced combining tree gives a smaller span than the imbalanced tree. But does it also reduce the work cost? At the bottom level where the leaves are, there are n = x nodes with constant cost each (these were generated using a map). Stepping up one level, there are n/2 nodes, each corresponding to a merge call, each costing c(1 + 1). In general, at level i (with i = 0 at the root), we have 2 i nodes where each node is a merge with input two sequences of length n/2 i+1. Therefore, the work of reducesort using this reduction order is the familiar sum W (reducesort x) = log n i=0 i=0 2 i c log n n 2 i c 2 i n 2 i+1 + n 2 i+1 This sum, as you have seen before, evaluates to O(n log n). In fact, this algorithm is essentially the merge sort algorithm. Therefore we see that mergesort and insertionsort are just two special cases of reducesort that use different reduction orders. As discussed earlier, we need to precisely define the reduction order to determine the result of reduce when applied to a non-associative combining function. These two examples illustrate, however, that even when the combining function is associative, the particular reduction order we choose can lead to drastically different costs in both work and span. When combining function has O(1) work, using a balanced reduction tree improves the span from O(n) to O(log n) but does not change the work. But with merge < as the combining function, we see that the balanced reduction tree also improves the work from O(n 2 ) to O(n log n). In general, how would we go about defining the cost of reduce with higher order functions. Given a reduction tree, we ll first define R(reduce f S) as R(reduce f S) = all function applications f (a, b) in the reduction tree. Following this definition, we can state the cost of reduce as follows: W (reduce f S) = O n + W (f (a, b)) S(reduce f S) = O log n f (a,b) R(f S) max f (a,b) R(f S) S(f (a, b)) 6 Version 1.0

The work bound is simply the total work performed, which we obtain by summing across all combine operations. The span bound is more interesting. The log n term expresses the fact that the tree is at most O(log n) deep. Since each node in the tree has span at most max f (a,b) S(f (a, b), any root-to-leaf path, including the critical path, has at most O(log n max f (a,b) S(f (a, b)) span. This can be used, for example, to prove the following lemma: Lemma 2.1. For any combine function f : α α α and a monotone size measure s: α +, if for any x, y, 1. s(f (x, y)) s(x) + s(y) and 2. W (f (x, y)) c f s(x) + s(y) for some universal constant c f depending on the function f, then W (reduce f S) = O log S (1 + s(x)). x S Applying this lemma to the merge sort example, we have W (reduce merge < a : a A ) = O( A log A ) 7 Version 1.0