Data Structures and Algorithms. Roberto Sebastiani

Similar documents
Data Structures and Algorithms Week 4

Data Structures and Algorithms Chapter 4

Next. 1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

1. Covered basics of a simple design technique (Divideand-conquer) 2. Next, more sorting algorithms.

CSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.

Lecture 5: Sorting Part A

Comparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?

Comparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers

Heapsort. Algorithms.

Introduction to Algorithms 3 rd edition

The Heap Data Structure

403: Algorithms and Data Structures. Heaps. Fall 2016 UAlbany Computer Science. Some slides borrowed by David Luebke

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Chapter 6 Heap and Its Application

Lecture 3. Recurrences / Heapsort

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Sorting. Popular algorithms: Many algorithms for sorting in parallel also exist.

CPSC 311 Lecture Notes. Sorting and Order Statistics (Chapters 6-9)

Algorithms, Spring 2014, CSE, OSU Lecture 2: Sorting

Sorting Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Heaps and Priority Queues

Algorithms Lab 3. (a) What is the minimum number of elements in the heap, as a function of h?

EECS 2011M: Fundamentals of Data Structures

A data structure and associated algorithms, NOT GARBAGE COLLECTION

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort

Basic Data Structures and Heaps

Lecture: Analysis of Algorithms (CS )

Randomized Algorithms, Quicksort and Randomized Selection

CS 303 Design and Analysis of Algorithms

7. Sorting I. 7.1 Simple Sorting. Problem. Algorithm: IsSorted(A) 1 i j n. Simple Sorting

Algorithms and Data Structures

Properties of a heap (represented by an array A)

Topic: Heaps and priority queues

Searching, Sorting. part 1

We can use a max-heap to sort data.

CSE5311 Design and Analysis of Algorithms. Administrivia Introduction Review of Basics IMPORTANT

Questions from the material presented in this lecture

Heapsort. Heap data structure

8. Sorting II. 8.1 Heapsort. Heapsort. [Max-]Heap 6. Heapsort, Quicksort, Mergesort. Binary tree with the following properties Wurzel

Lower Bound on Comparison-based Sorting

Heaps, Heapsort, Priority Queues

HEAP. Michael Tsai 2017/4/25

CS 5321: Advanced Algorithms Sorting. Acknowledgement. Ali Ebnenasir Department of Computer Science Michigan Technological University

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

Data Structures and Algorithms. Roberto Sebastiani

Data Structures and Algorithms. Part 2

COMP Data Structures

Data Structures and Algorithms. Roberto Sebastiani

Data Structures and Algorithms Chapter 2

How much space does this routine use in the worst case for a given n? public static void use_space(int n) { int b; int [] A;

Total Points: 60. Duration: 1hr

Sorting. Bubble Sort. Pseudo Code for Bubble Sorting: Sorting is ordering a list of elements.

COSC Advanced Data Structures and Algorithm Analysis Lab 3: Heapsort

Design and Analysis of Algorithms

Sorting Algorithms. For special input, O(n) sorting is possible. Between O(n 2 ) and O(nlogn) E.g., input integer between O(n) and O(n)

O(n): printing a list of n items to the screen, looking at each item once.

Analysis of Algorithms - Quicksort -

Appendix A and B are also worth reading.

CS 310 Advanced Data Structures and Algorithms

quiz heapsort intuition overview Is an algorithm with a worst-case time complexity in O(n) data structures and algorithms lecture 3

1 Tree Sort LECTURE 4. OHSU/OGI (Winter 2009) ANALYSIS AND DESIGN OF ALGORITHMS

BM267 - Introduction to Data Structures

Design and Analysis of Algorithms

Tirgul 4. Order statistics. Minimum & Maximum. Order Statistics. Heaps. minimum/maximum Selection. Overview Heapify Build-Heap

CS 3343 (Spring 2018) Assignment 4 (105 points + 15 extra) Due: March 9 before class starts

CSC263 Week 2. Larry Zhang

Partha Sarathi Manal

Question 7.11 Show how heapsort processes the input:

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

CS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:

4. Sorting and Order-Statistics

CSCE 750, Fall 2002 Notes 2 Page Bubble Sort // sort the array `a[*]' of length `count' // perform only the first `howmany' sort steps // keep t

IS 709/809: Computational Methods in IS Research. Algorithm Analysis (Sorting)

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

CS 241 Analysis of Algorithms

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

arxiv: v3 [cs.ds] 18 Apr 2011

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting. Aaron Bauer Winter 2014

Sorting and Selection

QuickSort and Others. Data Structures and Algorithms Andrei Bulatov

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Data structures. Organize your data to support various queries using little time and/or space

Heapsort. Why study Heapsort?

Introduction to Computer Science and Programming for Astronomers

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Merge Sort & Quick Sort

Lecture 8: Mergesort / Quicksort Steven Skiena

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

Analysis of Algorithms - Medians and order statistic -

Introduction to Algorithms

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Outline. Computer Science 331. Heap Shape. Binary Heaps. Heap Sort. Insertion Deletion. Mike Jacobson. HeapSort Priority Queues.

Data Structures. Sorting. Haim Kaplan & Uri Zwick December 2013

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

CSE 241 Class 17. Jeremy Buhler. October 28, Ordered collections supported both, plus total ordering operations (pred and succ)

CS 234. Module 8. November 15, CS 234 Module 8 ADT Priority Queue 1 / 22

COSC242 Lecture 7 Mergesort and Quicksort

Deterministic and Randomized Quicksort. Andreas Klappenecker

Advanced Algorithms and Data Structures

HOMEWORK 4 CS5381 Analysis of Algorithm

Transcription:

Data Structures and Algorithms Roberto Sebastiani roberto.sebastiani@disi.unitn.it http://www.disi.unitn.it/~rseba - Week 0 - B.S. In Applied Computer Science Free University of Bozen/Bolzano academic year 00-0

Acknowledgements & Copyright Notice These slides are built on top of slides developed by Michael Boehlen. Moreover, some material (text, figures, examples) displayed in these slides is courtesy of Kurt Ranalter. Some examples displayed in these slides are taken from [Cormen, Leiserson, Rivest and Stein, ``Introduction to Algorithms'', MIT Press], and their copyright is detained by the authors. All the other material is copyrighted by Roberto Sebastiani. Every commercial use of this material is strictly forbidden by the copyright laws without the authorization of the authors. No copy of these slides can be displayed in public or be publicly distributed without containing this copyright notice.

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average

Why Sorting When in doubt, sort one of the principles of algorithm design. Sorting is used as a subroutine in many algorithms: Searching in databases: we can do binary search on sorted data Element uniqueness, duplicate elimination A large number of computer graphics and computational geometry problems... 5

Why Sorting/ Sorting algorithms represent different algorithm design techniques. The lower bound for (worst case of) sorting (n log n) is used to prove lower bounds of other problems. 6

Sorting Algorithms so far Insertion sort, selection sort, bubble sort Worst case running time (n ) In place Merge sort Worst case running time (n log n) Requires additional memory (n) 7

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 8

Selection Sort SelectionSort(A[..n]): for for i := := to to n- n- A: A: Find the the smallest element among A[i..n] B: B: Exchange it it with A[i] A takes (n) and B takes (), (n ) in total Idea for improvement: smart data structure to do A and B in () spend O(log n) time per iteration to maintain the data structure get a total running time of O(n log n) 9

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 0

Binary Trees Each node may have a left and right child. The left child of 7 is The right child of 7 is 8 has no left child 6 has no children Each node has at most one parent. is the parent of The root has no parent. 9 is the root A leaf has no children. 6, and 8 are leafs 6 9 7 8

Binary Trees/ The depth (or level) of a node x is the length of the path from the root to x. The depth of is The depth of 9 is 0 The height of a node x is the length of the longest path from x to a leaf. The height of 7 is The height of a tree is the height of its root. The height of the tree is 6 9 7 8

Binary Trees/ The right subtree of a node x is the tree rooted at the right child of x. The right subtree of 9 is the tree shown in blue. The left subtree of a node x is the tree rooted at the left child of x. 6 9 7 8 The left subtree of 9 is the tree shown in red.

Complete Binary Trees A complete binary tree is a binary tree where all leaves have the same depth. all internal (non leaf) nodes have two children. A nearly complete binary tree is a binary tree where the depth of two leaves differs by at most. all leaves with the maximal depth are as far left as possible.

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 5

Heaps A binary tree is a binary heap iff it is a nearly complete binary tree each node is greater than or equal to all its children The properties of a binary heap allow an efficient storage as an array (because it is a nearly complete binary tree) a fast sorting (because of the organization of the values) 6

Heaps/ Heap property A[Parent(i)] >= A[i] Parent(i) return i/ Left(i) return i 5 8 7 6 0 9 Right(i) 5 6 7 8 9 0 return i+ 6 5 0 8 7 9 8 9 0 Level: 0 5 6 7 7

Heaps/ Notice the implicit tree links in the array: children of node i are i and i+ The heap data structure can be used to implement a fast sorting algorithm. The basic elements are Heapify: reconstructs a heap after an element was modified BuildHeap: constructs a heap from an array HeapSort: the sorting algorithm 8

Heaps/: heaps with arrays In java/c/c++ array indexes start from 0: Parent(i) return i / Left(i) return i+ in java/c/c++ 5 8 7 6 0 9 Right(i) 0 5 6 7 8 9 return i+ 6 5 0 8 7 9 7 8 9 Level: 0 0 5 6 9

Heapify Input: index i in array A, number n of elements Binary trees rooted at Left(i) and Right(i) are heaps. A[i] might be smaller than its children, thus violating the heap property. Heapify makes A a heap by moving A[i] down the heap until the heap property is satisfied again. 0

Heapify Example 7 8 8 9 0 6 8 0 5 6 7 8 9 0 9 6 8 6 7 9 0.Call Heapify(A,) 5 7 6 9 7.Exchange A[] with A[] and recursively call Heapify(A,).Exchange A[] with A[9] and recursively call Heapify(A,9).Node 9 has no children, so we are done 8 9 0 0 5 6 7

Heapify Algorithm Heapify(A, i, i, n) n) l := := *i; // // l := := Left(i) r := := *i+; // // r := := Right(i) if if l <= <= n and A[l] > A[i] then max := := l else max := := i if if r <= <= n and A[r] > A[max] max := := r if if max!=!= i swap A[i] and A[max] Heapify(A, max, n) n)

Heapify: Running Time The running time of Heapify on a subtree of size n rooted at i includes the time to determine relationship between elements: Θ() run Heapify on a subtree rooted at one of the children of i n/ is the worst case size of this subtree (half filled last row of the tree, s.t left is twice as big as right) T ( n) T ( n /) + Θ() T ( n) = O(log n) Alternatively Running time on a node of height h: O(h) = O(log n) 6

Building a Heap Convert an array A[...n] into a heap. Notice that the elements in the subarray A[( n/ + )...n] are element heaps (i.e., leaves) to begin with. BuildHeap(A) for i i := := n/ n/ to do do Heapify(A, i, i, n) n)

Building a Heap/ 6 9 0 8 9 0 8 7 6 9 0 8 7 5 6 7 6 9 0 8 9 0 8 7 6 9 0 8 7 5 6 7 Heapify(A, 7, 0) Heapify(A, 6, 0) Heapify(A, 5, 0) 5

Building a Heap/ 6 9 0 8 9 0 8 7 6 9 0 8 7 5 6 7 6 9 0 8 9 0 8 7 6 9 0 8 7 5 6 7 Heapify(A,, 0) 6

Building a Heap/ 6 9 0 8 9 0 8 7 6 9 0 8 7 5 6 7 0 6 9 8 9 0 8 7 0 6 9 8 7 5 6 7 Heapify(A,, 0) 7

Building a Heap/5 6 9 0 8 9 0 8 7 0 6 9 8 7 5 6 7 6 0 7 9 8 9 0 8 6 0 7 9 8 5 6 7 Heapify(A,, 0) 8

Building a Heap/6 6 0 7 9 8 9 0 8 6 0 7 9 8 5 6 7 6 0 8 7 9 8 9 0 6 0 8 7 9 5 6 7 Heapify(A,, 0) 9

Building a Heap: Analysis Correctness: induction on i, all trees rooted at m > i are heaps. Running time: n/ calls to Heapify = n/*o(log n) = O(n log n) Non tight bound but good enough for an overall O(n log n) bound for Heapsort. Intuition for a tight bound: most of the time Heapify works on less than n element heaps 0

Building a Heap: Analysis/ Tight bound: T ( An n element heap has height log n. The heap has n/ h+ nodes of height h. Cost for one call of Heapify is O(h). n) = Math: log n h= 0 log n n O( h) = O( n h+ k kx = k = 0 ( x) h= 0 T ( n) O( n ) = O( n h= 0 x h log n h / = ) = h ( / ) h ) k / x x k = k(/ x) k = k = 0 x k = 0 ( / ) O( n)

HeapSort The total running time of heap sort is O(n) + n * O(log n) = O(n log n) HeapSort(A) BuildHeap(A) O(n) for i i := := n to to do do n times exchange A[] and A[i] O() n := := n O() Heapify(A,,, n) n) O(log n) n)

Heap Sort 6 8 7 9 0 8 7 9 6 0 0 8 7 6 9 9 8 7 8 7 7 9 8 9 0 6 0 6 0 6 7 8 9 7 8 9 7 8 9 0 6 0 6 0 6 7 8 9 7 8 9 0 6 0 6

Heap Sort: Summary Heap sort uses a heap data structure to improve selection sort and make the running time asymptotically optimal. Running time is O(n log n) like merge sort, but unlike selection, insertion, or bubble sorts. Sorts in place like insertion, selection or bubble sorts, but unlike merge sort. The heap data structure is used for other things than sorting.

Suggested exercises Implement Heapify Implement HeapSort Prove the correctness of Heapify Prove the correctness of HeapSort 5

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 6

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 7

Quick Sort Characteristics Like insertion sort, but unlike merge sort, sorts in place, i.e., does not require an additional array. Very practical, average sort performance O(n log n) (with small constant factors), but worst case O(n ). 8

Quick Sort the Principle To understand quick sort, let s look at a high level description of the algorithm. A divide and conquer algorithm Divide: partition array into subarrays such that elements in the lower part <= elements in the higher part. Conquer: recursively sort the subarrays Combine: trivial since sorting is done in place 9

Partitioning i i 7 6 9 8 5 j 0 j 6 Partition(A,l,r) 0 x := A[r] 0 i := l- 0 j := r+ X=0 0 while TRUE 05 repeat j := j- 06 until A[j] x 07 repeat i := i+ 08 until A[i] x 09 if i<j 0 then switch A[i] A[j] else return i 0 0 0 i 5 5 6 6 6 9 i 9 j 8 i 8 j 8 9 j 5 7 7 7 0 5 6 8 9 7 0

Quick Sort Algorithm 9 Initial call Quicksort(A,, n) Quicksort(A, l, r) 0 if l < r 0 m := Partition(A, l, r) 0 Quicksort(A, l, m ) 0 Quicksort(A, m, r)

Analysis of Quicksort Assume that all input elements are distinct. The running time depends on the distribution of splits.

Best Case If we are lucky, Partition splits the array log n evenly T ( n) = T ( n / ) + Θ( n) n/8 n n/ n/ n/ n/ n/8 n/8 n/8 n/8 n/ n/ n/8 n/8 n/8 n n n n n (n log n)

Worst Case What is the worst case? One side of the parition has one element. T(n) = T(n ) + T() + Θ(n) = T(n ) + 0 + Θ(n) = = n k = Θ( Θ( k) n k = k) = Θ(n )

Worst Case/ n n n n n n n n n (n ) 5

Worst Case/ When does the worst case appear? input is sorted input reverse sorted Same recurrence for the worst case of insertion sort (reverse order, all elements have to be moved). Sorted input yields the best case for insertion sort. 6

Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average 7

Analysis of Quicksort Suppose the split is /0 : 9/0 0/9 log n 0 log n n (/0)n (9/0)n (/00)n (9/00)n (9/00)n (8/0)n n n n (8/000)n (79/000)n n n n (n log n) 8

An Average Case Scenario Suppose, we alternate lucky and unlucky cases to get an average behavior n (n) L(n) = U(n/) + (n) lucky U(n) = L(n ) + (n) unlucky we consequently get L(n) = (L(n/ ) + (n)) + (n) = L(n/ ) + (n) = (n log n) n n (n) (n )/ (n )/ n/ n/ 9

An Average Case Scenario/ How can we make sure that we are usually lucky? Partition around the middle (n/th) element? Partition around a random element (works well in practice) Randomized algorithm running time is independent of the input ordering. no specific input triggers worst case behavior. the worst case is only determined by the output of the random number generator. 50

Randomized Quicksort 7 5 Assume all elements are distinct. Partition around a random element. Consequently, all splits (:n, :n,..., n :) are equally likely with probability /n. Randomization is a general tool to improve algorithms with bad worst case but good average case complexity. 5

Randomized Quicksort/ RandomizedPartition(A,l,r) 0 i := Random(l,r) 0 exchange A[r] and A[i] 0 return Partition(A,l,r) RandomizedQuicksort(A,l,r) 0 if l < r then 0 m := RandomizedPartition(A,l,r) 0 RandomizedQuicksort(A,l,m) 0 RandomizedQuicksort(A,m+,r) 5

Summary Nearly complete binary trees Heap data structure Heapsort based on heaps worst case is n log n Quicksort: partition based sort algorithm popular algorithm very fast on average worst case performance is quadratic 5

Summary/ Comparison of sorting methods. ordered random inverse Absolute values are not important; relate values to each other. Insertion Selection 0. 58.8 50.7 58. 0.8 7.6 Relate values to the complexity (n log n, n ). Bubble Heap 80.8. 8.8. 78.66. Running time in seconds, n=08. Quick 0.7. 0.76 5

Suggested exercises Implement Partition Prove the correctness of Partition Implement QuickSort Implement Randomized QuickSort Prove the correctness of QuickSort 55

Next Week Dynamic data structures Pointers Lists, trees Abstract data types (ADTs) Definition of ADTs Common ADTs 56