Searching Algorithms/Time Analysis

Similar documents
Time Analysis of Sorting and Searching Algorithms

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 3 Notes. Class URL:

Sorting Algorithms. CSE21 Winter 2017, Day 2 (B00), Day 1-2 (A00) January 11, 2017

Lecture 2: Getting Started

Sorting & Growth of Functions

Outline. Computer Science 331. Three Classical Algorithms. The Sorting Problem. Classical Sorting Algorithms. Mike Jacobson. Description Analysis

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

0.1 Welcome. 0.2 Insertion sort. Jessica Su (some portions copied from CLRS)

Advanced Algorithms and Data Structures

Lecture Notes for Chapter 2: Getting Started

(Refer Slide Time: 00:50)

EECS 2011M: Fundamentals of Data Structures

Binomial Coefficients

Assignment 1. Stefano Guerra. October 11, The following observation follows directly from the definition of in order and pre order traversal:

Comparison of x with an entry in the array

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]:

Binary Search to find item in sorted array

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently.

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

Plotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1

Randomized Algorithms, Hash Functions

Data Structure and Algorithm Homework #1 Due: 1:20pm, Tuesday, March 21, 2017 TA === Homework submission instructions ===

CSE548, AMS542: Analysis of Algorithms, Fall 2012 Date: October 16. In-Class Midterm. ( 11:35 AM 12:50 PM : 75 Minutes )

Binomial Coefficient Identities and Encoding/Decoding

Algorithms - Ch2 Sorting

Advanced Algorithms and Data Structures

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

1KOd17RMoURxjn2 CSE 20 DISCRETE MATH Fall

Introduction to Algorithms

Data Structures and Algorithms CSE 465

Encoding/Decoding. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck. May 9, 2016

Design and Analysis of Algorithms

Solutions to the Second Midterm Exam

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

Data Structures and Algorithms Chapter 2

Data Structures and Algorithms. Part 2

MergeSort, Recurrences, Asymptotic Analysis Scribe: Michael P. Kim Date: September 28, 2016 Edited by Ofir Geri

Analysis of Algorithms. Unit 4 - Analysis of well known Algorithms

Analyzing Algorithms

1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

LECTURE 9 Data Structures: A systematic way of organizing and accessing data. --No single data structure works well for ALL purposes.

Lecture Notes on Quicksort

CS 137 Part 7. Big-Oh Notation, Linear Searching and Basic Sorting Algorithms. November 10th, 2017

CSE 373 APRIL 3 RD ALGORITHM ANALYSIS

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Example Algorithms. CSE 2320 Algorithms and Data Structures Alexandra Stefan. University of Texas at Arlington. Last updated 9/7/2016

Lecture 15 : Review DRAFT

S1) It's another form of peak finder problem that we discussed in class, We exploit the idea used in binary search.

COMP Analysis of Algorithms & Data Structures

1 Dynamic Programming

Randomized Algorithms: Element Distinctness

TJ IOI 2017 Written Round Solutions. Thomas Jefferson High School for Science and Technology

Total Points: 60. Duration: 1hr

Assignment 1 (concept): Solutions

Practice Sheet 2 Solutions

Assertions & Verification & Example Loop Invariants Example Exam Questions

CS 561, Lecture 1. Jared Saia University of New Mexico

UNIT 5B Binary Search

Fundamental problem in computing science. putting a collection of items in order. Often used as part of another algorithm

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

CS134 Spring 2005 Final Exam Mon. June. 20, 2005 Signature: Question # Out Of Marks Marker Total

For searching and sorting algorithms, this is particularly dependent on the number of data elements.

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

DATA STRUCTURES/UNIT 3

The divide and conquer strategy has three basic parts. For a given problem of size n,

Data Structures and Algorithms Key to Homework 1

Chapter 2: Complexity Analysis

Exam Datastrukturer. DIT960 / DIT961, VT-18 Göteborgs Universitet, CSE

CSE 21 Spring 2016 Homework 5. Instructions

CSE 20 DISCRETE MATH. Fall

CHENNAI MATHEMATICAL INSTITUTE M.Sc. / Ph.D. Programme in Computer Science

Sorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?

Algorithms. Notations/pseudo-codes vs programs Algorithms for simple problems Analysis of algorithms

1 Dynamic Programming

CS 173 [A]: Discrete Structures, Fall 2012 Homework 8 Solutions

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

Order Analysis of Algorithms. Sorting problem

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

Classic Data Structures Introduction UNIT I

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Lecture 7 Quicksort : Principles of Imperative Computation (Spring 2018) Frank Pfenning

CIS 121 Data Structures and Algorithms Midterm 3 Review Solution Sketches Fall 2018

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 10 Notes. Class URL:

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Searching Elements in an Array: Linear and Binary Search. Spring Semester 2007 Programming and Data Structure 1

Fundamental mathematical techniques reviewed: Mathematical induction Recursion. Typically taught in courses such as Calculus and Discrete Mathematics.

Exam Data structures DAT036/DAT037/DIT960

CSE 21 Spring 2016 Homework 4 Answer Key

Section 1: True / False (2 points each, 30 pts total)

Assertions & Verification Example Exam Questions

CPSC 311: Analysis of Algorithms (Honors) Exam 1 October 11, 2002

(D) There is a constant value n 0 1 such that B is faster than A for every input of size. n n 0.

CS2223: Algorithms D- Term, Homework I. Teams: To be done individually. Due date: 03/27/2015 (1:50 PM) Submission: Electronic submission only

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

Searching, Sorting. Arizona State University 1

Math 501 Solutions to Homework Assignment 10

University of Waterloo CS240 Winter 2018 Assignment 2. Due Date: Wednesday, Jan. 31st (Part 1) resp. Feb. 7th (Part 2), at 5pm

PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS

Algorithm. Lecture3: Algorithm Analysis. Empirical Analysis. Algorithm Performance

Transcription:

Searching Algorithms/Time Analysis CSE21 Fall 2017, Day 8 Oct 16, 2017 https://sites.google.com/a/eng.ucsd.edu/cse21-fall-2017-miles-jones/

(MinSort) loop invariant induction Loop invariant: After the t th iteration of the loop, the first t elements are: 1. The smallest t elements of the original list. 2. In sorted order. Inductive Hypothesis: Suppose that for some t 0, the loop invariant is true after t iterations. Show that it is true after t + 1 iterations:

(MinSort) loop invariant induction Loop invariant: After the t th iteration of the loop, the first t elements are: 1. The smallest t elements of the original list. 2. In sorted order. Inductive Hypothesis: Suppose that for some t 0, the loop invariant is true after t iterations. Show that it is true after t + 1 iterations: During the (t + 1)st iteration, the algorithm sets a m to be the minimum value of a t+1,, a n. Swap a m and a t+1. So, after the (t + 1)st iteration, a t+1 is the minimum value of a t+1,, a n.

(MinSort) loop invariant induction Show that it is true after t + 1 iterations: During the (t + 1)st iteration, the algorithm sets a m to be the minimum value of a t+1,, a n. Swap a m and a t+1.so, after the (t + 1)st iteration, a t+1 is the minimum value of a t+1,, a n. By the inductive hypothesis, after t iterations, a 1,, a t are: 1. The smallest t elements of the original list. 2. In sorted order. So, we have that a 1,, a t are all less than a t+1 and a t+1 is less than all of a t+1,, a n : 1. a 1,, a t+1 are the smallest t + 1 elements of the original list. Since a 1,, a t are in sorted order and a 1,, a t are all less than a t+1 : 2. a 1,, a t+1 are in sorted order. Therefore: for any 0 t n, the loop invariant is true after t iterations.

(MinSort) loop invariant conclusion Therefore: for any 0 t n, the loop invariant is true after t iterations. In particular: the loop invariant is true after n iterations and so after the algorithm terminates, the first n elements are: 1. The smallest n elements 2. In sorted order. This is exactly the algorithm specification!!!!

2 Proving Loop Invariants Induction variable (k): the number of times through the loop. Base case: Need to show the statement holds for k=0, before the loop Inductive step: Let k be a positive integer. Need help? See website for extra practice problems on Induction hypothesis: Suppose the statement invariants and holds induction. after k-1 times through the loop. Need to show that the statement holds after k times through the loop.

Why sort? A TA facing a stack of exams needs to input all 400 scores into a spreadsheet where the students are listed in alphabetical order. OR You want to find all the duplicate values in a long list.

Sorting helps with searching Two searching algorithms: One that works for any data, sorted or not One that is much faster, but relies on the data being sorted

The search problem Given an array A[1], A[2], A[3],..., A[n] and a target value x, find an index j where x = A[j] or determine that no such index exists because x is not in the array; Return j=0 if no index exists

Example Suppose our array is A[1]=2, A[2]=5, A[3]=0, A[4]=1. If x = 1, what j would the search problem find? A) j=2 B) j=4 C) j=0 D) no such j exists

Searching an unsorted array If you needed to find a DVD on this shelf, where the DVDs are in no particular order, how would you proceed?

Linear Search Search through the array one index at a time, asking at each position. ''Is this my target value?'' If you answer yes, return the position where you found it.

Searching: Specification WHAT Rosen page 194 Given a list a 1, a 2,..., a n and a target value x find an index j where x=a j or determine that there is no such index. Values can be any type. For simplicity, use integers.

HOW reverse engineering Pseudocode: procedure mystery (x: integer, a 1, a 2,..., a n : distinct integers ) i := 1 while (i <= n and x a i ) i := i+1 if i <=n then location := i else location := 0 return location { location is the subscript of the term that equals x, or is 0 if x is not found } In groups, describe in English (high-level), what the algorithm is doing.

Correctness of Linear Search: WHY What's the loop invariant? Try to write it down yourself, first.

Proof of correctness Why is this algorithm correct? Easy case: if the algorithm returns some i in the range 1 through n, it must be because x= A[i] Remaining case:?

Invariant for main loop Invariant: If after the loop, i=k, then????

Invariant for main loop Invariant: If after the loop, i=k, then x is not among A[1],,...A[k-1] Base case: k=1; x is not among the empty set Induction step. Assume i=k+1. Then in the previous iteration, i was k. Thus, x is not among the first k-1 elements of the array. In the current iteration, we compared x to A[k]. Since we incremented i, x was also not A[k].

If Linear Search returns 0 Linear search only returns 0 if i = n+1 Then the invariant says x is not in the first n positions of the array But this means x is not in the array, so 0 is the correct answer

How fast is linear search? Suppose you have an unsorted array and you are searching for a target value x that you know is in the array somewhere. What position for x will cause linear search to take the longest amount of time? A) x is the first element of the array B) x is the last element of the array C) x is somewhere in the middle of the array

How fast is linear search? Suppose you have an unsorted array and you are searching for a target value x that you know is in the array somewhere. What position for x will cause linear search to take the longest amount of time? A) x is the first element of the array B) x is the last element of the array C) x is somewhere in the middle of the array

How fast is linear search? Say our array A has n elements and we are searching for x. The time it takes to find x (or determine it is not present) depends on the number of probes, that is the number of entries we have to retrieve and compare to x.

How many probes? Worst-case scenario: Best-case scenario: On average:

How many probes? Worst-case scenario: The target is the last element in the array or not present at all. n probes Best-case scenario: The target is the first element in the array. 1 probe On average: We'd expect to have to search about half of the array until we find the target. about n/2 probes

Searching a sorted array How would you search through a pile of alphabetized papers to find the one with your name on it?

Using order in search Suppose our list is sorted in increasing order. If we probe the list at position m, what do we learn in each of these three cases? x = a m x < a m A.if x is in the list, it occurs BEFORE position m B.if x is in the list, it occurs AFTER position m C. x occurs at position m D. x is not in the list x > a m

How does having a sorted array help us? Suppose we probe A at position i. What do we learn? If x=a[m], we are done. If x<a[m], then if x is in the array, it is in position p for p<m. m-1 positions remain to be checked If x>a[m], then if x is in the array, it is in position p for p>m. n-m positions remain to be checked

How do we choose where to probe? If our array has 100 elements and we probe at position 5, it is most likely that the target is in the second half, so we'll have to search 95 entries in the worst case. How do we make the worst case as favorable as possible?

How do we choose where to probe? If our array has 100 elements and we probe at position 5, it is most likely that the target is in the second half, so we'll have to search 95 entries in the worst case. How do we make the worst case as favorable as possible? Probe in the middle. If we are searching the subarray A[i],, A[j], probe at index i + j 2

Binary Search: The Approach Probe in the middle of the array. Based on what you find there, determine which half to search in next. Continue until the target is found or you can be sure the target is not in the array. (When can you be sure the target is not in the array?) Return the location of the target, or 0 if it's not in the array

Binary Search: HOW Starting at the middle of the list and based on what you find, determine which half to search next. Continue until target is found or we are sure that it isn t there. procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) { location is the subscript of the term that equals x, or is 0 if x is not found }

Binary Search: HOW Starting at the middle of the list and based on what you find, determine which half to search next. Continue until target is found or we are sure that it isn t there. procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) i := 1 j := n Location:=0 while i<j A. i<n B. j<n C. i<j D. j<i E.None of the above m := floor( (i+j)/2 ) if x = a m then location:= m if x > a m then i:=m+1 if x < a m then j:=m-1 return location { location is the subscript of the term that equals x, or is 0 if x is not found }

Binary Search: WHY procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) i := 1 j := n Location:=0 while i<j m := floor( (i+j)/2 ) if x = a m then location:= m if x > a m then i:=m+1 if x < a m then j:=m-1 return location

Binary Search is Telling the Truth? (correctness) We must show two things: 1) If binary search returns some nonzero position p, then a p = x. 2) If binary search returns 0, then there is no position p for which a p = x.

1) If it returns nonzero p, then a p = x. procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) i := 1 j := n Location:=0 while i<j m := floor( (i+j)/2 ) if x = a m then location:= m if x > a m then i:=m+1 if x < a m then j:=m-1 return location

What is the contrapositive of (2)? 2) If binary search returns 0, then there is no position p for which a p =x.

What is the contrapositive? 2) If binary search returns 0, then there is no position p for which a p =x. Contrapositive: If there is a position p for which a p =x, then binary search doesn't return 0.

Loop Invariant Want to show: If there is a position p for which a p =x, then binary search doesn't return 0. Loop invariant: Suppose there is a position p for which a p =x, then: i p j

Invariant: procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) i := 1 j := n Location:=0 while i<j m := floor( (i+j)/2 ) if x = a m then location:= m if x > a m then i:=m+1 if x < a m then j:=m-1 return location

Invariant: Proof

Binary Search Terminates With each probe, you reduce the size of the subarray you are searching to at most half of its previous size. Number of probes Max size of subarray 1 n/2 2 n/4 3 n/8 4 n/16......

Binary Search Terminates With each probe, you reduce the size of the subarray you are searching to at most half of its previous size. Number of probes Max size of subarray 1 n/2 2 n/4 3 n/8 4 n/16...... w

Binary Search Terminates How many probes w do we need to make sure our search has terminated? That is, what value of w is sure to make the subarray size less than 1? n < 1 2 w n < 2 w log n < w The smallest integer w that is sure to be greater than log(n) is w = log n + 1

Comparison with Linear Search Therefore, we have shown that it takes at most log n + 1 probes for binary search to terminate. This is the worst case. By comparison, linear search takes as many as n probes.even in the average case, linear search takes about n/2 probes.

The cost of binary search While binary search is faster than linear search, it depends upon your array being sorted. What if you have an unsorted array you need to search? Should you sort it first so you can use the better search?

If you need to search the same array many times, it becomes more worthwhile to invest the initial time in sorting your array. The cost of binary search There's a tradeoff between the cost of sorting an array and the benefit of being able to search faster.

General questions to ask about algorithms 1) What problem are we solving? SPECIFICATION 2) How do we solve the problem? ALGORITHM DESCRIPTION 3) Why do these steps solve the problem? CORRECTNESS 4) When do we get an answer? RUNNING TIME PERFORMANCE

Counting comparisons: WHEN Measure Time Number of operations Comparisons of list elements! For selection sort (MinSort), how many times do we have to compare the values of some pair of list elements?

Selection Sort (MinSort) Pseudocode Rosen page 203, exercises 41-42 procedure selection sort(a 1, a 2,..., a n : real numbers with n >=2 ) for i := 1 to n-1 m := i for j:= i+1 to n if ( a j < a m ) then m := j interchange a i and a m { a 1,..., a n is in increasing order} How many times do we compare pairs of elements? A. n B. log n C. n! n 2 D. E. n 2

Selection Sort (MinSort) Pseudocode Rosen page 203, exercises 41-42 procedure selection sort(a 1, a 2,..., a n : real numbers with n >=2 ) for i := 1 to n-1 m := i for j:= i+1 to n if ( a j < a m ) then m := j interchange a i and a m { a 1,..., a n is in increasing order} In the body of the outer loop, when i=1, how many times do we compare pairs of elements? A. 1 B. n C. n-1 D. n+1 E. None of the above.

Selection Sort (MinSort) Pseudocode Rosen page 203, exercises 41-42 procedure selection sort(a 1, a 2,..., a n : real numbers with n >=2 ) for i := 1 to n-1 m := i for j:= i+1 to n if ( a j < a m ) then m := j interchange a i and a m { a 1,..., a n is in increasing order} i = 1: n-1 comparisons i = 2: n-2 comparisons... i = n-1: 1 comparison Total number of comparisons: 1+2+...+(n-1)

Sum of consecutive integers X = 1 + 2 + 3 + + (n-3) + (n-2) + (n-1) X = (n-1) + (n-2) + (n-3) + + 3 + 2 + 1 2X = n + n + n + + n + n + n 2X = n(n-1) X = n(n-1)/2

Counting operations When do we get an answer? RUNNING TIME PERFORMANCE Counting number of times list elements are compared

Runtime performance Algorithm: problem solving strategy as a sequence of steps Examples of steps - Comparing list elements (which is larger?) - Accessing a position in a list (probe for value) - Arithmetic operation (+, -, *, ) - etc. "Single step" depends on context

Runtime performance How long does a single step take? Some factors - Hardware - Software Let s discuss and list factors that could impact how long a single step takes

Runtime performance How long does a single step take? Some factors - Hardware (CPU, climate (!), cache ) - Software (programming language, compiler)

Runtime performance The time our program takes will depend on Input size Number of steps the algorithm requires Time for each of these steps on our system

Runtime performance Ignore what we can't control Goal: Estimate time as a function of the size of the input, n Focus on how time scales for large inputs

Ignore what we can't control Rate of growth Focus on how time scales for large inputs

Selection Sort (MinSort) Performance (Again) Rosen page 210, example 5 Number of comparisons of list elements (n-1) + (n-2) + + (1) = n(n-1)/2 Sum of positive integers up to (n-1) Rewrite this expression using big-o notation: A. O(n) B. O(n(n-1)) C. O(n 2 ) D. O(1/2) E. None of the above

Exercise: Why is 1+2+3+4+ n = n 2?????

Linear Search: HOW Starting at the beginning of the list, compare items one by one with x until find it or reach the end procedure linear search (x: integer, a 1, a 2,..., a n : distinct integers ) i := 1 while (i <= n and x a i ) i := i+1 if i <=n then location := i else location := 0 return location { location is the subscript of the term that equals x, or is 0 if x is not found }

How fast is Linear Search: WHEN Rosen page 220, part of example 2 The time it takes to find x (or determine it is not present) depends on the number of probes, that is the number of list entries we have to retrieve and compare to x How many probes do we make when doing Linear Search on a list of size n if x happens to equal the first element in the list? if x happens to equal the last element in the list? if x happens to equal an element somewhere in the middle of the list? if x doesn't equal any element in the list?

How fast is Linear Search: WHEN Best case: 1 probe target appears first Worst case: n probes target appears last or not at all Average case: n/2 probes target appears in the middle, Running time (expect to have to search about depends on more than size array... of more the input! on half of the expected value later in the Rosen p. 220

Binary Search: WHEN procedure binary search (x: integer, a 1, a 2,..., a n : increasing integers ) i := 1 j := n while i<j m := floor( (i+j)/2 ) if x > a m then i := m+1 else j := m if x=a i else location := 0 return location then location := i Each time divide size of "list in play" by 2 { location is the subscript of the term that equals x, or is 0 if x is not found }

Rosen page 220, example 3 How fast is Binary Search: WHEN Number of comparisons (probes) depends on number of iterations of loop, hence size of set. After iterations of loop (Max) size of list "in play" 0 n 1 n/2 2 n/4 3 n/8?? 1

Rosen page 220, example 3 How fast is Binary Search: WHEN Number of comparisons (probes) depends on number of iterations of loop, hence size of set. After iterations of loop (Max) size of list "in play" 0 n 1 n/2 2 n/4 3 n/8 ceil(log 2 n) 1 Rewrite this formula in order notation: A. B. C. D. E. None of the above

Comparing linear search and binary search Rosen pages 220-221 Linear Search Binary Search Assumptions None Sorted list # probes in * best case or * worst case * average case?? Best case analysis depends on whether we check if midpoint agrees with target right away or wait until list size gets to 1 (as in Rosen)

Comparing linear search and binary search Rosen pages 220-221 Linear Search Binary Search Assumptions None Sorted list # probes in * best case or * worst case * average case??

Comparing linear search and binary search Rosen pages 220-221 Linear Search Binary Search Assumptions None Sorted list # probes in * best case or * worst case * average case?? Is it worth it to sort our list first?