205/09/30 UNIT 5B Binary Search Course Announcements Written exam next week (Wed. Oct 7 ) Practice exam available on the Resources page Exam reviews: Sunday afternoon; watch Piazza for times and places Problem Set 5 will be due as usual on the Friday after the exam. Programming Assignment 5 will be due Monday after exam. 2
This Lecture A new search technique for lists called binary search Application of recursion to binary search Logarithmic worst-case complexity 3 but first, return to Monday s slides on FIBONACCI NUMBERS 4 2
Fibonacci Numbers A sequence of numbers: 0 + + + 2 + 3 + 5 + 8 3... 7 Fibonacci Numbers in Nature 0,,, 2, 3, 5, 8, 3, 2, 34, 55, 89, 44, 233, etc. Number of branches on a tree, petals on a flower, spirals on a pineapple. Vi Hart's video on Fibonacci numbers (http:// www.youtube.com/watch?v=ahximuksxx0) 8 3
Recursive Definition Let fib(n) = the nth Fibonacci number, n 0 fib(0) = 0 (base case) fib() = (base case) fib(n) = fib(n-) + fib(n-2), n > 0,,, 2, 3, 5, 8, 3, 2, 34, 55, 89, 44, 233, etc def fib(n): if n == 0 or n == : return n" Two recursive calls! " else: " " return fib(n-) + fib(n-2)" 20 Recursive Call Tree fib(5) fib(4) fib(3) fib(3) fib(2) fib(2) fib() fib(2) fib() fib() fib(0) fib() fib(0) fib() fib(0) fib(0) = 0 fib() = fib(n) = fib(n-) + fib(n-2), n > 22 4
Recursive Call Tree 4 fib(5) 3 fib(4) 2 fib(3) 2 fib(3) fib(2) fib(2) fib() fib(2) fib() 0 fib() fib(0) fib() 0 fib(0) 0 fib() fib(0) fib(0) = 0 fib() = fib(n) = fib(n-) + fib(n-2), n > 22 the idea BINARY SEARCH 0 5
Number guessing I'm thinking of a number between and 6 You get to ask me, yes or no, is it greater than some number you choose Which questions will you ask to get the answer quickest? How many questions do you need to ask? Binary Search in an Ordered List? 2 6
Binary Search in an Ordered List 00 pixels 87 pixels? 3 Binary Search in an Ordered List 9 pixels 87 pixels? 4 7
Binary Search in an Ordered List 83 pixels? 87 pixels 5 Binary Search in an Ordered List 87 pixels 87 pixels! 6 8
from idea to ALGORITHM 7 Specification: the Search Problem Input: A list of n unique elements and a key to search for The elements are sorted in increasing order. Result: The index of an element matching the key, or None if the key is not found. 4 9
Recursive Algorithm BinarySearch(list, key):. If the list is empty, return None. 2. Compare the key to the middle element of the list. 3. Return the index of the middle element if they match. 4. If the key is less than the middle element then return BinarySearch(first half of list, key) Otherwise, return BinarySearch(second half of list, key). 8 Example : Search for 73 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 Found: return 9 0
Example 2: Search for 42 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 Not found: return None Controlling the range of the search Maintain three numbers: lower, upper, mid Initially lower is -, upper is length of the list lower = - upper = 2 0
Controlling the range of the search mid is the midpoint of the range: mid = (lower + upper) // 2 (integer division) Example: lower = -, upper = 9 (range has 9 elements) mid = 4 What happens if the range has an even number of elements? Example: lower = -, upper = 8 mid = 3 Example D? 0 2 3 4 5 6 7 8 A B C D E F G H I lower = - upper = 9 mid = 4 List sorted in ascending order. Suppose we are searching for D. 5 2
D? Example 0 2 3 4 5 6 7 8 A B C D E F G H I lower = - upper = 4 mid = Each @me we look at a smaller por@on of the list within the window and ignore all the elements outside of the window 6 D? Example 0 2 3 4 5 6 7 8 A B C D E F G H I lower = upper = 4 mid = 2 Each @me we look at a smaller por@on of the array within the window and ignore all the elements outside of the window 7 3
Example D! 0 2 3 4 5 6 7 8 A B C D E F G H I lower = 2 upper = 4 mid = 3 Each @me we look at a smaller por@on of the array within the window and ignore all the elements outside of the window 7 towards a Python program: DESIGNING THE RECURSION 28 4
Base case: range empty How do we determine if the range is empty? lower + == upper What should we return then? None 2 Base case: key found The key is compared to the element at mid: list[mid] == key What should we return then? mid 2 5
Recursive Case Non-empty range: what subproblem(s) should we solve? search left half or search right half What should we return then? result of searching left or right half New value for lower? value for upper? left half: lower, mid right half: mid, upper 2 Parameters for recursion Inputs: key and list of items But we also need lower and upper bounds since they change throughout the search, they have to be parameters of the search function Design: main function and recursive helper function 32 6
Recursive Binary Search in Python first value for upper " first value for lower # main function" def bsearch(items, key):" return bs_helper(items, key, -, len(items))" " # recursive helper function" def bs_helper(items, key, lower, upper):" if lower + == upper: # Base case: empty" return None" mid = (lower + upper) // 2" if key == items[mid]: # Base case: found " return mid" new value for upper # Recursive cases " same value for lower if key < items[mid]: # Go left" return bs_helper(items, key, lower, mid)" else: # Go right" return bs_helper(items, key, mid, upper)" new value for lower same value for upper 3 Caveat: specification The algorithm and the code was developed on the assumption that the input list is sorted. If the function is called with an unsorted list it has no obligation to behave correctly. 5 7
reflections MEASUREMENT AND ANALYSIS 35 Trace: Search for 73 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 key lower upper bs_helper(items, 73, -, 5)" " " " " " mid = 7 and 73 > items[7]" bs_helper(items, 73, 7, 5)" " " " " " mid = and 73 < items[]" bs_helper(items, 73, 7, )" " " " " " mid = 9 and 73 == items[9]" " " " " " --->"return 9" 8
Trace: Search for 42 0 2 3 4 5 6 7 8 9 0 2 3 4 2 25 32 37 4 48 58 60 66 73 74 79 83 9 95 key lower upper bs_helper(items, 42, -, 5)" " " " " " mid = 7 and 42 < items[7]" bs_helper(items, 42, -, 7)" " " " " " mid = 3 and 42 > items[3]" bs_helper(items, 42, 3, 7)" " " " " " mid = 5 and 42 < items[5]" bs_helper(items, 42, 3, 5)" " " " " " mid = 4 and 42 > items[4]" bs_helper(items, 73, 4, 5)" " " " " " lower + == upper " " " " " " ---> Return None" Instrumenting Binary Search Code count = 0 # count of comparisons" " def bsearch(list, key):" global count" count = 0" print("searching list of length ", len(list))" result = bs_helper(list, key, -, len(list))" print("number of comparisons:", count)" return result" " def bs_helper(list, key, lower, upper):" global count" if lower + == upper:" print("not found")" return None" mid = (lower + upper) // 2" print("mid:", mid, "lower:", lower, "upper", upper)" count = count + " if key == list[mid]: " return mid" if key < list[mid]:" return bs_helper(list, key, lower, mid)" else:" return bs_helper(list, key, mid, upper)" 8 9
Instrumented Output >>> bsearch(list(range(,500,2)), 2) Searching list of length 250 mid: 24 lower: - upper 250 mid: 6 lower: - upper 24 mid: 30 lower: - upper 6 mid: 4 lower: - upper 30 mid: 6 lower: - upper 4 mid: 0 lower: 6 upper 4 Number of comparisons: 6 0 >>> bsearch(list(range(,500,2)), 256) Searching list of length 250 mid: 24 lower: - upper 250 mid: 87 lower: 24 upper 250 mid: 55 lower: 24 upper 87 mid: 39 lower: 24 upper 55 mid: 3 lower: 24 upper 39 mid: 27 lower: 24 upper 3 mid: 29 lower: 27 upper 3 mid: 28 lower: 27 upper 29 Not found Number of comparisons: 8 >>> bsearch(list(range(,500000,2)), 256) Searching list of length 250000 mid: 24999 lower: - upper 250000 mid: 27 lower: 26 upper 28 Not found Number of comparisons: 8 >>> bsearch(list(range(,5000000,2)), 256) Searching list of length 2500000 mid: 249999 lower: - upper 2500000 mid: 28 lower: 27 upper 29 Not found Number of comparisons: 2 When is the worst case? 39 Analyzing Binary Search Suppose we search for a key larger than anything in the list. Example sequences of range sizes: 8, 4, 2, (4 key comparisons) 6, 8, 4, 2, (5 key comparisons) 7, 8, 4, 2, (5 key comparisons) 8, 9, 4, 2, (5 key comparisons)... 3, 5, 7, 3, (still 5 key comparisons) 32, 6, 8, 4, 2, (at last, 6 key comparisons) Notice: 8 = 2 3, 6 = 2 4, 32 = 2 5, Therefore: log 2 8 = 3, log 2 6 = 4, log 2 32 = 5 40 20
Generalizing the Analysis floor Some notation: x means round x down, so 2.5 =2 Binary search of n elements will do at most + log 2 n comparisons + log 2 8 = + log 2 9 =... + log 2 5 = 4 + log 2 6 = + log 2 7 =... + log 2 3 = 5 Why? We can split search region in half + log 2 n times before it becomes empty. "Big O" notation: we ignore the + and the floor function. We say Binary Search has complexity O(log n). 22 O(log n) ( logarithmic time ) 2(log 2 n) + 5 Number of Operations log 2 n log 0 n log a (x) = log a (b) log b (x) constant n (amount of data) 24 2
Number of Operations O(log n) For a log 2 n algorithm, If you double the number of data elements the amount of work you do increases by just one unit Add one 6 5 4 log 2 n 6 32 64 Double n (amount of data) 25 Binary Search (Worst Case) Number of elements Number of Comparisons 5 4 3 5 63 6 27 7 255 8 5 9 023 0 million 20 billion 30 26 22
Binary Search Pays Off Finding an element in an list with a million elements requires only 20 comparisons! BUT... The list must be sorted. What if we sort the list first using insertion sort? Insertion sort O(n 2 ) (worst case) Binary search O(log n) (worst case) Total time: O(n 2 ) + O(log n) = O(n 2 ) Luckily there are faster ways to sort in the worst case... 27 Next Time Merge Sort: a faster sorting algorithm image: https://www.bamsoftware.com/hacks/sorts/ 46 23