ECE 122 Engineering Problem Solving Using Java Lecture 27 Linear and Binary Search
Overview Problem: How can I efficiently locate data within a data structure Searching for data is a fundamental function of computers Many different search algorithms exist The simplest ones discussed today We evaluate search algorithm based on their complexity Lowest complexity algorithm may not be best for all data sets
Observations We can search either ordered or unordered lists The unordered case is straightforward. Ordered search requires an ordered list How can data be sorted? Discussed next lecture Search applications: - search within documents, over a collection - database applications Sorting applications: - anywhere data is organized - present the results of a search (e.g. Google)
Search Algorithm A step-by-step description of how to solve the search problem Some search algorithms: sequential search binary search Linear search - a simple search algorithm Compare the target key to the key of records one by one starting from the first record Easy to implement for Array
Linear Search Brute force approach Start at the beginning and examine each element in turn. It takes linear time in both the worst and average cases. Sometimes called sequential search 6 5 4 3 2 1 21 13 8 5 3 2? target 8 How can we locate a specific element in an array?
Pseudo-code of Linear Search For Array int LinearSearch( int key, int[ ] db ) { int index; for( index=0; index<db.length; index++ ) { } } return 1; if( db[index]==key ) return index; 6 5 4 3 2 1 21 13 8 5 3 2? target 8 An Array db stores integer numbers Search for first value 8 in this array If found, return index Otherwise, return 1
Linear Search Worst case No faster than the old version. Average still in O(n) reduced by a factor of 2. Let event i be the event that the target, if it were present, would belong right before element i.
Linear Search For A Linked List Linear Search also works for Linked List An Linked List db, in which each node stores a string Searching B in this list If found, return that node Otherwise, return null
Complexity of Linear Search Suppose there are n records In the best case, the target is the first entry. It only takes 1 step In the worst case, the target is the last entry, or it cannot be found. Sequential search has to scan all n records and takes totally n steps. Cost: O(n)
Binary Search If the records are sorted by the key, we can do better. Think about how you look up a word in English dictionary where the words are sorted alphabetically. If the target is present in the list, it must be one between low and high A B C M X Y Z low=0 middle=12 high=25 Take a look at the middle one. middle = (low+high) / 2
Binary Search: Case 1 Assume target is M A B C M X Y Z low=0 middle high=25 If the target is the same as the key value in the middle position, we have found the record. Return the value of middle
Binary Search: Case 2 Assume target is E A B C L M X Y Z low=0 high=11 If target < the key value in the middle position, we know that it can not be found in the right half. We only need to search in the left half. high = middle - 1;
Binary Search: Case 3 Assume target is P A B C M N X Y Z low=13 high=25 If target > the key value in the middle position, we know that it can not be found in the left half. We only need to search the right half. low = middle + 1;
Binary Search Starting our search in the middle with a sorted array. A constant amount of work allows us to divide the data in half.
Pseudo-code of Binary Search int BinarySearch(char target, char[] array) { int low=0, high = array.length-1, middle; while (low <= high) { middle = (low+high)/2; if( array[middle]==target ) return middle; else if( array[middle]<target ) low = middle + 1; else high = middle - 1; } return -1; } Keep splitting the array in half until result is found Array must initially be ordered
Binary Search
Complexity For Binary Search Suppose there are n records In the best case, the target is the middle entry. It only takes 1 step In the worst case, it will take log 2 n + 1 steps. Cost: O(logN)
Logarithmic versus Linear How big is the practical difference?
Binary Search
Binary Search The running time of binary search is proportional to the number of times the loop runs. The number of times we have to divide the data in half before we run out of data. In the worst case, we always have to look in the larger piece. The number of passes through the loop is p + 1, where 2 p = n. If n = 8, we have one pass where there are 8 candidate elements, one where there are 4, one where there are 2, and one where there is 1. This is four passes. Notice that 2³ = 8. If n were 2 4 = 16, we would need 5 passes. O(log n)
Comparing Search Algorithms On average, a linear search would examine n/2 elements before finding the target Therefore, a linear search is O(n) The worst case for a binary search is (log 2 n) comparisons A binary search is a logarithmic algorithm It has a time complexity of O(log 2 n) But keep in mind that the search pool must be sorted For large n, a binary search is much faster
Summary Linear search is easy to implement Basically, just a for loop Has O(n) complexity for n stored items More efficient than binary search for small n Binary search harder to implement but more efficient Requires data to be sorted Has O(log n) complexity for n stored items Frequently used for reasonably-sized data sets - E.g. more than 10-20 stored items