DATA STRUCTURES/UNIT 3
|
|
- Stuart Porter
- 5 years ago
- Views:
Transcription
1 UNIT III SORTING AND SEARCHING 9 General Background Exchange sorts Selection and Tree Sorting Insertion Sorts Merge and Radix Sorts Basic Search Techniques Tree Searching General Search Trees- Hashing. Introduction Sorting and Searching are fundamental operations in computer science. Sorting refers to the operation of arranging data in some given order. Searching refers to the operation of searching the particular record from the existing information. Normally, the information retrieval involves searching, sorting and merging. In this chapter we will discuss the searching and sorting techniques in detail. After going through this unit you will be able to: SEARCHING Know the fundamentals of sorting techniques Know the different searching techniques Discuss the algorithms of internal sorting and external sorting Difference between internal sorting and external sorting Complexity of each sorting techniques Discuss the algorithms of various searching techniques Discuss Merge sort Discuss algorithms of sequential search, binary search and binary tree search. Analyze the performance of searching methods Searching refers to the operation of finding the location of a given item in a collection of items. The search is said to be successful if ITEM does appear in DATA and unsuccessful otherwise. The following searching algorithms are discussed in this chapter. 1. Sequential Searching 2. Binary Search CCET/MCA Page 1
2 3. Binary Tree Search Sequential Search This is the most natural searching method. The most intuitive way to search for a given ITEM in DATA is to compare ITEM with each element of DATA one by one..the algorithm for a sequential search procedure is now presented. Algorithm SEQUENTIAL SEARCH INPUT : List of Size N. Target Value T OUTPUT : Position of T in the list-1 BEGIN Set FOUND = false Set I := 0 While (I <= N) and (FOUND is false) IF List[i] ==t THEN FOUND = true ELSE I = I+1 IF FOUND==false THEN T is not present in the List END Binary Search Suppose DATA is an array which is sorted in increasing numerical order. Then there is an extremely efficient searching algorithm, called binary search, which can be used to find the location LOC of a given ITEM of information in DATA. The binary search algorithm applied to our array DATA works as follows. During each stage of our algorithm, our search for ITEM is reduced to a segment of elements of DATA: DATA[BEG], DATA[BEG + 1], DATA[BEG + 2],... DATA[END]. CCET/MCA Page 2
3 Note that the variable BEG and END denote the beginning and end locations of the segment respectively. The algorithm compares ITEM with the middle element DATA[MID] of the segment, where MID is obtained by MID = INT((BEG + END) / 2) (We use INT(A) for the integer value of A.) If DATA[MID] = ITEM, then the search is successful and we set LOC: = MID. Otherwise a new segment of DATA is obtained as follows: (a) If ITEM < DATA[MID], then ITEM can appear only in the left half of the segment: DATA[BEG],DATA[BEG + 1],..,DATA[MID - 1]So we reset END := MID - 1 and begin searching again. (b) If ITEM > DATA[MID], then ITEM can appear only in the right half of the segment: DATA[MID + 1], DATA[MID + 2],...,DATA[END] So we reset BEG := MID + 1 and begin searching again. Initially, we begin with the entire array DATA; i.e. we begin with BEG = 1 and END = n, If ITEM is not in DATA, then eventually we obtain END<BEG This condition signals that the search is unsuccessful, and in this case we assign LOC: = NULL. Here NULL is a value that lies outside the set of indices of DATA. We now formally state the binary search algorithm. Algorithm 2.9: (Binary Search) BINARY(DATA, LB, UB, TEM, LOC) Here DATA is sorted array with lower bound LB and upper bound UB, and ITEM is a given item of information. The variables BEG, END and MID denote, respectively, the beginning, end and middle locations of a segment of elements of DATA. This algorithm finds the location LOC of ITEM in DATA or sets LOC=NULL. 1. [Initialize segment variables.] Set BEG := LB, END := UB and MID = INT((BEG + END)/ 2). 2. Repeat Steps 3 and 4 while BEG END and DATA[MID] ITEM. 3. If ITEM<DATA[MID], then: CCET/MCA Page 3
4 Set END := MID - 1. Else: Set BEG := MID + 1. [End of If structure] 4. Set MID := INT((BEG + END)/2). [End of Step 2 loop.] 5. If DATA[MID] :=ITEM, then: Set LOC:=MID. Else: Set LOC := NULL. [End of If structure.] 6. Exit. Example 2.9 Let DATA be the following sorted 13-element array: DATA: 11, 22, 30, 33, 40, 44, 55, 60, 66, 77, 80, 88, 99 We apply the binary search to DATA for different values of ITEM. (a) Suppose ITEM = 40. The search for ITEM in the array DATA is pictured in where the values of DATA[BEG] and DATA[END] in each stage of the algorithm are indicated by parenthesis and- the value of DATA[MID] by a bold. Specifically, BEG, END and MID will have the following successive values: (1) Initially, BEG = 1 and END 13. Hence, MID = INT[(1 + 13) / 2 ] = 7 and so DATA[MID] = 55 (2) Since 40 < 55, END = MID 1 = 6. Hence, MID = INT[(1 + 6) / 2 ] = 3 and so DATA[MID] = 30 (3) Since 40 > 30, BEG = MID + 1 = 4. Hence, MID = INT[(4 + 6) / 2 ] = 5 and so DATA[MID] = 40 The search is successful and LOC = MID = 5. (1) (11), 22, 30, 33, 40, 44, 55, 60, 66, 77, 80, 88, (99) (2) (11), 22, 30, 33, 40, (44), 55, 60, 66, 77, 80, 88, 99 (3) 11, , (33), 40, (44), 55, 60, 66, 77, 80, 88, 99 [Successful] CCET/MCA Page 4
5 Complexity of the Binary Search Algorithm The complexity is measured by the number of comparisons f(n) to locate ITEM in DATA where DATA contains n elements. Observe that each comparison reduces the sample size in half. Hence we require at most f(n) comparisons to locate ITEM where f(n) = [log2n] + 1 That is the running time for the worst case is approximately equal to log2n. The running time for the average case is approximately equal to the running time for the worst case. Limitations of the Binary Search Algorithm The algorithm requires two conditions: (1) the list must be sorted and (2) one must have direct access to the middle element in any sublist. Binary Search Tree Suppose T is a binary tree. Then T is called a binary search tree if each node N of T has the following property: The value at N is greater than every value in the left subtree of N and is less than every value in the right subtree of N. Binary Search Tree(T) SEARCHING AND INSERTING IN BINARY SEARCH TREES Suppose an ITEM of information is given. The following algorithm finds the location of ITEM in the binary search tree T, or inserts ITEM as a new node in its appropriate place in the tree. (a) Compare ITEM with the root node N of the tree. CCET/MCA Page 5
6 (i) If ITEM < N, proceed to the left child of N. (ii) If ITEM > N, proceed to the right child of N. (b) Repeat Step (a) until one of the following occurs: (i) We meet a node N such that ITEM = N. In this case the search is successful. (ii) We meet an empty subtree, which indicates that the search is unsuccessful, and we insert ITEM in place of the empty subtree. In other words, proceed from the root R down through the tree T until finding ITEM in T or inserting ITEM as a terminal node in T. Example 2.11 Consider the binary search tree T in Fig Suppose ITEM = 20 is given. Compare ITEM = 20 with the root, 38, of the tree T. Since 20 < 38, proceed to the left child of 38, which is Compare ITEM = 20 with 14. Since 20 > 14, proceed to the right child of 14, which is Compare ITEM = 20 with 23. Since 20 < 23, proceed to the left child of 23, which is Compare ITEM = 20 with 18. Since 20 > 18 and 18 does not have a right child, insert 20 as the right child of 18. ITEM=20 inserted DELETING IN A BINARY SEARCH TREE Suppose T is a binary search tree, and suppose an ITEM of information is given. This section gives an algorithm which deletes ITEM from the tree T. CCET/MCA Page 6
7 Case 1. N has no children. Then N is deleted from T by simply replacing the location of N in the parent node P(N) by the null pointer. Case 2. N has exactly one child. Then N is deleted from T by simply replacing the location of N in P(N) by the location of the only child of N. Case 3. N has two children. Let S(N) denote the inorder successor of N. ( The reader can verify that S(N) does not have a left child.) Then N is deleted from T by first deleting S(N) from T (by using Case 1 or Case 2) and then replacing node N in T by the node S(N). Observe that the third case is much more complicated than the first two cases. In all three cases, the memory space of the deleted node N is returned to the AVAIL list. (a) Before deletions. (b) Linked representation. CCET/MCA Page 7
8 (a) Node 44 is deleted (b) Linked representation Sorting Methods The function of sorting or ordering a list of objects according to some linear order is so fundamental that it is ubiquitous in engineering applications in all disciplines. There are two broad categories of sorting methods: Internal sorting takes place in the main memory, where we can take advantage of the random access nature of the main memory. External sorting is necessary when the number and size of objects are prohibitive to be accommodated in the main memory. Given records r1, r2,..., rn, with key values k1, k2,..., kn, produce the records in the order ri1, ri2,..., rin, Such that ki1 ki2... kin CCET/MCA Page 8
9 The complexity of a sorting algorithm can be measured in terms of number of algorithm steps to sort n records number of comparisons between keys (appropriate when the keys are long character strings)number of times records must be moved (appropriate when record size is large) Any sorting algorithm that uses comparisons of keys needs at least O(n log n) time to accomplish the sorting. Sorting Methods Internal External (In memory) Appropriate for secondary storage quick sort heap sort Mergesort bubble sort radix sort insertion sort Polyphase sort selection sort shell sort Insertion Sort The general idea of the insertion sort method is that for each element, find the slot where it belongs. Example The element in position Array[0] is certainly sorted. Thus, move on to insert the second character, D, into the appropriate location to maintain the alphabetical order. How does it work? Each element Array[j] is taken one at a time from j = 0 to n-1. Before insertion of Array[j], the subarray from Array[0] to Array[j-1] is sorted, and the remainder of the array is not. After insertion, Array[0 j] is correctly ordered, while the subarray with elements Array[j+1]..Array[n-1] is unsorted. Insertion Sort Algorithm CCET/MCA Page 9
10 for i = 1 to n-1 temp = a[i] loc = i while(loc>0 && (a[loc-1]> temp) a[loc] = a[loc-1] loc = loc 1 a[loc] = temp Insertion sort The initial state is that the first element,considered by itself, is sorted The final state is that all elements, considered as a group, are sorted. Basic action is to arrange that elements in positions 0 through i. In each stage i increases by 1. The outer loop controls this. When the body of the outer for loop is entered,we know that elements at positions 0 through i are sorted and we need to extend this to positions 0 to n-1. At each step the element indexed by i needs to be added to the sorted part of the array. This is done by placing it in a temporary variable and sliding all elements larger than it one position to the right. Then the temporary element is copied into the leftmost relocated element. The counter loc indicates this position. Complexity Best situation: the data is already sorted. The inner loop is never executed, and the outer loop is executed n 1 times for total complexity of O(n). Worst situation: data in reverse order. The inner loop is executed the maximum number of times. Thus the complexity of the insertion sort in this worst possible case is quadratic or O(n2). Selection Sort In this sorting we find the smallest element in this list and put it in the first position. Then find the second smallest element in the list and put it in the second position. And so on. Pass 1. Find the location LOC of the smallest in the list of N elementsa[l], A[2],..., A[N], and then interchange A[LOC] and [1].Then A[1] is sorted. CCET/MCA Page 10
11 Pass 2. Find the location LOC of the smallest in the sublist of N 1Elements A[2], A[3],..., A[N], and then interchangea[loc]and A[2]. Then:A[l], A[2] is sorted, since A[1]<A[2]. Pass 3. Find the location LOC of the smallest in the sublist of N 2elements A[3], A[4],..., A[N], and then interchange A[LOC] and A[3]. Then: A[l], A[2],..., A[3] is sorted, since A[2] <A[3]. Pass N - 1. Find the location LOC of the smaller of the elements A[N - 1), A[N], and then interchange A[LOC] and A[N- 1]. Then: A[l], A[2],..., A[N] is sorted, since A[N - 1] < A[N].Thus A is sorted after N - 1 passes. Hashing Accessing elements in an array is extremely efficient. Array elements are accessed by index. If we can find a mapping between the search keys and indices, we can store each record in the element with the corresponding index. Thus each element would be found with one operation only. Advantage: the records can be references directly - ideally the search time is a constant, complexity O(1) Question: how to find such correspondence? Answers: direct access tables hash tables Direct-address tables Direct-address tables the most elementary form of hashing. Assumption direct one-to-one correspondence between the keys and numbers 0, 1,, m-1., m not very large. Searching is fast, but there is cost the size of the array we need is the size of the largest key. Not very useful if only a few keys are widely distributed. CCET/MCA Page 11
12 Hash functions Hash function: function that transforms the search key into a table address. Hash functions transform the keys into numbers within a predetermined interval. These numbers are then used as indices in an array (table, hash table) to store the records Keys numbers. If M is the size of the array, then h(key) = key % M. This will map all the keys into numbers within the interval [0, M-1]. Keys strings of characters Treat the binary representation of a key as a number, and then apply the first case. How keys are treated as numbers: If each character is represented with m bits, then the string can be treated as base-2 m number. Hash tables: Basic concepts Once we have found the method of mapping keys to indexes, the questions to be solved is how to choose the size of the table (array) to store the records, and how to perform the basic operations: Insert Search delete Let N be the number of the records to be stored, and M - the size of the array (hash table). The integer, generated by a hash function between 0 and M-1 is used as an index in a hash table of M elements. Initially all slots in the table are blank. This is shown either by a sentinel value, or a special field in each slot. To insert use the hash function to generate an address for each value to be inserted. CCET/MCA Page 12
13 To search for a key in the table the same hash function is used. To delete a record with a given key - first we apply the search method and when the key is found we delete the record. Size of the table: Ideally we would like to store N records in a table with size N. However, in many cases we don't know in advance the exact number of records. Also, the hash function can map two keys to one and the same index, and some cells in the array will not be used. Hence we assume that the size of the table can be different from the number of the records. We use M to denote the size of the table. A characteristic of the hash table is its load factor L = N/M: the ratio between the number of records to be stored and the size of the table. The method to choose the size of the table depends on the chosen method of collision resolution, discussed below. M should be a prime number. It has been proved that if M is a prime number, we obtain better (more even) distribution of the keys over the table. Collision resolution Collision is the case when two or more keys hash to one and the same index in the hash table. Collision resolution deals with keys that are mapped to same indexes. Methods: Separate chaining Open addressing o Linear probing o Quadric probing o Double hashing CCET/MCA Page 13
14 SEPARATE CHAINING Complexity of separate chaining The time to compute the index of a given key is a constant. Then we have to search in a list for the record. Therefore the time depends on the length of the lists. It has been shown empirically that on average the list length is N/M (the load factor L), provided M is prime and we use a function that gives good distribution. Unsuccessful searches go to the end of some list, hence we have L comparisons. Successful searches are expected to go half the way down some list. On average the number of comparisons in successful search is L/2. Therefore we can say that runtime complexity of separate chaining is O(L).Note, that what really matters is the load factor rather than the size of the table or the number of records, taken separately. How to choose M in separate chaining? Since the method is used in cases when we cannot predict the number of records in advance, the choice of M basically depends on other factors such as available memory. Typically M is chosen relatively small so as not to use up a large area of contiguous memory, but enough large so that the lists are short for more efficient sequential search. Recommendations in the literature vary form M to be about one tenth of N - the number of the records to M to be equal (or close to) N. Other methods of chaining: Keep the lists ordered: useful if there are much more searches than inserts, and if most of the searches are unsuccessful. Represent the chains as binary search tree. Extra effort needed not efficient. Advantages of separate chaining used when memory is of concern, easily implemented. Disadvantages unevenly distributed keys long lists and many empty spaces in the table. CCET/MCA Page 14
15 1. Open addressing Invented by A. P. Ershov and W. W. Peterson in 1957 independently. Idea: Store collisions in the hash table itself. The method uses a collision resolution function in addition to the hash functon. If collision occurs, next probes are performed following the formula: h i (x) = (hash(x) + f(i)) mod TableSize where: hash(x) is the hash function f(i) is the collision resolution function i is the number of the current attempt (probe) to insert an element. a. Linear probing (linear hashing, sequential probing): f(i) = i Insert: When there is a collision we just probe the next slot in the table. If it is unoccupied we store the key there. If it is occupied we continue probing the next slot. Search: If the key hashes to a position that is occupied and there is no match, we probe the next position. a) match successful search b) empty position unsuccessful search c) occupied and no match continue probing. When the end of the table is reached, the probing continues from the beginning, until the original starting position is reached. Problems with delete: a special flag is needed to distinguish deleted from empty positions. This is necessary for the search function if we come to a "deleted" position, CCET/MCA Page 15
16 the search has to continue as the deletion might have been done after the insertion of the key we are looking for, and it might be further in the table. Here is an example of Linear probing Total amount of memory space less, since no pointers are maintained. Disadvantage: " Primary clustering" Large clusters tend to build up: if an empty slot is preceded by i filled slots, the probability that the empty slot is the next one to be filled is (i+1)/m. If the preceding slot was empty, the probability is 1/M. This means that when the table begins to fill up, many other slots are examined. Linear probing runs slowly for nearly full tables. Quadratic probing: f(i) = i 2 A guadratic function is used to compute the next index in the table to be probed. Example: In linear probing we check the i-th position. If it is occupied, we check the i+1 st position, next we check the i+2 nd, etc. In quadric probing, if the i-th position is occupied we check the i+1 st, next we check the i+4 th, next - i + 9 th etc. The idea here is to skip regions in the table with possible clusters. Double hashing: f(i) = i * hash 2 (x) Purpose same as in quadratic probing : to overcome the disadvantage of clustering. Instead of examining each successive entry following a collided position, we use a second hash function to get a fixed increment for the "probe" sequence. CCET/MCA Page 16
17 The second function should be chosen so that the increment and M are relatively prime. Otherwise there will be slots that would remain unexamined. Example: hash 2 (x) = R - (x mod R), R is smaller than TableSize, prime. In open addressing the load factor L is less than 1. Good strategy is to keep L < 0.5 If the table is close to full, the search time grows and may become equal to the table size Rehashing If the table is close to full, the search time grows and may become equal to the table size. When the load factor exceeds a certain value (e.g. greater than 0.5) we do rehashing : Build a second table twice as large as the original and rehash there all the keys of the original table. Rehashing is expensive operation, with running time O(N) However, once done, the new hash table will have good performance. Extendible hashing Used when the amount of data is too large to fit in main memory and external storage is used. N records in total to store, M records in one disk block The problem: in ordinary hashing several disk blocks may be examined to find an element - a time consuming process. Extendible hashing: no more than two blocks are examined. Idea: Keys are grouped according to the first m bits in their code. Each group is stored in one disk block. CCET/MCA Page 17
18 If the block becomes full and no more records can be inserted, each group is split into two, and m+1 bits are considered to determine the location of a record. Example: lets' say we have 4 groups of keys according to the first two bits: Each disk block in the example can contain 3 records only, 4 blocks are needed to store the above keys New key to be inserted: Block2 is full, so we start considering 3 bits: 000/ / /111 (still on same block) The second group of keys is split onto two disk blocks - one for keys staring with 010, and one for keys starting with 011. A directory is maintained in main memory with pointers to the disk blocks for each bit pattern. The size of the directory is 2 D = O(N (1+1/M) /M), where CCET/MCA Page 18
19 D - number of bits considered N - number of records M - number of disk blocks. Conclusion Hashing is the best search method (constant running time) if we don't need to have the records sorted. The choice of the hash function remains the most difficult part of the task and depends very much on the nature of the keys. Separate chaining or open addressing? Open addressing is the preferred method if there is enough memory to keep a table twice larger than the number of the records. Separate chaining is used when we don't know in advance the number of the records to be stored. Though it requires additional time for list processing, it is simpler to implement. Some application areas Dictionaries, on-line spell checkers, compiler symbol tables. CCET/MCA Page 19
Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms
Hashing CmSc 250 Introduction to Algorithms 1. Introduction Hashing is a method of storing elements in a table in a way that reduces the time for search. Elements are assumed to be records with several
More informationAdvanced Algorithms, Third Class, Computer Science Department, CSW,
Done by Mokhtar M. Hasan, Lecturer at the Computer Science Department, College of Science for Women, to fulfill the course subject of Advanced Algorithm material. Advanced Algorithms, Third Class, Computer
More information5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.
5. Hashing 5.1 General Idea 5.2 Hash Function 5.3 Separate Chaining 5.4 Open Addressing 5.5 Rehashing 5.6 Extendible Hashing Malek Mouhoub, CS340 Fall 2004 1 5. Hashing Sequential access : O(n). Binary
More informationIntroduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far
Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing
More informationCOMP171. Hashing.
COMP171 Hashing Hashing 2 Hashing Again, a (dynamic) set of elements in which we do search, insert, and delete Linear ones: lists, stacks, queues, Nonlinear ones: trees, graphs (relations between elements
More informationChapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,
Introduction Chapter 5 Hashing hashing performs basic operations, such as insertion, deletion, and finds in average time 2 Hashing a hash table is merely an of some fixed size hashing converts into locations
More informationQuestion Bank Subject: Advanced Data Structures Class: SE Computer
Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.
More informationUNIT III BALANCED SEARCH TREES AND INDEXING
UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant
More informationGeneral Idea. Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure
Hashing 1 Hash Tables We ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees. The implementation of hash tables is called hashing. Hashing is a technique
More informationHash Tables. Hashing Probing Separate Chaining Hash Function
Hash Tables Hashing Probing Separate Chaining Hash Function Introduction In Chapter 4 we saw: linear search O( n ) binary search O( log n ) Can we improve the search operation to achieve better than O(
More informationHashing Techniques. Material based on slides by George Bebis
Hashing Techniques Material based on slides by George Bebis https://www.cse.unr.edu/~bebis/cs477/lect/hashing.ppt The Search Problem Find items with keys matching a given search key Given an array A, containing
More informationHash Table and Hashing
Hash Table and Hashing The tree structures discussed so far assume that we can only work with the input keys by comparing them. No other operation is considered. In practice, it is often true that an input
More informationWorst-case running time for RANDOMIZED-SELECT
Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case
More informationData Structures. Chapter 04. 3/10/2016 Md. Golam Moazzam, Dept. of CSE, JU
Data Structures Chapter 04 1 Linear Array A linear array is a list of finite number N of homogeneous data elements such that: The elements of the array are referenced respectively by an index set consisting
More information9/24/ Hash functions
11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way
More informationWe assume uniform hashing (UH):
We assume uniform hashing (UH): the probe sequence of each key is equally likely to be any of the! permutations of 0,1,, 1 UH generalizes the notion of SUH that produces not just a single number, but a
More informationModule 2: Classical Algorithm Design Techniques
Module 2: Classical Algorithm Design Techniques Dr. Natarajan Meghanathan Associate Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu Module
More informationTABLES AND HASHING. Chapter 13
Data Structures Dr Ahmed Rafat Abas Computer Science Dept, Faculty of Computer and Information, Zagazig University arabas@zu.edu.eg http://www.arsaliem.faculty.zu.edu.eg/ TABLES AND HASHING Chapter 13
More informationHashing. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University
Hashing CptS 223 Advanced Data Structures Larry Holder School of Electrical Engineering and Computer Science Washington State University 1 Overview Hashing Technique supporting insertion, deletion and
More informationFINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard
FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 The data of the problem is of 2GB and the hard disk is of 1GB capacity, to solve this problem we should Use better data structures
More informationHashing. Hashing Procedures
Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements
More informationO(n): printing a list of n items to the screen, looking at each item once.
UNIT IV Sorting: O notation efficiency of sorting bubble sort quick sort selection sort heap sort insertion sort shell sort merge sort radix sort. O NOTATION BIG OH (O) NOTATION Big oh : the function f(n)=o(g(n))
More informationIII Data Structures. Dynamic sets
III Data Structures Elementary Data Structures Hash Tables Binary Search Trees Red-Black Trees Dynamic sets Sets are fundamental to computer science Algorithms may require several different types of operations
More informationData Structures And Algorithms
Data Structures And Algorithms Hashing Eng. Anis Nazer First Semester 2017-2018 Searching Search: find if a key exists in a given set Searching algorithms: linear (sequential) search binary search Search
More informationCpt S 223. School of EECS, WSU
Hashing & Hash Tables 1 Overview Hash Table Data Structure : Purpose To support insertion, deletion and search in average-case constant t time Assumption: Order of elements irrelevant ==> data structure
More informationSymbol Table. Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management
Hashing Symbol Table Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management In general, the following operations are performed on
More informationAAL 217: DATA STRUCTURES
Chapter # 4: Hashing AAL 217: DATA STRUCTURES The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions, and finds in constant average
More informationUNIT 5. Sorting and Hashing
UNIT 5 Sorting and Hashing SORTING METHODS SYLLABUS: 5.1.Sorting Methods a. Bubble Sort, b. Selection Sort, c. Quick Sort, d. Insertion Sort, e. Merge Sort, f. Radix Sort 5.2.Hashing Concepts CONTINUE
More informationHash[ string key ] ==> integer value
Hashing 1 Overview Hash[ string key ] ==> integer value Hash Table Data Structure : Use-case To support insertion, deletion and search in average-case constant time Assumption: Order of elements irrelevant
More informationCSE 214 Computer Science II Searching
CSE 214 Computer Science II Searching Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu http://www3.cs.stonybrook.edu/~cse214/sec02/ Introduction Searching in a
More informationLecture 6 Sorting and Searching
Lecture 6 Sorting and Searching Sorting takes an unordered collection and makes it an ordered one. 1 2 3 4 5 6 77 42 35 12 101 5 1 2 3 4 5 6 5 12 35 42 77 101 There are many algorithms for sorting a list
More informationUnderstand how to deal with collisions
Understand the basic structure of a hash table and its associated hash function Understand what makes a good (and a bad) hash function Understand how to deal with collisions Open addressing Separate chaining
More information1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1
Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 2. O(n) 2. [1 pt] What is the solution to the recurrence T(n) = T(n/2) + n, T(1)
More informationCS 2412 Data Structures. Chapter 10 Sorting and Searching
CS 2412 Data Structures Chapter 10 Sorting and Searching Some concepts Sorting is one of the most common data-processing applications. Sorting algorithms are classed as either internal or external. Sorting
More informationHash Tables Outline. Definition Hash functions Open hashing Closed hashing. Efficiency. collision resolution techniques. EECS 268 Programming II 1
Hash Tables Outline Definition Hash functions Open hashing Closed hashing collision resolution techniques Efficiency EECS 268 Programming II 1 Overview Implementation style for the Table ADT that is good
More informationAlgorithms and Data Structures
Lesson 4: Sets, Dictionaries and Hash Tables Luciano Bononi http://www.cs.unibo.it/~bononi/ (slide credits: these slides are a revised version of slides created by Dr. Gabriele D Angelo)
More informationwwwashekalazizwordpresscom Data Structure Test Paper 3 1(a) Define Data Structure Mention the subject matters of data structure The logical and mathematical model of a particular organization of data is
More informationCSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)
Name: Email address: Quiz Section: CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators) Instructions: Read the directions for each question carefully before answering. We will
More informationHO #13 Fall 2015 Gary Chan. Hashing (N:12)
HO #13 Fall 2015 Gary Chan Hashing (N:12) Outline Motivation Hashing Algorithms and Improving the Hash Functions Collisions Strategies Open addressing and linear probing Separate chaining COMP2012H (Hashing)
More informationThe time and space are the two measure for efficiency of an algorithm.
There are basically six operations: 5. Sorting: Arranging the elements of list in an order (either ascending or descending). 6. Merging: combining the two list into one list. Algorithm: The time and space
More informationa) State the need of data structure. Write the operations performed using data structures.
Important Instructions to examiners: 1) The answers should be examined by key words and not as word-to-word as given in the model answer scheme. 2) The model answer and the answer written by candidate
More informationTable ADT and Sorting. Algorithm topics continuing (or reviewing?) CS 24 curriculum
Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT (a.k.a. Dictionary, Map) Table public interface: // Put information in the table, and a unique key to identify
More information4. SEARCHING AND SORTING LINEAR SEARCH
4. SEARCHING AND SORTING SEARCHING Searching and sorting are fundamental operations in computer science. Searching refers to the operation of finding the location of a given item in a collection of items.
More informationSelection, Bubble, Insertion, Merge, Heap, Quick Bucket, Radix
Spring 2010 Review Topics Big O Notation Heaps Sorting Selection, Bubble, Insertion, Merge, Heap, Quick Bucket, Radix Hashtables Tree Balancing: AVL trees and DSW algorithm Graphs: Basic terminology and
More informationReview of Elementary Data. Manoj Kumar DTU, Delhi
Review of Elementary Data Manoj Kumar DTU, Delhi Structures (Part 2) Linked List: Problem Find the address/data of first common node. Use only constant amount of additional space. Your algorithm should
More informationAlgorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs
Algorithms in Systems Engineering ISE 172 Lecture 12 Dr. Ted Ralphs ISE 172 Lecture 12 1 References for Today s Lecture Required reading Chapter 5 References CLRS Chapter 11 D.E. Knuth, The Art of Computer
More informationHashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech
Hashing Dr. Ronaldo Menezes Hugo Serrano Agenda Motivation Prehash Hashing Hash Functions Collisions Separate Chaining Open Addressing Motivation Hash Table Its one of the most important data structures
More informationOperations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.
Priority Queue, Heap and Heap Sort In this time, we will study Priority queue, heap and heap sort. Heap is a data structure, which permits one to insert elements into a set and also to find the largest
More informationCS301 - Data Structures Glossary By
CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm
More information1. Attempt any three of the following: 15
(Time: 2½ hours) Total Marks: 75 N. B.: (1) All questions are compulsory. (2) Make suitable assumptions wherever necessary and state the assumptions made. (3) Answers to the same question must be written
More informationCMSC 341 Lecture 16/17 Hashing, Parts 1 & 2
CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 Prof. John Park Based on slides from previous iterations of this course Today s Topics Overview Uses and motivations of hash tables Major concerns with hash
More informationCSCD 326 Data Structures I Hashing
1 CSCD 326 Data Structures I Hashing Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional searching time complexity available is O(log2n)
More informationCSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)
Name: Email address: Quiz Section: CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators) Instructions: Read the directions for each question carefully before answering. We will
More informationCSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)
_ UWNetID: Lecture Section: A CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators) Instructions: Read the directions for each question carefully before answering. We will give
More informationOpen Addressing: Linear Probing (cont.)
Open Addressing: Linear Probing (cont.) Cons of Linear Probing () more complex insert, find, remove methods () primary clustering phenomenon items tend to cluster together in the bucket array, as clustering
More informationHashing for searching
Hashing for searching Consider searching a database of records on a given key. There are three standard techniques: Searching sequentially start at the first record and look at each record in turn until
More informationAlgorithm Efficiency & Sorting. Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms
Algorithm Efficiency & Sorting Algorithm efficiency Big-O notation Searching algorithms Sorting algorithms Overview Writing programs to solve problem consists of a large number of decisions how to represent
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More information4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING
4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING 4.1.2 ALGORITHMS ALGORITHM An Algorithm is a procedure or formula for solving a problem. It is a step-by-step set of operations to be performed. It is almost
More informationModule 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo
Module 5: Hashing CS 240 - Data Structures and Data Management Reza Dorrigiv, Daniel Roche School of Computer Science, University of Waterloo Winter 2010 Reza Dorrigiv, Daniel Roche (CS, UW) CS240 - Module
More informationCSCE 2014 Final Exam Spring Version A
CSCE 2014 Final Exam Spring 2017 Version A Student Name: Student UAID: Instructions: This is a two-hour exam. Students are allowed one 8.5 by 11 page of study notes. Calculators, cell phones and computers
More informationAdapted By Manik Hosen
Adapted By Manik Hosen Basic Terminology Question: Define Hashing. Ans: Concept of building a data structure that can be searched in O(l) time is called Hashing. Question: Define Hash Table with example.
More informationFinal Exam in Algorithms and Data Structures 1 (1DL210)
Final Exam in Algorithms and Data Structures 1 (1DL210) Department of Information Technology Uppsala University February 0th, 2012 Lecturers: Parosh Aziz Abdulla, Jonathan Cederberg and Jari Stenman Location:
More informationDirect File Organization Hakan Uraz - File Organization 1
Direct File Organization 2006 Hakan Uraz - File Organization 1 Locating Information Ways to organize a file for direct access: The key is a unique address. The key converts to a unique address. The key
More informationCOS 226 Midterm Exam, Spring 2009
NAME: login ID: precept: COS 226 Midterm Exam, Spring 2009 This test is 10 questions, weighted as indicated. The exam is closed book, except that you are allowed to use a one page cheatsheet. No calculators
More informationIntroducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved
Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =
More informationCSE373: Data Structures & Algorithms Lecture 17: Hash Collisions. Kevin Quinn Fall 2015
CSE373: Data Structures & Algorithms Lecture 17: Hash Collisions Kevin Quinn Fall 2015 Hash Tables: Review Aim for constant-time (i.e., O(1)) find, insert, and delete On average under some reasonable assumptions
More informationCSE Data Structures and Introduction to Algorithms... In Java! Instructor: Fei Wang. Mid-Term Exam. CSE2100 DS & Algorithms 1
CSE 2100 Data Structures and Introduction to Algorithms...! In Java!! Instructor: Fei Wang! Mid-Term Exam CSE2100 DS & Algorithms 1 1. True or False (20%=2%x10)! (1) O(n) is O(n^2) (2) The height h of
More informationCSIT5300: Advanced Database Systems
CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,
More informationTopic HashTable and Table ADT
Topic HashTable and Table ADT Hashing, Hash Function & Hashtable Search, Insertion & Deletion of elements based on Keys So far, By comparing keys! Linear data structures Non-linear data structures Time
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More informationHash Table. A hash function h maps keys of a given type into integers in a fixed interval [0,m-1]
Exercise # 8- Hash Tables Hash Tables Hash Function Uniform Hash Hash Table Direct Addressing A hash function h maps keys of a given type into integers in a fixed interval [0,m-1] 1 Pr h( key) i, where
More informationUNIT 7. SEARCH, SORT AND MERGE
UNIT 7. SEARCH, SORT AND MERGE ALGORITHMS Year 2017-2018 Industrial Technology Engineering Paula de Toledo CONTENTS 7.1. SEARCH 7.2. SORT 7.3. MERGE 2 SEARCH Search, sort and merge algorithms Search (search
More informationChapter 10. Sorting and Searching Algorithms. Fall 2017 CISC2200 Yanjun Li 1. Sorting. Given a set (container) of n elements
Chapter Sorting and Searching Algorithms Fall 2017 CISC2200 Yanjun Li 1 Sorting Given a set (container) of n elements Eg array, set of words, etc Suppose there is an order relation that can be set across
More informationSorting. Sorting in Arrays. SelectionSort. SelectionSort. Binary search works great, but how do we create a sorted array in the first place?
Sorting Binary search works great, but how do we create a sorted array in the first place? Sorting in Arrays Sorting algorithms: Selection sort: O(n 2 ) time Merge sort: O(nlog 2 (n)) time Quicksort: O(n
More informationChapter 20 Hash Tables
Chapter 20 Hash Tables Dictionary All elements have a unique key. Operations: o Insert element with a specified key. o Search for element by key. o Delete element by key. Random vs. sequential access.
More informationLecture 8 Index (B+-Tree and Hash)
CompSci 516 Data Intensive Computing Systems Lecture 8 Index (B+-Tree and Hash) Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 due tomorrow: Announcements Due on 09/21 (Thurs),
More informationHash-Based Indexing 1
Hash-Based Indexing 1 Tree Indexing Summary Static and dynamic data structures ISAM and B+ trees Speed up both range and equality searches B+ trees very widely used in practice ISAM trees can be useful
More information) $ f ( n) " %( g( n)
CSE 0 Name Test Spring 008 Last Digits of Mav ID # Multiple Choice. Write your answer to the LEFT of each problem. points each. The time to compute the sum of the n elements of an integer array is: # A.
More informationSection 05: Solutions
Section 05: Solutions 1. Memory and B-Tree (a) Based on your understanding of how computers access and store memory, why might it be faster to access all the elements of an array-based queue than to access
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,
More informationCSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials)
CSE100 Advanced Data Structures Lecture 21 (Based on Paul Kube course materials) CSE 100 Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table cost
More informationImplementation with ruby features. Sorting, Searching and Haching. Quick Sort. Algorithm of Quick Sort
Implementation with ruby features Sorting, and Haching Bruno MARTI, University of ice - Sophia Antipolis mailto:bruno.martin@unice.fr http://deptinfo.unice.fr/ hmods.html It uses the ideas of the quicksort
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationCSE 332, Spring 2010, Midterm Examination 30 April 2010
CSE 332, Spring 2010, Midterm Examination 30 April 2010 Please do not turn the page until the bell rings. Rules: The exam is closed-book, closed-note. You may use a calculator for basic arithmetic only.
More informationIn-Memory Searching. Linear Search. Binary Search. Binary Search Tree. k-d Tree. Hashing. Hash Collisions. Collision Strategies.
In-Memory Searching Linear Search Binary Search Binary Search Tree k-d Tree Hashing Hash Collisions Collision Strategies Chapter 4 Searching A second fundamental operation in Computer Science We review
More informationData Structure for Language Processing. Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions
Data Structure for Language Processing Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions INTRODUCTION: Which operation is frequently used by a Language Processor? Ans: Search. This
More informationTables. The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries
1: Tables Tables The Table ADT is used when information needs to be stored and acessed via a key usually, but not always, a string. For example: Dictionaries Symbol Tables Associative Arrays (eg in awk,
More informationRecitation 9. Prelim Review
Recitation 9 Prelim Review 1 Heaps 2 Review: Binary heap min heap 1 2 99 4 3 PriorityQueue Maintains max or min of collection (no duplicates) Follows heap order invariant at every level Always balanced!
More informationComputer Science 136 Spring 2004 Professor Bruce. Final Examination May 19, 2004
Computer Science 136 Spring 2004 Professor Bruce Final Examination May 19, 2004 Question Points Score 1 10 2 8 3 15 4 12 5 12 6 8 7 10 TOTAL 65 Your name (Please print) I have neither given nor received
More information2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS
Outline catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } Balanced Search Trees 2-3 Trees 2-3-4 Trees Slide 4 Why care about advanced implementations? Same entries, different insertion sequence:
More informationCS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017
CS 137 Part 8 Merge Sort, Quick Sort, Binary Search November 20th, 2017 This Week We re going to see two more complicated sorting algorithms that will be our first introduction to O(n log n) sorting algorithms.
More informationCS-301 Data Structure. Tariq Hanif
1. The tree data structure is a Linear data structure Non-linear data structure Graphical data structure Data structure like queue FINALTERM EXAMINATION Spring 2012 CS301- Data Structure 25-07-2012 2.
More informationLecture 5. Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs
Lecture 5 Treaps Find, insert, delete, split, and join in treaps Randomized search trees Randomized search tree time costs Reading: Randomized Search Trees by Aragon & Seidel, Algorithmica 1996, http://sims.berkeley.edu/~aragon/pubs/rst96.pdf;
More informationHash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell
Hash Tables CS 311 Data Structures and Algorithms Lecture Slides Wednesday, April 22, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005
More informationTHINGS WE DID LAST TIME IN SECTION
MA/CSSE 473 Day 24 Student questions Space-time tradeoffs Hash tables review String search algorithms intro We did not get to them in other sections THINGS WE DID LAST TIME IN SECTION 1 1 Horner's Rule
More informationCS 350 Algorithms and Complexity
CS 350 Algorithms and Complexity Winter 2019 Lecture 12: Space & Time Tradeoffs. Part 2: Hashing & B-Trees Andrew P. Black Department of Computer Science Portland State University Space-for-time tradeoffs
More informationChapter 12: Indexing and Hashing. Basic Concepts
Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition
More informationUnit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION
DESIGN AND ANALYSIS OF ALGORITHMS Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION http://milanvachhani.blogspot.in EXAMPLES FROM THE SORTING WORLD Sorting provides a good set of examples for analyzing
More informationCS:3330 (22c:31) Algorithms
What s an Algorithm? CS:3330 (22c:31) Algorithms Introduction Computer Science is about problem solving using computers. Software is a solution to some problems. Algorithm is a design inside a software.
More information