Unit 8: Analysis of Algorithms 1: Searching

Similar documents
Binary Trees

MULTIMEDIA COLLEGE JALAN GURNEY KIRI KUALA LUMPUR

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Chapter 20: Binary Trees

Summer Final Exam Review Session August 5, 2009

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

Programming II (CS300)


Data Structure. IBPS SO (IT- Officer) Exam 2017

Department of Computer Science and Technology

Cpt S 122 Data Structures. Data Structures Trees

Binary Trees, Binary Search Trees

CS301 - Data Structures Glossary By

B-Trees and External Memory

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

B-Trees and External Memory

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

12 Abstract Data Types

Trees. (Trees) Data Structures and Programming Spring / 28

1. Stack overflow & underflow 2. Implementation: partially filled array & linked list 3. Applications: reverse string, backtracking

CSE2331/5331. Topic 6: Binary Search Tree. Data structure Operations CSE 2331/5331

INF2220: algorithms and data structures Series 1

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

1) What is the primary purpose of template functions? 2) Suppose bag is a template class, what is the syntax for declaring a bag b of integers?

Design and Analysis of Algorithms Lecture- 9: Binary Search Trees

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

BRONX COMMUNITY COLLEGE of the City University of New York DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE. Sample Final Exam

ECE250: Algorithms and Data Structures Midterm Review

3. Priority Queues. ADT Stack : LIFO. ADT Queue : FIFO. ADT Priority Queue : pick the element with the lowest (or highest) priority.

Programming II (CS300)

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

Programming II (CS300)

CS102 Binary Search Trees

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Revision Statement while return growth rate asymptotic notation complexity Compare algorithms Linear search Binary search Preconditions: sorted,

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

4. Trees. 4.1 Preliminaries. 4.2 Binary trees. 4.3 Binary search trees. 4.4 AVL trees. 4.5 Splay trees. 4.6 B-trees. 4. Trees

R10 SET - 1. Code No: R II B. Tech I Semester, Supplementary Examinations, May

Abstract Data Structures IB Computer Science. Content developed by Dartford Grammar School Computer Science Department

Data Structures Question Bank Multiple Choice

INF2220: algorithms and data structures Series 1

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

COMP : Trees. COMP20012 Trees 219

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Course Review for. Cpt S 223 Fall Cpt S 223. School of EECS, WSU

Binary Search Trees Treesort

Basic Data Structures (Version 7) Name:

Binary search trees. Binary search trees are data structures based on binary trees that support operations on dynamic sets.

Binary search trees 3. Binary search trees. Binary search trees 2. Reading: Cormen et al, Sections 12.1 to 12.3

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

CH 8. HEAPS AND PRIORITY QUEUES

Name CPTR246 Spring '17 (100 total points) Exam 3

CH. 8 PRIORITY QUEUES AND HEAPS

Priority queues. Priority queues. Priority queue operations

Priority Queues and Heaps. Heaps of fun, for everyone!

DDS Dynamic Search Trees

Analysis of Algorithms

( ) ( ) C. " 1 n. ( ) $ f n. ( ) B. " log( n! ) ( ) and that you already know ( ) ( ) " % g( n) ( ) " #&

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION (Autonomous) (ISO/IEC Certified)

Week 6. Data structures

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

[ DATA STRUCTURES ] Fig. (1) : A Tree

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Data Structures. Outline. Introduction Linked Lists Stacks Queues Trees Deitel & Associates, Inc. All rights reserved.

Recursive Data Structures and Grammars

MIDTERM EXAMINATION Spring 2010 CS301- Data Structures

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

COSC 2011 Section N. Trees: Terminology and Basic Properties

Augmenting Data Structures

Cpt S 223 Fall Cpt S 223. School of EECS, WSU

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it.

Stores a collection of elements each with an associated key value

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Algorithms and Data Structures

CSCI2100B Data Structures Trees

Introduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree

Binary Trees. Recursive definition. Is this a binary tree?

CE 221 Data Structures and Algorithms

Priority Queues, Binary Heaps, and Heapsort

Computer Science E-22 Practice Final Exam

) $ f ( n) " %( g( n)

Sample Exam 1 Questions

Abstract Data Structures IB Computer Science. Content developed by Dartford Grammar School Computer Science Department

Why Do We Need Trees?

Principles of Computer Science

Recitation 9. Prelim Review

Assume you are given a Simple Linked List (i.e. not a doubly linked list) containing an even number of elements. For example L = [A B C D E F].

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Priority Queues and Binary Heaps

CS134 Spring 2005 Final Exam Mon. June. 20, 2005 Signature: Question # Out Of Marks Marker Total

CMSC 341 Lecture 14: Priority Queues, Heaps

3137 Data Structures and Algorithms in C++

1 Binary trees. 1 Binary search trees. 1 Traversal. 1 Insertion. 1 An empty structure is an empty tree.

Topics. Trees Vojislav Kecman. Which graphs are trees? Terminology. Terminology Trees as Models Some Tree Theorems Applications of Trees CMSC 302

Course Review. Cpt S 223 Fall 2009

Transcription:

P Computer Science Unit 8: nalysis of lgorithms 1: Searching Topics: I. Sigma and Big-O notation II. Linear Search III. Binary Search Materials: I. Rawlins 1.6 II. Rawlins 2.1 III. Rawlins 2.3 IV. Sigma notation exercises V. nalysis exercises #2 VI. Primes Specifications VII. Translator Specifications VIII. Review exercises 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

P Computer Science Sigma Notation and Big-O Notation Exercises I. Write the following expressions using sigma notation: a. 10 + 9 + 8 + 7 + 6 + 5 b. 1 1 1 2 1 3 1 n c. (n + 2n + 3n + 4n + 5n) d. 1 1 1 2 1 4 1 8 1 n 2 II. Evaluate the sigma-notation expressions: a. 4 1 2 i 1 4 b. c. 5 i 3 n i 1 3x n III. For each of the following routines, what is the approximate run time? Use Big-O notation. a. Reading a string of size n into memory. b. Swapping two elements in an array. c. Performing a serial (linear) search on n lists, all of which are size n. 23

P Computer Science nalysis of lgorithms Exercises #2 1. Find a closed n form for the following sums: i a. i 0 b. n i 1 i 3 2. Examine the code below: int sum = 0; for (int i = 1; i <= n; i++) { sum = sum 1; for (int j = 1; j <= i; j++) { sum = sum + 2; } } What value is sum set to as a result of the code? How many assignments does it do? 24

Data Structures t the heart of virtually every computer program are its algorithms and its data structures. It is hard to separate these two items, for data structures are meaningless without algorithms to create and manipulate them, and algorithms are usually trivial unless there are data structures on which to operate. This category concentrates on four of the most basic structures: stacks, queues, binary search trees, and priority queries. Questions will cover these data structures and implicit algorithms, not on implementation language details. stack is usually used to save information that will need to be processed later. Items are processed in a last-in, first-out (LIFO) order. queue is usually used to process items in the order in which requests are generated; a new item is not processed until all items currently on the queue are processed. This is also known as first-in, first-out (FIFO) order. binary search tree is used when one is storing a set of items and needs to be able to efficiently process the operations of insertion, deletion and query (i.e. find out if a particular item is part of the set and if not, which item in the set is close to the item in question). priority queue is used like a binary search tree, except one cannot delete an arbitrary item, nor can one make an arbitrary query. One can only delete the smallest element of the set, and can only find out what is the smallest element of the set. stack supports two operations: PUSH and POP. command of the form PUSH() puts the key at the top of the stack; the command POP (X) removes the top item from the stack and stores its value into variable X. If the stack was empty (because nothing had ever been pushed on it, or if all elements has been popped off of it), then X is given the special value of NIL. n analogy to this is a stack of books on a desk: a new book is placed on the top of the stack (pushed) and a book is removed from the top also (popped). Some textbooks call this data structure a push-down stack or a LIFO stack. Queues operate just like stacks, except items are removed from the bottom instead of the top. good physical analogy of this is the way a train conductor or newspaper boy uses a coin machine to give change: new coins are added to the tops of the piles, and change is given from the bottom of each. Some textbooks refer to this data structure as a FIFO stack. Consider the following sequence of 14 operations: PUSH(), PUSH(M), PUSH(E), POP(X), PUSH(R), POP(X), PUSH(I), POP(X), POP(X), POP(X), POP(X), PUSH(C), PUSH(), PUSH(N) If these operations are applied to a stack, then the values of the pops are: E, R, I, M, and NIL. fter all of the operations, there are three items still on the stack: the N is at the top (it will be the next to be popped, if nothing else is pushed before the pop command), and C is at the bottom. If, instead of using a stack we used a queue, then the values popped would be:, M, E, R, I and NIL. There would be three items still on the queue: N at the top and C on the bottom. Since items are removed from the bottom of a queue, C would be the next item to be popped regardless of any additional pushes. binary search tree is composed of nodes having three parts: information (or a key), a pointer to a left child, and a pointer to a right child. It has the property that the key at every node is always greater than 25

or equal to the key of its left child, and less than the key of its right child. The following tree is built from the keys, M, E, R, I, C,, N in that order: M E R C I N The root of the resulting tree is the node containing the key ; note that duplicate keys are inserted into the tree as if they were less than their equal key. The tree has a depth (sometimes called height) of 3 because the deepest node is 3 nodes below the root. Nodes with no children are called leaf nodes; there are four of them in the tree:, C, I and N. n external node is the name given to a place where a new node could be attached to the tree. In the final tree above, there are 9 external nodes; these are not drawn. The tree has an internal path length of 15: the sum of the depths of all nodes. It has an external path length of 31: the sum of the depths of all external nodes. To insert the N (the last key inserted), 3 comparisons were needed: against the root, the M and the R. To perform an inorder traversal of the tree, recursively traverse the tree by first visiting the left child, then the root, then the right child. In the tree above, the nodes are visited in the following order:,, C, E, I, M, N and R. preorder travel (root, left, right) visits in the following order:,, M, E, C, I, R and N. postorder traversal (left, right, root) is:, C, I, E, N, R, M,. Inorder traversals are typically used to list the contents of the tree in order. Binary search trees can support the operations: insert, delete and search. Moreover, it handles the operations efficiently: in a tree with, say, 1 million items, one can search for a particular value in about log21000000 20 steps. Items can be inserted or deleted in about as many steps, too. However, binary search trees can become unbalanced, if the keys being inserted are not pretty random. For example, consider the binary search tree resulting from inserting the keys, E, I, O, U, Y. Sophisticated techniques are available to maintain balanced trees. Binary search trees are dynamic data structures: they can support an unlimited number of operations, and in any order. To search for a node in a binary tree, the following algorithm (in pseudo-code) is used: p = root found = FLSE loop while (p NIL) and (not found) if (x<p s key) then p = p s left child else if (x>p s key) then p = p s right child else found = TRUE repeat 26

Deleting from a binary search tree is a bit more complicated. The algorithm we ll use is as follows: p = node to delete f = father of p if (p has no children) then delete p else if (p has one child) then make p s child become f s child delete p else (p has two children) l = p s left child (it might also have children) r = p s right child (it might also have children) make l become f s child instead of p stick r onto the l tree delete p end The following three diagrams illustrate the algorithm using the tree above. t the left, we delete I (no children); in the middle, we delete the R (one child); and at the right, we delete the M (two children). M M E E R E N C I C N C I R N priority queue is quite similar to a binary search tree, but one can only delete the smallest item and search for the smallest. These operations can be done in a guaranteed time proportional to the log of the number of items. One popular way to implement a priority queue is using a heap data structure. heap uses a binary tree (that is, a tree with two children) and maintains the following two properties: every node is larger than its two children (nothing is said about the relative magnitude of the two children), and the resulting tree contains no holes. That is, all levels of the tree are completely filled, except the bottom level, which is filled in from the left to the right. You may want to stop for a moment to think about how you might make an efficient implementation of a priority queue. The algorithm for insertion is not too difficult: put the new node at the bottom of the tree and then go up the tree, making exchanges with its parent, until the tree is valid. Consider inserting C into the following heap that has been built by inserting, M, E, R, I, C,, N: 27

I C N M E C I M E C R R N The smallest value is always the root. To delete it (and one can only delete the smallest value), one replaces it with the bottom-most and right-most element, and then walks down the tree making exchanges with the child in order to insure that the tree is valid. The following pseudo-code formalizes this notion: b = bottom-most and right-most element p = root of tree p s key = b s key delete b loop while (p is larger than either child) exchange p with smaller child p = smaller child repeat BUILDING HEP FROM, M, E, R, I, C,, N: M M E M E R I R I M E I C I R M E C N M E C R M E R References mberg, Wayne. Data Structures from rrays to Priority Queues, Wadsworth (1985). Bentley, Jon. Thanks, Heaps in Programming Pearls, Communications of the CM, Vol. 28, No. 3, March 1985. Sedgewick, Robert. lgorithms, ddison-wesley (1983), Chapters 11 and 14. Wirth, Niklaus. Data Structures and lgorithms in Scientific merican, September 1984, pp. 60-79. 28

Sample Problems Which of the following binary trees are valid binary search trees? (Empty nodes are not drawn.) B C B J S Be careful! binary tree is a tree where each node has at most 2 children. binary search tree is a binary tree with the additional property that the letter of each node is greater than the value of its left child, and less than the value of its right child. The valid trees are: (a), (b), and (d). S R T L C D E B S R W Z O F M Q W T N M If one traversed the following tree in preorder (visit the root, and then each of the subtrees from left to right), in what order would nodes be visited? nswer: B D E F H I C G common mistake is not to recursively visit all nodes in each subtree. B C D E F G H I 29

Challenge Questions: 1. How many nodes have only one child in the binary search tree for: DUKEUNIVERSITY 2. What is the depth of the binary search tree for the following: TELEPHONEWORKER 3. What is the internal path length of the binary search tree for: PHILDELPHI 30

P Computer Science Programming ssignment No. 8: Prime Numbers Specifications: The set of prime numbers consists of those integers which are evenly divisible only by one and themselves. For example, the following is a short list of primes: 2, 3, 5, 7, 11, 13, 17, 23, 29, 31. It is theorized that the set of prime numbers is infinite. Prime numbers, especially large ones, are useful to computer scientists for their applications in cryptography. In this project, we are going to test whether numbers are prime. partial list of primes is stored in a file. The user is asked to query for a prime, and if the number is in the database, the computer will respond with an appropriate message. Otherwise, the number is tested for primeness, and if the number is indeed prime, it is added into the list. This process should be repeated as long as the user wishes. Input / Output: The integer list will come from a file whose name will be determined at run-time. The file will contain a number of lines, each of which contains a prime number. The list will end with a -1. The list will be in ascending order, although not necessarily be complete. The number of items is indeterminate, but you may assume a maximum of 100. few sample input lines of one file could be as follows: 2 3 5 7 11 17 31-1 fter reading in the input file, I/O will be turned over to the console. The user will be asked to enter a number they wish to search. fter a few runs, the output screen could look like this: Enter data file: primes.in Reading database... done. Please enter a number ( -1 terminates): 17 17 is prime. # comparisons were used Please enter a number ( -1 terminates): 33 33 is NOT prime. # comparisons were used Please enter a number ( -1 terminates): 29 29 is not in the database, but it is prime added to the list. # comparisons were used Please enter a number ( -1 terminates): -1 Have a nice day! 31

II. Design Subdivide your program so that each method achieves naturally-encapsulated tasks. In particular, be sure that you have a separate method for linear search and for binary search. When running your program, you should have your program use binary search by default. If a new prime is added, it should be added to the end of the list, regardless of its order. new prime that is numerically greater than the last item in the list will not change the ordering. If the new prime results in the list not being ordered, you should change all further searches to use linear search in place of binary search. III. Implementation For storing the numbers, you may choose between using either a partially-filled array of primitives (int) or an rraylist of wrappers (Integer). The ordering (or non-ordering) of the data will determine which kind of search method can be used. If possible, the program will use the binary search algorithm as discussed in class. You may use either the looping method or the recursive method. If the set is not ordered, then the linear search must be used. IV. Testing and Debugging Test your work using appropriate test cases. Begin with a small set of test cases to be sure that your program works, then expand it to include at least 25 values. In order to measure the efficiency of your search, include a comparison counter that will keep track of the number of times the.compareto method is called. nswer the following: - fter running your binary search ten times, what is the average number of comparisons that need to be made? - For binary search, what is the worst numbers of comparisons that need to be made for your data set size? How often does the worst case show up? - fter making a non-ordered list, does your search efficiency change accordingly? 32

P Computer Science Programming ssignment No. 8: Language Translator Specifications: In this project, you are going to match Latin words (or another language) with their English equivalents. The list of words and definitions is stored in a file. The user is asked to query for a Latin word, and if the word is in the database, the computer will respond with the English definition. This process should be repeated as long as the user wishes. Input / Output: Your input will come from two places. Initially, the input will come from a file whose name will be determined at run-time (a good example file might be called latinwords.in ). The file will contain a number of lines, each of which contains: a string representing the English meaning (possibly multi-word), followed by a TB, followed by the Latin word. If multiple Latin words are possible, only the first will be considered. The end of the file will be indicated with the English word *quit*. few sample input lines of one file could be as follows: draw near propinquo residence domus to grow old consenesco what is left reliquum witty lepidus *quit* t the console, the user will be asked to enter a word they wish to search for. Each time the user types a valid string, it will be searched for as a Latin word, and if it is found, the English will be printed to the monitor. fter a few runs, the output screen could look like this: Enter data file: latinwords.in Reading database... done. Please enter a Latin word ( quit terminates): lepidus English: witty Please enter a Latin word ( quit terminates): consenesco English: to grow old Please enter a Latin word ( quit terminates): quit Have a nice day! II. Design Subdivide your program so that each method achieves naturally-encapsulated tasks. In particular, be sure that you have a separate method for linear search and for binary search. When running your program, you should have your program run a method isordered which will determine if the program uses binary search or linear search. (Note that this method should be run on the foreign list, not the English list). 33

III. Implementation For storing the Strings, you may choose between using either a pair of partially-filled arrays of Strings or a parallel set of rraylists. The ordering (or non-ordering) of the data will determine which kind of search method can be used. If possible, the program will use the binary search algorithm as discussed in class. You may use either the looping method or the recursive method. If the set is not ordered, then the linear search must be used. IV. Testing and Debugging Test your work using appropriate test cases. You will need to write at least two data files, one that is ordered and one that is not ordered. Begin with a small set of test cases to be sure that your program works, then expand it to include at least 25 values. In order to measure the efficiency of your search, include a comparison counter that will keep track of the number of times the.compareto method is called. nswer the following: - fter running your binary search ten times, what is the average number of comparisons that need to be made? - For binary search, what is the worst numbers of comparisons that need to be made for your data set size? How often does the worst case show up? - fter making a non-ordered list, does your search efficiency change accordingly? 34

P Computer Science Review Exercises: nalysis of lgorithms 1. Graph the following functions. Use the graph template below as a guideline (and only a guideline). Be sure to plot specific points and label your graph thoroughly. a) O(n) b) O(n 2 ) c) n 2 / 2 100 d) O(lg n) e) 2 * lg n y = # run-time steps 0 100 x = n: # of input items 2. Write the following in Sigma notation: a. 9 + 10 + 11 + 12 + 13 + 14 b. 1 + n + n 2 + n 3 + n 4 +... + n n c. 1 + ½ + 1/3 + ¼ + 1/5 35

3. Evaluate the sigma notation expressions: a. 19 i 11 1 b. c. d. 4 10 i 1 i 6 n i 1 i 2n ( i 1) i 4. True / False a. linear search of a list assumes that the list is in sorted order. b. binary search of a list assumes that the list is in sorted order. c. linear search is an O(n) algorithm. d. search on an ordered list cannot be made quicker than one on an unsorted list. 5. In deciding which search algorithm to use on a list, which of the following should not be a factor in your decision? a. The length of the list to be searched b. Whether the list contains numbers or characters c. Whether or not the list is already in sorted order d. The number of times the list is to be searched 6. pproximately how many comparisons are performed by a binary search of 1000 items if the search item is not in the list? a. 1000 b. 500 c. 50 d. 10 e. 1 36