CSCI E 119 Section Notes Section 11 Solutions

Similar documents
Graphs. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. What is a Graph? b d f h j

Graphs. Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. What is a Graph? b d f h j

Graphs. What is a Graph? Computer Science S-111 Harvard University David G. Sullivan, Ph.D.

Direct Addressing Hash table: Collision resolution how handle collisions Hash Functions:

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

CIS 121 Data Structures and Algorithms Midterm 3 Review Solution Sketches Fall 2018

Selection, Bubble, Insertion, Merge, Heap, Quick Bucket, Radix

Computer Science E-22 Practice Final Exam

Trees. Eric McCreath

Data Structures Question Bank Multiple Choice

1 5,9,2,7,6,10,4,3,8,1 The first number (5) is automatically the first number of the sorted list

Recitation 9. Prelim Review

Hash Tables. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Data Dictionary Revisited

CS61BL. Lecture 5: Graphs Sorting

LECTURE 17 GRAPH TRAVERSALS

Understand how to deal with collisions

Binary heaps (chapters ) Leftist heaps

INSTITUTE OF AERONAUTICAL ENGINEERING

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

DATA STRUCTURES AND ALGORITHMS

CISC-235* Test #3 March 19, 2018

CS 350 Algorithms and Complexity

Lecture 6: Hashing Steven Skiena

2. True or false: even though BFS and DFS have the same space complexity, they do not always have the same worst case asymptotic time complexity.

CS 251, LE 2 Fall MIDTERM 2 Tuesday, November 1, 2016 Version 00 - KEY

CS350: Data Structures B-Trees

Randomized Algorithms, Hash Functions

Final Examination CSE 100 UCSD (Practice)

CS 112 Final May 8, 2008 (Lightly edited for 2012 Practice) Name: BU ID: Instructions

CSE 373 Final Exam 3/14/06 Sample Solution

Data Structures Brett Bernstein

CS 112 Final May 8, 2008 (Lightly edited for 2011 Practice) Name: BU ID: Instructions GOOD LUCK!

Prelim 2 Solution. CS 2110, April 26, 2016, 5:30 PM

GRAPHS Lecture 17 CS2110 Spring 2014

Total Score /15 /20 /30 /10 /5 /20 Grader

University of Illinois at Urbana-Champaign Department of Computer Science. Final Examination

INF2220: algorithms and data structures Series 1

Prelim 2. CS 2110, November 20, 2014, 7:30 PM Extra Total Question True/False Short Answer

Course Review for Finals. Cpt S 223 Fall 2008

TREES. Trees - Introduction

You must include this cover sheet. Either type up the assignment using theory5.tex, or print out this PDF.

York University AK/ITEC INTRODUCTION TO DATA STRUCTURES. Final Sample II. Examiner: S. Chen Duration: Three hours

Graph Search Methods. Graph Search Methods

Lecture 3: Graphs and flows

Graph Search Methods. Graph Search Methods

CSCI-1200 Data Structures Fall 2018 Lecture 23 Priority Queues II

Lecture 26: Graphs: Traversal (Part 1)

CS 350 : Data Structures B-Trees

Computer Science 136 Spring 2004 Professor Bruce. Final Examination May 19, 2004

Questions. 6. Suppose we were to define a hash code on strings s by:

Priority queues. Priority queues. Priority queue operations

Hashing Techniques. Material based on slides by George Bebis

Prelim 2 Solutions. CS 2110, November 20, 2014, 7:30 PM Extra Total Question True/False Short Answer

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

CS210 (161) with Dr. Basit Qureshi Final Exam Weight 40%

CSI 604 Elementary Graph Algorithms

Design and Analysis of Algorithms - - Assessment

Draw the resulting binary search tree. Be sure to show intermediate steps for partial credit (in case your final tree is incorrect).

UNIT 6A Organizing Data: Lists. Last Two Weeks

1. Meshes. D7013E Lecture 14

Thus, it is reasonable to compare binary search trees and binary heaps as is shown in Table 1.

Elementary Graph Algorithms. Ref: Chapter 22 of the text by Cormen et al. Representing a graph:

( D. Θ n. ( ) f n ( ) D. Ο%

COMP 103 RECAP-TODAY. Priority Queues and Heaps. Queues and Priority Queues 3 Queues: Oldest out first

CPS222 Lecture: Sets. 1. Projectable of random maze creation example 2. Handout of union/find code from program that does this

1. AVL Trees (10 Points)

) $ f ( n) " %( g( n)

AP Programming - Chapter 20 Lecture page 1 of 17

Hash Tables. Hashing Probing Separate Chaining Hash Function

Lecture 10 Graph algorithms: testing graph properties


SELF-BALANCING SEARCH TREES. Chapter 11

Info 2950, Lecture 16

CS171 Final Practice Exam

8. Write an example for expression tree. [A/M 10] (A+B)*((C-D)/(E^F))

CS 220: Discrete Structures and their Applications. graphs zybooks chapter 10

Algorithm Design and Analysis

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

Course Review. Cpt S 223 Fall 2009

Test 1 Last 4 Digits of Mav ID # Multiple Choice. Write your answer to the LEFT of each problem. 2 points each t 1

Prelim 2 Solution. CS 2110, November 19, 2015, 5:30 PM Total. Sorting Invariants Max Score Grader

( ) + n. ( ) = n "1) + n. ( ) = T n 2. ( ) = 2T n 2. ( ) = T( n 2 ) +1

COS 226 Algorithms and Data Structures Fall Midterm

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed)

Friday Four Square! 4:15PM, Outside Gates

Algorithm Design (8) Graph Algorithms 1/2

Module 2: Classical Algorithm Design Techniques

Introduction to Graphs. CS2110, Spring 2011 Cornell University

HASH TABLES cs2420 Introduction to Algorithms and Data Structures Spring 2015

CS 307 Final Spring 2009

COMP 182: Algorithmic Thinking Prim and Dijkstra: Efficiency and Correctness

Data Structures and Algorithms 2018

UNIT III BALANCED SEARCH TREES AND INDEXING

Motivation for B-Trees

- 1 - Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions. CS106B Spring 2013

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

This is a set of practice questions for the final for CS16. The actual exam will consist of problems that are quite similar to those you have

Chapter 5. Decrease-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Binary Trees, Binary Search Trees

Broadcast: Befo re 1

Transcription:

CSCI E 119 Section Notes Section 11 Solutions 1. Double Hashing Suppose we have a 7 element hash table, and we wish to insert following words: apple, cat, anvil, boy, bag, dog, cup, down We use hash functions: h1(key) = index related to first letter of the word ( a = 0, b = 1, ) h2(key) = length of the word (ex. h2( apple ) = 5) Let s go through inserting elements using double hashing and count the total length of the probes: First we insert apple. h1( apple ) = 0 is not occupied. Total probe length = 1: 1 _ 2 _ 5 _ Next we insert cat. h1( cat ) = 2 is not occupied. Total probe length = 1 + 1 = 2: 1 _ 5 _ Next we insert anvil. h1( anvil ) = 0 which is occupied. h2( anvil ) = 5. 0 + 5 is not occupied. Total probe length = 2 + 2 = 4: 1 _ Next we insert boy. h1( boy ) = 1 which is not occupied. Total probe length = 4 + 1 = 5:

Next we insert bag. h1( bag ) = 1 which is occupied. h2( bag ) = 3. 1 + 3 = 4 is unoccupied. Total probe length = 5 + 2 = 7: 4 bag Next we insert dog. h1( dog ) = 3 which is not occupied. Total probe length = 7 + 1 = 8: 3 dog 4 bag Next we insert cup. h1( cup ) = 2 which is occupied. h2( cup ) = 3. 2 + 3 = 5 is occupied. 2 + 2*3 = 8 = 1 is occupied. 2 + 3*3 = 11 = 4 is occupied. 2 + 4*3 = 14 = 0 is occupied. 2 + 5*3 = 17 = 3 is occupied. 2 + 6*3 = 20 = 6 is not occupied. Total probe length = 8 + 7 = 15: 3 dog 4 bag 6 cup Again, we cannot insert down because the table is full. Total probe length = 15 + 7 = 22. 2. The probe() method in our HashTable class (REVISITED) The return value of the probe() method is an integer. In some cases, it represents the index of the key that we re searching for. In other cases, it represents the index of the first empty or removed cell encountered during the search for the specified key. 0 aardvark 1 2 cat 3 bear 4 5 dog 6 The hashtable above has been partially filled using linear probing and the hash function h1 from problem 1. A gray cell indicates that an item has been removed.

One of the items in the table has been inserted incorrectly. Which one, and how do you know? dog is misplaced. Its hash code is 3, because it begins with d. Position 3 may have been filled when it was inserted, which explains why it wasn t put there. However, because position 4 is empty, it should have been inserted there, and it wasn t. Note that position 4 could not have been previously occupied, because it isn t gray. For each of the keys below, determine: i. the probe length ii. the return value of the probe() method Assume that none of these keys are actually inserted in the table. a. bear h1( bear ) = 1. Position 1 is a removed cell, so the probe() method takes note of that and continues probing. Position 2 is filled with a different key, so it moves on to position 3, which contains the key we are searching for. Thus, the method returns 3. Probe length = 3 (position 1, 2, and 3). b. cow h1( cow ) = 2. Position 2 is filled with a different key, so the probe() method moves on to position 3, which is also filled with a different key. Position 4 is empty, so the probe() method breaks out of the while loop and returns 4. Probe length = 3. c. buffalo h1( buffalo ) = 1. Position 1 is a removed cell, so the probe() method takes note of that and continues probing. Position 2 is filled with a different key, so it moves on to position 3, which is also filled with a different key. Position 4 is empty, so the probe() method breaks out of the while loop. Because it encountered a removed cell (position 1), it returns its position, so that a newly inserted value could be put there. Return value = 1. Probe length = 4. d. giraffe h1( giraffe ) = 6. Position 6 is a removed cell, so the probe() method takes note of that and moves on to position (6 + 1) % 7 = 0, which is filled with a different key, so it moves on to position 1. Position 1 is also a removed cell, but it is not the first one encountered, so the probe() method does not record its position, but moves on to position 2. Position 2 is filled with a different key, so it moves on to position 3, which is also filled with a different key. Position 4 is empty, so the probe() method breaks out of the while loop. It returns the position of the first encountered removed cell. Return value = 6. Probe length = 6. What is the largest probe length that we could have for this table, regardless of its contents? 7 the length of the table. After 7 positions, the probe sequence repeats, so the probe() method will give up after trying 7 positions.

3. Comparing data structures A local retailer wants to implement a simple in memory database that can be used to access information about products. Although a snapshot of this database will be periodically copied to disk, the entire contents fit in memory, and your component of the application will operate only on data stored in memory. Here are the requirements specified by the retailer: She wants to be able to retrieve product records by specifying the name of the product. She wants to be able to specify the first n characters of a product name and to retrieve all records that begin with those characters. She wants the record retrieval to be as efficient as possible on the order of 20 operations per retrieval, given a database of approximately one million records. She wants to be able to increase the size of the database adding large sets of new records without taking the system offline. Given this list of requirements, which data structure would be the better choice for this application, a binary search tree or a hash table or would these two data structures work equally well? Let s consider each of the criteria in turn: 1) Retrieving product records by specifying name of the product Search Tree: Assuming we used a balanced search tree, this takes O(log n). If the tree is unbalanced this could take O(n). Hash Table: This should take constant time as long as there are not too many collisions, but it could in theory be O(n) if the hash function doesn t work well or the table becomes too full. 2) Specifing the first n characters of a product name and retrieving all records that begin with these characters: Search Tree: While in the worst case, we have to go through the entire tree, if the tree is balanced, we should be able to prune much of the search space. Worst case O(n). Best case is much better than O(n). Hash Table: This is difficult since we probably have to go through the entire table (depending on the hash function used). Most likely O(n). 3) Required time to retrieve: Search Tree: one million ~= 2^20, so O(log n) = 20, which is within the specifications. Hash Table: O(1), but could approach or exceed 20 if the hash function doesn t work well or the table becomes too full that is there are many collisions. 4) Increasing the size of the database: Search Tree: O(m log n) in the best case, where m is the number of records they want to add. O(mn) in the worst case. Can be done without taking the system offline. Hash Table: potentially O(m + n), because you may need to resize the hash table, and then copy the existing records and add the new ones which takes O(m+n) steps. Additionally this may require taking the system offline while the existing records are copied over to the new table. Therefore, it seems that given the criteria, a search tree would work best due to the ability to retrieve the first n characters of a product name without going through the entire search tree, and the ability to add an arbitrary number of records without resizing or going offline. While the hash

table has the potential for constant insertion and lookup time, this is not much better than O(log n), especially when n is one million. 4. Graph Terminology and Representation Consider the highway graph from lecture: 84 Portland 39 Concord Albany 63 134 74 Worcester 83 44 Portsmouth 54 Boston 42 49 New York 185 Providence What are Worcester s neighbors in the graph? Albany, Boston, Concord, Portsmouth, and Providence, because it is connected to each of them by a single edge. Is the graph connected? Why or why not? Yes, because there is a path between every pair of vertices. Is it complete? Why or why not? No, because there isn t an edge between every pair of vertices. For example, there is no edge between Albany and Boston. Is it acyclic? If not, what is one example of a cycle in the graph? No. One example of a cycle is the path Worcester Boston Providence Worcester. If we used an adjacency matrix to represent this graph, what would it look like? Assume that the vertices are numbered alphabetically: 0 = Albany, 1 = Boston, 2 = Concord, 3 = New York, 4 = Portland, 5 = Portsmouth, 6 = Providence, 7 = Worcester 0 1 2 3 4 5 6 7 0 134 1 74 54 49 44 2 74 84 63 3 185 4 84 39 5 54 39 6 49 185 42 7 134 44 63 42 All of the empty cells would hold a special value indicating the absence of an edge.

5. Graph Traversals Let s try some additional traversals on the highway graph from lecture. a. What order would the cities be visited in if we performed a depth first traversal from Boston, and what is the resulting spanning tree? (Draw the spanning tree below)? Order visited: Boston, Worcester, Providence, New York, Concord, Portland, Portsmouth, Albany. Steps: 1) dftrav(boston, null): visit Boston, set its parent reference to null, and make a recursive call on the unvisited neighbor that is the smallest distance away (Worcester). 2) dftrav(worcester, Boston): visit Worcester, set its parent reference to Boston, and make a recursive call on the unvisited neighbor that is the smallest distance away (Providence). 3) dftrav(providence, Worcester): visit Providence, set its parent reference to Worcester, and make a recursive call on the unvisited neighbor that is the smallest distance away (New York). 4) dftrav(new York, Providence): visit New York, set its parent reference to Providence. It has no unvisited neighbors, so we return. 5) Providence has no other unvisited neighbors, so we return. 6) Worcester still has unvisited neighbors. Make a recursive call on the unvisited neighbor that is the smallest distance away (Concord). 7) dftrav(concord, Worcester): visit Concord, set its parent reference to Worcester, and make a recursive call on the unvisited neighbor that is the smallest distance away (Portland). 8) dftrav(portland, Concord): visit Portland, set its parent reference to Concord, and make a recursive call on the unvisited neighbor that is the smallest distance away (Portsmouth). 9) dftrav(portsmouth, Portland): visit Portsmouth, set its parent reference to Portland. It has no unvisited neighbors, so we return. 10) Portland has no other unvisited neighbors, so we return. 11) Concord has no other unvisited neighbors, so we return. 12) Worcester still has one unvisited neighbor Albany. Make a recursive call on it. 13) dftrav(albany, Worcester): visit Albany, set its parent reference to Worcester. It has no unvisited neighbors, so we return. 14) Worcester has no other unvisited neighbors, so we return. 15) Boston has no other unvisited neighbors, so we return from the original invocation.

b. What order would the cities be visited in if we performed a breadth first traversal from Boston, and what is the resulting spanning tree? (Draw the spanning tree below.) Step 2: Remove 8, Place 4 at the root and sift: Order visited: Boston, Worcester, Providence, Portsmouth, Concord, Albany, New York, Portland. 7 6 Evolution of the queue: remove insert contents Bos Bos Bos Worc, Prov, Portsmouth, Conc Worc, Prov, Portsmouth, Conc Worc Alb Prov, Portsmouth, Conc, Alb Prov NY Portsmouth, Conc, Alb, NY Portsmouth Portland Conc, Alb, NY, Portland Conc none (no unencountered neighbors) Alb, NY, Portland Alb none NY, Portland NY none Portland Portland none empty Cities are marked as encountered before they are inserted in the queue, and their parent reference is set to the city that was just removed from the queue. Cities are visited upon removal from the queue.