CS 3114 Data Structures and Algorithms READ THIS NOW!

Similar documents
Solution READ THIS NOW! CS 3114 Data Structures and Algorithms

CS 3114 Data Structures and Algorithms Test 1 READ THIS NOW!

CS 3114 Data Structures and Algorithms READ THIS NOW!

Generic BST Interface

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring Instructions:

CS 3114 Data Structures and Algorithms DRAFT Project 2: BST Generic

Organizing Spatial Data

PR quadtree. public class prquadtree< T extends Compare2D<? super T> > {

CS 3114 Data Structures and Algorithms DRAFT Minor Project 3: PR Quadtree

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Summer I Instructions:

CS 2505 Computer Organization I

CS 2506 Computer Organization II Test 1. Do not start the test until instructed to do so! printed

CS 2604 Data Structures Midterm Summer 2000 U T. Do not start the test until instructed to do so!

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

CS 2506 Computer Organization II Test 2

CS 2506 Computer Organization II Test 1

Binary Tree Node Relationships. Binary Trees. Quick Application: Expression Trees. Traversals

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Second Examination Solution

Algorithms. Deleting from Red-Black Trees B-Trees

Advanced Java Concepts Unit 5: Trees. Notes and Exercises

CS 2505 Computer Organization I

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

Trees can be used to store entire records from a database, serving as an in-memory representation of the collection of records in a file.

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so!

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Data Structures in Java

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

Section 1: True / False (1 point each, 15 pts total)

CS 106X Sample Final Exam #2

COMP : Trees. COMP20012 Trees 219

PR Quadtree Implementation

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

CSCE 2014 Final Exam Spring Version A

void insert( Type const & ) void push_front( Type const & )

Computer Science Foundation Exam

CS 2506 Computer Organization II

INSTITUTE OF AERONAUTICAL ENGINEERING

CS251-SE1. Midterm 2. Tuesday 11/1 8:00pm 9:00pm. There are 16 multiple-choice questions and 6 essay questions.

CS3114 (Fall 2013) PROGRAMMING ASSIGNMENT #2 Due Tuesday, October 11:00 PM for 100 points Due Monday, October 11:00 PM for 10 point bonus

CS 2506 Computer Organization II Test 2. Do not start the test until instructed to do so! printed

In-Memory Searching. Linear Search. Binary Search. Binary Search Tree. k-d Tree. Hashing. Hash Collisions. Collision Strategies.

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

CSE373 Fall 2013, Second Midterm Examination November 15, 2013

Largest Online Community of VU Students

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

University of Waterloo Department of Electrical and Computer Engineering ECE250 Algorithms and Data Structures Fall 2017

CSE 332 Winter 2015: Midterm Exam (closed book, closed notes, no calculators)

ECE 2035 A Programming HW/SW Systems Spring problems, 5 pages Exam Three 13 April Your Name (please print clearly)

- 1 - Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions. CS106B Spring 2013

CS ) PROGRAMMING ASSIGNMENT 11:00 PM 11:00 PM

COS 226 Midterm Exam, Spring 2009

CS 3214 Computer Systems. Do not start the test until instructed to do so! printed

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Summer Final Exam Review Session August 5, 2009

Midterm II Exam Principles of Imperative Computation Frank Pfenning. March 31, 2011

CS 315 Data Structures mid-term 2

COS 226 Algorithms and Data Structures Spring Midterm

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

You must include this cover sheet. Either type up the assignment using theory5.tex, or print out this PDF.

CS 2504 Intro Computer Organization Test 1

STUDENT LESSON AB30 Binary Search Trees

AP Programming - Chapter 20 Lecture page 1 of 17

Section 05: Solutions

Do not start the test until instructed to do so!

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

Computer Science 136 Spring 2004 Professor Bruce. Final Examination May 19, 2004

CS134 Spring 2005 Final Exam Mon. June. 20, 2005 Signature: Question # Out Of Marks Marker Total

COMP 250 Midterm #2 March 11 th 2013

XC Total Max Score Grader

HO #13 Fall 2015 Gary Chan. Hashing (N:12)

DATA STRUCTURES/UNIT 3

University of Waterloo Department of Electrical and Computer Engineering ECE250 Algorithms and Data Structures Fall 2014

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

CMSC 341 Lecture 10 Binary Search Trees

CS301 All Current Final Term Paper Subjective 2013 Solved with refernces

University of Illinois at Urbana-Champaign Department of Computer Science. Second Examination

a graph is a data structure made up of nodes in graph theory the links are normally called edges

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

CS 2505 Computer Organization I Test 1. Do not start the test until instructed to do so! printed

Quiz 1 Solutions. Asymptotic growth [10 points] For each pair of functions f(n) and g(n) given below:

Binary Trees. For example: Jargon: General Binary Trees. root node. level: internal node. edge. leaf node. Data Structures & File Management

Final Exam in Algorithms and Data Structures 1 (1DL210)

Instructions. Definitions. Name: CMSC 341 Fall Question Points I. /12 II. /30 III. /10 IV. /12 V. /12 VI. /12 VII.

8. Binary Search Tree

Midterm 2 Exam Principles of Imperative Computation. Tuesday 31 st March, This exam is closed-book with one sheet of notes permitted.

ECE242 Data Structures and Algorithms Fall 2008

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Section 05: Solutions

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

9/24/ Hash functions

If you took your exam home last time, I will still regrade it if you want.

On my honor I affirm that I have neither given nor received inappropriate aid in the completion of this exercise.

Multiple choice questions. Answer on Scantron Form. 4 points each (100 points) Which is NOT a reasonable conclusion to this sentence:

Transcription:

READ THIS NOW! Print your name in the space provided below. There are 7 short-answer questions, priced as marked. The maximum score is 100. This examination is closed book and closed notes, aside from the permitted one-page fact sheet. Your fact sheet may contain definitions and examples, but it may not contain questions and/or answers taken from old tests or homework. You may include examples from the course notes. No calculators or other computing devices may be used. The use of any such device will be interpreted as an indication that you are finished with the test and your test form will be collected immediately. Until solutions are posted, you may not discuss this examination with any student who has not taken it. Failure to adhere to any of these restrictions is an Honor Code violation. When you have finished, sign the pledge at the bottom of this page and turn in the test and your signed fact sheet. Name (Last, First) Solution printed Pledge: On my honor, I have neither given nor received unauthorized aid on this examination. signed A 1

xkcd.com 2

1. [18 points] Complete the implementations of the public function and private helper function described below, which are intended to return the number of left child nodes in the tree; i.e., the number of nodes that are left children of their parent node. The only relevant declarations from the BST implementation are that it has a member named root and an inner BinaryNode class, with BinaryNode (reference) members left and right, and a member element of generic type. You may not invoke other tree methods. You may not modify the given function interfaces in any way! // Indicate any data members you'd add to the BST class here: private int nleftchildren; Since the helper function is void, the problem cannot be solved without adding a data member to use as the counter. Of course, the counter should be private. /** Reports the number of left child nodes in the BST. * * Returns: * the number of left child nodes in the BST */ public int numlefties( ) { // Write body of the public function here: nleftchildren = 0; The counter needs to be reset to 0 for each call. numleftieshelper( root ); return nleftchildren; private void numleftieshelper( BinaryNode sroot ) { if ( sroot == null ) return; if ( sroot.left!= null ) { // nothing of interest here, or below // if have a left child, nleftchildren++; // count it numleftieshelper( sroot.left ); // examine its subtree numleftieshelper( sroot.right ); // examine right subtree, if any In the helper function, there are a few points of interest: it's absolutely necessary to have a return in the null test (if it's at the beginning), or to otherwise skip the rest of the body since you have to check for a non-null left pointer, the recursive call to the left side should be avoided if the left pointer is null (without a repeated check for that) 3

2. The design discussion for a PR quadtree implemention resulted in the suggestion that making leaf nodes buckets, capable of holding more than one data object might improve the performance of some operations on the tree. a) [6 points] What would (allegedly) be gained by using a bucketed leaf nodes? Be complete and precise. Using buckets will reduce the length of some branches in the tree, since a leaf will not split until we find enough data points in its region to overflow the bucket. This will help with search cost, and therefore with insertion and deletion as well. I expected an answer to mention that some branches would be shorter, AND that therefore insert/search/delete would all be more efficient (on those branches). Haskell Hoo V decided to follow this suggestion. In doing so, he decided that he would have each leaf node contain its own PR quadtree, using bucket size 1. Those PR quadtrees would then be used as the buckets. b) [8 points] How should Haskell Hoo V have set the world boundaries for each of these PR quadtree buckets? Since the PR quadtree bucket represents the region corresponding to the leaf node that contains it, the "world" for these PR quadtrees will be determined by the region boundaries of the leaf nodes that contain them. c) [6 points] Describe the effect of Haskell Hoo V's approach on the performance of operations in his PR quadtree (versus simply having each leaf node store a single data object). Well, each of these PR quadtrees will mimic the subtree that we would have had if we'd simply used a bucket size of 1 in the original tree. There's no advantage whatsoever. In fact, it seems there's a slight disadvantage since we have to absorb a little more storage (root pointer for each of these trees), and a little more coding (to switch from our normal traversal pattern to accessing the root of the new quadtree). However... consider how the tree would be built by a sequence of insertions. When the first point is inserted, we would create a leaf node, containing a PR quadtree that contained the point... when a second point is inserted, it will go into that PR quadtree in the usual manner (for a PR quadtree with a bucket size of 1)... and from there onwards we will just be modifying that PR quadtree... so Haskell seems to have missed the point entirely. In short, this is a profoundly stupid idea... missing the point of having buckets entirely. I expected an answer to address the actual effect on the tree with some precision. 4

3. [12 points] According to Samet's paper on spatial data structures, the maximum number of levels in a PR quadtree can be as high as 2S log D where S is the length of the side of the "world" and D is the smallest distance between any pair of points stored in the quadtree. So, for example, if the world is 1024 x 1024 and there's a pair of points in the tree that are a distance of 4 apart, the number of levels in the tree could be as large as 9. Explain why, in the situation described above, it is possible that there could be much fewer than 9 levels in the tree. Be very precise. An answer like "well, that's just an upper bound" will not receive any credit. (A diagram might be very useful in answering this question.) Simply put, we get the worst case only if the two closest data points are not naturally separated by a region boundary that resulted from an earlier split. For example, if we have two points in the NE and SW quadrants of the world, they can be arbitrarily close to the world center and never require more than one splitting to separate them: +--------------------+ B ---------- --------- A +--------------------+ An answer needed to explain (or show) how it's possible to have a very small value for D (i.e., a pair of data points that are very close together) and still have a relatively shallow quadtree. There were some interesting, false assertions in the answers: there do not have to be a lot of insertions to (potentially) create a very stalky branch; two insertions will achieve that with the right coordinates it is, therefore, irrelevant how many leaf nodes are in the tree, or how many data points have been inserted (aside from needing 2) 5

4. [12 points] Suppose a hash table, using linear probing to resolve collisions, is initially empty, and records S1, S2,, and SM are inserted into the table, in that order, that all of those records have different key values, and that all of those records map to different home slots. After that, records R1, R2,, and RN are inserted into the table, in that order, all of those records have different key values, and all of those records map to the same home slot, which is not the home slot of any of the Sk inserted earlier. The size of the hash table is much larger than M + N. What is the maximum number of record comparisons that could be performed in a search for the record RN? Justify your answer. The S-records all map to different home slots, so they are inserted without any collisions, and wind up filling M slots in the table. When R1 is inserted, it goes into its home slot, without any collisions. When R2 is inserted, it collides with R1, so we probe. Now, this could result in R2 colliding with every one of the S-records, but we could not probe linearly and reach R1 again. So, the worst case for R2 is that we get M + 1 collisions when we insert it, and look at M + 2 slots. If that happens, then inserting R3 will require looking at M + 3 slots (R1, every S-record, and R2)...... and so forth... So, the worst case for RN would be colliding with R1, every S-record, and R2 through RN-1, which would amount to M + N 1 collisions, and looking at M + N slots. In that case, a search for RN would require M + N comparisons. A complete answer will explain HOW it's possible to achieve the alleged maximum, not just state what the maximum is. 5. [10 points] A developer is designing a generic hash table, planning to use chaining rather than probing to resolve collisions, and to implement her design in Java. Her supervisor suggests that she should use a BST in each table slot (instead of a simple linked list), because that would be likely to improve the cost of searches that reached a slot holding more than one data element. She considers the suggestion, and responds by saying that using BSTs in this manner may not be possible. Is she correct? Explain. Yes. To insert an object into the hash table, that object must supply a hash method. Since every class in Java has an inherited hash method (even if it may not be very good), every Java object can be inserted into a hash table. But, to insert an object into a BST, the object must implement compareto(), or something equivalent, and that is not something that every Java object will do. Pretty straightforward... the need for Comparable is the ONLY issue here. 6

6. Double hashing resolves collisions by probing, using a second hash function to choose the probe slots. Specifically, if the key K experiences a collision in its home slot h, we use the following formula to compute indices for probe slots: Slot( i) ( h i G( K))% N where G() is the second hash function and N is the size of the table. Assume that G() is actually 1-1 and nonzero on the set of possible key values. a) [8 points] Explain why double hashing, when performing a single insertion, can be viewed as a variation of linear probing. With double hashing, we take a step of fixed length, G(K), at each probing step. With linear probing, we do exactly that, except the step length is always 1. With double hashing, the step length is likely to be different for different keys. A lot of answers basically said that all probing strategies have certain things in common, which is true but not relevant to this specific question. b) [8 points] Linear probing is guaranteed to find an empty table slot, if there is one. Explain why, even given the assumptions above about G(), this scheme may not find an empty slot, even if there is one. Naturally, if G(K) == 0, we'd just hop up and down in the home slot... but the given assumptions rule this out. But, if we have a key such that G(K) > 1 and G(K) divides the size of the table, N, then we might cycle back to the home slot after examining only a few of the table slots. In fact, it's worse than that. G(K) could be larger than N, but G(K) % N could be a divisor of N. Or, G(K) could even be a multiple of N, which would yield the "hopping in place" behavior described above. So, it's important that the table size be prime, but even then we may have problems. A good answer would address one or more specific ways that this scheme could fail, not just assert it was possible. 7

7. [12 points] Consider the BST deletion code below. There is no requirement for a node pool in this implementation, and there are no duplicate values in the tree. public void delete(t elem) { if ( elem == null ) return; root = deletehelper(elem, root); private BinaryNode deletehelper(t x, BinaryNode sroot) { if ( sroot == null ) return null; int compare = x.compareto(sroot.element); if ( compare < 0 ) { sroot.left = deletehelper(x, sroot.left); else if ( compare > 0 ) { sroot.right = deletehelper(x, sroot.right); // found the node to delete else { // handle leaf case if ( sroot.left == null && sroot.right == null ) ) { return null; // handle single-child case else if (sroot.left == null) { return sroot.right; else if (sroot.right == null) { return sroot.left; // handle two-child case else { BinaryNode rmin = findsubtreemin(sroot.right); deletehelper( rmin.element, sroot); rmin.left = sroot.left; rmin.right = sroot.right; return rmin; return node; The deletion implementation works correctly, except for one logic error in the handling of the two-child case. The helper function findsubtreemin() is implemented correctly, so it is not the problem. Describe the logic error in the given code, and draw a tree that would trigger that error. This would fail when rmin points to the right child of sroot, if I had remembered to put sroot.right here. In that case, the right child pointer in rmin actually ends by pointing to rmin. 8