Data Structure and Algorithm Midterm Reference Solution TA

Similar documents
Section 1: True / False (2 points each, 30 pts total)

CS-301 Data Structure. Tariq Hanif

CS126 Final Exam Review

COMP Analysis of Algorithms & Data Structures

COMP Analysis of Algorithms & Data Structures

String Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42

Analysis of Algorithms

COMP Analysis of Algorithms & Data Structures

Assignment 1. Stefano Guerra. October 11, The following observation follows directly from the definition of in order and pre order traversal:

TREES. Trees - Introduction

Data Structure and Algorithm Homework #3 Due: 2:20pm, Tuesday, April 9, 2013 TA === Homework submission instructions ===

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

Design and Analysis of Algorithms

( ) n 3. n 2 ( ) D. Ο

Solutions. Suppose we insert all elements of U into the table, and let n(b) be the number of elements of U that hash to bucket b. Then.

Lecture 9 March 4, 2010

CSE 332, Spring 2010, Midterm Examination 30 April 2010

Multiple-choice (35 pt.)

CMSC351 - Fall 2014, Homework #2

1 Lower bound on runtime of comparison sorts

Lower Bound on Comparison-based Sorting

MLR Institute of Technology

Lecture 7 February 26, 2010

Test 1 Last 4 Digits of Mav ID # Multiple Choice. Write your answer to the LEFT of each problem. 2 points each t 1

Reading for this lecture (Goodrich and Tamassia):

Trees. Chapter 6. strings. 3 Both position and Enumerator are similar in concept to C++ iterators, although the details are quite different.

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Summer I Instructions:

Lower bound for comparison-based sorting

Binary Trees. BSTs. For example: Jargon: Data Structures & Algorithms. root node. level: internal node. edge.

n 2 ( ) ( ) + n is in Θ n logn

Instructions. Definitions. Name: CMSC 341 Fall Question Points I. /12 II. /30 III. /10 IV. /12 V. /12 VI. /12 VII.

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

MIDTERM EXAM (HONORS SECTION)

1 (15 points) LexicoSort

V Advanced Data Structures

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Lecture 6: Analysis of Algorithms (CS )

INSTITUTE OF AERONAUTICAL ENGINEERING

Search Trees. Undirected graph Directed graph Tree Binary search tree

CS251-SE1. Midterm 2. Tuesday 11/1 8:00pm 9:00pm. There are 16 multiple-choice questions and 6 essay questions.

Outline. Computer Science 331. Insertion: An Example. A Recursive Insertion Algorithm. Binary Search Trees Insertion and Deletion.

Quiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)

Advanced Set Representation Methods

Spring 2018 Mentoring 8: March 14, Binary Trees

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]:

( ) n 5. Test 1 - Closed Book

V Advanced Data Structures

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

( ) D. Θ ( ) ( ) Ο f ( n) ( ) Ω. C. T n C. Θ. B. n logn Ο

BINARY SEARCH TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015

INF2220: algorithms and data structures Series 1

Framework for Design of Dynamic Programming Algorithms

Search Trees: BSTs and B-Trees

D. Θ nlogn ( ) D. Ο. ). Which of the following is not necessarily true? . Which of the following cannot be shown as an improvement? D.

DATA STRUCTURES AND ALGORITHMS

How fast can we sort? Sorting. Decision-tree example. Decision-tree example. CS 3343 Fall by Charles E. Leiserson; small changes by Carola

( D. Θ n. ( ) f n ( ) D. Ο%

Greedy Algorithms CHAPTER 16

Trees, Part 1: Unbalanced Trees

Solved by: Syed Zain Ali Bukhari (BSCS)

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

logn D. Θ C. Θ n 2 ( ) ( ) f n B. nlogn Ο n2 n 2 D. Ο & % ( C. Θ # ( D. Θ n ( ) Ω f ( n)

Chapter 5. Binary Trees

& ( D. " mnp ' ( ) n 3. n 2. ( ) C. " n

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring Instructions:

Q1 Q2 Q3 Q4 Q5 Q6 Total

Data Structures and Programming Spring 2014, Midterm Exam.

Discussion 2C Notes (Week 8, February 25) TA: Brian Choi Section Webpage:

CS521 \ Notes for the Final Exam

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

CSCI Trees. Mark Redekopp David Kempe

SORTING. Practical applications in computing require things to be in order. To consider: Runtime. Memory Space. Stability. In-place algorithms???

Search Trees. Data and File Structures Laboratory. DFS Lab (ISI) Search Trees 1 / 17

x-fast and y-fast Tries

Technical University of Denmark

Algorithms. AVL Tree

Binary Trees, Binary Search Trees

DO NOT. In the following, long chains of states with a single child are condensed into an edge showing all the letters along the way.

A set of nodes (or vertices) with a single starting point

08 A: Sorting III. CS1102S: Data Structures and Algorithms. Martin Henz. March 10, Generated on Tuesday 9 th March, 2010, 09:58

Data Structures and Algorithms

Computer Science Foundation Exam

Scan Primitives for GPU Computing

I2206 Data Structures TS 6: Binary Trees - Binary Search Trees

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

DOWNLOAD PDF LINKED LIST PROGRAMS IN DATA STRUCTURE

Week 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

CS 206 Introduction to Computer Science II

Binary heaps (chapters ) Leftist heaps

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

Binary Search Trees, etc.

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

1. To reduce the probability of having any collisions to < 0.5 when hashing n keys, the table should have at least this number of elements.

Trees. (Trees) Data Structures and Programming Spring / 28

Data Structures Question Bank Multiple Choice

8. Binary Search Tree

( ) + n. ( ) = n "1) + n. ( ) = T n 2. ( ) = 2T n 2. ( ) = T( n 2 ) +1

9. The expected time for insertion sort for n keys is in which set? (All n! input permutations are equally likely.)

Lecture: Analysis of Algorithms (CS )

Transcription:

Data Structure and Algorithm Midterm Reference Solution TA email: dsa1@csie.ntu.edu.tw Problem 1. To prove log 2 n! = Θ(n log n), it suffices to show N N, c 1, c 2 > 0 such that c 1 n ln n ln n! c 2 n ln n (1) for all n N, since constants can be ignored when considering asymptotic complexity. For the second inequality, since n n ln n! = ln k ln n = n ln n (2) k=1 k=1 for n 1, we can take c 2 = 1 and the constraint N 1 is trivial. For the trickier first inequality, calculus come in handy! Remember that the upper sum is greater than the area under the curve y = ln x, i.e. n ln n! = ln k n k=1 1 ln xdx = (x ln x x) n 1 = n ln n n + 1 (3) for n 1, with the help of integration by parts. Now we want n ln n n + 1 c 1 n ln n, or equivalently (1 c 1 )n ln n n 1, for n N. If we take c 1 = 0.5 (actually all 0 < c 1 < 1 are fine, and note that different c 1 result in different N 0 below), then what we want becomes n ln n 2n 2, for n N. We already know (and are familiar with) that n ln n = ω(n), so there must be some N 0 N such that n ln n 2n 2, for n N 0. The formal arguments are not shown here. To sum up, take c 1 = 0.5, c 2 = 1, N = N 0, and with (2), (3) we have (1), which completes the proof. Remark 1. Another simpler way to prove log 2 n! = Ω(n log n) (the first inequality) is by log n! = log n + log(n 1) + + log n 2 + + log 1 log n + log(n 1) + + log n 2 log n 2 + log n 2 + + log n 2 = n 2 log n 2 = Ω(n log n). Remark 2. This problem can be solved easily with Stirling s formula, which this margin is too narrow to contain. 1

Problem 2. (a) Code provided as follow: 1 int is_linked_list_with_loop ( struct list_node * head ){ 2 list_ node * one_ step = head ; 3 list_ node * two_ steps = head ; 4 while ( one_ step!= NULL && two_ steps!= NULL ){ 5 one_step = one_step -> next ; 6 two_steps = two_steps -> next ; 7 if( two_ steps == NULL ) return 0; 8 two_steps = two_steps -> next ; 9 if( one_ step!= NULL && one_ step == two_ steps ) 10 return 1; 11 } 12 return 0; 13 } (b) Suppose the linked list is without loop. Then, for each one step and two steps will at most go through O(N) nodes then encounter NULL pointer. Suppose the linked list has a loop. Assume that the size of loop is C and total number of nodes is N. Then, for each iteration, the number of steps between one step and two steps will differ by one. Therefore, after K iterations, their difference will be exactly K. Then, after N C iterations, both of them will go into the loop. Once their steps difference is a multiple of C, they will be on the same node. We need at most C more iterations to reach a multiple of C. Therefore, total time complexity is O(N C)+O(C) = O(N). (c) We modify the code of (a) to provide more information to solve this problem. That is, the number of step go through by one step. For convenience, we encode the step by multiplying it by two. Then, we can still have the information that the linked list with loop or not from least significant bit. 1 int is_linked_list_with_loop ( struct list_node * head ){ 2 list_ node * one_ step = head ; 3 list_ node * two_ steps = head ; 4 int step = 0; 5 while ( one_ step!= NULL && two_ steps!= NULL ){ 6 step = step + 1; 7 one_step = one_step -> next ; 8 two_steps = two_steps -> next ; 9 if( two_ steps == NULL ){ 10 while ( one_ step!= NULL ){ 11 step = step + 1; 12 one_step = one_step -> next ; 13 } 14 return 0 + step * 2; 15 } 2

16 two_steps = two_steps -> next ; 17 if( one_ step!= NULL && one_ step == two_ steps ) 18 return 1 + step * 2; 19 } 20 return 0 + step * 2; 21 } 22 int chain_ length ( struct list_ node * head ){ 23 int result = is_linked_list_with_loop ( head ); 24 if( ( result & 1 ) == 0 ) 25 return result / 2; 26 int step_ of_ one = result / 2; 27 list_ node * one_ step = head ; 28 list_ node * two_ step = head ; 29 for ( int i = 0 ; i < step_ of_ one ; i ++ ) 30 two_step = two_step -> next ; 31 for ( int i = 0 ; i < step_ of_ one ; i ++ ){ 32 if( one_ step == two_ step ){ 33 return i; 34 } 35 one_step = one_step -> next ; 36 two_step = two_step -> next ; 37 } 38 return -1; // this should not happen 39 } 3

Problem 3. Space-efficient doubly linked list. (a) (6%) 1 first_node -> xor_pointer = ( int ) NULL ^ ( int )(& next_node ) 2 middle_node -> xor_pointer = ( int )(& prev_node ) ^ ( int )(& next_node ) 3 last_node -> xor_pointer = ( int )(& prev_node ) ^ ( int ) NULL Note that XORing (int)null can be removed, since a bit remains the same after being XORed with 0. Each node s xor pointer pointer stores the XOR value of its previous node s address and its next node s address. During traversal, whatever the direction is, calculating the XOR value of the xor pointer and its previous encountered node s address results in the next node s address. (b) (9%) 1 int number_ of_ nodes ( struct db_ list_ node * head ) { 2 struct db_ list_ node * prev, * next ; 3 prev = NULL ; 4 int count = 0; 5 while ( head!= NULL ) { 6 next = ( struct db_list_node *) (( int )head -> xor_pointer ^ ( int ) prev ); 7 prev = head ; 8 head = next ; 9 count ++; 10 } 11 return count ; 12 } Note that the int type should be replaced by long long int type, if the computer architecture is 64-bit instead of 32-bit, or you can use uintptr t type, which automatically detects the length of address, to avoid this problem. 4

Problem 4. (a) (5%) Because insertion sort runs in O(n 2 ) worst-case time for sorting n elements, sort a sequence of k elements runs in O(k 2 ) worst-case time. For sorting n k sublists, the total running time is O( n k k2 ) = O(nk) (b) (10%) By problem 4.(a), we know that the running time of insertion sort part is O(nk) (the bottom of the tree). The running time for sorting n elements from each level to higher level in merge sort is O(n) (because in each level there are n elements). And we have n k sublists, so the number of n levels of the tree are log 2 k. The running time of merge sort part is O(n log n k ), and the time complexity of the hybrid algorithm is O(nk + n log n k ). (c) (10%) Our target is to find the optimal k so that intersion sort can run faster than merge sort. At fisrt, we can find that sorting k elements by insertion sort is faster than sorting by merge sort. When the value of k grows and reaches some value, the running time of insertion sort is larger than the running time of merge sort. In practice, we can choose any platform and compare the running time of merge sort and insertion sort by trying various values of k (e.g. k from 2 to n). The optimal value of k is the largest list length on which insertion sort is faster than merge sort. 5

Problem 5. (a) (5%) (1), (2), (4): valid; (3), (5): invalid. For each list, split it in two lists: one with numbers smaller than 363 and one with bigger (in order). The smaller list should be sorted ascending, the bigger should be descending. For example, in case (1), the bigger list is 401, 398, and 397. It is sorted descending. The smaller list is 2, 252, 330, and 344. It is sorted ascending. Therefore, case (1) is valid. You can use the same method for case (2) and (4) and found that they are all valid, too. In case (3), the bigger list is 925, 911, and 912. It is not sorted descending, so it is invalid. In case (5), the smaller list is 278, 347, 299, and 358. It is not sorted ascending, so it is invalid. (b) (5%) Given that x has two child nodes, x s successor must be in x s right subtree because The x s successor must in the x s right subtree. Since x have right child, there exist one node greater than x. For the other nodes which is not in the x s right subtree, if it is greater than x, then it must greater than any node in subtree of x, so there is no any node can between x and the smallest node in the x right subtree, hence that the smallest node in x right subtree is x s successor. If x s succesor has a left child, the left child is greater than x since it is in x s right subtree and smaller then x s successor, we can find a contradiction that there is a node between x and x s successor, so x s successor have no left child. Similar, the smallest node in x s left subtree is x s predecessor and if it have right child will lead to the same contradiction. (c) (10%) 1 struct bst_ node * bst_ successor ( struct bst_ node * x) { 2 bst_ node * successor = NULL ; w 3 if (x-> right!= NULL ) { 4 successor = x-> right ; 5 while ( successor -> left!= NULL ){ 6 successor = successor - > left ; 7 } 8 } else { 9 while (x->p!= NULL ){ 10 if (x->p-> data > x-> data ){ 11 successor = x->p; 12 break ; 13 } 14 x = x->p; 15 } 16 } 17 return successor ; 18 } 6

Problem 6. String compression. First of all, we can see that string AB (directly concatenate string A and string B) is a valid choice of S. Knowing the answer is no more then A + B ( S denotes the length of string S), let s focus on finding shorter S. If S = x < A + B, A[1]... A[x B + 1]... A[ A ] S[1]... S[x B + 1]... S[ A ]... S[x] B[1]... B[ A + B x]... B[ B ] There should be a string A s suffix equals to a string B s prefix. If the string A s suffix with length y is equal to the string B s prefix with length y, then one can construct a string with length A + B y that is a valid choice of S. The task becomes to find the longest A s suffix that is also a string B s prefix. One can get 30% of scores by enumerating every A s suffix and directly comparing it with B s prefix that has the same length. To get 100% of scores, one possible way is using Rabin-Karp Algorithm to compare a pair of A s suffix and B s prefix, which get time complexity to O( A + B ). Another solution uses the property of Prefix function from Knuth-Morris-Pratt Algorithm. For any string C, the prefix function π[ C ] is the largest number k < C such that the suffix of C with length k is a prefix of C. This seems similar to our task here. But we got two strings here, how to manage this problem? Concatenate them! Let s make C = B$A where $ is a character not being any of the lowercase English letters. Because of the strange character $, π[ C ] will not exceed min{ A, B }, which is actually matching A s suffix to B s prefix. So we got another solution with time complexity O( A + B ). 7

Problem 7. Post-order Sequence Verification. Overview Basically, any sequence of numbers could be a post-order sequence of a binary tree. However, there are some characteristics that one must possess to be a post-order sequence of a binary search tree. The definition of post-order is outputting the content of a binary tree in the following order: (1) output the left sub-tree, (2) output the right sub-tree, and (3) output the root. Now, consider the characteristics of a binary search tree. The left sub-tree contains all nodes that are smaller than root s value, and the right sub-tree contains all those larger than root s value. Therefore, a post-order sequence, when we take the last number as the value of root, the rest of the sequence can be divided into two parts, where the first part is all the numbers smaller than root s value, i.e., left sub-tree, and the numbers in the second part would be all greater than root s value, forming the right sub-tree. Algorithm Based on the observation and deduction above, we can now develop a more structured set of steps for our post-order sequence verification: Take the last number as root s value. Try split the rest of the sequence into two parts. If successful, recursively check the two sub-sequence. Otherwise, it s not a valid post-order sequence. Sample Code The following code snippet implements the algorithm above. The function check() would check a sub-sequence of the given integer array, sequence, specified by start and end, both inclusive. The return value of check() is the height of the tree formed by the specified sub-sequence if it is possible, -1 otherwise. Additionally, the height of the tree formed by current sub-sequence is: 1 + the height of its higher sub-tree, either left or right. 1 function check ( sequence, start, end ): 2 # the terminal condition 8

3 if start > end : 4 return 0 5 # get the value of root 6 root_ value = sequence [ end ] 7 # find the position of separation 8 for i = start to end : 9 if sequence [ i] > root_ value : 10 break 11 separate_ index = i 12 for i = separate_ index to end : 13 if seq [i] < root_value : 14 return -1 # not separable 15 # recursive checking 16 left_ sub = check ( sequence, start, separate_ index - 1) 17 right_ sub = check ( sequence, separate_ index, end - 1) 18 if left_ sub == -1 or right_ sub == -1: 19 return -1 20 else : 21 return max ( left_ sub, right_ sub ) + 1 9