Σ P(i) ( depth T (K i ) + 1),

Similar documents
Examples and Applications of Binary Search

5.3 Recursive definitions and structural induction

Lecture 1: Introduction and Strassen s Algorithm

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

The Adjacency Matrix and The nth Eigenvalue

6.854J / J Advanced Algorithms Fall 2008

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

Module 8-7: Pascal s Triangle and the Binomial Theorem

CSE 417: Algorithms and Computational Complexity

Homework 1 Solutions MA 522 Fall 2017

The isoperimetric problem on the hypercube

On (K t e)-saturated Graphs

1 Graph Sparsfication

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Protected points in ordered trees

Ones Assignment Method for Solving Traveling Salesman Problem

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121. Introduction to Trees

condition w i B i S maximum u i

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

Major CSL Write your name and entry no on every sheet of the answer script. Time 2 Hrs Max Marks 70

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

1.2 Binomial Coefficients and Subsets

Big-O Analysis. Asymptotics

Lecture 5. Counting Sort / Radix Sort

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

Priority Queues. Binary Heaps

Minimum Spanning Trees

6.851: Advanced Data Structures Spring Lecture 17 April 24

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Python Programming: An Introduction to Computer Science

Data Structures Week #5. Trees (Ağaçlar)

On Infinite Groups that are Isomorphic to its Proper Infinite Subgroup. Jaymar Talledo Balihon. Abstract

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

Mathematical Stat I: solutions of homework 1

arxiv: v2 [cs.ds] 24 Mar 2018

MAXIMUM MATCHINGS IN COMPLETE MULTIPARTITE GRAPHS

Lecture 9: Exam I Review

Computational Geometry

Computers and Scientific Thinking

Random Graphs and Complex Networks T

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Counting the Number of Minimum Roman Dominating Functions of a Graph

. Written in factored form it is easy to see that the roots are 2, 2, i,

University of Waterloo Department of Electrical and Computer Engineering ECE 250 Algorithms and Data Structures

Planar graphs. Definition. A graph is planar if it can be drawn on the plane in such a way that no two edges cross each other.

Ch 9.3 Geometric Sequences and Series Lessons

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Counting Regions in the Plane and More 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

Recursion. Recursion. Mathematical induction: example. Recursion. The sum of the first n odd numbers is n 2 : Informal proof: Principle:

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

of types having a total order and a predecessor and a successor function. In the

Combination Labelings Of Graphs

Which movie we can suggest to Anne?

Project 2.5 Improved Euler Implementation

Algorithm. Counting Sort Analysis of Algorithms

Algorithms for Disk Covering Problems with the Most Points

c-dominating Sets for Families of Graphs

An Efficient Algorithm for Graph Bisection of Triangularizations

Appendix D. Controller Implementation

EE123 Digital Signal Processing

COMP 558 lecture 6 Sept. 27, 2010

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

DATA STRUCTURES. amortized analysis binomial heaps Fibonacci heaps union-find. Data structures. Appetizer. Appetizer

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

INTERSECTION CORDIAL LABELING OF GRAPHS

Sorting 9/15/2009. Sorting Problem. Insertion Sort: Soundness. Insertion Sort. Insertion Sort: Running Time. Insertion Sort: Soundness

Big-O Analysis. Asymptotics

Introduction to Sigma Notation

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling

Pattern Recognition Systems Lab 1 Least Mean Squares

Investigation Monitoring Inventory

Minimum Spanning Trees

LU Decomposition Method

Minimum Spanning Trees. Application: Connecting a Network

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

How do we evaluate algorithms?

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CS473-Algorithms I. Lecture 2. Asymptotic Notation. CS 473 Lecture 2 1

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

Lecture 2: Spectra of Graphs

Octahedral Graph Scaling

CSE 2320 Notes 8: Sorting. (Last updated 10/3/18 7:16 PM) Idea: Take an unsorted (sub)array and partition into two subarrays such that.

The number n of subintervals times the length h of subintervals gives length of interval (b-a).

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Lower Bounds for Sorting

Strong Complementary Acyclic Domination of a Graph

Lecture Notes on Integer Linear Programming

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Performance Plus Software Parameter Definitions

Our Learning Problem, Again

Weston Anniversary Fund

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

BASED ON ITERATIVE ERROR-CORRECTION

Transcription:

EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static biary search tree of keys to miimize the expected search time for a give search probability distributio. Suppose we are give acollectio of keys 1 < 2 <... < which are to be stored i a biary search tree. After the tree has bee costructed, oly search operatios will be performed i.e. there will be o isertios or deletios. We are also give aprobability desity fuctio P where P(i) isthe probability of searchig for key i. here are may differet biary search trees where the give keys ca be stored. For a particular tree with these keys, the average umber of comparisos to fid a key, for the give probability desity is Σ P(i) ( depth ( i ) + 1), i=1 where depth ( i )deotes the depth of the ode where i is stored i. he problem we would like to solve is to fid, amog all the possible biary search trees that cotai the keys, oe which miimizes this quatity. Such a tree is called a optimal (static) biary search tree. Note that there may be several optimal biary trees for the give desity fuctio. his is why we speak of a optimal, rather tha the optimum biary search tree. Asimple way to accomplish this is to try out all possible biary trees with odes, computig the average umber of comparisos to fid a key i each tree cosidered, ad selectig a tree with the miimum average. Ufortuately, this simple strategy is ridiculously iefficiet because there are too may trees to try out. I particular, there are ( 2 )/( +1) differet biary trees with odes (if iterested i the derivatio of this formula, see uth, vol. I, pp. 388-389). hus, if there are 20 keys, we have to try out 131,282,408,400 differet trees. Computig the average umber of comparisos i each at the rather astoishig speed of 1µsec per tree, will still take 2188 hours or approximately 91 days ad ights of computig to fid a optimal biary search tree (for just 20 keys)! Fortuately, there is a much more efficiet, if less straightforward, way to fid a optimal biary search tree. Let be a biary search tree that cotais i, i+1,..., j for some 1 i j. Weshall see shortly why itisuseful to cosider trees that cotai subsets of successive keys. We defie the cost of as, c( )= Σ j P(l) ( depth ( l ) + 1) Hece, if cotais all keys (i.e. i = 1ad j = ), the cost of is precisely the expected umber of comparisos to fid a key for the give desity fuctio. hus, we ca rephrase our problem as follows: Give adesity fuctio for the keys, fid a miimum cost tree with odes. Before givig the algorithm to fid a optimal biary search tree, we prove two key facts. his is ot so if is missig some of the keys, because i that case the probabilities of the keys that are i do ot sum to 1 that is, P is ot a proper desity fuctio relative to the set of keys ithe tree.

-2- Lemma 1: Let be a biary search tree cotaiig keys i, i+1,..., j, L ad R be the left ad right subtrees of respectively. he c( )=c( L ) + c( R ) + Σ j P(l). Proof: his is a easy cosequece of the defiitio of cost of a tree. You should prove it oyour ow. Lemma 2: Let be a biary search tree that has miimum cost amog all trees cotaiig keys i, i+1,..., j ad let m be the key at the root of (so i m j ). he L,the left subtree of, is a biary search tree that has miimum cost amog all trees cotaiig keys i,..., m 1 ;ad R,the right subtree of,isabiary search tree that has miimum cost amog all trees cotaiig keys m+1,..., j. Proof: We prove the cotrapositive. hat is, if either the left or right subtree of fails to satisfy the miimality property asserted i the lemma we show that does ot really have the miimum possible cost amog all trees that cotai i,..., j. Let L ad R be miimum cost biary search trees that cotai i,..., m 1 ad m+1,..., j respectively. hus, c( L ) c( L )ad c( R ) c( R )(*). Further, let be the tree with key m i the root, ad left ad right subtrees L ad R respectively. Evidetly, is a biary search tree that cotais keys i,..., j.if L or R do ot have the miimality property asserted by the lemma, it must be that either c( L )>c( L ), or c( R )>c( R ). his, together with (*) implies that c( L ) + c( R )>c( L ) + c( R ). From this ad Lemma 1 we have, c( )=c( L ) + c( R ) + Σ j P(l) > c( L ) + c( R ) + Σ j P(l) =c( ). hus, c( )>c( ) ad is ot amiimum cost biary search tree amog all trees that cotai keys i,..., j. Computig a Optimal Biary Search ree Lemma 2 is the basis of a efficiet algorithm to fid a optimal biary search tree. Let ij deote a optimal biary search tree cotaiig keys i, i+1,..., j.he 1 is precisely the optimal biary search tree that we wat to costruct. Lemma 2 says that ij must be of the followig form: m i,m-1 m+1,j hat is, its root has key m for some m, i m j, ad its subtrees are i,m 1 ad m+1, j,i.e. miimum cost subtrees cotaiig keys i,..., m 1 ad m+1,..., j respectively. But i,m 1 ad m+1, j are "smaller" trees tha ij. his suggests proceedig iductively, startig with small miimum cost trees (each cotaiig just oe key) ad progressively buildig larger ad larger

-3- miimum cost trees, util we have a miimum cost tree with odes which is what we are lookig for. More specifically, we start the iductio with miimum cost trees each of which cotais exactly oe key, ad proceed by costructig miimum cost trees with 2, 3,..., successive keys. Note that there are exactly d +1 groups of d successive keys for each d = 1,...,. hus, istead of cosiderig all possible trees with odes we cosider oly (miimum cost) trees with 1 ode each, 1 (miimum cost) trees with 2 odes each,...,1miimum cost tree with odes, i.e. a total of ( +1) /2 trees much fewer tha ( 2 )/( +1). So ow the questio is how tocompute these trees ij iductively. he basis of the iductio, i.e. whe j i = 0istrivial. I this case we have j = i ad the miimum cost biary search tree that stores i (i fact the oly such tree!) is a sigle ode cotaiig i ;its cost is c( ii )=P(i). For the iductive step, take j i >0 ad assume that we have already computed all the uv s ad their costs, for v u < j i. Let imj be the tree with m i the root, ad left ad right subtrees i,m 1 ad m+1, j respectively. Aswesaw before, Lemma 2 implies that ij is the miimum cost tree amog the imj s. hus we ca fid ij simply by tryig out all the imj s for m = i, i +1,..., j. Ifact Lemma 1 tells us how tocompute c( imj )efficietly, sothat "tryig out" each possible m will ot take too log: Sice (m 1) i ad j (m +1) are both < j i, wehave already (iductively) computed i,m 1 ad m+1, j as well as their costs, c( i,m 1 )ad c( m+1, j ). Lemma 1 the tells us how toget c( imj )iterms of these. Note that whe m = i the left subtree of imj is i,i 1 ad whe m = j the right subtree of imj is j+1, j. Wedefie ij to be empty if i > j, ad the cost of a empty tree to be 0. Figure 1 shows this algorithm i pseudo-code. he algorithm takes as iput a array Prob[1.. ], which specifies the probability desity (i.e. Prob[i] = P(i)). It computes two twodimesioal arrays, Root ad Cost, where Root[i, j] isthe root of ij,ad Cost[i, j] = c( ij ), for 1 i j. o help compute Root ad Cost the algorithm maitais a third array, SumOf- Prob, where SumOfProb[i] = Σ i P(l) for 1 i, ad SumOfProb[0] = 0. Note that j Σ P(l) =SumOfProb[j] SumOfProb[i 1]. l=1 he algorithm of Figure 1 does ot explicitly costruct a optimal biary search tree but such a tree is implicit i the iformatio i array Root. Asaexercise you should write a algorithm which, give Root ad a array ey[1.. ], where ey[i] = i,costructs a optimal biary search tree. It is ot hard to show that this algorithm has worst case time complexity Θ( 3 ). A slight modificatio of this algorithm leads to a Θ( 2 )complexity (if iterested, see D.E. uth, "Optimum biary search trees," Acta Iformatica, vol. 1 (1971), pp. 14-25.) Usuccessful Searches I this discussio we have oly cosidered successful searches. However, if we take ito accout usuccessful searches, maybe the costructed tree is o loger optimal. Fortuately, this For techical reasos that will become apparet whe you look at the algorithm carefully we eed to set Cost[i, i 1] = 0 for1 i +1. Recall that i,i 1 is empty ad thus has cost 0.

-4- algorithm OptimalBS ( Prob[1.. ] ); begi (* iitializatio *) for i 1 to +1 do Cost[i, i 1] 0; SumOfProb[0] 0; for i 1 to do begi SumOfProb[i] Prob[i] + SumOfProb[i 1]; Root[i, i] i; Cost[i, i] Prob[i] ed; for d 1 to 1 do (* compute ifo about trees with d + 1 cosecutive keys *) for i 1 to d do begi (* compute Root[i, j] ad Cost[i, j] *) j i + d; MiCost + ; for m i to j do begi (* fid m betwee i ad j so that c( imj )ismiimum *) c Cost[i, m 1] + Cost[m +1, j] + SumOfProb[ j] SumOfProb[i 1]; if c < MiCost the begi MiCost c; r m ed ed; Root[i, j] r; Cost[i, j] MiCost ed ed Figure 1: Algorithm for optimal biary search tree. problem ca be take care of i a straightforward maer. o fid a optimal biary search tree i the case where both successful ad usuccessful searches are take ito accout, we must kow the probability desity for both successful ad usuccessful searches. So, i additio to P(i) weare also give Q(i) for 0 i, where Q(0) =prob. ofsearchig for keys < 1 ; Q(i) =prob. ofsearchig for keys x, i < x < i+1,for 1 i < ; Q() =prob. ofsearchig for keys >. I each biary search tree cotaiig keys 1, 2,..., we add +1 exteral odes,,..., E.his is illustrated below; the exteral odes are draw i boxes, as usual.

-5-2 1 3 E 4 E2 he average umber of comparisos for a successful or usuccessful search i such a tree is Σ P(i) ( depth ( i ) + 1)+ Σ Q(i) depth (E i ). i=1 he left term is the cotributio to the average umber of comparisos by the successful searches ad the right term is the cotributio to the average umber of comparisos by the usuccessful searches. Now wewat to fid a tree that miimizes this quatity. We ca proceed exactly as before, except that the defiitio of the cost of a tree with cosecutive keys i, i+1,..., j is slightly modified to accout for the usuccessful searches. Namely, itbecomes, i=0 c ( )= Σ j P(l) ( depth ( l ) + 1)+ With this cost fuctio Lemma 1 is slightly differet: j Σ Q(l) depth (E l ). Lemma 1 : Let be a biary search tree cotaiig keys i, i+1,..., j, L ad R be the left ad right subtrees of respectively. he 1 c ( )=c ( L ) + c ( R ) + Q(i 1) + Σ j (P(l) + Q(l)). Everythig else works out exactly as before. I particular, Lemma 2 is still valid (check this!). As a exercise show how to modify the algorithm i Figure 1 to accout for these chages. Example: Suppose we wat to fid a optimal biary search tree for the dictioary {begi, ed, goto, repeat, util} for the followig probabilities of searchig for these keys (P(i) s) ad keys alphabetically i betwee (the Q(i) s): 1 =begi 2 =ed 3 =goto =repeat 5 =util P(1 ) =.1 P(2 ) =.1 P(3 ) =. 05 P(4)=.05 P(5 ) =. 05 Q(0)=.05 Q(1 ) =. 05 Q(2 ) =. 2 Q(3 ) =. 2 Q(4)=.1 Q(5 ) =. 05 he trees ij as computed by the algorithm o this example are show i the table below with their costs. he computatio proceeds row by row (i.e. to compute the trees ad costs i row d, all the trees up to row d 1 must have bee previously computed). he optimal biary search tree 1,5 is show o the last row. You are strogly ecouraged to trace this example carefully. Eve if you thik you uderstad the algorithm from the previous abstract discussio, you may be surprised at how much better you ll uderstad it after workig out a example.

-6- d = -1 : E0 10 21 32 43 54 65 c( )=0 10 c( )=0 21 c( )=0 32 c( 43)=0 c( )=0 54 c( 65)=0 d = 0 : 1 2 3 5 E 4 c( )=.2 11 c( )=.35 22 c( )=.45 33 c( )=.35 44 c( )=.2 55 d = 1 : 2 3 3 E 1 2 2 5 E 4 c( 12) =.7 c( ) =.95 c( 23 34 ) =.95 c( ) =.65 45

-7- d = 2 : 2 3 3 1 3 2 E E1 E2 E 4 3 5 E 4 E 5 c( ) = 1.4 c( ) = 1.45 c( ) = 1.35 13 24 35 d = 3 : 3 3 2 2 4 E E 1 2 3 E 4 5 c( ) = 1.95 c( 14 25) = 1.85 d = 4 : 3 2 4 1 5 c( 15) = 2.35