Evaluating Top-k Selection Queries

Size: px
Start display at page:

Download "Evaluating Top-k Selection Queries"

Transcription

1 Evaluatig Top-k Selectio Queries Surajit Chaudhuri Microsoft Research Luis Gravao Columbia Uiversity Abstract I may applicatios, users specify target values for certai attributes, without requirig exact matches to these values i retur. Istead, the result to such queries is typically a rak of the top k tuples that best match the give attribute values. I this paper, we study the advatages ad limitatios of processig a top-k query by traslatig it ito a sigle rage query that traditioal relatioal DBMSs ca process efficietly. I particular, we study how to determie a rage query to evaluate a top-k query by exploitig the statistics available to a relatioal DBMS, ad the impact of the quality of these statistics o the retrieval efficiecy of the resultig scheme. Itroductio Iteret Search egies rak the objects i the results of selectio queries accordig to how well these objects match the origial selectio coditio. For such egies, query results are ot flat sets of objects that match a give coditio. Istead, query results are raked startig from the top object for the query at had. Give a query cosistig of a set of words, a search egie returs the matchig documets sorted accordig to how well they match the query. For decades, the iformatio retrieval field has studied how to rak text documets for a query both efficietly ad effectively []. I cotrast, much less attetio has bee devoted to supportig such top-k queries over relatioal databases. As the followig example illustrates, top-k queries arise aturally i may applicatios where the data is exact, as i a traditioal relatioal database, but where users are flexible ad willig to accept o-exact Permissio to copy without fee all or part of this material is grated provided that the copies are ot made or distributed for direct commercial advatage, the VLDB copyright otice ad the title of the publicatio ad its date appear, ad otice is give that copyig is by permissio of the Very Large Data Base Edowmet. To copy otherwise, or to republish, requires a fee ad/or special permissio from the Edowmet. Proceedigs of the 5th VLDB Coferece, Ediburgh, Scotlad, 999. matches that are close to their specificatio. The aswer to such a query is a raked set of the k tuples i the database that best match the selectio coditio. Example : Cosider a real-estate database that maitais iformatio like the Price ad Number of Bedrooms of each house that is available for sale. Suppose that a potetial customer is iterested i houses with four bedrooms, ad with a price tag of aroud $,. The database system should the rak the available houses accordig to how well they match the give user preferece, ad retur the top houses for the user to ispect. If o houses match the query specificatio exactly, the system might retur a house with, say, five bedrooms ad a price tag close to $, as the top house for the query. Ufortuately, despite the coceptual simplicity of top-k queries ad the expected performace payoff, they are ot yet supported by today s relatioal database systems. This support would free applicatios ad ed-users from havig to add this fuctioality i their cliet code. To provide such support efficietly, we eed processig techiques that do ot ivolve full sequetial scas of the uderlyig relatios. The challege i providig this fuctioality is that the database system eeds to hadle efficietly top-k queries for a wide variety of scorig fuctios. Ief- fect, these scorig fuctios might chage by user, ad they might also vary by applicatio, or by database. It is also importat that we are able to process such top-k queries with as few extesios to existig query egies as possible, sice today s relatioal systems are sigificatly complex ad performace sesitive. As i the case of processig traditioal selectio queries, oe must cosider the problem of executio as well as optimizatio of top-k queries. We assume that the executio egie is a traditioal relatioal egie that supports sigle as well as possibly multidimesioal idexes. Therefore, the key challege is to augmet the optimizatio phase such that top-k selectio queries may be compiled ito a executio pla that ca leverage the existig data structures (i.e., idexes) ad statistics (e.g., histograms) that a database system maitais. Simply put, we eed to develop ew techiques that make it possible to map a top-k query ito a traditioal selectio query. It is also importat

2 that ay such techique preserves the followig two properties: () it hadles a variety of scorig fuctios for computig the top-k tuples for a query, ad () it guaratees that there are o false dismissals (i.e., we ever miss ay of the top-k tuples for the give query). I this paper, we udertake a comprehesive study of the problem of mappig top-k queries ito executio plas that use traditioal selectio queries. I particular, we use the database histograms to map a top-k query to a suitable rage that ecapsulates k best matches for the query. I particular, we study the sesitivity of the mappig algorithms to the followig parameters: types of histograms available ad their memory budgets, scorig fuctios, data distributio, ad umber of query attributes. The rest of the paper is orgaized as follows. Sectio formally defies the problem of queryig for topk matches. Sectio discusses related work. Sectio 4 is the core of the paper, ad outlies the techiques that form the basis of our approach. Fially, Sectio 6 presets a experimetal evaluatio of our approach, usig the experimetal settig of Sectio 5. Query Model I a traditioal relatioal system, the aswer to a selectio query is a set of tuples. I cotrast, the aswer to a top-k query is a ordered set of tuples, where the orderig reflects how closely each tuple matches the give query. This sectio defies our query model precisely. Cosider a relatio R with attributes A,...,A. Atop-k query over R simply specifies target values for the attributes i R. Thus, a query is a assigmet of values v,...,v to the attributes A,...,A of R. I this paper, we will focus o top-k queries o cotiuous attributes (e.g., age, salary). Without loss of geerality, we will also assume that the values of these attributes are ormalized to be real umbers betwee ad. Example : Cosider a relatio S with two attributes, A ad A. These attributes have real values that rage betwee ad. A example of top- query over this relatio is q =(.4,.). Such a query asks for the tuples i S that are the closest to the (.4,.) poit, for some defiitio of proximity, as we discuss below. Give a top-k query q, the database system with relatio R uses some scorig fuctio Score to determie how closely each tuple i R matches the target values v,...,v specified i query q. Give a tuple t ad a query q, we assume that Score(q, t) is a real umber that rages betwee ad. I this paper, we focus o three importat scorig fuctios, amely Mi, Euclidea, adsum. Defiitio : Cosider a relatio R =(A,...,A ). A,...,A are real-valued attributes ragig betwee ad. The, give a query q =(q,...,q ) adatuple t =(t,...,t ) from R, we defie the score of t for q usig ay of the followig three scorig fuctios: Mi(q, t) = mi { q i t i } i= Euclidea(q, t) = (q i t i ) Sum(q, t) = i= i= q i t i Example : Cosider a tuple t = (.,.8) i our sample database S from Example, ad query q = (.4,.). The, t will the have a score of Mi(q, t) =mi{..4,.8. } =.5 for the Mi scorig fuctio, a score of Euclidea(q, t) = =.64 for the Euclidea scorig fuctio, ad a score of Sum(q, t) = ( )=.7 forthesum scorig fuctio. Figure (c) shows the distributio of scores for the Mi scorig fuctio ad query q =(.4,.). The horizotal plae i the figure cosists of the tuples with z =.8, so what emerges above this plae are those tuples with score.8 or higher. Note that the tuples with score.8 or higher for q are eclosed i a box aroud q. I cotrast, the tuples with score.8orhigherfortheeuclidea scorig fuctio (Figure (b)) are eclosed i a circle aroud q. Fially, the top tuples accordig to the Sum scorig fuctio lie withi a rotated box aroud q (Figure (a)). This differece i the shape of the regio eclosig the top tuples for the query will have crucial implicatios o query processig, as we will discuss i Sectio 4. A simple variatio of the defiitio of the scorig fuctios above results from lettig the differet attributes have differet weights. I geeral, the Mi, Euclidea,adSum fuctios that we use i this paper are just a few of may possible scorig fuctios. Our strategy for processig top-k queries ca be adapted to hadle a wide variety of such fuctios, as we will discuss. The key property that we ask from scorig fuctios is as follows: Property : Mootoicity of Scorig Fuctios: Cosider a relatio R ad a scorig fuctio Score defied over it. Let q =(v,...,v ) be a top-k query over R, ad let t =(t,...,t ) ad t =(t,...,t ) be two tuples i R such that t i q i t i q i for i =,...,. (I other words, t is at least as close to q as t for all attributes.) The, Score(q, t ) Score(q, t). Ituitively, this property of scorig fuctios implies that if a tuple t is closer, alog each attribute,

3 Sum X Y.8 Euclidea X (a) (b) (c) Figure : The scores (z axis) for query q =(.4,.) for the differet (x, y) pairs ad scorig fuctios Sum (a), Euclidea (b), admi (c) Y Mi X Y to the query values tha some other tuple t is, the, the score that t gets for the query caot be worse tha that of t. Fortuately, all iterestig scorig fuctios that we could thik of satisfy our mootoicity assumptios. I particular, the Euclidea, Mi, ad Sum scorig fuctios that we defied above satisfy this property. A possible SQL-like otatio for expressig top-k queries is as follows []: SELECT * FROM R WHERE A=v AND... ORDER k BY Score AND A=v The distiguishig feature of the query model is i the ORDER BY clause. This clause idicates that we are iterested i oly the k aswers that best match the give WHERE clause, accordig to the Score fuctio. Sectio 4 discusses how we will evaluate top-k queries for differet defiitios of the Score fuctio. Related Work Motro [9] emphasized the eed to support approximate ad raked matches i a database query laguage. He exteded the laguage Quel to distiguish betwee exact ad vague predicates. He also suggested a composite scorig fuctio to rak each aswer. Motro s work led to further developmet of the idea of query relaxatio that weakes a give user query to provide approximate matches usig additioal metadata (e.g., cocept hierarchies). The queryig model for top-k queries that we use i this paper is cosistet with Motro s defiitios. Our key focus is o explorig opportuities ad limitatios of efficietly mappig top-k queries ito traditioal relatioal queries. Recetly, Carey ad Kossma [, ] preseted techiques to optimize queries that require oly top-k matches. Their techique leverages the fact that whe k is relatively small compared to the size of the relatio, specialized sortig (or idexig) techiques that ca produce the first few values efficietly should be used. However, i order to apply their techiques whe the scorig fuctio is ot based o colum values themselves (e.g., as is the case for Mi, Euclidea, ad Sum as defied i Sectio ), we eed to first evaluate the scorig fuctio for each database object. Thus, whe a query requests the top-k values accordig to a scorig fuctio like Mi, theirtechique would eed to first evaluate the Mi score for every data object. Oly after evaluatig the score for each object are we able to use the techiques i [, ]. Hece, these strategies require a preprocessig step to compute the scorig fuctio itself ivolvig oe sequetial sca of all the data. I cotrast, i this paper we explore techiques that avoid accessig the etire data set. I [4, 5], Fagi addresses the problem of fidig topk matches for a user query q ivolvig several multimedia attributes. Each of these attributes (e.g., a image attribute) is assumed to have a ative sub-system that aswers top-k queries ivolvig oly the correspodig attribute. I the first phase of Fagi s A algorithm, the query processig system obtais a stream L i of top matches for coditio c i o attribute A i from the correspodig sub-system. Whe there are at least k objects i the itersectio of all the sigle-attribute streams L i, the system is guarateed to have already accessed k top objects for query q. (These top objects are ot ecessarily i the itersectio of the streams.) The secod phase of algorithm A computes the score of each of the retrieved objects, ad returs the best k objects. I Sectio 4., we preset a adaptatio of Fagi s strategy to the case whe the top-k query is issued agaist a relatioal database system. I [], we preseted a algorithm for processig queries over a multimedia database. Our query model built o Fagi s to also iclude Boolea coditios to the top-k compoet of the multimedia queries. There is a large body of work o fidig the earesteighbors of a multidimesioal data poit. Give a -dimesioal poit p, these techiques retrieve the k objects that are earest to p accordig to a give

4 distace metric. The state-of-the-art algorithms (e.g., [7]) follow a multi-step approach. Their key step is idetifyig a set of poits A such that p s k earest eighbors are o further from p tha a is, where a is the poit i A that is furthest from p. (A more recet paper [4] further refies this idea.) This approach is coceptually similar to the approach that we follow i this paper (ad also i []), where we first fid a suitable score S, ad the we use it to build a relatioal query that will retur the top-k matches for the origial query. Our focus i this paper is to study the practicality ad limitatios of usig the iformatio i the histograms kept by a relatioal system for query processig. I cotrast, the earest-eighbor algorithms metioed above use the data values themselves to idetify a cut-off score. Fially, refereces [6, 8] study how to merge ad recocile top-k query results obtaied from distributed databases whe the databases use arbitrary, udisclosed scorig algorithms. 4 Mappig a Top-k Query ito a Traditioal Selectio Query This sectio shows how to map a top-k query q ito a relatioal selectio query C q that ay traditioal RDBMS ca execute. Our goal is to obtai k tuples from relatio R that are the best tuples for q accordig to a scorig fuctio Score. Our query processig strategy cosists of the followig steps:. Use statistics o relatio R to fid a search score S q (Sectio 4.).. Build a selectio query C q to retrieve all tuples i R with score S q or higher for q (Sectio 4.).. Evaluate C q over R. 4. Compute Score(q, t) for every tuple t i the aswer for C q. 5. If there are at least k tuples t i the result for C q with Score(q, t) S q, the output k tuples with the highest scores. Otherwise, choose a lower value for S q ad restart the process. Sectio 4. itroduces a related mappig strategy that does ot follow the five steps above, ad is a adaptatio of Fagi s A algorithm (Sectio ). 4. Choice of Search Score S q The key step for evaluatig a top-k query q is determiig score S q : our algorithm retrieves all tuples t such that Score(q, t) S q. If there are at least k such tuples, the our algorithm above succeeds i fidig the top k matches for q. Otherwise, our choice of S q is too high, ad hece the query eeds to be restarted with a lower value for S q. Cosequetly, we should choose a value of S q that is ot too low, so that we do ot retrieve too may cadidate tuples from the database, but that is ot too high either, so that we ca obtai the top-k tuples without restartig the query. Our choice of S q will be guided by the statistics that the query processor keeps about relatio R. I particular, we will assume that we have a -dimesioal histogram H that describes the distributio of values of R. We discuss this issue further i Sectio 5.. Util the, we assume that H cosists of a series of ooverlappig buckets. Each bucket has associated with it a -rectagle [a,b ]... [a,b ], ad stores the umber of tuples i R that lie withi the -rectagle, together with other iformatio. For efficiecy, our choice of S q will be based o histogram H, ad ot o the uderlyig relatio R itself. More specifically, we choose S q as follows: a. Create (coceptually) a small, sythetic relatio R, cosistet with histogram H. R has oe distict tuple for each bucket i H, with as may istaces as the frequecy of the correspodig bucket. b. Compute Score(q, t) for every tuple t i R. c. Let T be the set of the top-k tuples i R for q. Output S q =mi t T Score(q, t). We ca coceptually build sythetic relatio R i may differet ways. We will study two extreme query processig strategies that result from two possible defiitios of R. The first query processig strategy, NoRestarts, results i a search score S q that is low eough to guaratee that o restarts are ever eeded as log as histograms are kept up to date. I other words, Step (5) above always fiishes successfully, without ever havigtoreduces q ad restart the process. For this, the NoRestarts strategy defies R i a pessimistic way: give a histogram bucket b, the correspodig tuple t b that represets b i R will be as bad for query q as possible. More formally, t b is a tuple i b s -rectagle with the followig property: Score(q, t b )=mi t T b Score(q, t) where T b is the set of all potetial tuples i the - rectagle associated with bucket b. Example 4: Cosider our example relatio S, with two attributes A ad A,queryq =(.4,.), ad the -dimesioal histogram H show i Figure (a). Histogram H has three buckets, b, b,adb. Relatio S has 4 tuples i bucket b, 5 tuples i bucket b, ad 55 tuples i bucket b. As explaied above, the NoRestarts strategy will build relatio S based o H by assumig that the tuple distributio i S is

5 (, ) (, ) b t b b q (, ) (, ) b t b b q t t Iter Iter NoRestarts Restarts Figure : The four strategies for computig the search score S q. (.4,.) t t (, ) (, ) (, ) (, ) (a) (b) Figure : A -bucket histogram H ad the choice of tuples represetig each bucket that strategies NoRestarts (a) ad Restarts (b) make for query q. (.,.) q=(.4,.) (.68,.) as bad as possible for query q. So, relatio S will cosist of three tuples (oe for each bucket i H) t, t,adt, which are as far from q as their correspodig bucket boudaries permit. Tuple t will have a frequecy of 4, t will have a frequecy of 5, ad t will have a frequecy of 55. Assume that the user who issued query q wats to use the Mi scorig fuctio to fid the top tuples for q. SiceMi(q, t )=., Mi(q, t )=.6, ad Mi(q, t )=.4, to get tuple istaces we eed the top tuple, t (frequecy 5), ad t (frequecy 55). Cosequetly, the search score S q will be Mi(q, t )=.4. From the way we built S, it follows that the origial relatio S is guarateed to cotai at least tuples with score S q =.4 orhigher for query q. The, if we retrieve all of the tuples with that score or higher, we will obtai a superset of the set of top-k tuples for q. Lemma : Let q be a top-k query over a relatio R. Let S q be the search score computed by strategy NoRestarts for q. The, there are at least k tuples t i R such that Score(q, t) S q. The secod query processig strategy, Restarts, results i a search score S q that is highest amog those search scores that might result i o restarts. This strategy defies R i a optimistic way: give a histogram bucket b, the correspodig tuple t b that represets t b i R will be as good for query q as possible. More formally, t b is a tuple i b s -rectagle with the followig property: Score(t b,q)=max t T b Score(q, t) where T b is the set of all potetial tuples i the - rectagle associated with bucket b. Example 4: (cot.) The Restarts strategy will ow build relatio S based o H by assumig that the tuple distributio i S is as good as possible for query q (Figure (b)). So, relatio S will cosist of three tuples (oe per bucket i H) t, t,adt,which (.4,.58) Figure 4: The circle aroud query q =(.4,.) cotais all of the tuples with Euclidea score of.8 or higher for q. are as close to q as their correspodig bucket boudaries permit. I particular, tuple t will be defied as q proper, with frequecy 5, sice its correspodig bucket (i.e., b ) has 5 tuples i it. After defiig the bucket represetatives t, t, ad t, we proceed as i the NoRestarts strategy to sort the tuples o their score for q. For Mi, we pick tuples t ad t,ad defie S q as Mi(q, t ). This time it is ideed possible for fewer tha k tuples i the origial table S to have a score of S q or higher for q, so restarts are possible. The S q score that Restarts computes is the highest score that might result i o restarts i Step (5) of the algorithm above. I other words, usig a value for S q that is higher tha that of the Restarts strategy will always result i restarts. I practice, as we will see i Sectio 6, the Restarts strategy results i restarts i virtually all cases, hece its ame. Lemma : Let q be a top-k query over a relatio R. Let S q be the search score computed by strategy Restarts for q. The, there are fewer tha k tuples t i R such that Score(q, t) >S q. I additio to the two extreme score-selectio strategies NoRestarts ad Restarts, we will study two other itermediate strategies, Iter ad Iter (Figure ). Give a query q, lets q be the search score selected by NoRestarts for q, adlets q be the correspodig score selected by Restarts. The, the Iter strategy will choose score SqS q, while the Iter strategy will choose a higher score of SqS q.asour experimets will show, Iter ad Iter are ofte the best strategies that we ca follow i terms of the efficiecy of the resultig techiques.

6 4. Choice of Selectio Query C q Oce we have determied the search score S q (Sectio 4.), the algorithm i Sectio 4 uses a query C q to retrieve all tuples t such that Score(q, t) S q,where q is the origial top-k query, ad Score is the scorig fuctio beig used. I this sectio we describe how to defie query C q. Ideally, we would like to ask our database system to retur exactly those tuples t such that Score(q, t) S q. Ufortuately, idexig structures i relatioal DBMSs do ot atively support this kid of predicates, as discussed i Sectio. Our approach is to build C q as a simple selectio coditio defiig a -rectagle. I other words, we defie C q as a query of the form: SELECT * FROM R WHERE (a<=a<=b) AND... AND (a<=a<=b) The -rectagle [a,b ]... [a,b ]ic q should tightly eclose all tuples t i R with Score(q, t) S q. Example 5 : Cosider our example query q = (.4,.) over relatio S, with Euclidea as the scorig fuctio. Suppose that our search score S q is.8, as computed by ay of the strategies i Sectio 4.. Each tuple t with Euclidea(q, t).8 lies i the circle aroud q that is show i Figure 4. The, the tightest -rectagle that ecloses that circle is [.,.68] [.,.58]. Hece, the fial SQL query C q is: SELECT * FROM S WHERE (.<=A<=.68) AND (.<=A<=.58) Give a search score S q,the-rectagle [a,b ]... [a,b ] that determies C q follows directly from the scorig fuctio used, the search score S q, ad the query q. Example 5: (cot.) Let us assume that the search score for our query q = (.4,.) is S q =.8, as above. We calculate the -rectagle that ecloses all tuples with.8 score or higher by focusig o oe attribute at a time. First, cosider a tuple r =(t,.) that has the same attribute values as query q i all attributes except for maybe attribute A. We will compute the rage of values that t ca have while Euclidea(q, r).8. I effect, Euclidea(q, r) = (t Euclidea((.4,.), (t,.)) =.4). Cosequetly, Euclidea(q, r).8 if ad oly if. t.68. Hece, the rage of values that attribute A ca take is [a,b ]=[.,.68]. Aalogously for attribute A,[a,b ]=[.,.58]. Puttig both pieces together, the fial -rectagle that ecloses all tuples with score.8 or higher for q is [.,.68][.,.58] (Figure 4). Score a i b i Mi q i (. S q) q i (. S q) Sum Euclidea q i (. S q) q i (. S q) q i (. S q) q i (. S q) Table : The -rectagle [a,b ]...[a,b ]forc q s selectio coditio ad search score S q, for differet scorig fuctios, where a i = max{,a i } ad b i = mi{,b i }. Table summarizes how to compute the -rectagle [a,b ]... [a,b ] for the three scorig fuctios from Sectio. The Mi scorig fuctio presets a iterestig property: the regio to be eclosed by the -rectagle is already a -rectagle. (See Figure (c).) Cosequetly, the query C q that is geerated for Mi for query q ad its associated search score S q will retrieve oly tuples with a score of S q or higher. This property will result i efficiet executios of top-k queries for Mi, as we will see. Ufortuately, this property does ot hold for the Sum ad Euclidea scorig fuctios (Figures (a) ad (b)). 4. A Alterative Mappig Strategy This sectio adapts Fagi s A algorithm (Sectio ) to produce a ew techique for mappig a top-k query ito a traditioal relatioal query. Ulike the Sectio 4. strategies, the selectio query resultig from this ew mappig is a disjuctio, ot a cojuctio. Our goal is, agai, to build a oe-shot relatioal query that avoids restarts wheever possible. We proceed as i strategy NoRestarts (Sectio 4.) to build a database with oe tuple represetig each bucket i the available -dimesioal histogram. We fid the top tuples as i the NoRestarts strategy. We the compute a -rectagle F =[a,b ]...[a,b ]that ecloses these top tuples tightly, ad that has bee exteded so that it is symmetric with respect to the give query q. (I other words, a i q i b i ad b i q i = q i a i,fori =,...,.) The tuples matchig rage [a i,b i ] are the top tuples for q alog attribute A i. The selectio query cosists of the disjuctio of the a i A i b i coditios. By retrievig all tuples that match at least oe of these coditios, we retrieve the top tuples for each of the idividual attributes. Furthermore, from the way we costructed F, there will be at least k tuples matchig all coditios. As with the origial A algorithm, we compute the score for all the oe-dimesioal matches. The k retrieved tuples havig the highest score for q are the fial aswer to the origial top-k query. The correctess of this algorithm follows from that of algorithm A [4]. Due to space costraits, we do ot discuss this algorithm ay further i this paper.

7 5 Experimetal Settig We ow describe the data sets, histograms, ad metrics for the experimets of Sectio Data Sets Our experimets use a real-world data set as well as sythetic data. The real-world data set is a fragmet of US Cesus Bureau data, ad was obtaied from the Uiversity of Califoria, Irvie archive of machielearig databases (ftp://ftp.ics.uci.edu/pub/- machie-learig-databases). The data set has 45, rows. Each row is a record for a idividual, with 4 attributes. We picked four cotiuous attributes that were especially well suited for our topk query model: age, wage, educatio level, ad hours of work per week. We also scaled dow the attribute values so that the resultig values raged betwee ad, to simplify our experimetal settig. We refer to this database as the Cesus database. I additio to the Cesus database, we geerated a umber of sythetic databases with differet data distributios. For this, we wrote a seed program that is capable of geeratig oe-dimesioal Zipfia distributio [5] with varyig Z factors. Whe this factor is zero, it geerates a uiform distributio. Higher values result i higher skew. For a -dimesioal data set, our geeratio program is parameterized by () a vector of Zvalues (oe for each attribute), Z =<z,...,z >; () the umber of tuples to be geerated, N. We created the data correspodig to a Z specificatio as follows. First, we geerated a oe-dimesioal Zipfia distributio of N tuples for attribute A usig Z factor z. Let us say that for attribute A the value v occurred i N out of the N tuples. We ow fill i the value for attribute A for each of these N tuples by geeratig N values w,...,w N usig a Zipfia distributio with Z factor z. At the ed of this step, the first two attributes of the origial N tuples are filled i with values (v,w ),...,(v,w N ). Let us say that this results i N tuples that have v ad w as the values for attributes A ad A, respectively. We the fill i the remaiig attribute values A,...,A for these N tuples i a aalogous way as above, usig the Z values z through z. For our experimets, we geerated databases of, records with =,, ad 4 attributes. The domai of each attribute is the real umbers betwee ad, with a spacig of. betwee attribute values. We varied the Zipfia vectors i the geeratio of the databases so we obtaied databases with a spectrum of skews. More specifically, Sectio 6 reports experimets for three families of databases, Z, Z, ad Z. Z, Z, ad Z represet the skew of databases built usig Zipfia vectors <,,..., >, <,,..., >, ad<,,..., >, respectively. Table summarizes the sythetic databases for which we report experimets i the ext sectio. Data Skew 4 Z,,, Z 7, 5,554 66,46 Z Table : The umber of distict tuple values for differet data skews ad umber of attributes. 5. Histograms As outlied above, we map a top-k query over a table R ito a relatioal selectio query. To do this mappig, we exploit the statistics (e.g., histograms) kept by the relatioal DBMS where relatio R resides. Oe of our goals i this paper is to study the effect o our mappig of the differet -dimesioal histogram structures proposed i the literature. These structures rely o a uderlyig strategy for buildig oe-dimesioal histograms. I this paper we focus o the AVI, PHASED, admhist -p -dimesioal techiques, with MAXDIFF as the uderlyig oedimesioal strategy [, ]. Below we briefly describe these structures. We refer the reader to [, ] for a detailed discussio. Costructig a MAXDIFF histogram o a attribute of a relatio is logically a two-step process. First, the data values are sorted ad, for each distict value, its frequecy of occurrece is calculated. Let the sorted values be v,...,v with correspodig frequecies f,...,f. We ca the defie frequecygap(i) = f i f i. This fuctio records the differece i frequecy of attribute values v i ad v i.the bucket boudaries are placed at those attribute values that correspod to the highest values of the frequecygap fuctio. The MAXDIFF histogram structure has bee show to have a good trade-off betwee accuracy ad buildig cost []. For the experimets that we report i the ext sectio, we have implemeted - dimesioal variats of MAXDIFF histograms usig the AVI, PHASED, admhist -p techiques, as described i []. The AVI techique for costructig a - dimesioal histogram is to simply assume statistical idepedece of the oe-dimesioal attributes. Thus, to determie the fractio of data i a -dimesioal bucket, we multiply the fractio of the data i each oe-dimesioal projectio of the bucket. The PHASED techique for costructig a - dimesioal histogram cosists of steps. I the first step, oe of the dimesios is used to partitio the dataset ito k buckets. I the j th step, each of the buckets obtaied at the ed of the previous step is divided ito k j buckets alog oe of the uused dimesios. The order i which dimesios are chose is determied prior to doig ay of the partitioig.

8 For each dimesio (attribute), we compute the variace i the frequecy of values o that dimesio. We the choose the attributes for partitioig the buckets i descedig order of their variace. This order reflects the criticality for separatig the values i buckets. This techique for costructig -dimesioalhistogram was first used i [] i the cotext of equidepth histogram structures. The MHIST -p techique for costructig a - dimesioal histogram is a adaptatio of the PHASED approach. More specifically, durig the j th step (see the descriptio of PHASED above), we determie the bucket i most eed of partitioig, ad we partitio it alog the attribute that exhibits the highest variace i frequecy withi the bucket. The factor p desigates the umber of buckets ito which each bucket is split at every step. The performace of our mappig techiques (Sectio 4) depeds o the accuracy of the available histograms. The accuracy of a histogram depeds i tur o the techique with which it was geerated, ad o the amout of memory that has bee allocated for it. I our experimets, i additio to tryig several histogram structures, we also study the effect of varyig memory o the accuracy of histograms. We assume throughout that histograms are kept up to date with the data. If histograms are ot up to date, the the performace of our techiques might decrease. However, the correctess of the aswers produced will remai uaffected, at the expese of a potetially higher umber of restarts (Sectio 4). 5. Measurig the Efficiecy of the Query Executio Strategies A top-k query q will typically ivolve several attributes. We might have idexes available for a umber of combiatios of the query attributes, ad the efficiecy of processig the query will be greatly affected by the particular idex cofiguratio available. We focus o two cofiguratios: (a) a sigle-colum idex exists for every attribute metioed i the query; or (b) a sigle -colum idex exists, coverig all attributes metioed i the query. Wheever a -dimesioal idex is preset, we retrieve exactly as may idex etries as there are tuples i the -rectagle defiig query C q, as described i Sectio 4., followed by the actual retrieval of the k top tuples for q. (The idex etries provide all the iformatio that we eed to decide which k tuples are the oes with the highest score for q.) Alteratively, whe oly oe-dimesioal idexes are available, we ca itersect oe or more idexes to determie the data tuples to be retrieved. Whe all ecessary siglecolum idexes are preset, this strategy results i o redudat retrieval of data tuples, as i the case whe a -dimesioal idex is available. However, ulike the case with -dimesioal idexes, we must ow pay the overhead of the idex itersectio. The cost of the idex itersectio ca be traded off agaist the cost of retrievig redudat data tuples (i.e., data tuples that do ot belog to the -rectagle of Sectio 4.). For each top-k query q, wemeasuretheumberof objects that match the associated -dimesioal selectio query C q (Sectio 4.). I Sectio 6, we report the average over all queries of the umber of tuples retrieved as the fractio of the umber of (ot ecessarily distict) tuples i the database (% of tuples retrieved). This metric reveals the tightess of our mappig of a top-k query ito a traditioal selectio query. A complemetary metric is %ofrestarts,the percetage of queries i our workload for which the associated selectio query failed to cotai the k best tuples, hece leadig to restarts. (See Step (5) of the algorithm of Sectio 4.) It is importat to distiguish betwee the tightess of the mappig of a top-k query to a traditioal selectio query, ad the efficiecy of executio of the latter. The tightess of the mappig depeds o the mappig algorithms (Sectio 4) ad o their iteractio with the quality of the available histograms. The efficiecy of executio of the selectio query produced by our mappig algorithm depeds i tur o the idexes available o the database ad o the optimizer s choice of a executio pla. The cost estimator i a optimizer determies the best access path amog the available choices. (These choices iclude performig a sequetial sca of the data.) I this paper, we will ot discuss further details of efficiet executio of selectio queries o databases but rather focus o the problem of mappig top-k queries to selectio queries efficietly usig histogram structures. 6 Experimetal Results This sectio presets experimetal results for our techiques of Sectio 4 for evaluatig top-k queries. I particular, we study the role of several factors o the efficiecy of our strategies, icludig the size ad type of -dimesioal histograms available, the scorig fuctio used i the queries, ad the dimesioality ad skew of the data sets. Our experimets the ivolve a large umber of parameters, ad we tried may differet value assigmets. For cociseess, we report results o a default settig where appropriate. This default settig uses databases built with the Z (moderate) skew (Sectio 5.), the PHASED techique for buildig -dimesioal histograms (Sectio 5.), ad allocates 5KB per histogram. For each experimet, we geerated differet queries. Each query was created by pickig each attribute value radomly from the [, ] rage. I the default settig, these queries ask for top tuples (i.e., k = ). We report results for other settigs of the parameters as well.

9 Validity of our Geeral Approach Our geeral approach for processig a top-k query q (Sectio 4.) is to fid a -rectagle that cotais all the top k tuples for q, ad use this rectagle to build a traditioal selectio query. Our first experimet studies the itrisic limitatios of our approach, i.e., whether it is possible to build a good -rectagle aroud query q that cotais all top k tuples ad little else. To aswer this first questio, idepedet of ay available histograms or search-score selectio strategies (Sectio 4), we first scaed the database to fid the actual top k tuples for a give query q, ad determied a tight -rectagle T that ecloses all of these tuples. We the computed what fractio of the database tuples lies withi rectagle T. Table reports these figures. As we ca see from the table, the fractio of tuples that lie i this ideal rectagle is extremely low, which validates our approach: if the database statistics (i.e., histograms) are accurate eough, the we should be able to fid a tight -rectagle that ecloses all the best tuples for a give query, with few extra tuples. Data Distributio Scorig 4 Mi... Z Sum... Euclidea... Mi... Z Sum.4.. Euclidea.4.. Mi Z Sum Euclidea..4.6 Table : The percetage of tuples i the database icluded i a -rectagle eclosig the actual top-k tuples for a query (k = ; N =, tuples). Effect of Multidimesioal Histograms For this experimet, we cosidered the AVI, PHASED, admhist - histogram structures (Sectio 5.). AVI proved to be sigificatly worse tha MHIST ad PHASED sice it teded to require restarts i most cases, while retrievig oly a extremely low fractio of the database tuples. I effect, the NoRestarts strategy of Sectio 4. guaratees o restarts oly i the presece of a accurate -dimesioal histogram. AVI ca oly estimate the holdigs of each -dimesioal bucket by assumig that attributes follow idepedet distributios. The results for AVI were so poor that we omit this histogram structure from the rest of the discussio. For PHASED ad MHIST, we varied the amout of storage that we allocated for the histograms. Figure 5 shows the effect of this variatio for the Euclidea scorig fuctio. (The results for Mi ad Sum are 9 8 MHIST, NoRestarts 7 MHIST, Iter 6 PHASED, NoRestarts PHASED, Iter Histogram Size (bytes) Figure 5: The percetage of tuples retrieved, as a fuctio of the umber of bytes dedicated to the - dimesioal histogram (Euclidea scorig fuctio; = ;Z data distributio). % Tuples Retrieved aalogous.) I this figure, we report the results for the NoRestarts ad the Iter policies of Sectio 4.. Whe we icrease the histogram size from KB to 5KB, there is a sharp improvemet i the efficiecy of our techique, as evideced by the drop i the percetage of tuples retrieved. PHASED performs (margially) better tha MHIST ad therefore for the rest of this sectio we report results maily usig PHASED. Although higher memory allocatio clearly icreases accuracy, as show by the figures, we decided to settle o a 5KB budget for each histogram i the rest of this paper. Effect of Differet Scorig Fuctios The goal of this experimet is to measure the differeces amog scorig fuctios as the data skew ad the umber of dimesios are varied (Sectio 5.). Figure 6 shows that, as the data skew icreases, the percetage of tuples retrieved decreases sharply ad cosistetly across all scorig fuctios. O the other had, as the umber of attributes is icreased (Figure 7), the performace of our techiques drops. Iterestigly, the Mi scorig fuctio copes sigificatly better with the icrease i tha the other scorig fuctios. As metioed i Sectio 4., the shape of the regio cotaiig the top tuples for a query matches a -rectagle perfectly, ulike the case for Sum ad Euclidea. The performace of Euclidea, though, is better tha that of Sum. As ca be observed from Table ad Figures (a) ad (b), the size of the -rectagle eclosig the top tuples for Sum is much larger tha that for Euclidea (Sectios 4. ad 4.). Effect of the Number of Tuples Requested k Figure 8 studies the effect of icreasig k, theumber of tuples requested i a top-k query. As k is icreased from to, the performace drops. As i the pre-

10 % Tuples Retrieved Sum, NoRestarts 5 Sum, NoRestarts Euclidea, NoRestarts 5 Euclidea, NoRestarts Mi, NoRestarts Mi, NoRestarts 4 Sum, Iter 5 Sum, Iter Euclidea, Iter Euclidea, Iter Mi, Iter Mi, Iter 5 5 Z Z Z Z Z Z Data Skew Data Skew (a) (b) Figure 6: The percetage of tuples retrieved (a), ad the percetage of queries that eeded restarts (b), for icreasig data skew (PHASED histogram of 5KB; =). % Restarts % Tuples Retrieved 5 Sum, NoRestarts 8 Sum, NoRestarts Euclidea, NoRestarts 6 Euclidea, NoRestarts Mi, NoRestarts 4 Mi, NoRestarts Sum, Iter Sum, Iter 5 Euclidea, Iter Euclidea, Iter Mi, Iter 8 Mi, Iter (a) (b) Figure 7: The percetage of tuples retrieved (a), ad the percetage of queries that eeded restarts (b), asa fuctio of the umber of attributes (PHASED histogram of 5KB; Z data distributio). % Restarts % Tuples Retrieved Sum, NoRestarts 5 Sum, NoRestarts Euclidea, NoRestarts Euclidea, NoRestarts 5 Mi, NoRestarts Mi, NoRestarts Sum, Iter Sum, Iter Euclidea, Iter 5 Euclidea, Iter Mi, Iter Mi, Iter k k (a) (b) Figure 8: The percetage of tuples retrieved (a), ad the percetage of queries that eeded restarts (b), for differet values of k (PHASED histogram of 5KB; Z data distributio; =). % Restarts

11 4 5 NoRestarts NoRestarts Iter 8 Iter Iter Iter 5 Restarts 6 Restarts Z Z Z Z Z Z Data Skew Data Skew (a) (b) Figure 9: The percetage of tuples retrieved (a), ad the percetage of queries that eeded restarts (b), for icreasig data skew (Euclidea scorig fuctio; PHASED histogram of 5KB; = ). % Tuples Retrieved % Restarts vious experimet, the percetage of tuples retrieved for Mi grows the slowest, followed by Euclidea. The combiatio of scorig fuctio Sum ad the NoRestarts strategy performs the worst. Comparig Query Processig Strategies Figure 9 compares the relative merits of the query processig strategies of Sectio 4.. At low data skews, the NoRestarts strategy results i a relatively larger umber of matchig tuples. However, as skew icreases, the performace of NoRestarts improves sigificatly ad domiates that of the other strategies, sice, by defiitio, it icurs o query restarts with up-to-date histograms. Strategy Iter proves to be a robust techique, sice it maitais good performace for all data skews. Effect of Usig -Rectagle Queries As explaied i Sectio 4., we process a top-k query q by first fidig a score S q ad the fidig a - rectagle that ecloses all tuples with a Score of S q or higher. Our goal is for the -rectagle to have as few bad tuples as possible, i.e., as few tuples with Score lower tha S q as possible. Figure examies this issue by computig the actual umber of tuples t with Score(q, t) S q. I other words, we take the score S q computed by usig a histogram ad a query processig strategy (Sectio 4.), ad we cout the tuples i the database with that score or higher. We ca the compare these umbers agaist those i Figure 9(a) to coclude that usig -rectagles for retrievig the database tuples does ot result i a major source of iefficiecy, sice the percetage of tuples i both cases is quite comparable. Results for the Cesus Database Figure shows how our query processig strategies perform o the Cesus data set (Sectio 5.). While oe of the strategies resulted i a sigificat umber % Tuples NoRestarts Iter Iter Restarts Z Z Z Data Skew Figure : The average umber of tuples (as a percetage of N) with score S q or higher (Step () of the Sectio 4 algorithm) for icreasig data skew (Euclidea scorig fuctio; PHASED histogram of 5KB; = ). of restarts (hece we do ot show the correspodig plot here), the robustess of strategy Iter for icreasig histogram size ca be see clearly. The performace for the differet scorig fuctios is cosistet with the results obtaied for the sythetic databases described above. 7 Coclusios ad Future Work I this paper, we studied the problem of mappig a top-k query o a relatioal database to a traditioal selectio query such that the mappig is tight, i.e., we retrieve as few tuples as possible. Our mappig algorithms exploit the histogram structures ad are able to cope with a wide variety of scorig fuctios. Our experimets highlighted the effect of differet scorig fuctios, data distributios, as well as histogrambuildig strategies o the performace of this mappig. Our focus i this paper has bee primarily o queries over cotiuous attributes. I the future, we will exted our techiques to hadle top-k queries over

12 9 8 7 Sum, NoRestarts 6 Euclidea, NoRestarts 5 Mi, NoRestarts 4 Sum, Iter Euclidea, Iter Mi, Iter 5 Histogram Size (bytes) Figure : The percetage of tuples retrieved, as a fuctio of the umber of bytes dedicated to the histogram (Cesus database; PHASED histogram). % Tuples Retrieved Max X.5 Figure : The scores for query q =(.4,.) for scorig fuctio Max..75 discrete attributes. Aother directio for future work is to explore approaches to support top-k queries with scorig fuctios (e.g., Max ) that caot be mapped tightly to the family of traditioal selectio queries that we used i this paper (Figure ). Ackowledgmets We thak Eugee Agichtei ad David Lomet for their useful commets. Refereces [] M. J. Carey ad D. Kossma. O sayig Eough Already! i SQL. I Proceedigs of the 997 ACM Iteratioal Coferece o Maagemet of Data (SIGMOD 97), May 997. [] M. J. Carey ad D. Kossma. Reducig the brakig distace of a SQL query egie. I Proceedigs of the Twety-fourth Iteratioal Coferece o Very Large Databases (VLDB 98), Aug [] S. Chaudhuri ad L. Gravao. Optimizig queries over multimedia repositories. I Proceedigs of the 996 ACM Iteratioal Coferece o Maagemet of Data (SIGMOD 96), Jue Y [4] R. Fagi. Combiig fuzzy iformatio from multiple systems. I Proceedigs of the Fifteeth ACM Symposium o Priciples of Database Systems (PODS 96), Jue 996. [5] R. Fagi. Fuzzy queries i multimedia database systems. I Proceedigs of the Seveteeth ACM Symposium o Priciples of Database Systems (PODS 98), Jue 998. [6] L. Gravao ad H. García-Molia. Mergig raks from heterogeeous Iteret sources. I Proceedigs of the Twety-third Iteratioal Coferece o Very Large Databases (VLDB 97), Aug [7] F. Kor, N. Sidiropoulos, C. Faloutsos, E. Siegel, ad Z. Protopapas. Fast earest eighbor search i medical image databases. I Proceedigs of the Twety-secod Iteratioal Coferece o Very Large Databases (VLDB 96), Sept [8] W.Meg,K.-L.Liu,C.Yu,X.Wag,Y.Chag, ad N. Rishe. Determiig text databases to search i the Iteret. I Proceedigs of the Twety-fourth Iteratioal Coferece o Very Large Databases (VLDB 98), Aug [9] A. Motro. VAGUE: A user iterface to relatioal databases that permits vague queries. ACM Trasactios o Office Iformatio Systems, 6():87 4, July 988. [] M. Muralikrisha ad D. J. DeWitt. Equi-depth histograms for estimatig selectivity factors for multidimesioal queries. I Proceedigs of the 988 ACM Iteratioal Coferece o Maagemet of Data (SIGMOD 88), Jue 988. [] V. Poosala ad Y. E. Ioaidis. Selectivity estimatio without the attribute value idepedece assumptio. I Proceedigs of the Twetythird Iteratioal Coferece o Very Large Databases (VLDB 97), Aug [] V. Poosala, Y. E. Ioaidis, P. J. Haas, ad E. J. Shekita. Improved histograms for selectivity estimatio of rage predicates. I Proceedigs of the 996 ACM Iteratioal Coferece o Maagemet of Data (SIGMOD 96), Jue 996. [] G. Salto ad M. J. McGill. Itroductio to moder iformatio retrieval. McGraw-Hill, 98. [4] T. Seidl ad H.-P. Kriegel. Optimal multi-step k-earest eighbor search. I Proceedigs of the 998 ACM Iteratioal Coferece o Maagemet of Data (SIGMOD 98), Jue 998. [5] G. K. Zipf. Huma behaviour ad the priciple of least effort. Addiso-Wesley, 949.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

1 Enterprise Modeler

1 Enterprise Modeler 1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

A Boolean Query Processing with a Result Cache in Mediator Systems

A Boolean Query Processing with a Result Cache in Mediator Systems A Boolea Query Processig with a Result Cache i Mediator Systems Jae-heo Cheog ad Sag-goo Lee * Departmet of Computer Sciece Seoul Natioal Uiversity Sa 56-1 Shillim-dog Kwaak-gu, Seoul Korea {cjh, sglee}cygus.su.ac.kr

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Counting Regions in the Plane and More 1

Counting Regions in the Plane and More 1 Coutig Regios i the Plae ad More 1 by Zvezdelia Stakova Berkeley Math Circle Itermediate I Group September 016 1. Overarchig Problem Problem 1 Regios i a Circle. The vertices of a polygos are arraged o

More information

Stone Images Retrieval Based on Color Histogram

Stone Images Retrieval Based on Color Histogram Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

One advantage that SONAR has over any other music-sequencing product I ve worked

One advantage that SONAR has over any other music-sequencing product I ve worked *gajedra* D:/Thomso_Learig_Projects/Garrigus_163132/z_productio/z_3B2_3D_files/Garrigus_163132_ch17.3d, 14/11/08/16:26:39, 16:26, page: 647 17 CAL 101 Oe advatage that SONAR has over ay other music-sequecig

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

Intermediate Statistics

Intermediate Statistics Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS Prosejit Bose Evagelos Kraakis Pat Mori Yihui Tag School of Computer Sciece, Carleto Uiversity {jit,kraakis,mori,y

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation Flexible Colorig Xiaozhou (Steve) Li, Atri Rudra, Ram Swamiatha HP Laboratories HPL-2010-177 Keyword(s): graph colorig; hardess of approximatio Abstract: Motivated b y reliability cosideratios i data deduplicatio

More information

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters. SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules

More information

Unsupervised Discretization Using Kernel Density Estimation

Unsupervised Discretization Using Kernel Density Estimation Usupervised Discretizatio Usig Kerel Desity Estimatio Maregle Biba, Floriaa Esposito, Stefao Ferilli, Nicola Di Mauro, Teresa M.A Basile Departmet of Computer Sciece, Uiversity of Bari Via Oraboa 4, 7025

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III GE2112 - FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III PROBLEM SOLVING AND OFFICE APPLICATION SOFTWARE Plaig the Computer Program Purpose Algorithm Flow Charts Pseudocode -Applicatio Software Packages-

More information

Modern Systems Analysis and Design Seventh Edition

Modern Systems Analysis and Design Seventh Edition Moder Systems Aalysis ad Desig Seveth Editio Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Desigig Databases Learig Objectives ü Cocisely defie each of the followig key database desig terms: relatio,

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

New Results on Energy of Graphs of Small Order

New Results on Energy of Graphs of Small Order Global Joural of Pure ad Applied Mathematics. ISSN 0973-1768 Volume 13, Number 7 (2017), pp. 2837-2848 Research Idia Publicatios http://www.ripublicatio.com New Results o Eergy of Graphs of Small Order

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Data Structures Week #9. Sorting

Data Structures Week #9. Sorting Data Structures Week #9 Sortig Outlie Motivatio Types of Sortig Elemetary (O( 2 )) Sortig Techiques Other (O(*log())) Sortig Techiques 21.Aralık.2010 Boraha Tümer, Ph.D. 2 Sortig 21.Aralık.2010 Boraha

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 9 Poiters ad Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 9.1 Poiters 9.2 Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Slide 9-3

More information

Consider the following population data for the state of California. Year Population

Consider the following population data for the state of California. Year Population Assigmets for Bradie Fall 2016 for Chapter 5 Assigmet sheet for Sectios 5.1, 5.3, 5.5, 5.6, 5.7, 5.8 Read Pages 341-349 Exercises for Sectio 5.1 Lagrage Iterpolatio #1, #4, #7, #13, #14 For #1 use MATLAB

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

MAXIMUM MATCHINGS IN COMPLETE MULTIPARTITE GRAPHS

MAXIMUM MATCHINGS IN COMPLETE MULTIPARTITE GRAPHS Fura Uiversity Electroic Joural of Udergraduate Matheatics Volue 00, 1996 6-16 MAXIMUM MATCHINGS IN COMPLETE MULTIPARTITE GRAPHS DAVID SITTON Abstract. How ay edges ca there be i a axiu atchig i a coplete

More information

ISSN (Print) Research Article. *Corresponding author Nengfa Hu

ISSN (Print) Research Article. *Corresponding author Nengfa Hu Scholars Joural of Egieerig ad Techology (SJET) Sch. J. Eg. Tech., 2016; 4(5):249-253 Scholars Academic ad Scietific Publisher (A Iteratioal Publisher for Academic ad Scietific Resources) www.saspublisher.com

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Searching a Russian Document Collection Using English, Chinese and Japanese Queries

Searching a Russian Document Collection Using English, Chinese and Japanese Queries Searchig a Russia Documet Collectio Usig Eglish, Chiese ad Japaese Queries Fredric C. Gey (gey@ucdata.berkeley.edu) UC Data Archive & Techical Assistace Uiversity of Califoria, Berkeley, CA 94720 USA ABSTRACT.

More information

the beginning of the program in order for it to work correctly. Similarly, a Confirm

the beginning of the program in order for it to work correctly. Similarly, a Confirm I our sytax, a Assume statemet will be used to record what must be true at the begiig of the program i order for it to work correctly. Similarly, a Cofirm statemet is used to record what should be true

More information

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs Dyamic Aalysis ad Desig Patter Detectio i Java Programs Outlie Lei Hu Kamra Sartipi {hul4, sartipi}@mcmasterca Departmet of Computig ad Software McMaster Uiversity Caada Motivatio Research Problem Defiitio

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

A Note on Least-norm Solution of Global WireWarping

A Note on Least-norm Solution of Global WireWarping A Note o Least-orm Solutio of Global WireWarpig Charlie C. L. Wag Departmet of Mechaical ad Automatio Egieerig The Chiese Uiversity of Hog Kog Shati, N.T., Hog Kog E-mail: cwag@mae.cuhk.edu.hk Abstract

More information

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP)

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP) Heuristic Approaches for Solvig the Multidimesioal Kapsack Problem (MKP) R. PARRA-HERNANDEZ N. DIMOPOULOS Departmet of Electrical ad Computer Eg. Uiversity of Victoria Victoria, B.C. CANADA Abstract: -

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Hashing Functions Performance in Packet Classification

Hashing Functions Performance in Packet Classification Hashig Fuctios Performace i Packet Classificatio Mahmood Ahmadi ad Stepha Wog Computer Egieerig Laboratory Faculty of Electrical Egieerig, Mathematics ad Computer Sciece Delft Uiversity of Techology {mahmadi,

More information

CS Polygon Scan Conversion. Slide 1

CS Polygon Scan Conversion. Slide 1 CS 112 - Polygo Sca Coversio Slide 1 Polygo Classificatio Covex All iterior agles are less tha 180 degrees Cocave Iterior agles ca be greater tha 180 degrees Degeerate polygos If all vertices are colliear

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

COMP 558 lecture 6 Sept. 27, 2010

COMP 558 lecture 6 Sept. 27, 2010 Radiometry We have discussed how light travels i straight lies through space. We would like to be able to talk about how bright differet light rays are. Imagie a thi cylidrical tube ad cosider the amout

More information

Combining Associative and Navigational Access in Persistent Object Stores

Combining Associative and Navigational Access in Persistent Object Stores Combiig Associative ad Navigatioal Access i Persistet Object Stores Markus Kirchberg, Weea Nusdi, Alexei Tretiakov Iformatio Sciece Research Cetre, Departmet of Iformatio Systems, Massey Uiversity, Private

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information