Estimating Progress of Execution for SQL Queries

Size: px
Start display at page:

Download "Estimating Progress of Execution for SQL Queries"

Transcription

1 Estatng Progress of Executon for SQL Queres Surajt Chaudhur Vvek arasayya Ravshankar Raaurthy Mcrosoft Research Mcrosoft Research Unversty of Wsconsn, Madson ABSTRACT Today s database systes provde lttle feedback to the user/dba on how uch of a SQL query s executon has been copleted. For long runnng queres, such feedback can be very useful, for exaple, to help decde whether the query should be ternated or allowed to run to copleton. Although the above requreent s easy to express, developng a robust ndcator of progress for query executon s challengng. In ths paper, we study the above proble and present technques that can for the bass for effectve progress estaton. The results of experentally valdatng our technques n Mcrosoft SQL Server are prosng.. ITRODUCTIO Decson support applcatons typcally nclude long-runnng queres. For such queres, the ablty to estate the progress of query executon could be very useful. Progress estaton could help DBAs as well as end users or applcatons help decde whether to ternate the query or allow t to fnsh. Such feedback could qualtatvely prove the experence for any database user. However, today s database systes only provde rudentary feedback to users about progress of query executon. Ths feedback s lted to the query optzer generated executon plan and ts cost, as well as the nuber of tuples returned by the query durng ts executon. Beyond ths, to the best of our knowledge, there s no pror publshed work on the proble of progress estaton for SQL query executon. The ost useful easure of progress would report to the user at any pont durng the query s executon, the aount of te requred for the query to coplete executon. However, any ethod that provdes such a easure would be subject to uncertanty arsng fro concurrent executon of other queres. Due to ths dffculty, we focus on the proble of estatng the percentage reanng (or equvalently copleted) of the query, at any pont durng ts executon,.e., reportng a progress bar for query executon. Such an estator s spler than estatng te reanng snce t s ndependent of other queres. In effect, ths easure estates the te reanng on an solated syste where only the gven query s executng. Effectve progress estaton for query executon requres us to accurately estate the total work requred to execute the query. Queres n odern database systes are qute coplex nvolvng Persson to ake dgtal or hard copes of all or part of ths work for personal or classroo use s granted wthout fee provded that copes are not ade or dstrbuted for proft or coercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc persson and/or a fee. SIGMOD 2004, June 3 8, 2004, Pars, France. Copyrght 2004 ACM /04/06 $5.00. jons, nested sub-queres and aggregaton. Any easure of work for a query that s ndependent of the nteredate cardnaltes of such operators s lkely to be too splstc. For exaple, consder a etrc that reports progress as the percentage of query results that have been returned thus far. Let us assue that we could accurately estate the total nuber of rows that a query wll return n ts result. To see why such a etrc for progress could be really naccurate, consder an executon plan consstng of a very expensve jon followed by an nexpensve Sort operaton. Snce Sort s a blockng operaton, query results are not returned untl the Sort starts outputtng rows. Therefore, untl such te, the above etrc would report no progress rrespectve of how uch work was done n the jon. As another llustraton of why the proble s dffcult, consder a etrc that reports the percentage of nodes (.e., operators) n the executon plan that have copleted. However, f a query s just a sngle ppelne of operators, for alost the entre duraton of the executon of the query, all the operators n the plan are actve.e., not yet copleted. Thus the above etrc wll not report any progress untl near the very end of query executon. We note that a query optzer already uses a odel of work done by a query (based on estated CPU and I/O costs). Whle leveragng ths odel for progress estaton ay be possble, n ths paper, we ask whether an even spler odel would suffce for the purposes of progress estaton. The otvaton for ths spler odel s the ease of ncorporaton nto exstng query executon engnes. We odel work done by a query as a functon of the nuber of rows output by each operator n the query executon plan. Whle ths odel does nhert the known dffcultes of cardnalty estaton faced by a query optzer, we use two key deas to help tgate the pact of naccurate cardnalty estaton on progress estaton. Frst, we observe that t s possble to estate the cardnaltes of certan operators e.g., Table Scans or Index Scans whch we refer to as drver nodes (forally defned n Secton 2) uch ore accurately than other nteredate nodes n a ppelne e.g., a Flter or Hash Jon. We show that n any cases estatng the overall query progress by only ontorng progress of these drver nodes can greatly prove accuracy. Second, durng query executon we leverage runte executon nforaton to refne cardnalty estaton. We take a conservatve approach (based on antanng and refnng upper and lower bounds on cardnaltes of operators n the plan) that s guaranteed not to ntroduce addtonal naccuraces as a result of such refneent. Our soluton s applcable to arbtrary SQL queres and can be pleented at low overhead n exstng database systes. We have pleented our technques nsde Mcrosoft SQL Server and the ntal results of experentally Work done whle author was vstng Mcrosoft Research.

2 evaluatng our estator on the TPC-H benchark [2] queres (0 GB verson on unfor as well as skewed data dstrbutons) are prosng. The rest of the paper s structured as follows. Secton 2 descrbes the proble and presents our odel of work done by a query. Gven ths odel, we propose n Secton 3, an estator for the progress of a query whose executon conssts of a sngle ppelne. Secton 4 presents our soluton for the general case of a query nvolvng ultple ppelnes. Experental valdaton of our prototype on decson support queres s presented n Secton 5. Secton 6 dscusses the desrable property of onotoncty of progress estaton and ts relatonshp to accuracy of estaton. We present extensons to our odel of work to be ore robust to runte condtons n Secton 7, and dscuss related work n Secton 8. We conclude wth a bref dscusson on nterestng areas of future work. 2. PROBLEM DESCRIPTIO 2. Defntons A progress estator uses an executon plan that s chosen by the query optzer for the gven query. An executon plan s a tree where the nodes of the tree are physcal operators. For exaple, Fgure shows the executon plan for a query. Index ested Loops Jon sngle ppelne. For a ested Loops or Index ested Loops Jon operator, the outer chld, the jon operator and ts entre nner subtree are part of a the sae ppelne as the outer chld node. Both Sort and Group-By (hash-based) operators, whch are blockng, start a new ppelne of ther own. For the exaple n Fgure, the ppelnes are: P = {Table Scan A, Flter}, P 2 = {Index Scan B, Hash Jon, Index ested Loops, Index Seek C}. In prncple the above defnton of can be extended to other physcal operators as well. Thus, ntutvely, a ppelne can be thought of as a axal subtree of concurrently executng operators. Every ppelne has a set of drver nodes,.e., operators that are the sources of tuples operated upon by reanng nodes n the ppelne. More precsely, we defne the drver nodes of a ppelne as the set of all leaf nodes of the ppelne, except those that are n the nner subtree of a ested Loops/ Index ested Loops jon. For exaple, n Fgure, the shaded nodes are drver nodes Table Scan A s the drver node for the ppelne P and Index Scan B s the drver node for ppelne P 2. ote that Index Seek C s not a drver node snce t s a leaf node of the nner subtree of an Index ested Loops Jon. We observe that t s possble for a ppelne to contan ore than one drver node, e.g., n a Merge-Jon of two sorted relatons, both the nput relatons to the Merge-Jon are drver nodes. Index ested Loops Jon Hash Jon Index Seek C Sort-Merge Jon Index Seek C Flter Sort A Sort B Index Scan B Table Scan A Table Scan A Index Scan B Fgure. Exaple of an executon plan for a query A physcal operator s referred to as a blockng operator f t does not produce any outputs untl t has consued at least one of ts nputs copletely. For exaple, suppose Table Scan A wth Flter s the buld relaton of the Hash Jon and Index Scan B s the probe relaton. The Hash Jon operator n Fgure s blockng snce t ust consue all rows fro the buld relaton before t produces any output. Another exaple of a coon blockng operator s Sort. The overall executon of a query s staged nto ultple ppelnes. We now defne the noton of ppelnes for an executon plan consstng of coon physcal operators such as Table Scan, Index Scan, Index Seek, Flter, Hash Jon, Merge-Jon, Index ested Loops (IL) Jon, L Jon, Group-By (Hash-based) and Sort. The defnton s procedural and proceeds nductvely n a botto up anner over the nodes of an executon plan. A leaf node of the plan (Table Scan, Index Scan, Index Seek) starts a ppelne. A Flter node s part of the ppelne that ts chld operator belongs to. For a Hash Jon, the jon operator s ncluded n the ppelne of the probe chld, and the buld chld s the root of another ppelne. For a Merge-Jon, the ppelnes contanng ts chldren and the Merge Jon operator tself are unon ed to create a Fgure 2. Executon plan wth Sort-Merge Jon. Ths s llustrated n Fgure 2. The ppelnes dentfed for ths query would be P = {Table Scan A}, P 2 = {Index Scan B} P 3 = {Sort A, Sort B, Merge Jon, Index ested Loops, Index Seek C} and the drver nodes (the shaded nodes) would be respectvely {Table Scan A}, {Table Scan B}, {Sort A, Sort B}. Thus there are two drver nodes for the last ppelne. We note that unlke a Hash Jon, for a Sort-Merge Jon, the scans of both nputs do not necessarly need to coplete for the Sort-Merge Jon to coplete. An executon plan can be vewed as a partal order of ppelnes snce, n general, for certan ppelnes to start executng, one or ore other ppelnes need to coplete. For exaple n Fgure, executon of P ust precede P 2. Slarly n Fgure 2, executon of P and P 2 ust precede P 3, but the order between P and P 2 s arbtrary. 2.2 Desrable Propertes of a Progress Estator Accuracy: The estated percentage of work copleted by the query at any pont durng ts executon should be close to the actual percentage of work copleted by the query at that pont.

3 Fne granularty: It follows fro the above accuracy requreent that the estator should be able to provde estates at suffcently fne granularty over the duraton of the query s executon. Thus, for exaple, an estator that only provdes accurate estates at 0% and 00% copleton would not be useful. Low overhead: An essental requreent for a progress estator to be practcal s that t should pose low overhead on the actual executon of the query. Leveragng feedback fro executon: As query executon progresses, ore nforaton based on (nteredate) results of executon can becoe avalable. Ideally, an estator should be able to take full advantage of such nforaton. Monotoncty: Snce the actual executon of the query progresses onotoncally, deally, the estated progress should be also be onotoncally ncreasng fro the start of query executon to ts fnsh. We observe that n today s database systes feedback on query progress durng executon does not satsfy one or ore of the above requreents. Whle the optzer estated cost of a query can be obtaned at low overhead and progress estaton based on ths cost s trvally onotonc (snce the estated cost does not change over the lfete of the query s executon), t can potentally be naccurate and t does not leverage any feedback fro executon. Slarly, the nuber of tuples returned by a query durng ts executon (whle low overhead and onotonc) has the ajor drawback that t can be naccurate and lackng n granularty as llustrated n the ntroducton. Moreover, t only takes lted advantage of executon feedback. Fnally, we note that n general, there s a trade-off between guaranteeng onotoncty and achevng accuracy of progress estaton (we dscuss ths further n Secton 6). 2.3 The Getext() Model of Work As descrbed n the ntroducton, our goal s to estate progress of a query on an solated syste,.e., on a syste where there s no other actvty besdes the executon of ths query. Any progress estator requres a odel of work done by a query as the bass of ts estaton. In ths secton we present such a odel of work. One approach for odelng the work done by a query could have been to use the cost odel used by query optzer s for coparng dfferent executon plans for a query. Query optzers typcally odel the work done by the query as a functon of CPU, rando I/O and sequental I/O costs. Thus, to use such a odel for progress estaton, we would need to easure the CPU, rando and sequental I/O s perfored by the query durng ts executon. In ths paper we nvestgate whether an even spler odel of work would be adequate for the purposes of progress estaton. The an otvaton for a spler odel s the ease wth whch t can be ncorporated nto today s database systes. The reason we expect that a spler odel ay be adequate for progress estaton s that unlke the query optzer that needs to dstngush between ultple plans for a gven query usng ts cost odel, we only need to be able to estate the percentage of work done for a gven query executon plan. We note that operators n a query executon plan are typcally pleented usng a deand drven terator odel [5], where each physcal operator n the executon plan exports a standard nterface for query processng (ncludng Open(), Close() and Getext()). We propose to odel the work done by a query as the total nuber of Getext() calls ssued throughout the duraton of the query s executon over all operators n the executon plan. In essence, we are countng each Getext() call as a prtve operaton of query processng and odelng the total work done by the query by the total nuber of Getext() calls. ote that all CPU nstructons, I/Os etc. perfored by the query occurs as a result of Getext() calls. Thus, ths odel assues that the total te requred to execute the query s aortzed across ultple Getext() calls, and therefore the percentage of Getext() calls done thus far s a good estator of the te taken by the query (on an solated syste). It should be noted that the Getext() odel of work s nadequate for the purposes of query optzaton. As a sple exaple of why ths s the case, consder two plans for the sae query: one nvolvng a non-clustered ndex Index Seek and another nvolvng a Table Scan. Wth the above Getext() odel of work, the Index Seek would always be consdered cheaper (.e., less work) by the query optzer snce the nuber of rows t returns can never exceed that of the Table Scan. Progress Estaton Based on Getext() odel We now defne progress estaton based on the Getext() odel of work. Suppose the executon plan has a total of operators. Let the total nuber of tuples that flow out of operator Op (.e., nuber of Getext() calls nvoked on that operator) at the end of query executon be (=..). At any pont durng query executon, let the nuber of tuples that have flowed out of every operator thus far be ( =..). Thus, the deal estator under the Getext() odel of work (we call t gn) would estate progress at that pont durng the query s executon as: gn = ote that whle accurate values can be obtaned as the query s executng, the exact values are avalable only at the end of query executon. Thus, the estator gn s not drectly pleentable as stated above snce s are not known exactly whle the query s executng. Thus, the key challenge for any progress estator E that uses the above odel of work s to estate as accurately as possble whle the query s executng. ote that the proble of estatng the nuber of Getext() calls for an operator n the query executon plan s the cardnalty estaton proble faced by query optzers. The only dfference s that unlke a query optzer, whch can only use pre-coputed database statstcs (e.g., hstogras), the estator E can potentally also observe feedback fro query executon for use n ts estaton. We observe that the fne granularty requreent (see Secton 2.2) should typcally be satsfed by an estator usng the Getext() odel snce for a long runnng query, a large nuber of Getext() calls are ade durng ts executon. Another desrable property of an estator s sall runte overhead. For exaple, an estator that actually executes the query n order to obtan the total nuber of Getext() calls ( ) would be unacceptable. Thus, we requre that the nforaton used by any estator be lted to a sall aount of aggregated nforaton ether n the for of pre-

4 coputed database statstcs or statstcs coputed on observed feedback fro query executon. Although ths restrcton by tself s not suffcent to guarantee low overhead, t appears to be necessary for an estator to be practcal. The estator that we present n ths paper uses feedback fro query executon (see Secton 4) to refne estates of. Observe that snce s onotoncally ncreasng as the query executes, the onotoncty of the estator depends on how the estates of are changed by the estator as the query executes. We coent on the onotoncty property of our estator based on the Getext() odel n Secton 6. We note a couple of addtonal propertes of the Getext() odel of work: () It can be appled to odern database systes snce they typcally eploy a deand drven terator odel for query executon. (2) It has the property that t s nvarant across ultple runs of the sae query. 3. DRIVER ODE ESTIMATOR: SIGLE PIPELIE QUERIES In ths secton, we outlne our soluton for the progress estaton proble for the class of queres that consst of a sngle executon ppelne. We show how our soluton extends to the general class of arbtrary query executon plans (consstng of ultple ppelnes) n Secton 4. For splcty, we consder a query whose executon plan s a sngle ppelne consstng of a chan of (non-blockng) operators: Op -> Op 2. -> Op and havng a sngle operator Op as ts drver node (see Secton 2. for defnton of a drver node). Typcally, such a ppelne conssts of a sngle drver node (e.g., Table Scan or Index Scan) followed by a sequence of nonblockng operators such as Flter and Index ested Loops (IL) jon. As descrbed earler, they key challenge for any estator usng the Getext() odel (.e., tryng to estate gn) s to accurately estate, the total nuber of Getext() calls that wll be perfored over all nodes n the query. In an deal world, the optzer s estates of (and hence the progress estator, whch can use such estates) would be accurate. But cardnalty estaton usually nvolves splfyng assuptons (partcularly on the correlaton between data values) and consequently s prone to estaton errors. For exaple, t s known that estaton errors propagate exponentally as a functon of the nuber of jons n the query [8]. Our focus n ths paper s not on developng technques for better cardnalty estaton for the purpose of query optzaton. Rather, we develop addtonal technques that could tgate the pact of errors n cardnalty estaton on progress estaton. Our estator (called the Drver ode Estator, dne for short) for sngle ppelne queres havng exactly one drver node s defned as: dne = where s the nuber of Getext() calls done on the drver node of the ppelne, Op, thus far; and s the estated total nuber of Getext() calls for Op. Therefore, underlyng dne s the hypothess (we refer to t as the drver node hypothess) that overall query progress can be estated by the progress of only the drver node of the ppelne,.e.,: There are a few portant reasons why the estator dne can work well n practce. Frst, note that naccuraces n gn arse due to naccurate estates. Snce a drver node n a ppelne s the source of tuples that are operated upon by other nodes n the ppelne, pror to start of executon of that ppelne, the cardnalty of the drver node s typcally known accurately. For exaple, for any ppelnes, drver nodes are typcally Table Scans or Index Scans, and the estates of for such drver nodes can be obtaned (alost exactly) fro the database syste catalogs. Whle the estates ay not be as accurate n the case of the drver node beng an Index Seek operator, any hstogras on the predcate coluns can be leveraged. In such cases, the estate of for the drver node can stll be qute accurate. On the other hand, accurately estatng for a Flter node that references a UDF, or a ested Loops Jon node are usually ore naccurate due to the nherent dffcultes n selectvty estaton and errors n propagaton to nteredate nodes [8]. Thus, usng only drver nodes for progress estaton can often result n better accuracy. Second, when cardnalty of the drver node donates s of other operators n the ppelne we can expect the estator dne to be close to gn. Ths s not uncoon n decson support queres such as TPC-H [2] where the drver node cardnaltes are large (e.g. large Table/Index Scans), and where operators such as Flter and Group-By can greatly reduce the cardnalty of nondrver nodes. Thrd, observe that the drver node hypothess ples: where / can be thought of as the work done per tuple output by the drver node. Therefore, when the total nuber of Getext() calls ade over all nodes n the ppelne does not vary sgnfcantly over the lfete of the ppelne, ontorng progress of only drver node s suffcent. Although ths condton does not hold for arbtrary ppelnes, we show below an portant class of ppelnes (n whch the output cardnalty of each operator s no larger than ts nput cardnalty) for whch dne stll yelds progress estates that are wthn a constant factor of gn. Fnally, snce a drver node s the source of tuples processed by other operators n the ppelne, t typcally provdes suffcently fne granularty of progress estaton. Ths property ay not hold n general for other operators (such as Flter or L Jon) n a ppelne, snce ther cardnaltes ay be arbtrarly sall. Guarantee of dne for onotoncally decreasng ppelnes: We dscuss an portant class of sngle ppelne queres where dne s guaranteed to be accurate wthn a constant factor of gn (the constant factor s the nuber of operators n the ppelne). Consder ppelnes havng the logcal property that no operator n the ppelne can ncrease ts ncong cardnalty. Thus, at any pont durng the query s executon, + and +. We refer to such a ppelne as a onotoncally decreasng ppelne. Soe of coon physcal operators that could be part of a onotoncally decreasng ppelne are Table Scan, Flter and

5 streang aggregate operators. IL Jon would also satsfy the above property when the jon looks up a key value (.e., a foregn key key jon). Cla: For a onotoncally decreasng ppelne wth operators, the estator dne s guaranteed to be accurate wthn a constant factor of the deal estator gn,.e. gn dne Proof: See Appendx A.. gn ote that for the case of a sngle ppelne consstng of a Table Scan, Flter and aggregaton operator (slar to the class of queres studed n the onlne aggregaton work e.g., [7]), f the nput tuples to such a ppelne P are read n rando order, then the drver node hypothess wll hold,.e., the expected value of / s P / P for that ppelne. Fnally, for the case of a sngle ppelne that s not onotoncally decreasng, the above guarantee does not hold, and dne s a heurstc. Intutvely, f an nteredate operator (e.g., a non foregn-key ested Loops Jon) can ncrease ts ncong cardnalty arbtrarly, then the dstrbuton of work done n the ppelne can be skewed so that progress at the drver node ay not be ndcatve of overall progress of the ppelne. 4. SOLUTIO FOR GEERAL CASE In ths secton, we extend our soluton for an arbtrary SQL query executon plan that conssts of ultple ppelnes. As descrbed n Secton 2., we odel an arbtrary executon plan as a partal order of ppelnes and extend the deas of Secton 3 of usng only drver nodes for each currently executng ppelne. Therefore, n our approach, the key ssues are: () explanng how to use the Drver ode Estator (dne) for a sngle ppelne to obtan an overall progress estate for entre query executon plan (Secton 4.), and (2) ntalzng and refnng the cardnalty estates based on feedback fro query executon (Secton 4.2). 4. Estator for Arbtrary Query Executon Plan As per our defnton of gn (Secton 2.3), for a query executon plan wth s ppelnes, our estator for the entre query can be rewrtten equvalently as follows: gn = P P Ps where each suaton ter denotes the su over all nodes n the correspondng ppelne. As dscussed prevously, the key challenge for a ppelne P s estatng the P for that ppelne. We note that n a query executon plan that nvolves ultple ppelnes, we know that each ppelne ust be n one of the followng states: (a) Copleted. (b) Currently executng. (c) ot yet started executng. For any ppelne that has copleted executon we have the exact values of the nuber of Getext() calls done on all operators n that ppelne, and thus P = P for such a ppelne. For a currently executng ppelne, we use dne to estate P. Specfcally, t follows drectly fro the drver node hypothess that P = P / dne. For a ppelne that has not Ps yet started executng ( P = 0), and we use the optzer s estates for P. In fact, t s for ths case where we expect a sgnfcant opportunty to prove estate of P usng feedback fro query executon. 4.2 Explotng Executon Feedback for Refnng Estates A key challenge arses fro estatng cardnalty of nodes of ppelnes that start wth nteredate blockng nodes e.g., Sort nodes and hash based Group-By nodes. For nodes of such ppelnes there s an opportunty to get better cardnalty estates by usng feedback fro query executon. Consder the query executon plan shown n Fgure 3. Suppose A s the buld relaton and B the probe relaton of the Hash Jon. The ppelnes for the query are P = {Table Scan A, Flter, Hash Jon}, P 2 = {Table Scan B}, P 3 = {Group-By} and P 4 = {Sort}, whch are executed n the order P, P 2, P 3, P 4. The drver nodes for the query are shaded n the fgure. To estate the cardnalty of the Sort operator (n ppelne P 4 ), we would need to have accurate estates on the flter, jon and group-by operators of the frst two ppelnes. Ths s n fact the tradtonal cardnalty estaton proble and s error prone. Hence, our ntal estate of work done by the Sort could be naccurate, potentally leadng to overall ncorrect progress estaton. Here, executon feedback can be leveraged to prove estate of the Sort node cardnalty. For exaple, when ppelne P 2 copletes, we n fact have the exact value of the cardnalty of the result of the jon. Slarly, when the Group-By, copletes, we have exact cardnalty (no uncertanty) of the nput to the Sort. In the rest of ths secton, we descrbe a general fraework for refnng cardnalty estates of a gven executon plan based on executon feedback. Flter Table Scan A Sort Group-By (Hash) Hash Jon Table Scan B Fgure 3. Executon plan wth ultple blockng operators Whle several technques are possble, n ths paper we follow a conservatve approach that ensures that we never ntroduce any addtonal naccuraces due to the refneent process. Thus, we refne the current estate of any node only f we are certan that the refneent wll ake the estate ore accurate. We acheve ths as follows: For each node n the executon plan, we track two addtonal values UB and LB, whch are respectvely, the upper and lower bounds on the cardnaltes of the rows that can be output fro that node. These bounds are based solely on the algebrac propertes of the operator and observed cardnaltes fro executon, and are guaranteed to be actual bounds on. In partcular, ths eans that LB UB. We adjust these lower

6 and upper bounds as we get ore nforaton fro query executon usng the technques descrbed below. The nvarant that we antan at all tes s that LB current estate of UB,.e., f we fnd that the current estate of les outsde the bounds, then we correct ts value to the approprate bound. The effectveness of such refneent based on bounds depends on how uch and how quckly these bounds can be refned based on executon feedback. When an upper (or lower) bound for a partcular node s refned, ths could potentally help refne the upper (or lower) bound of other nodes above t n the executon tree. We propagate these bounds usng algebrac propertes of operators. For exaple, n Fgure 3, suppose that at soe pont n te T durng the query s executon, we were able to conclude that the upper bound for the Hash Jon can be reduced fro llon rows to 0.5 llon rows. Suppose the upper bounds for the Group By and Sort nodes were 0.8 llon rows. Then, based on the algebrac propertes of Group-By and Sort nodes, we can also conclude that each of ther upper bounds cannot exceed 0.5 llon rows. The lowerng of the upper bound could help refne the estates of at one or both of these nodes at te T. ote that although dne uses only the drver node cardnaltes for the currently executng ppelne, t s necessary to refne cardnaltes of all nodes n the ppelne, snce t could nfluence the estates for nodes n a ppelne that s yet to start executng. In our pleentaton, we propagate bounds a few tes per second (at roughly the granularty at whch feedback s necessary to the user/applcaton). Refnng lower and upper bounds The refneent of lower and upper bounds for an operator Op at query executon te uses the followng nforaton: () The observed nput and output cardnaltes of the operator (.e., the of the operator as well as ts nput operators) (2) Algebrac propertes of the operator. For exaple, for Flter and Group-By operators, we know that the cardnalty cannot exceed ts nput cardnalty. (3) The current state of the operator. Ths refers to the state of nternal data structures used by the operator. For exaple, the current nuber of entres n the hash table of a Group-By operator. For refnng lower bounds, (the actual nuber of rows output fro the operator thus far) s tself a correct lower bound for any operator. An exaple of where the algebrac property of the operator s useful for refnng lower bounds s Sort. Snce Sort has the property that t does not change ts nput cardnalty, n fact, - (.e., cardnalty of the nput operator to the Sort) s a vald lower bound. Thus, the cardnalty of the Sort operator (whch s always the start of a new ppelne) can be refned when the prevous ppelne s executng. An exaple of where the current state of the operator s useful n refnng lower bounds, consder the Group-By (hash based) operator. If we can count the nuber of dstnct values observed durng the operator s executon thus far (say d), then the lower bound can be refned to d at that pont n te. Ths could be done, for exaple, by trackng the nternal hash table used by the operator. As far as the upper bound s concerned, for operators such as Flter and L Jon (foregn-key jon), we can leverage ther algebrac propertes (the fact that they can never ncrease ther nput cardnalty) and the s to refne the upper bound to: (UB - - ) +. Another exaple where algebrac propertes help refne upper bound s Sort, where UB - (.e., upper bound of nput to Sort) s an upper bound for the Sort tself. An exaple of the use of current operator state for refnng upper bounds s the Hash Jon operator. Consder a Hash Jon between two relatons A (buld sde) and B (probe sde). Assue A has already been hashed nto buckets, and suppose S s the nuber of tuples of the largest bucket. We can explot ths nforaton durng the probe phase to obtan a tghter upper bound snce we know that each row fro B can produce at ost S tuples after the jon. We refer the reader to Appendx B for detals of how upper and lower bounds can be refned for certan coon physcal operators. In the future, we ntend to explore applcablty of other rules that can yeld tghter bounds based on executon feedback. We observe that whenever an operator ternates, we know exactly the upper and lower bounds of that operator (whch are dentcal at that pont). Thus, e.g., for the query plan n Fgure 3, when the fnal ppelne (P 4 ) starts executng, we know exactly the cardnalty of ts drver node (the Sort node). In general, when a ppelne starts executng, we know exactly the cardnalty of ts drver nodes. In our experents on TPC-H queres, we have found that both lower and upper bounds help refne s of certan drver nodes sgnfcantly (e.g., by three orders of agntude for Q2) for drver nodes of upper level ppelnes (when the optzer underestates the cardnalty e.g., of a Sort node). Interestngly, the pact of these refneents on the overall estaton errors (see Secton 5) s typcally uch saller (a few percent). Ths s because n these queres, the s of drver nodes such as Table/Index Scan donate the s of other nodes. A ore thorough evaluaton of the effectveness of these boundng technques on other data sets/queres s part of our ongong work. Fnally, we note that other technques to leverage nforaton fro query executon are possble. For exaple, onlne estaton technques based on observng nteredate results as n [7], refnng statstcs based on observed query results [,], and renvokng optzer for cardnalty estated based on observed cardnaltes slar to [4]. Explotng these deas to augent our technques s an portant area of future work. 5. IMPLEMETATIO AD EXPERIMETAL EVALUATIO In ths secton, we frst descrbe the pleentaton of our soluton for estatng progress of SQL queres nsde Mcrosoft SQL Server. We follow ths wth the results of an experental evaluaton of our soluton for long runnng decson support queres on both the TPC-H benchark [2] as well as an nternal custoer database. 5. Ipleentaton Our pleentaton nsde Mcrosoft SQL Server conssts of the followng sple extensons to the exstng query executon engne. We augent the data structure correspondng to a node n the query executon plan wth counters for (nuber of rows output by the node thus far), (current estate of total nuber of rows that wll be output by node at copleton), UB and LB (upper and lower bounds respectvely of nuber of rows that can be output by node). After the query s optzed and an executon plan tree P has been generated for the query, we dentfy ppelnes

7 n P and the drver node(s) for each ppelne. We ntalze for each node to the optzer estated cardnalty (for leaf-level nodes such as Table/Index Scan ths s the cardnalty of the base table/ndex). We update and propagate the values of UB, LB and for nodes usng the and algebrac propertes of operators as descrbed n Secton 4.2. For convenence of collectng the progress nforaton of an executng query, we pleent a background thread that wakes up perodcally (approxately 4 tes a second), traverses P, coputes the progress, and logs the progress estate and a testap to a fle. The overheads of gatherng ths nforaton at runte are neglgble relatve to executon te of queres we consdered. In general, we would expect that database servers wll extend nterfaces (e.g., va syste stored procedures or functons) to allow clents to prograatcally access progress nforaton for an executng query by pollng the server. 5.2 Experents Goal: The goal of the experents s to: Evaluate the accuracy of our estator (whch s based on the Getext() odel of work presented n Secton 2) on a set of long runnng and coplex decson support queres. Evaluate robustness of our estator when data skew s vared. Valdate the drver node hypothess for progress estaton of currently executng ppelnes. Setup: We conducted the experents on a achne wth a 2.8GHz CPU and 52 MB RAM. Databases: We ran the experents on the TPC-H 0GB database [2]. We chose the 0GB confguraton because the queres are truly long runnng (typcally 0s of nutes). For the evaluaton wth varyng skew, we generated a TPC-H 0GB database wth a Zpfan skew factor of 2 usng the publcly avalable tool [3]. We also ran queres fro a real data warehouse applcaton used wthn the copany to analyze sales (we refer to ths as the SALES database approx. 5GB n sze). Queres: For TPC-H we evaluate all the queres defned n the benchark. We report nubers for all the long runnng queres n the benchark (those that reference the lnete table). For TPC- H queres, the jons are typcally foregn-key jons, and thus ost ppelnes exhbt the property of beng onotoncally decreasng (Secton 3). For the SALES database, we pcked a few queres for evaluaton. The queres aganst the sales database are aggregaton queres that are jons of 7-0 tables, and have 8-0 groupng coluns. The jons are non foregn-key jons, thus the property of onotoncally decreasng ppelnes does not hold. Evaluaton Metrc: Our experents are conducted a sngle query at a te, and on a achne on whch only the database server s executng. In ths settng, we expect the percentage work copleted reported by any schee to be a good estator of the percentage te taken by the query. As descrbed n Secton 5. above, we record the fracton coplete predcted by our soluton at regular ntervals throughout query executon. Assue the query starts executng at te t 0. Let f be the percentage of the query copleted as reported by our estator at te t ( > 0, t > t - ). Let t n be the te at whch the query copletes. Then, at any pont n te t, an estator that has perfect knowledge of the future would report the actual percentage of the query copleted as 00. (t -t 0 )/(t n -t 0 ). Thus, we defne the estaton error of an estator at te t (denoted by e ) as: 00 ( t t0 ) e = f ( t t ) ote also that snce we take the absolute value of the dfference, we do not dstngush between under estates or over estates. We report the overall estaton error for a query usng three aggregate easures over all the e s collected for the query, the average, standard devaton and ax over all e s TPC-H Benchark Queres The goal of ths experent s to evaluate the accuracy of our progress estator (see Secton 4.), whch s based on the Getext() odel of work. We evaluate the estator on coplex decson support queres of the TPC-H benchark [2] on the 0GB database. Table 2 shows the ean and axu error (as defned above) for several long runnng TPC-H queres for the unfor data dstrbuton case (Z=0) as well as the skewed data dstrbuton case (Z=2). As we see fro the table, for the Z=0 case, the axu error for any query does not exceed 0%, and the average error s sall (typcally below 5%). The standard devaton was also sall (at or below 5% n all cases). One nterestng observaton s that expensve Sort nodes at the top of a query executon plan can potentally be probleatc (as n Q5 for Z=0), partcularly when the query optzer overestates the cardnalty of the Sort node. In such cases t s dffcult to rectfy errors based on executon feedback untl the lower ppelne (that feeds nto the Sort node) s alost coplete. Thus, the error nduced by the optzer s estates perssts for alost the entre duraton of the query. For the Z=2 case, the axu and ean errors are hgher for certan queres e.g., Q8, Q8, and Q2. To understand the reasons for the errors better, refer to Fgures 4 and 5, whch show scatter plots of the actual percentage copleted vs. estated percentage copleted for Q8 for Z=0 and Z=2 respectvely. A perfect estator would have all data ponts along the dagonal of the graph. For the Z=2 case (Fgure 5), when the ajor ppelne n ths query (nvolvng scan of the lnete table followed by a Merge Jon and couple of Hash Jons (probes)) starts, the estates of the cardnaltes of the jons used by our estator are sgnfcantly overestated. However, shortly after the ppelne starts executng, (as explaned n Secton 4.2) we estate the cardnaltes usng dne whch s based on the progress of the drver node (Scan of lnete). Ths results n quckly reducng the estaton error, and explans the dscontnuty n progress estaton around 20% actual copleton. In general, untl a ppelne starts executng, our estator s ore susceptble to errors n cardnalty estaton. For the case of Z=0, the cardnalty estates of ths ppelne are qute accurate, and therefore we see lower errors. We observe slar behavor n queres Q8 and Q2. Ths experent shows our estator (based on the Getext() odel of work) results n farly robust progress estaton, even n the presence of skewed data dstrbutons. n 0

8 Table 2. Estaton Errors TPC-H Benchark Queres (0 GB database), Unfor and Skewed Data Sets Estated Pct. Copleton Estated Pct. Copleton Estaton Error (Z=0) TPCH-0GB Query 8 (Z=0) 00% 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% 00% Actual Pct. Copleton TPCH-0GB Query 8 (Z=2) 00% 80% 60% 40% 20% 0% Estaton Error (Z=2) Query Mean Max Mean Max Q 0.9% 2.8% 0.2% 0.5% Q3.% 2.0% 3.4% 4.7% Q4 0.5%.0% 0.6%.4% Q5 7.3% 9.0% 3.7% 5.4% Q6.2% 2.9% 2.8% 4.6% Q7 2.3% 4.0% 3.8% 7.6% Q8 0.8%.7% 5.2% 6.2% Q9 2.7% 4.9% 2.9% 8.3% Q0 0.4%.4%.6% 4.4% Q2.0%.7% 0.9% 3.8% Q4 0.5%.8%.5% 3.2% Q5 0.6%.3%.6% 4.4% Q7.7% 2.6% 0.7% 2.0% Q8 5.9% 6.8% 4.2% 25.5% Q9 0.5%.5%.8% 2.7% Q20 3.0% 9.8% 3.7% 5.9% Q2 0.9% 2.5% 5.7% 38.8% Fgure 4. Scatter plot of actual vs. estated percentage copleted (TPC-H Q8), Unfor dstrbuton 0% 20% 40% 60% 80% 00% Actual Pct. Copleton Fgure 5. Scatter plot of actual vs. estated percentage copleted (TPC-H Q8), Skewed dstrbuton Valdaton of Drver ode Hypothess In ths experent, we deonstrate the portance of estatng overall progress based only on progress of drver nodes wthn a currently executng ppelne. We do ths by coparng wth an estator that also uses the Getext() odel, but does not use dne (.e., the drver node hypothess) to estate the cardnalty of all nodes n the currently executng ppelne, but reles only on the optzer estated cardnaltes. We show the results for TPC-H query Q9 aganst the 0 GB database wth Zpfan skewed data (Z=2). The results for our estator and the estator that uses only optzer estates (OPT) for currently executng ppelnes s shown n Fgures 6 and 7 respectvely. The ean and ax errors for our estator s 2.9% and 8.3% respectvely, whereas the errors for OPT are 23% and 47% respectvely. The reason s that OPT, due to the ncluson of several jon nodes (whose cardnalty estates are naccurate), ends up wth a sgnfcant overestate of the actual work whch gets refned only near the very end of query executon (when the estated copleton jups fro 48% to 89%). Estated Pct. Copleton Estated Pct. Copleton TPCH-0GB Query 9 (Z=2) 00% 80% 60% 40% 20% 0% 00% 0% 20% 40% 60% 80% 00% Actual Pct. Copleton Fgure 6. Scatter plot of actual vs. estated percentage copleted (TPC-H Q9, Usng Drver node hypothess) TPCH-0GB Query 9 (Z=2) 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% 00% Actual Pct. Copleton Fgure 7. Scatter plot of actual vs. estated percentage copleted (TPC-H Q9, Usng only optzer estates) Queres on SALES Database In ths experent, we evaluate our estator on coplex decson support style queres fro a real database applcaton. As n TPC- H, we see that the ean estaton errors are qute low (around 0%) and the ax errors are around 20% (see Table 3). Ths

9 experent shows that that the accuracy of our estator does not degrade apprecably for ths set of real world queres that contan non foregn-key jons, and sgnfcant groupng and aggregaton. Table 3. Estaton Errors SALES queres Estaton Error Query Mean Max Q 7.% 7.3% Q2 8.2% 6.9% Q3.6% 8.2% Q4 9.3% 2.4% Q5 7.0% 8.% 6. MOOTOICITY As dscussed n Secton 2.2., onotoncty s a desrable property fro a user s perspectve. Consder a progress estator that uses the Getext() odel of work (Secton 2.3). Snce the values (observed cardnaltes durng executon) are onotoncally ncreasng, the estator wll be onotonc provded any changes to the values durng query executon are onotoncally decreasng. An estator that has up front knowledge of the exact nuber of Getext() calls that wll be ade by each operator (.e., the values) can guarantee onotoncty, snce t would never need to change. However, for any other technque that can only estate the value of the there s a trade-off between guaranteeng onotoncty and the accuracy of progress estaton. One way to ensure onotoncty s to ntally use a value for the estated that s uch larger than the actual,.e., an upperbound. The proble wth such an approach s that accuracy can suffer, snce the actual ay be uch saller. For exaple, consder a query plan whch perfors a hash jon of relatons R and R 2 and then sorts the result of the jon. ote that obtanng a tght upper-bound on the estate of the Sort node cardnalty can be probleatc. If the jon s a foregn-key jon, then we know that an upper bound on the cardnalty of the joned relaton, and hence the Sort node, s the sze of table wth the foregn-key. However, for non foregn-key jons the upper-bound can be a consderable overestate of the actual for the Sort node, and thus the accuracy of the estator ay be poor untl ost of the query has copleted executng. Therefore, the real challenge s to fnd tght upper-bounds so that accuracy of the estator s not sgnfcantly coprosed. Gven the dffculty of guaranteeng a tght upper bound for nteredate drver nodes, a trade-off between onotoncty and the accuracy of progress estaton appears unavodable. Thus, an nterestng ssue s whether users prefer ore accurate estates or estates that are guaranteed to be onotonc. A possble approach for addressng ths ssue s to present both the estated progress as well as the progress based on the upper-bounds. Let the progress coputed usng upper bounds be p % and the correspondng one coputed usng estates be p 2 %. Then (p, p 2 ) as a par of values would ndcate to the user that the % done at any nstant s not lower than p and our current best estate s the value p 2. ote that p s onotonc, whereas p 2 ay not be. We observe that for a sngle ppelne query, the estator dne (Secton 3) s onotonc, snce s known exactly and does not change durng the executon of the ppelne (see Secton 7 for runte condtons that ay cause onotoncty volatons even for sngle ppelne queres). However, for the case of ultppelne queres, our estator s not guaranteed to be onotonc. In partcular, onotoncty volatons can occur when a new ppelne starts executng, and we revse the optzer estates of, wth the estate based on dne (as descrbed n Secton 4.2). For the queres aganst the TPC-H 0GB (Skew Z=2) data n our experents, we coputed progress estates at regular ntervals (approxately 4 tes a second), and we easured: (a) the nuber of onotoncty volatons,.e., nuber of tes n whch a progress estate was less than the prevous estate, (b) the average % by whch the estate decreased and (c) the axu % by whch the estate decreased. We observed onotoncty volatons n fve queres (Q7, Q8, Q9, Q20, Q2). Moreover, except for Q8 and Q20, there was only volaton n the other three queres. The axu decrease n estated progress across all queres was 8.3% (for Q2) and the average decrease for each query respectvely was.4%, 0.7%, 4.9%, 0.0%, and 8.3%. One reason for the relatvely few and sall onotoncty volatons s that n these queres the s are donated by the leaf-level drver nodes (scans of lnete, orders tables). Due to the flterng and aggregatons perfored n these queres, the actual s of the upper-level nodes n the plan are usually uch saller. Thus, even n cases when the s for the non-leaf drver nodes are ntally under-estated, the agntude of the onotoncty volatons are sall. 7. RUTIME CODITIOS The odel of work done by a query (see Secton 2) akes the splfyng assupton that the actual work done by a call to Getext() s the sae across all operators n the plan,.e., Getext() fro all operators are weghted equally. In general, ths assupton does not hold, e.g., due to an expensve operator lke a UDF n a Flter node, or because one Table Scan reads fro a fast dsk whereas another Table Scan reads fro a slow dsk. A possble way to extend the basc odel of work to account for dfferent cost of Getext() of dfferent operators s to odel the work as a weghted su of the nuber of Getext() calls done by operators n the plan. The weghtng factor C j assocated wth operator j s a relatve easure of the work done by a Getext() on that operator. Of course, ths ntroduces an addtonal paraeter (besdes drver node cardnalty) that needs to be estated and refned. A possble soluton s to start wth unfor relatve rates (.e. C j = for all j) or use cost estates ade by the query optzer, and then adjust the C j values based on executon feedback. Modelng and coputng per-tuple work for every ppelne could be an portant factor n general, and developng technques to address ths ssue s part of our ongong work. In the rest of ths secton we dscuss an portant specal case of a runte condton, splls of tuples to dsk due to nsuffcent eory, and show how our estator can be adapted to handle splls wthout ntroducng an addtonal weghtng factor, by treatng spll processng as a runte ppelne. Handlng Splls: Splls of tuples to dsk, whch can occur as a result of nsuffcent eory can result n ore work that s not accounted for by our odel of work snce t occurs wthn an operator. Consder a jon between two relatons A and B, where the optzer pcks a hybrd hash jon operator. Hybrd hash proceeds by buldng a hash table of A n eory. Durng the scan of

10 relaton A, f the eory budget of the hash jon s exhausted, then certan buckets wll be splled to dsk. When the table B s used to probe the hash parttons, the tuples of B that hash to the buckets that are not eory resdent are also wrtten to dsk. Bucket spllng s a runte effect and hence t can be dffcult to accurately estate n advance the nuber of tuples that wll be splled to dsk. We observe that we can odel the query executon as coprsng two parts, one that processes the orgnal relatons and another that processes the splled parttons. In other words we can thnk of the orgnal query as follows. Q = (A jon B) (A jon B ) where A and B denote the correspondng parts of relatons A and B that have been splled (0 A A, 0 B B ). The drver nodes for query Q would nclude scans of A, B, A and B. Thus the total work for Q would be A + B + A + B. The an proble s that A, B cannot be predcted at optzaton te. The an dea behnd our soluton to the spll proble s as follows. Whenever a tuple s splled to dsk (ether fro relaton A or B) the denonator value (whch denotes the total work) s ncreented by one (.e., another Getext() call). We are n essence addng ore work to be done later and the denonator value should reflect the estated cardnalty of the ppelne. ow, consder the pont durng executon when the frst phase of hash processng s over and none of the splled parttons have been processed. The odfed estator would have ncreented the denonator counter for each tuple that had been splled and would estate the progress as ( A + B )/ ( A + B + A' + B' ) whch s correct as t accounts for the reanng tuples to be processed. When the splled parttons are re-read the correspondng counts would be counted n the nuerator and only when all the parttons have been processed wll the estator report the progress as 00%. Ths correcton to the estator works because of the syetry of splls,.e., exactly the tuples that have been wrtten to dsk wll be processed later. It s also easy to see that ths odfcaton to the orgnal algorth would work for ultple recurson levels n a hash jon ppelne. Fnally, we note that splls could occur n other operators lke hash-based Group-By or the erge phase n a Sort- Merge jon f there are too any duplcates of a partcular value. Thus, n general, a query can be consdered as Q Q where Q accounts for the work done by the current query n handlng data that s splled. The followng experent on TPC-H Q8 hghlghts n the portance of handlng splls. Fro Fgure 8 we see that the progress estator reans stuck on 44% for a relatvely long te (ore than 5% of total query executon te). Ths s because durng ths nterval the query s wrtng and readng the splled parttons, and the estator does not capture ths effect. On the other hand when we enable spll handlng as dscussed above (see Fgure 9) the estator s ore accurate. Estated Percentage Copleted TPC-H 0 GB Query 8 (Z=0) Wthout Spll Handlng 00% 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% 00% Actual Percentage Copleted Fgure 8. Scatter plot of actual vs. estated percentage copleted (TPC-H Q8, o Spll handlng) Estated Percentage Copleted TPC-H 0GB Query 8 (Z=0) Wth Spll Handlng 00% 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% 00% Actual Percentage Copleted Fgure 9. Scatter plot of actual vs. estated percentage copleted (TPC-H Q8, Wth Spll handlng) 8. RELATED WOR There are two broad areas that are related to our work. Frst s the area of estatng cardnalty of query expressons. Selectvty estaton e.g., [9] plays a key role n enablng query optzers to pck a sutable query executon plan. Our work leverages the query optzer to provde an ntal estate of cardnalty of nodes n an executon plan. The second broad area that relates to ths paper s the use of nforaton gathered durng query executon. One body of work e.g., [2,4] uses feedback of observed cardnaltes at runte to potentally re-optze the sae query, pck aong copetng query plans or to prove decsons on resource allocaton for t. In contrast, we use observed cardnalty of operators n the executon tree to prove estate of total work that needs to be done, whle leavng the query executon plan unchanged. We note that n prncple, the technques n [4] that collect statstcs such as cardnaltes/hstogras etc. of nteredate query results can be adapted n our context for obtanng better estates of s by renvokng the query optzer s cardnalty estaton odule at runte wth ore accurate statstcs. Whle ths does requre nontrval extensons to today s query processng engnes, t represents an nterestng avenue of future work for progress estaton. Another use of runte feedback s to refne statstcs e.g., [,] that can be used for selectvty estaton for

Introduction. Leslie Lamports Time, Clocks & the Ordering of Events in a Distributed System. Overview. Introduction Concepts: Time

Introduction. Leslie Lamports Time, Clocks & the Ordering of Events in a Distributed System. Overview. Introduction Concepts: Time Lesle Laports e, locks & the Orderng of Events n a Dstrbuted Syste Joseph Sprng Departent of oputer Scence Dstrbuted Systes and Securty Overvew Introducton he artal Orderng Logcal locks Orderng the Events

More information

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming Optzaton Methods: Integer Prograng Integer Lnear Prograng Module Lecture Notes Integer Lnear Prograng Introducton In all the prevous lectures n lnear prograng dscussed so far, the desgn varables consdered

More information

On-line Scheduling Algorithm with Precedence Constraint in Embeded Real-time System

On-line Scheduling Algorithm with Precedence Constraint in Embeded Real-time System 00 rd Internatonal Conference on Coputer and Electrcal Engneerng (ICCEE 00 IPCSIT vol (0 (0 IACSIT Press, Sngapore DOI: 077/IPCSIT0VNo80 On-lne Schedulng Algorth wth Precedence Constrant n Ebeded Real-te

More information

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms Generatng Fuzzy Ter Sets for Software Proect Attrbutes usng Fuzzy C-Means C and Real Coded Genetc Algorths Al Idr, Ph.D., ENSIAS, Rabat Alan Abran, Ph.D., ETS, Montreal Azeddne Zah, FST, Fes Internatonal

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation College of Engneerng and Coputer Scence Mechancal Engneerng Departent Mechancal Engneerng 309 Nuercal Analyss of Engneerng Systes Sprng 04 Nuber: 537 Instructor: Larry Caretto Solutons to Prograng Assgnent

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Merging Results by Using Predicted Retrieval Effectiveness

Merging Results by Using Predicted Retrieval Effectiveness Mergng Results by Usng Predcted Retreval Effectveness Introducton Wen-Cheng Ln and Hsn-Hs Chen Departent of Coputer Scence and Inforaton Engneerng Natonal Tawan Unversty Tape, TAIWAN densln@nlg.cse.ntu.edu.tw;

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Color Image Segmentation Based on Adaptive Local Thresholds

Color Image Segmentation Based on Adaptive Local Thresholds Color Iage Segentaton Based on Adaptve Local Thresholds ETY NAVON, OFE MILLE *, AMI AVEBUCH School of Coputer Scence Tel-Avv Unversty, Tel-Avv, 69978, Israel E-Mal * : llero@post.tau.ac.l Fax nuber: 97-3-916084

More information

An Adaptive Sleep Strategy for Energy Conservation in Wireless Sensor Networks

An Adaptive Sleep Strategy for Energy Conservation in Wireless Sensor Networks An Adaptve Sleep Strategy for Energy Conservaton n Wreless Sensor Networks Guseppe Anastas, Marco Cont, Maro D Francesco Abstract - In recent years, wreless sensor network deployents for real lfe applcatons

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

A system based on a modified version of the FCM algorithm for profiling Web users from access log

A system based on a modified version of the FCM algorithm for profiling Web users from access log A syste based on a odfed verson of the FCM algorth for proflng Web users fro access log Paolo Corsn, Laura De Dosso, Beatrce Lazzern, Francesco Marcellon Dpartento d Ingegnera dell Inforazone va Dotsalv,

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

arxiv: v3 [cs.ds] 7 Feb 2017

arxiv: v3 [cs.ds] 7 Feb 2017 : A Two-stage Sketch for Data Streams Tong Yang 1, Lngtong Lu 2, Ybo Yan 1, Muhammad Shahzad 3, Yulong Shen 2 Xaomng L 1, Bn Cu 1, Gaogang Xe 4 1 Pekng Unversty, Chna. 2 Xdan Unversty, Chna. 3 North Carolna

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

A New Scheduling Algorithm for Servers

A New Scheduling Algorithm for Servers A New Schedulng Algorth for Servers Nann Yao, Wenbn Yao, Shaobn Ca, and Jun N College of Coputer Scence and Technology, Harbn Engneerng Unversty, Harbn, Chna {yaonann, yaowenbn, cashaobn, nun}@hrbeu.edu.cn

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

EXTENDED FORMAL SPECIFICATIONS OF 3D SPATIAL DATA TYPES

EXTENDED FORMAL SPECIFICATIONS OF 3D SPATIAL DATA TYPES - 1 - EXTENDED FORMAL SPECIFICATIONS OF D SPATIAL DATA TYPES - TECHNICAL REPORT - André Borrann Coputaton Cvl Engneerng Technsche Unverstät München INTRODUCTION Startng pont for the developent of a spatal

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Math Homotopy Theory Additional notes

Math Homotopy Theory Additional notes Math 527 - Homotopy Theory Addtonal notes Martn Frankland February 4, 2013 The category Top s not Cartesan closed. problem. In these notes, we explan how to remedy that 1 Compactly generated spaces Ths

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Measuring Cohesion of Packages in Ada95

Measuring Cohesion of Packages in Ada95 Measurng Coheson of Packages n Ada95 Baowen Xu Zhenqang Chen Departent of Coputer Scence & Departent of Coputer Scence & Engneerng, Southeast Unversty Engneerng, Southeast Unversty Nanjng, Chna, 20096

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

A Theory of Non-Deterministic Networks

A Theory of Non-Deterministic Networks A Theory of Non-Deternstc Networs Alan Mshcheno and Robert K rayton Departent of EECS, Unversty of Calforna at ereley {alan, brayton}@eecsbereleyedu Abstract oth non-deterns and ult-level networs copactly

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

ALM-FastReplica: Optimizing the Reliable Distribution of Large Files within CDNs

ALM-FastReplica: Optimizing the Reliable Distribution of Large Files within CDNs ALM-FastReplca: Optzng the Relable Dstrbuton of Large Fles wthn CDNs Ludla Cherkasova Internet Systes and Storage Laboratory HP Laboratores Palo Alto HPL-005-64 Aprl 4, 005* E-al: cherkasova@hpl.hp.co

More information

Low training strength high capacity classifiers for accurate ensembles using Walsh Coefficients

Low training strength high capacity classifiers for accurate ensembles using Walsh Coefficients Low tranng strength hgh capacty classfers for accurate ensebles usng Walsh Coeffcents Terry Wndeatt, Cere Zor Unv Surrey, Guldford, Surrey, Gu2 7H t.wndeatt surrey.ac.uk Abstract. If a bnary decson s taken

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested

More information

Estimating Costs of Path Expression Evaluation in Distributed Object Databases

Estimating Costs of Path Expression Evaluation in Distributed Object Databases Estmatng Costs of Path Expresson Evaluaton n Dstrbuted Obect Databases Gabrela Ruberg, Fernanda Baão, and Marta Mattoso Department of Computer Scence COPPE/UFRJ P.O.Box 685, Ro de Janero, RJ, 2945-970

More information

Outline. Third Programming Project Two-Dimensional Arrays. Files You Can Download. Exercise 8 Linear Regression. General Regression

Outline. Third Programming Project Two-Dimensional Arrays. Files You Can Download. Exercise 8 Linear Regression. General Regression Project 3 Two-densonal arras Ma 9, 6 Thrd Prograng Project Two-Densonal Arras Larr Caretto Coputer Scence 6 Coputng n Engneerng and Scence Ma 9, 6 Outlne Quz three on Thursda for full lab perod See saple

More information

AP PHYSICS B 2008 SCORING GUIDELINES

AP PHYSICS B 2008 SCORING GUIDELINES AP PHYSICS B 2008 SCORING GUIDELINES General Notes About 2008 AP Physcs Scorng Gudelnes 1. The solutons contan the most common method of solvng the free-response questons and the allocaton of ponts for

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

A new Fuzzy Noise-rejection Data Partitioning Algorithm with Revised Mahalanobis Distance

A new Fuzzy Noise-rejection Data Partitioning Algorithm with Revised Mahalanobis Distance A new Fuzzy ose-reecton Data Parttonng Algorth wth Revsed Mahalanobs Dstance M.H. Fazel Zarand, Mlad Avazbeg I.B. Tursen Departent of Industral Engneerng, Arabr Unversty of Technology Tehran, Iran Departent

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

Large Margin Nearest Neighbor Classifiers

Large Margin Nearest Neighbor Classifiers Large Margn earest eghbor Classfers Sergo Bereo and Joan Cabestany Departent of Electronc Engneerng, Unverstat Poltècnca de Catalunya (UPC, Gran Captà s/n, C4 buldng, 08034 Barcelona, Span e-al: sbereo@eel.upc.es

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Multicast Tree Rearrangement to Recover Node Failures. in Overlay Multicast Networks

Multicast Tree Rearrangement to Recover Node Failures. in Overlay Multicast Networks Multcast Tree Rearrangeent to Recover Node Falures n Overlay Multcast Networks Hee K. Cho and Chae Y. Lee Dept. of Industral Engneerng, KAIST, 373-1 Kusung Dong, Taejon, Korea Abstract Overlay ultcast

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

SUV Color Space & Filtering. Computer Vision I. CSE252A Lecture 9. Announcement. HW2 posted If microphone goes out, let me know

SUV Color Space & Filtering. Computer Vision I. CSE252A Lecture 9. Announcement. HW2 posted If microphone goes out, let me know SUV Color Space & Flterng CSE5A Lecture 9 Announceent HW posted f cropone goes out let e now Uncalbrated Potoetrc Stereo Taeaways For calbrated potoetrc stereo we estated te n by 3 atrx B of surface norals

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Handwritten English Character Recognition Using Logistic Regression and Neural Network

Handwritten English Character Recognition Using Logistic Regression and Neural Network Handwrtten Englsh Character Recognton Usng Logstc Regresson and Neural Network Tapan Kuar Hazra 1, Rajdeep Sarkar 2, Ankt Kuar 3 1 Departent of Inforaton Technology, Insttute of Engneerng and Manageent,

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A fair buffer allocation scheme

A fair buffer allocation scheme A far buffer allocaton scheme Juha Henanen and Kalev Klkk Telecom Fnland P.O. Box 228, SF-330 Tampere, Fnland E-mal: juha.henanen@tele.f Abstract An approprate servce for data traffc n ATM networks requres

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Monte Carlo inference

Monte Carlo inference CS 3750 achne Learnng Lecture 0 onte Carlo nerence los Hauskrecht los@cs.ptt.edu 539 Sennott Square Iportance Saplng an approach or estatng the epectaton o a uncton relatve to soe dstrbuton target dstrbuton

More information

Relevance Feedback in Content-based 3D Object Retrieval A Comparative Study

Relevance Feedback in Content-based 3D Object Retrieval A Comparative Study 753 Coputer-Aded Desgn and Applcatons 008 CAD Solutons, LLC http://www.cadanda.co Relevance Feedback n Content-based 3D Object Retreval A Coparatve Study Panagots Papadaks,, Ioanns Pratkaks, Theodore Trafals

More information

Transaction-Consistent Global Checkpoints in a Distributed Database System

Transaction-Consistent Global Checkpoints in a Distributed Database System Proceedngs of the World Congress on Engneerng 2008 Vol I Transacton-Consstent Global Checkponts n a Dstrbuted Database System Jang Wu, D. Manvannan and Bhavan Thurasngham Abstract Checkpontng and rollback

More information

Performance Analysis of Coiflet Wavelet and Moment Invariant Feature Extraction for CT Image Classification using SVM

Performance Analysis of Coiflet Wavelet and Moment Invariant Feature Extraction for CT Image Classification using SVM Perforance Analyss of Coflet Wavelet and Moent Invarant Feature Extracton for CT Iage Classfcaton usng SVM N. T. Renukadev, Assstant Professor, Dept. of CT-UG, Kongu Engneerng College, Perundura Dr. P.

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

Scheduling Workflow Applications on the Heterogeneous Cloud Resources

Scheduling Workflow Applications on the Heterogeneous Cloud Resources Indan Journal of Scence and Technology, Vol 8(2, DOI: 0.7485/jst/205/v82/57984, June 205 ISSN (rnt : 0974-6846 ISSN (Onlne : 0974-5645 Schedulng Workflow Applcatons on the Heterogeneous Cloud Resources

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS

STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS 1 HENDRA RAHMAWAN, 2 KUSPRIYANTO, 3 YUDI SATRIA GONDOKARYONO School of Electrcal Engneerng and Inforatcs, Insttut Teknolog Bandung,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information