Optimal Workload-based Weighted Wavelet Synopses

Size: px
Start display at page:

Download "Optimal Workload-based Weighted Wavelet Synopses"

Transcription

1 Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel (2004, rev: Jul 2005) Abstract In recent years wavelets were shown to be effectve data synopses. We are concerned wth the problem of fndng effcently wavelet synopses for massve data sets, n stuatons where nformaton about query workload s avalable. We present lnear tme, I/O optmal algorthms for buldng optmal workload-based wavelet synopses for pont queres. The synopses are based on a novel constructon of weghted nner-products and use weghted wavelets that are adapted to those products. The synopses are optmal n the sense that the subset of retaned coeffcents s the best possble for the bases n use wth respect to ether the mean-squared absolute or relatve errors. For the latter, ths s the frst optmal wavelet synopss even for the regular, non-workload-based case. Expermental results demonstrate the advantage obtaned by the new optmal wavelet synopses. 1 Introducton In recent years there has been ncreasng attenton to the development and study of data synopses, as effectve means for addressng performance ssues n massve data sets. Data synopses are concse representatons of data sets, that are meant to effectvely support approxmate queres to the represented data sets [10]. A prmary constrant of a data synopss s ts sze. The effectveness of a data synopss s measured by the accuracy of the answers t provdes, as well as by ts response tme and ts constructon tme. Several dfferent synopses were ntroduced and studed, ncludng random samples, sketches, and dfferent types of hstograms. Recently, wavelet-based synopses were ntroduced and shown to be a powerful tool for buldng effectve data synopses for varous applcatons, ncludng selectvty estmaton for query optmzaton n DBMS, approxmate query processng n OLAP applcatons and more (see [17, 23, 21, 22, 2, 6, 9, 8], and references theren). The general dea of wavelet-based approxmatons s to transform a gven data vector of sze nto a representaton wth respect to a wavelet bass (ths s called a wavelet transform), and approxmate t usng only M wavelet bass vectors, by retanng only M coeffcents from the A prelmnary verson of ths paper appeared n Proc. of ICDT 05. Research partly supported by a grant from the Israel Scence Foundaton. 1

2 lnear combnaton that spans the data vector (coeffcents thresholdng). The lnear combnaton that uses only M coeffcents (and assumes that all other coeffcents are zero) defnes a new vector that approxmates the orgnal vector, usng less space. Ths s called M-term approxmaton, whch defnes a wavelet synopss of sze M. Wavelet synopses. Wavelets were tradtonally used to compress some data sets where the purpose s to reconstruct, n a later tme, an approxmaton of the whole data usng the set of retaned coeffcents. The stuaton s a lttle dfferent when usng wavelets for buldng synopses n database systems [17, 23]: n ths case only portons of the data are reconstructed each tme, n response to user queres, rather than the whole data at once. As a result, portons of the data that are used for answerng frequent queres are reconstructed more frequently than portons of the data that correspond to rare queres. Therefore, the approxmaton error s measured over the mult-set of actual queres, rather than over the data tself. Another aspect of the use of wavelets n database systems s that due to the large data-szes n databases (gga-, tera- and peta-bytes), the effcency of buldng wavelet synopses s of prmary mportance. Dsk I/Os should be mnmzed as much as possble, and non-lnear-tme algorthms may be unacceptable. Optmal wavelet synopses. The man advantage of transformng the data nto a representaton wth respect to a wavelet bass s that for data vectors contanng smlar values, many wavelet coeffcents tend to have very small values. Thus, elmnatng such small coeffcents ntroduces only small errors when reconstructng the orgnal data, resultng n a very effectve form of lossy data compresson. Generally speakng, we can characterze a wavelet approxmaton by three attrbutes: how the approxmaton error s measured, what wavelet bass s used and how coeffcent thresholdng s done. Many bases were suggested and used n tradtonal wavelets lterature. Gven a bass wth respect to whch the transform s done, the selecton of coeffcents that are retaned n the wavelet synopss may have sgnfcant mpact on the approxmaton error. The goal s therefore to select a subset of M coeffcents that mnmzes some approxmaton-error measure. Ths subset s called an optmal wavelet synopss, wth respect to the chosen error measure. Whle there has been a consderable work on wavelet synopses and ther applcatons [17, 23, 21, 22, 2, 6, 9, 8], so far there were only a few optmalty results. The frst one s a lnear-tme Parseval-based algorthm, whch was used n tradtonal wavelets lterature (e.g [12]), where the error was measured over the data. Ths algorthm mnmzes the L 2 norm of the error vector, and equvalently t mnmzes the mean-squared-absolute error over all possble pont queres [17, 23]. o algorthm that mnmzes the mean-squared-relatve error over all possble pont queres was known. The second one, ntroduced recently [9], s a polynomal-tme (O( 2 M log M)) algorthm that mnmzes the max relatve or max absolute error over all possble pont queres. Another optmalty result s a polynomal tme dynamc-programmng algorthm that obtans an optmal wavelet synopss over multple measures [6]. The synopss s optmal w.r.t. an error metrc defned as weghted combnaton of L 2 norms over the multple measures (ths weghted combnaton has no relaton wth the noton of weghted wavelets of ths paper). Workload-based wavelet synopses. In recent years there has been ncreased nterest n workloadbased synopses synopses that are adapted to a gven query workload, wth the assumpton that the workload represents (approxmately) a probablty dstrbuton from whch future queres wll be taken. Chaudhur et al [4] argue that dentfyng an approprate precomputed sample that avods 2

3 large errors on an arbtrary query s vrtually mpossble. To mnmze the effects of ths problem, prevous studes have proposed usng the workload to gude the process of selectng samples [1, 3, 7]. By pckng a sample that s tuned to the gven workload, we can reduce the error over frequent (or otherwse mportant ) queres n the workload. In [4], the authors formulate the problem of pre-computng a sample as an optmzaton problem, whose goal s to pck a sample that mnmzes the error for the gven workload. Recently, workload-based wavelet synopses were proposed by Portman and Matas [14, 19]. Usng an adaptve-greedy algorthm, the query-workload nformaton was used durng the thresholdng process n order to buld a wavelet synopss that reduces the error w.r.t. to the query workload. These workload-based wavelet synopses demonstrate sgnfcant mporvement wth respect to pror synopses. They are, however, not optmal w.r.t. the query workload. In ths paper, we address the problem of fndng effcently optmal workload-based wavelet synopses. 1.1 Contrbutons We ntroduce effcent algorthms for fndng optmal workload-based wavelet synopses usng weghted Haar (WH) wavelets, for workloads of pont queres. Our man contrbutons are: Lnear-tme, I/O optmal algorthms that fnd optmal Workload-based Weghted Wavelet (WWW) synopses: 1 An optmal synopss w.r.t. workload-based mean-squared absolute-error (WB-MSE). An optmal synopss w.r.t. workload-based mean-squared relatve-error (WB-MRE). Equvalently, the algorthms mnmze the expected squared, absolute or relatve errors over a pont query taken from a gven dstrbuton. Our WB-MSE synopses generalze the standard Haar wavelet synopses n the sense that when the workload s unform, our synopses are exactly the standard Haar wavelet synopses, and the Weghted-Haar bass becomes the standard Haar bass. The WB-MRE algorthm, used wth unform workload, s also the frst algorthm that mnmzes the mean-squared-relatve-error over the data values, wth respect to a wavelet bass. Both WWW synopses are also optmal wth respect to enhanced wavelet synopses, whch allow changng the values of the synopses coeffcents to arbtrary values. Expermental results show the advantage of our synopses wth respect to exstng synopses. The above results were obtaned usng the followng novel technques. We defne the problem of fndng optmal workload-based wavelet synopses n terms of a weghted norm, a weghted-nner-product and a weghted-nner-product-space. Ths enables lnear tme I/O optmal algorthms for buldng optmal workload-based wavelet synopses. The approach of usng a weghted nner product can also be used to the general case n whch each data pont s gven dfferent prorty, representng ts sgnfcance. Ths generalzaton s used to obtan the optmal synopses for mean relatve error, where the weght of each pont s normalzed by ts value. Usng these weghts, one can fnd a weghted-wavelet bass, and an optmal weghted wavelet synopss n lnear tme, wth O(/B) I/Os. 1 o relaton whatsover to the world-wde-web. 3

4 We ntroduce the use of weghted wavelets for data synopses. Usng weghted wavelets [5, 11] enables fndng optmal workload-based wavelet synopses effcently. In contrast, t s not known how to obtan optmal workload-based wavelet synopses wth respect to the Haar bass effcently. In the wavelets lterature (e.g., [12]), wavelets are used to approxmate a gven sgnal, whch s treated as a vector n an nner-product space. Snce an nner-product defnes an L 2 norm, the approxmaton error s measured as the L 2 norm of the error vector, whch s the dfference between the actual (approxmated) vector and the approxmatng vector. Many wavelet bases were used for approxmaton, as dfferent bases are adequate for approxmatng dfferent collectons of data vectors. By usng an orthonormal wavelet bass, an optmal coeffcent thresholdng can be acheved n lnear tme, based on Parseval s formula. When usng non-orthogonal wavelet bass, or measurng the error usng other norms (e.g., L ), t s not known whether an optmal coeffcent thresholdng can be found effcently, so usually non-optmal greedy algorthms are used n practce. A weghted Haar (WH) bass s a generalzaton of the standard Haar bass, whch s typcally used for wavelet synopses due to ts smplcty. There are several attrbutes by whch a wavelet bass s characterzed, whch affect the qualty of the approxmatons acheved usng ths bass (for full dscusson, see [12]). These attrbutes are: the set of nested spaces of ncreasng resoluton whch the bass spans, the number of vanshng moments of the bass, and ts compact support (f one exsts). Both Haar bass and a WH bass span the same subsets of nested spaces, have one vanshng moment, and a compact support of sze 1. The Haar bass s orthonormal for unform workload of pont queres. Hence t s optmal for the M SE error measure. The WH bass s orthonormal wth respect to the weghted nnerproduct defned by the problem of fndng optmal workload-based wavelet synopses. As a result, an optmal workload-based synopses wth respect to WH bass s acheved effcently, based on Parseval s formula, whle for the Haar bass no effcent optmal thresholdng algorthm s known, n cases other than unform workload. 1.2 Paper outlne The rest of the paper s organzed as follows. In Secton 2 we descrbe the bascs of wavelet-based synopses. In Secton 3 we descrbe our basc approach, ncludng the workload-based error metrcs and optmal thresholdng n orthonormal bases. In Secton 4 we defne the problem of fndng optmal workload-based wavelet synopses n terms of weghted nner product, and solve t usng an orthonormal bass. In Secton 5 we descrbe the optmal algorthm for mnmzng WB-MSE, whch s based on the constructon of Secton 4. In Secton 6 we extend the algorthm to work for the WB-MRE. In Secton 7 we present expermental results, and n Secton 8 we draw our conclusons. 2 Wavelets bascs In ths secton we wll start by presentng the Haar wavelets and contnue wth presentng wavelet based synopses, obtaned by thresholdng process, descrbed n Secton 2.2. The error tree structure wll be presented next (Secton 2.3), along wth the descrpton of the reconstructon of orgnal data from the wavelet synopses n Secton 2.4. Wavelets are a mathematcal tool for the herarchcal decomposton of functons n a spaceeffcent manner. Wavelets represent a functon n terms of a coarse overall shape, plus detals that range from coarse to fne. Regardless of whether the functon of nterest s an mage, a curve, or 4

5 a surface, wavelets offer an elegant technque for representng the varous levels of detal of the functon n a space-effcent manner. 2.1 One-dmensonal Haar wavelets Haar wavelets are conceptually the smplest wavelet bass functons, and were thus used n prevous works of wavelet synopses. They are fastest to compute and easest to mplement. We focus on them for purpose of exposton n ths paper. To llustrate how Haar wavelets work, we wll start wth a smple example borrowed from [17, 23]. Suppose we have one-dmensonal sgnal of = 8 data tems: S = [2, 2, 0, 2, 3, 5, 4, 4]. We wll show how the Haar wavelet transform s done over S. We frst average the sgnal values, parwse, to get a new lower-resoluton sgnal wth values [2, 1, 4, 4]. That s, the frst two values n the orgnal sgnal (2 and 2) average to 2, and the second two values 0 and 2 average to 1, and so on. We also store the parwse dfferences of the orgnal values (dvded by 2) as detal coeffcents. In the above example, the four detal coeffcents are (2 2)/2 = 0, (0 2)/2 = 1, (3 5)/2 = 1, and (4 4)/2 = 0. It s easy to see that the orgnal values can be recovered from the averages and dfferences. Ths was one phase of the Haar wavelet transform. By repeatng ths process recursvely on the averages, we get the Haar wavelet transform (Table 1). We defne the wavelet transform (also called wavelet decomposton) of the orgnal eghth-value sgnal to be the sngle coeffcent representng the overall average of the orgnal sgnal, followed by the detal coeffcents n the order of ncreasng resoluton. Thus, for the one-dmensonal Haar bass, the wavelet transform of our sgnal s gven by S = [2 3 4, 1 1 4, 1 2, 0, 0, 1, 1, 0] Resoluton Averages Detal Coeffcents 8 [2, 2, 0, 2, 3, 5, 4, 4] 4 [2, 1, 4, 4] [0,-1,-1, 0] 2 [1.5, 4] [0.5, 0] 1 [2.75] Table 1: Haar Wavelet Decomposton The ndvdual entres are called the wavelet coeffcents. The wavelet decomposton s very effcent computatonally, requrng only O() CPU tme and O(/B) I/Os to compute for a sgnal of values, where B s the dsk-block sze. o nformaton has been ganed or lost by ths process. The orgnal sgnal has eght values, and so does the transform. Gven the transform, we can reconstruct the exact sgnal by recursvely addng and subtractng the detal coeffcents from the next-lower resoluton. In fact we have transformed the sgnal S nto a representaton wth respect to another bass of R 8 : The Haar wavelet bass. A detaled dscusson can be found, for example, n [20]. 2.2 Thresholdng Gven a lmted amount of storage for mantanng a wavelet synopss of a data array A (or equvalently a vector S), we can only retan a certan number M of the coeffcents stored n the wavelet decomposton of A. The remanng coeffcents are mplctly set to 0. The goal of 5

6 coeffcent thresholdng s to determne the best subset of M coeffcents to retan, so that some overall error measure n the approxmaton s mnmzed. One advantage of the wavelet transform s that n many cases a large number of the detal coeffcents turn out to be very small n magntude. Truncatng these small coeffcents from the representaton (.e., replacng each one by 0) ntroduces only small errors n the reconstructed sgnal. We can approxmate the orgnal sgnal effectvely by keepng only the most sgnfcant coeffcents. For a gven nput sequence d 0,..., d 1, we can measure the error of approxmaton n several ways. Let the th data value be d. Let q be the th pont query, whch t s value s d. Let ˆd be the estmated result of d. We use the followng error measure for the absolute error over the th data value: e = e(q ) = d ˆd Once we have the error measure for representng the errors of ndvdual data values, we would lke to measure the norm of the vector of errors e = (e 0,..., e 1 ). The standard way s to use the L 2 norm of e dvded by whch s called the mean squared error: MSE(e) = e = 1 1 e 2 We would use the terms MSE and L 2 norm nterchangeably durng our development snce they are completely equvalent, to a postve multplcatve constant. The basc thresholdng algorthm, based on Parseval s formula, s as follows: let α 0,..., α 1 be the wavelet coeffcents, and for each α let level(α ) be the resoluton level of α. The detal coeffcents are normalzed by dvdng each coeffcent by 2 level(a ) reflectng the fact that coeffcents at the lower resolutons are less mportant than the coeffcents at the hgher resolutons. Ths process actually turns the wavelet coeffcents nto an orthonormal bass coeffcents (and s thus called normalzaton ). The M largest normalzed coeffcents are retaned. The remanng M coeffcents are mplctly replaced by zero. Ths determnstc process provably mnmzes the L 2 norm of the vector of errors defned above, based on Parseval s formula (see Secton 3). 2.3 Error tree The wavelet decomposton procedure followed by any thresholdng can be represented by an error tree [17, 23]. Fg. 1 presents the error tree for the above example. Each nternal node of the error tree s assocated wth a wavelet coeffcent, and each leaf s assocated wth an orgnal sgnal value. Internal nodes and leaves are labelled separately by 0, 1,..., 1. For example, the root s an nternal node wth label 0 and ts node value s 2.75 n Fg. 1. For convenence, we shall use node and node value nterchangeably. The constructon of the error tree exactly mrrors the wavelet transform procedure. It s a bottom-up process. Frst, leaves are assgned orgnal sgnal values from left to rght. Then wavelet coeffcents are computed, level by level, and assgned to nternal nodes. 2.4 Reconstructon of orgnal data Gven an error tree T and an nternal node t of T, t a 0, we let leftleaves(t) (rghtleaves(t)) denote the set of leaves (.e., data) nodes n the subtree rooted at t s left (resp., rght) chld. Also, gven any (nternal or leaf) node u, we let path(u) be the set of all (nternal) nodes n T that are proper ancestors of u (.e., the nodes on the path from u to the root of T, ncludng the root but 6

7 Fgure 1: Error tree for = 8 not u) wth nonzero coeffcents. Fnally, for any two leaf nodes d l and d k we denote d(l : h) as the range sum k =l d Usng the error tree representaton T, we can outlne the followng reconstructon propertes of the Haar wavelet decomposton [17, 23]: The reconstructon of any data value d depends only on the values of the nodes n path(d ). d = α j path(d ) δ j α j where δ j = +1 f d leftleaves(α j ) or j = 0, and δ j = 1 otherwse. 3 The bascs of our development In ths secton, we descrbe two basc deas of our development. As we want to fnd a synopss whch s optmal wth respect to a workload of queres, we need to defne the metrcs by whch to measure the qualty of approxmaton, meanng, the value of the approxmaton error. The standard way s to measure the L 2 norm of the error vector. Here we use a generalzaton of the L 2 norm, whch takes the workload of queres nto account, resultng n a weghted L 2 norm. Our synopses then mnmze ths weghted L 2 norm of the error vector. Addtonally, we use Parseval s formula and a known theorem that results from t, whch deals wth optmal approxmatons wth respect to orthonormal bases. We also strengthen ths theorem to show that the optmalty s acheved over a much broader class of approxmatons. 7

8 3.1 Workload-based error metrcs Let D = (d 0,..., d 1 ) be a sequence wth = 2 j values. Denote the set of pont queres as Q = (q 0,..., q 1 ), where q s a query whch ts answer s d. Let a workload W = (c 0,..., c 1 ) be a vector of weghts that represents the probablty dstrbuton from whch future pont queres are to be generated. Let (u 0,..., u 1 ) be a bass of R, than D = α u. We can represent D by a vector of coeffcents (α 0,..., α 1 ). Suppose we want to approxmate D usng a subset of the coeffcents S {α 0,..., α 1 } where S = M. Then, for any subset S we can defne a weghted norm W L 2 wth respect to S, that provdes a measure for the errors expected for queres drawn from the probablty dstrbuton represented by W, when usng S as a synopss. S s then referred to as a workload-based wavelet synopss. Denote ˆd as an approxmaton of d usng S. There are two standard ways to measure the error over the th data value (equvalently, pont query): The absolute error: e a () = e a (q ) = d ˆd ; and the relatve error: e r () = e r (q ) = d ˆd max{ d,s}, where s s a postve bound that prevents small values from domnatng the relatve error. Whle the standard (non-workload-based) approach s to reduce the L 2 norm of the vector of errors (e 1,..., e ) (where e = e a () or e = e r ()), here we would generalze the L 2 norm to reflect the query workload. Let W be a gven workload consstng of a vector of queres probabltes c 1,..., c, where c s the probablty that q occurs; that s, 0 < c 1, and 1 c = 1. The weghted-l 2 norm of the vector of (absolute or relatve) errors e = (e 1,..., e ) s defned as: W L 2 (e) = e w = 1 c e 2 where 0 < c 1, 1 c = 1. Thus, each data value d, or equvalently each pont query q, s gven some weght c that represents ts sgnfcance. ote that W L 2 norm s the square-root of the mean squared error for a pont query that s drawn from the gven dstrbuton. Thus, mnmzng that norm of the error s equvalent to mnmzng the mean squared error of an answer to a query. In general, the weghts gven to data values need not necessarly represent a probablty dstrbuton of pont queres, but any other sgnfcance measure. For example, n Secton 6 we use weghts to solve the problem of mnmzng the mean-squared relatve error measured over the data values (the non-workload-based case). otce that t s a generalzaton of the MSE norm: by takng equal weghts for each query, meanng c = 1 for each and e = e a (), we get the standard MSE norm. We use the term workload-based error for the W L 2 norm of the vector of errors e. When e are absolute (resp. relatve) errors the workload-based error would be called the WB-MSE (resp. WB-MRE). 3.2 Optmal thresholdng n orthonormal bases The constructon s based on Parseval s formula, and a known theorem that results from t (Theorem 1). 8

9 3.2.1 Parseval s formula. Let V be a vector space, where v V s a vector and {u 0,..., u 1 } s an orthonormal bass of V. We can express v as v = 1 α u. Then v 2 = 1 α 2 (1) An M-term approxmaton s acheved by representng v usng a subset of coeffcents S {α 0,..., α 1 } where S = M. The error vector s than e = / S α u. By Parseval s formula, e 2 = / S α2. Ths proves the followng theorem. Theorem 1 (Parseval-based optmal thresholdng) Let V be a vector space, where v V s a vector and {u 0,..., u 1 } s an orthonormal bass of V. We can represent v by {α 0,..., α 1 } where v = 1 α u. Suppose we want to approxmate v usng a subset S {α 0,..., α 1 } where S = M. Pckng the M largest coeffcents to S mnmzes the L 2 norm of the error vector, over all possble subsets of M coeffcents. Gven an nner-product, based on ths theorem one can easly fnd an optmal synopses by choosng the largest M coeffcents. 3.3 Optmalty over enhanced wavelet synopses otce that n the prevous secton we lmted ourselves to pckng subsets of coeffcents wth orgnal values from the lnear combnaton that spans v (as s usually done). In case {u 0,..., u 1 } s a wavelet bass, these are the coeffcents that results from the wavelet transform. We next show that the optmal thresholdng accordng to Theorem 1 s optmal even accordng to an enhanced defnton of M-term approxmaton. We defne enhanced wavelet synopses as wavelet synopses that allow arbtrary values to the retaned wavelet coeffcents, rather than the orgnal values that resulted from the transform. The set of possble standard synopses s a subset of the set of possble enhanced synopses, and therefore an optmal synopss accordng to the standard defnton s not necessarly optmal accordng to the enhanced defnton. An enhanced wavelet synopss can be, for example, the synopss descrbed n [8]. In ths work, the authors use probablstc technques n order to determne a coeffcent s value, such that only ts expected value s ts orgnal value, n order to probablstcally control the max-error and reduce t. There s a well known theorem about orthonormal transformatons and enhanced synopses, whch can be found n several sgnal processng textbooks. Theorem 2 (Optmal enhanced wavelet synopses) When usng an orthonormal bass, choosng the largest M coeffcents wth orgnal values s an optmal enhanced wavelet synopses. Proof : The proof s based on the fact that the bass s orthonormal. It s enough to show that gven some synopss of M coeffcents wth orgnal values, any change to the values of some subset of coeffcents n the synopss would only make the approxmaton error larger: Let u 1,..., u be an orthonormal bass and let v = α 1 u α u be the vector we would lke to approxmate by keepng only M wavelet coeffcents. Wthout loss of generalty, suppose we choose the frst M coeffcents and have the followng approxmaton for v: ṽ = M =1 α u. Accordng to Parseval s formula e 2 = =M+1 α 2 snce the bass s orthonormal. ow suppose we would change the values of some subset of j retaned coeffcents to new values. Let us see that due to 9

10 the orthonormalty of the bass t would only make the error larger. Wthout loss of generalty we would change the frst j coeffcents, meanng, we would change α 1,..., α j to be α 1,..., α j. In ths case the approxmaton would be ṽ = j =1 α u + M =j+1 α u. The approxmaton error would be v ṽ = j =1 (α α ) u + =M+1 α u. It s easy to see that the error of approxmaton would be: e 2 = v ṽ, v ṽ = j =1 (α α )2 + =M+1 α 2 > =M+1 α 2. 4 The workload-based nner product In ths secton, we defne the problem of fndng an optmal workload-based synopses n terms of a weghted-nner-product space, and solve t relyng on ths constructon. Here we deal wth the case where e are the absolute errors (the algorthm mnmzes the WB-MSE). An extenson to relatve errors (WB-MRE) s ntroduced n Secton 6 Our development s as follows: 1. Transformng the data vector D nto an equvalent representaton as a functon f n a space of pecewse constant functons over [0, 1). (Sec. 4.1) 2. Defnng the workload-based nner product. (Sec. 4.2) 3. Usng the nner product to defne an L 2 norm, showng that the newly defned norm s equvalent to the weghted L 2 norm (W L 2 ). (Sec. 4.3) 4. Defnng a weghted Haar bass whch s orthonormal wth respect to the new nner product. (Sec. 4.4) Based on Theorem 1 and Theorem 2 one can easly fnd an optmal workload-based wavelet synopses wth respect to a weghted Haar wavelet bass. 4.1 Transformng the data vector nto a pecewse constant functon We assume that our approxmated data vector D s of sze = 2 j. As n [20], we treat sequences (vectors) of 2 j ponts as pecewse constant functons defned on the half-open nterval [0, 1). In order to do so, we wll use the concept of a vector space from lnear algebra. A sequence of one pont s just a functon that s constant over the entre nterval [0, 1); we ll let V 0 be the space of all these functons. A sequence of 2 ponts s a functon that has two constant parts over the ntervals [0, 1 2 ) and [ 1 2, 1). We ll call the space contanng all these functons V 1. If we contnue n ths manner, the space V j wll nclude all pecewse constant functons on the nterval [0, 1), wth the nterval dvded equally nto 2 j dfferent sub-ntervals. We can now thnk of every one-dmensonal sequence D of 2 j values as beng an element, or vector f, n V j. 4.2 Defnng a workload-based nner product The frst step s to choose an nner product defned on the vector space V j. Snce we want to mnmze a workload based error (and not the regular L 2 error), we started by defnng a new workload based nner product. The new nner product s a generalzaton of the standard nner product. It s a sum of = 2 j weghted standard products; each of them s defned over an nterval of sze 1 : f, g = ( 1 c f (x) g (x) dx ) where 0 < c 1, 1 c = 1 (2) 10

11 Lemma 1 f, g s an nner product. Proof : Let us check that t satsfes the condtons of an nner product: f, g : V j V j R Symmetrc: Blnear: 1 f, g = c 1 f(x)g(x)dx = c g(x)f(x)dx = g, f 1 af 1 + bf 2, g = c (af 1 + bf 2 )(x)g(x)dx = 1 c 1 a c 1 af 1 (x)g(x)dx + 1 f 1 (x)g(x)dx + b c c bf 2 (x)g(x)dx = f 2 (x)g(x)dx = a f 1, g + b f 2, g and also wth a smlar proof. f, ag 1 + bg 2 = a f, g 1 + b f, g 2 postve defnte: 1 f, f = c 1 f(x)f(x)dx = c and f, f = 0 ff f 0 snce c > 0 for each f 2 (x)dx 0 As mentoned before, a coeffcent c represents the probablty (or a weght) for the th pont query (q ) to appear. otce that the answer of whch s the th data value, whch s functon value at the th nterval. When all coeffcents c are equal to 1 (a unform dstrbuton of queres), we get the standard nner product, and therefore ths s a generalzaton of the standard nner product. 11

12 4.3 Defnng a norm based on the nner product Based on that nner product we defne an nner-product-based (IPB) norm: f IPB = f, f (3) Lemma 2 The norm f IPB measured over the vector of absolute errors s the weghted L 2 norm of ths vector,.e e 2 IPB = 1 c e 2 = e 2 w. Proof : Let f V j be a functon and let f V j be a functon that approxmates f. let the error functon be e = f f V j. Then the norm of the error functon s: 1 ( c f 1 e 2 IPB = e, e = c 1 c ( ) f ( 1 e 2 (x) dx = )) 2 = 1 1 c c (f e (x) e (x) dx = ( f f ) 2 (x) dx = ( ) ( )) 2 1 f = c e 2 where e s the error on the th functon value. Ths s exactly the square of the prevously defned weghted L 2 norm. otce that when all coeffcents are equal to 1 we get the regular L 2 norm, and therefore ths s a generalzaton of the regular L 2 norm (MSE). Our goal s to mnmze the workload based error whch s the W L 2 norm of the vector of errors. 4.4 Defnng an orthonormal bass At ths stage we would lke to use Theorem 1. The next step would thus be fndng an orthonormal (wth respect to a workload based nner product) wavelet bass for the space V j. The bass s a Weghted Haar Bass. For each workload-based nner product (defned by a gven query workload) there s correspondng orthonormal weghted Haar bass, and our algorthm fnds ths bass n lnear tme, gven the workload of pont queres. We descrbe the bases here, and see how to fnd a bass based on a gven workload of pont queres. We wll later use ths nformaton n the algorthmc part. In order to buld a weghted Haar bass, we take the Haar bass functons and for the k th bass functon we multply ts postve (resp. negatve) part by some x k (resp. x k ). We would lke to choose such x k and y k so that we get an orthonormal bass wth respect to our nner product. Thus, nstead of usng Haar bass functons (Fg. 2), we use functons of the knd llustrated n Fg. 3, where x k and y k are not necessarly (and probably not) equal, so our bass looks lke the one n (Fg. 4). One needs to show how to choose x k and y k. Let u k be some Haar bass functon as descrbed above. Let [a k0, a k1 ) be the nterval over whch the bass functon s postve and let [a k1, a k2 ) be the nterval over whch the functon s negatve. 1 Recall that a k0, a k1 and a k2 are both multples of and therefore the nterval precsely contans some number of contnuous ntervals of the form [, +1 ] (also a k 1 = a k 0 +a k2 2 ). Moreover, the sze of the nterval over whch the functon s postve (resp. negatve) s 1 for some < j (As 2 we remember, = 2 j ). Recall that for the th nterval of sze 1, meanng [, +1 ) there s a 12

13 Fgure 2: An example for a Haar bass functon correspondng weght coeffcent c whch s the coeffcent that s used n the nner product. otce that each Haar bass functon s postve (negatve) over some number of (whole) such ntervals. We can therefore assocate the sum of coeffcents of the ntervals under the postve (negatve) part of the functon wth the postve (negatve) part of the functon. Let us denote the sum of weght coeffcents (c s) correspondng to ntervals that are under the postve (resp. negatve) as l k (resp. r k ). Lemma 3 Suppose for each Haar bass functon v k we choose x k and y k such that x k = rk l k r k + l 2 k y k = lk l k r k + r 2 k and multply the postve (resp. negatve) part of v k by x k (resp. y k ); by dong that we get an orthonormal set of = 2 j functons, meanng we get an orthonormal bass. Proof : We frst show that when takng x k and y k such that x k r k = y k lk the bass s orthogonal. It s enough to show that the nner product of any v k and a constant functon s 0. In order to see why that suffces: Let u and v be some 2 Haar bass functons and let I u and I v be the ntervals over whch u and v are dfferent from zero, respectvely. If there s some pont (nterval) over whch both functons are dfferent from zero, then by the Haar bass defnton we get ether I u I v or I v I u. Suppose I v I u then I v s contaned only n the negatve part of I u or only n the postve part of I u, agan, by the Haar bass defnton. Consequently, when multplyng u and v by an nner product, there are two possble results: ether there s no pont that both functons are dfferent from zero, or the non-zero nterval of one functon s completely contaned n a constant part of the other functon. Obvously ths goes for our Weghted Haar Bass as well. ow, let us verfy that the nner product of some v k wth a constant functon f (x) = m s zero: 1 v k, f = c m { v k( )>0} c 1 v k (x) f (x) dx = 1 m c v k (x) dx + m 13 v k (x) dx = { v k( )<0} c c v k (x) mdx = v k (x) dx =

14 Fgure 3: An example for a Weghted Haar Bass functon Fgure 4: the weghted Haar Bass along wth the workload coeffcents, each coeffcent under ts correspondng nterval. For each level, the functons of the level are dfferent from zero over ntervals of equal sze. m xk m { v k( )>0} { v k( )>0} c m yk c x k m { v k( )<0} { v k( )<0} c y k = c = m (x k l k y k r k ) = 0 ow, n order to get an orthonormal bass all we have to do s to normalze those bass functons. 14

15 Let us compute the norm of some v k whose postve part s set to x k and ts negatve part s set to y k : 1 v k, v k = c v k 2 (x) dx = { v k( )>0} x2 k c { v k( )>0} { v k( )>0} v 2 k (x) dx + c x 2 k + c + y2 k { v k( )<0} { v k( )<0} { v k( )<0} From the orthogonalty condton we wll take y k = x kl k r k : r k c c y 2 k = c = x 2 kl k + y 2 kr k ( ) x 2 kl k + ykr 2 k = 1 x 2 xk l 2 k kl k + r k = 1 x 2 kl k + x2 k l2 k x 2 k So we wll take: ( ) l k + l2 k rk 2 = 1 x 2 1 k = x k = 1 l k + l2 k r 2 l k + l2 k k x k = rk l k r k + l 2 k y k = lk l k r k + r 2 k r 2 k = v 2 k (x) dx = r k = 1 rk l k r k + l 2 k There s a specal case whch s the computng of the constant bass functon (whch represents the total weghted average) v 0 (x) = const. We would lke the norm of ths functon to be 1. We just have to put x k = y k n the equaton x 2 k l k + yk 2r k = 1 and get f (x) = x k = y k = 1 l k +r k = const. Agan, notce that had all the workload coeffcents been equal (c = 1 ) we would get the standard Haar bass used to mnmze the standard L 2 norm. Agan, notce that had all the workload coeffcents been equal (c = 1 ) we would get the standard Haar bass used to mnmze the standard L 2 norm. As we have seen, ths s an orthonormal bass to our functon space. In order to see that t s a wavelet bass, we can notce that for each k = 1,..., j, the frst 2 k functons are an orthonormal set belongng to V k (ts dmenson s 2 k ) and whch s therefore a bass of V k. 5 The Algorthm for the WWW Transform In ths secton we descrbe the algorthmc part. Gven a workload of pont queres and a data vector to be approxmated, we buld workload-based wavelet synopses of the data vector usng a weghted Haar bass. The algorthm has two parts: 1. Computng effcently a Weghted Haar bass, gven a workload of pont queres. (Sec. 5.1) 2. Computng effcently the Weghted Haar Wavelet Transform wth respect to the chosen bass. (Sec. 5.2) 15

16 5.1 Computng effcently a weghted Haar bass ote that at ths pont we already have a method to fnd an orthonormal bass wth respect to a gven workload based nner product. Recall that n order to know x k and y k for every bass functon we need to know the correspondng l k and r k. We are gong to compute all those partal sums n lnear tme. Suppose that the bass functons are arranged n an array lke n a bnary tree representaton. The hghest resoluton functons are at ndexes 2,..., 1, whch are the lowest level of the tree. The next resoluton level functons are at ndexes 4,..., 2 1, and so on, untl the constant bass functon s n ndex 0. otce that for the lowest level (hghest resoluton) functons (ndexes 2,..., 1) we already have ther l k s and r k s. These are exactly the workload coeffcents. It can be easly seen n Fg. 4 for the lower four functons. otce that after computng the accumulated sums for the functons at resoluton level, we have all the nformaton to compute the hgher level functons: let u k be a functon at resoluton level and u 2k, u 2k+1 be at level + 1, where ther supports ncluded n u k s support (u k s ther ancestor n the bnary tree of functons). We can use the followng formula for computng l k and r k : l k = l 2k + r 2k r k = l 2k+1 + r 2k+1 See Fg. 4. Thus, we can compute n one pass only the lowest level, and buld the upper levels bottom-up (n a way somewhat smlar to the Haar wavelet transform). The algorthm conssts of phases, where n each phase the functons of a specfc level are computed. At the end of a phase, we keep a temporary array holdng all the parwse sums of all the l k s and r k s from that phase and use them for computng the next phase functons. Clearly, the runnng tme s = O (). The number of I/Os s O (/B) I/Os (where B s the block sze of the dsk) snce the process s smlar to the computaton Haar wavelet transform. A pseudo-code of the computaton can be found n Fg. 7. The createf uncton() functon takes two sums of weght coeffcents correspondng to the functon s postve part and to the functon s negatve part, and buld a functon whose postve (resp. negatve) part s value s x k (resp. y k ) usng the followng formulae: x k = rk l k r k + l 2 k y k = lk l k r k + r 2 k 5.2 Computng a weghted Haar wavelet transform Gven the bass we would lke to effcently perform the wavelet transform wth respect to that bass. Let us look at the case of = 2 (Fg. 5). Suppose we would lke to represent the functon n Fg. 6. It s easy to compute the followng result (denote α as the coeffcent of f ): α 0 = yv 0 + xv 1 x + y α 1 = v 0 v 1 x + y (by solvng 2x2 matrx). otce that the coeffcents are weghted averages and dfferences, snce the transform generalzes the standard Haar transform (by takng x = y = 2 we get the standard Haar transform). It s easy to reconstruct the orgnal functon from the coeffcents: v 0 = α 0 + xα 1 v 1 = α 0 yα 1 Ths mples a straghtforward method to compute the wavelet transform (whch s I/O effcent as well) accordng to the way we compute a regular wavelet transform wth respect to the Haar 16

17 Fgure 5: Weghted Haar Transform wth two functons Fgure 6: a smple functon wth 2 values over [0, 1) bass: we go over the data, and compute the weghted dfferences whch are the coeffcents of the bottom level functons. We keep the weghted averages, whch can be represented solely by the rest of the bass functons (the lower resoluton functons - as n the regular Haar wavelet transform), n another array. We repeat the process over the averages tme and tme agan untl we have the overall average, whch s added to our array as the coeffcent of the constant functon (v 0 (x) = const). Whle computng the transform, n addton to readng the values of the sgnal, we need to read the proper bass functon that s relevant for the current stage (n order to use the x k and y k of the functon that s employed n the above formula). Ths s easy to do, snce all the functons are stored n an array F and the ndex of a functon s determned by the teraton number and s dentcal to the ndex of the correspondng currently computed coeffcent. A pseudo code of the algorthm s can be found n Fg. 8. The steps of our algorthm are dentcal to the steps of the Haar algorthm, wth the addton of readng the data at F [] (the x k and y k of the functon) durng the th teraton. Therefore the I/O complexty of that phase remans O (/B) (B s the dsk block sze) wth O () runnng tme. After obtanng the coeffcent of the orthonormal bass we keep the largest M coeffcents, along wth ther correspondng M functons, and throw the smallest coeffcents. Ths can be done effcently usng an M-approxmate quantle algorthm [13]. Based on Theorem 1 we obtan an optmal synopss. 6 Optmal synopss for mean-squared relatve error We show how to mnmze the weghted L 2 norm of the vector of relatve errors, weghted by the query workload, by usng weghted wavelets. As a specal case, ths mnmzes the mean-squaredrelatve-error measured over the data values. Recall that n order to mnmze the weghted L 2 norm of relatve errors, we need to mnmze =1 c ( d ˆd max{d,s} ) 2. For smplcty, we show nstead how to mnmze =1 c ( d ˆd d ) 2 ; the extenson to the above s straghtforward. Snce D = d 1,..., d s part of the nput of the algorthm, t s fxed throughout the algorthm s executon. We can thus dvde each c by d 2 and get a new 17

18 nput: an array W of weght coeffcents output: an array F of bass functons temp.length = /2 for = 0 to /2-1 F[/2 + ] = createfuncton(w[2], W[2+1]) temp[] = W[2] + W[2+1] whle temp.length > 1 temp.length /= 2 offset = temp.length for = 0 to temp.length/2 F[offset + ] = createfuncton(temp[2], temp[2 + 1]) temp[] = temp[2] + temp [2 + 1] F[0] = createconstfuncton (1/temp[0]) Fgure 7: Constructon of a WH Bass nput: an array D of data values, an array F of bass functons output: an array Res of wavelet coeffcents for = 0 to /2-1 Res[/2 + ] = (D[2] - D[2 + 1])/(F[].pos + F[].neg) temp[] = D[2] * F[].neg + D[2 + 1] * F[].pos/(F[].pos + F[].neg) whle temp.length > 1 offset = temp.length/2 for = 0 to temp.length/2 Res[offset + ] = temp[2] - temp[2 + 1] / (F[].pos + F[].neg) temp[] = (temp[2] * F[].neg + temp[2 + 1] * F[].pos) / (F[].pos + F[].neg) Res[0] = temp[0]/f[0].constvalue Fgure 8: The wavelet transform 18

19 ( vector of weghts: W = c 1 d 2 1,..., c d 2 ) ( d ˆd weghts we mnmze c =1 d 2 errors. otce that n the case b = 1. Relyng on our prevous results, and usng the new vector of ) ( ) 2 2 = =1 c, whch s the W L 2 norm of relatve d ˆd d (the unform case) the algorthm mnmzes the meanrelatve-error over all data values. As far as we know, ths s the frst algorthm that mnmzes the mean-relatve-error over the data values. 7 Experments In ths secton we demonstrate the advantage obtaned by our workload-based wavelet synopses. All our experments were done usng the τ-synopses system [15]. For our expermental studes we used both synthetc and real-lfe data sets. The synthetc data-sets are taken from the TPCH data ( and the real-lfe data-sets are taken from the Forest CoverType data provded by KDD Data of the Unversty of Calforna ( The data-sets are: 1. TPCH - 2. KDD - TPCH1 - Data attrbute 1 from table ORDERS, fltered by attrbute O CUSTKEY, whch contans about 150,000 dstnct values. KDD Data attrbute Aspect from table CovTypeAgr fltered by Elevaton from the KDD data, wth a total of 2048 dstnct values. KDD512 - Data attrbute Elevaton from table CovTypeAgr fltered by Aspect from the KDD data, wth a total of 512 dstnct values. KDD128 - Data attrbute Aspect from table CovTypeAgr fltered by Slope from the KDD data, wth a total of 128 dstnct values. The sets of queres were generated ndependently by a Zpf dstrbuton generator. We used queres of dfferent skews, dstrbuted by several Zpf parameter values. We took here the zpf parameters n the range between 0.2 and 0.8, n order to test the behavor of the synopses under dfferent skews, whch range from close-to-unform to hghly skewed. The sets of queres contaned queres over each data set. 7.1 Optmalty wth respect to WB-MSE measure In Fg. 9 we compare the standard wavelet synopss from [17, 23] wth our WB-MSE wavelet synopss. The standard synopss s depcted n sold lne. We measured the WB-MSE as a functon of synopss sze, measured as the number of coeffcents n the synopss. For each M = 10, 20,..., 100 we bult synopses of sze M usng both methods and compared the WB-MSE error, measured wth respect to a gven workload of queres. The workload contaned 5000 Zpf dstrbuted pont queres, wth a Zpf parameter of 0.5. The data-set was the TPCH1 data. As the synopss sze ncreases, the error of the workload-based algorthm becomes much smaller than the error of the standard algorthm. The reason for ths s that synopses of szes 10,...,100 are very small wth respect to a data of sze 150,000. Snce the standard algorthm does not take the query workload nto account, the results are more or less the same for all synopses szes n the experment. However, 19

20 Fgure 9: Comparng the WB-MSE of the standard and the workload-based synopses, for dfferent synopses szes. Data: TPCH1, Workload: 5000 queres dstrbuted as Zpf(0.5). The error of the optmal synopss sharply decreases as synopss sze ncreases. the workload-based synopss adapts tself to the query workload, whch s of sze All the data values whch are not quered by the workload were gven very small mportance weghts by the algorthm, so the synopss actually had to be accurate over less than 5000 values. Thus, there s a sharp decrease n the error of the workload-based algorthm as the synopss sze ncreases. In Fg. 10 we used a smlar experment, ths tme wth the KDD2048 data. The standard synopss s agan depcted n sold lne. As n the prevous experment, we measured the WB-MSE as a functon of synopss sze. For each M = 20, 40,..., 200 we bult synopses of sze M usng both methods and compared the WB-MSE error, measured wth respect to a gven workload of queres. The workload contaned 5000 Zpf dstrbuted pont queres, wth a Zpf parameter 0.5. The data was the KDD2048 data, of sze We see that for each synopss sze the error of the standard algorthm s approxmately twce the error of the workload-based algorthm. The reason for ths s that here the query workload s larger than the data-set, n contrast to the prevous experment. Thus, most of the data s quered by the workload, so the mportance weghts gven to data values were more unform than n the prevous experment. Therefore, the error dfference s smaller than n the prevous experment, snce the advantage of the workload-based algorthm becomes more sgnfcant as the workload gets more skewed. However, snce the workload-based synopss adapts tself to the workload, the error s stll better than the standard synopss, whch assumes unform dstrbuton. 7.2 Optmalty wth respect to the WB-MRE measure In Fg. 11 we compare the standard wavelet synopss from [17, 23] and the adaptve-greedy workloadbased wavelet synopss from [14, 19] wth our WB-MRE wavelet synopss. The standard synopss s depcted n dotted lne wth x s. Snce t s hard to dstngush between the other two synopses n ths resoluton level, we zoom nto ths fgure n Fg. 12. We measured the WB-MRE as a functon of synopss sze, measured as the number of coeffcents n the synopss. For each M = 20, 40,..., 200 we bult synopses of sze M usng the three methods and compared the WB-MRE error, measured wth respect to a gven workload of queres. The workload contaned 3000 Zpf dstrbuted pont queres, wth a Zpf parameter of 0.5. The data-set was the KDD2048 data. Snce the standard algorthm does not take nto account the query-workload and s not adapted for relatve errors, ts 20

21 Fgure 10: Comparng the WB-MSE of the standard and the workload-based synopses, for dfferent synopses szes. Data: KDD2048, Workload: 5000 queres dstrbuted as Zpf(0.5). The error of the optmal synopss s about twce smaller that the standard synopss, for dfferent synopses szes. Fgure 11: Comparng the WB-MRE of the standard synopss, the workload-based adaptve-greed synopss and our WB-MRE synopss, for dfferent synopses szes. Data: KDD2048, Workload: 5000 queres dstrbuted as Zpf(0.5). Snce the adaptve and the WB-MRE are ndstgushable n ths scale, an elaboraton of that zone s n Fg. 12. It can be clearly seen that the workload-based synopses acheve smaller error than the standard synopss. approxmaton error s more than tmes larger than the approxmaton errors of the workloadbased algorthms, for each synopss sze. In Fg. 12 we compare the adaptve-greedy workload-based synopss from [17, 23] wth our WB- MRE synopss. The adaptve-greedy synopss s depcted n sold lne. We measured the WB-MRE as a functon of synopss sze, measured as the number of coeffcents n the synopss. For each M = 20, 40,..., 200 we bult synopses of sze M usng the two methods and compared the WB-MRE error, measured wth respect to a gven workload of queres. The workload contaned 5000 Zpf dstrbuted pont queres, wth a Zpf parameter of 0.5. The data-set was the KDD2048 data. For each synopss sze, the approxmaton error of the adaptve-greedy s tmes larger than the error of our WB-MRE algorthm. 21

22 Fgure 12: Comparng the WB-MRE of the workload-based adaptve-greed synopss and our WB- MRE synopss, for dfferent synopses szes. Data: KDD2048, Workload: 5000 queres dstrbuted as Zpf(0.5). The WB-MRE synopss acheves sgnfcantly better approxmaton than the adaptvegreedy algorthm. 7.3 WB error for dfferent workload skews In Fg. 13 we depct the WB-MRE as a functon of synopss sze, for three gven query workloads, dstrbuted wth Zpf parameters 0.2, 0.5 and 0.8. The data-set was the KDD2048 data-set, and the workloads conssted 5000 queres. For each of the gven three workloads we buld synopses of sze M = 50, 100,..., 500 and depcted the WB-MRE as a functon of synopss sze. It can be seen that many wavelet coeffcents can be gnored before the error sgnfcantly ncreases. Ths s a desred feature for any synopss. For example, for synopses of sze 500 the WB-MRE s smaller than 0.05, and for synopses of sze 250 the WB-MRE s smaller than 0.1. It can also be seen that the hgher the skew, the more accurate the workload-based synopses. The reason s that when the skew gets hgher, the synopss should be accurate over a smaller number of data values. In Fg. 14 we compare the rato between the approxmaton error of the standard algorthm [14, 19] and the approxmaton error of the WB-MSE algorthm, as a functon of workload skew. The comparson s for dfferent query workloads, dstrbuted wth Zpf parameters rangng from 0 (unform) to 0.9 (hghly skewed). The data-set was the KDD2048. For each gven workload we measured the error rato between the two synopses. It can be seen that the hgher the skew of the workload, the hgher the rato between the approxmaton errors of the synopses. The reason s than as the workload gets far from unform, the advantage of the workload-based algorthms naturally becomes more sgnfcant over the standard synopss, whch assumes unform workload. 7.4 Robustness to workload skew devatons In ths secton we dscuss the problem of ncorrect future workload estmaton, and test the WWW synopses behavor n such cases. Specfcally, a synopss can be bult assumng a specfc dstrbuton, and be used later to answer queres taken from a dfferent dstrbuton. A synopss s sad to be robust to errors n the workload estmaton f small errors n the estmaton ntroduce small errors n the qualty of approxmaton. There are dfferent ways to measure the robustness of a synopss n such cases, as the actual workload can be dfferent from the estmated workload n many dfferent ways. We chose to focus on Zpf dstrbuton, as t s beleved to be common n 22

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optimal Workload-based Weighted Wavelet Synopses Yossi Matias and Daniel Urieli School of Computer Science Tel-Aviv University {matias,daniel1}@tau.ac.il Abstract. In recent years wavelets were shown to

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices Hgh resoluton 3D Tau-p transform by matchng pursut Wepng Cao* and Warren S. Ross, Shearwater GeoServces Summary The 3D Tau-p transform s of vtal sgnfcance for processng sesmc data acqured wth modern wde

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

On Some Entertaining Applications of the Concept of Set in Computer Science Course

On Some Entertaining Applications of the Concept of Set in Computer Science Course On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,

More information

Report on On-line Graph Coloring

Report on On-line Graph Coloring 2003 Fall Semester Comp 670K Onlne Algorthm Report on LO Yuet Me (00086365) cndylo@ust.hk Abstract Onlne algorthm deals wth data that has no future nformaton. Lots of examples demonstrate that onlne algorthm

More information

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Lecture 5: Probability Distributions. Random Variables

Lecture 5: Probability Distributions. Random Variables Lecture 5: Probablty Dstrbutons Random Varables Probablty Dstrbutons Dscrete Random Varables Contnuous Random Varables and ther Dstrbutons Dscrete Jont Dstrbutons Contnuous Jont Dstrbutons Independent

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Brushlet Features for Texture Image Retrieval

Brushlet Features for Texture Image Retrieval DICTA00: Dgtal Image Computng Technques and Applcatons, 1 January 00, Melbourne, Australa 1 Brushlet Features for Texture Image Retreval Chbao Chen and Kap Luk Chan Informaton System Research Lab, School

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information