Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces

Size: px
Start display at page:

Download "Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces"

Transcription

1 Speedng Up the Xbox Recommender System Usng a Eucldean Transformaton for Inner-Product Spaces Ran Glad-Bachrach Mcrosoft Research Yoram Bachrach Mcrosoft Research Nr Nce Mcrosoft R&D Lran Katzr Computer Scence, Technon Yehuda Fnkelsten Mcrosoft R&D Ulrch Paquet Mcrosoft Research Noam Koengsten Mcrosoft R&D ABSTRACT A promnent approach n collaboratve flterng based recommender systems s usng dmensonalty reducton (matrx factorzaton) technques to map users and tems nto low-dmensonal vectors. In such systems, a hgher nner product between a user vector and an tem vector ndcates that the tem better suts the user s preference. Tradtonally, retrevng the most sutable tems s done by scorng and sortng all tems. Real world onlne recommender systems must adhere to strct response-tme constrants, so when the number of tems s large, scorng all tems s ntractable. We propose a novel order preservng transformaton, mappng the maxmum nner product search problem to Eucldean space nearest neghbor search problem. Utlzng ths transformaton, we study the effcency of several (approxmate) nearest neghbor data structures. Our fnal soluton s based on a novel use of the PCA-Tree data structure n whch results are augmented usng paths one hammng dstance away from the query (neghborhood boostng). The end result s a system whch allows approxmate matches (tems wth relatvely hgh nner product, but not necessarly the hghest one). We evaluate our technques on two large-scale recommendaton datasets, Xbox Moves and Yahoo Musc, and show that ths technque allows tradng off a slght degradaton n the recommendaton qualty for a sgnfcant mprovement n the retreval tme. Categores and Subject Descrptors H.5[Informaton systems]: Informaton retreval retreval models and rankng, retreval tasks and goals Keywords Recommender systems, matrx factorzaton, nner product search, fast retreval Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. Copyrghts for components of ths work owned by others than the author(s) must be honored. Abstractng wth credt s permtted. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. Request permssons from permssons@acm.org. RecSys 14, October 6 1, 214, Foster Cty, Slcon Valley, CA, USA. Copyrght s held by the owner/author(s). Publcaton rghts lcensed to ACM. ACM /14/1...$ INTRODUCTION The massve growth n onlne servces data gves rse to the need for better nformaton flterng technques. In the context of recommender systems the data conssts of (1) the tem catalog; (2) the users; and (3) the user feedback (ratngs). The goal of a recommender system s to fnd for everyuser a lmted set of tems that havethe hghest chance to be consumed. Modern recommender systems have two major parts. In the frst part, the learnng phase, a model s learned (offlne)basedon userfeedback 1. Inthesecond part, the retreval phase, recommendatons are ssued per user (onlne). Ths paper studes the scalablty of the retreval phase (the second part) n massve recommender systems based on matrx factorzaton. Specfcally, we ntroduce a new approach whch offers a trade-off between runnng tme and the qualty of the results presented to a user. Matrx Factorzaton (MF) s one of the most popular approaches for collaboratve flterng. Ths method has repeatedly demonstrated better accuracy than other methods such as nearest neghbor models and restrcted Boltzmann machnes [2, 8]. In MF models, users and tems are represented by latent feature vectors. A Bayesan MF model s also at the heart of the Xbox recommendaton system [16] whch serves games, moves, and musc recommendatons to mllons of users daly. In ths system, users and tems are represented by (low-dmensonal) vectors n R 5. The qualty of the match between a user u represented by the vector x u and the tem represented by the vector y s gven by thenner productx u y between thesetwovectors. A hgher nner product mples a hgher chance of the user consumng the tem. The Retreval Problem: Ideally, gven a user u represented by a vector x u, all the tem vectors (y 1,...,y n) are examned. For each such tem vector y, ts match qualty wth the user (x u y ) s computed, and the tems sorted accordng to ther match qualty. The tems wth the hghest match qualty n the lst are then selected to form the fnal lst of recommendatons. However, the catalog of tems s often too large to allow an exhaustve computaton of all the nner products wthn a lmted allowed retreval tme. The Xbox catalog conssts of mllons of tems of varous knds. If a lnear scan s used, mllons of nner product computatons are requred for each sngle recommendaton. The 1 Ths phase cannot be done entrely offlne when a context s used to ssue the recommended tems.

2 user vectors can take nto account contextual nformaton 2 that s only avalable durng user engagement. Hence, the complete user vector s computed onlne (at runtme). As a result, the retreval of the recommended tems lst can only be performed onlne, and cannot be pre-computed offlne. Ths task consttutes the sngle most computatonal ntensve task mposed on the onlne servers. Thereby, havng a fast alternatve for ths process s hghly desrable. Our Contrbuton: Ths paper shows how to sgnfcantly speed up the recommendaton retreval process. The optmal tem-user match retreval s relaxed to an approxmate search: retrevng tems that have a hgh nner product wth the user vector, but not necessarly the hghest one. The approach combnes several buldng blocks. Frst, we defne a novel transformaton from the nner product problem to a Eucldean nearest neghbor problem (Secton 3). As a pre-processng step, ths transformaton s appled to the tem vectors. Durng tem retreval, another transformaton s appled to the user vector. The tem wth the smallest Eucldean dstance n the transformed space s then retreved. To expedte the nearest neghbor search, the PCA-Tree [21] data structure s used together wth a novel neghborhood boostng scheme (Secton 4). To demonstrate the effectveness of the proposed approach, t s appled to an Xbox recommendatons dataset and the publcly avalable Yahoo Musc dataset [8]. Experments show a trade-off curve of a slght degradaton n the recommendaton qualty for a sgnfcant mprovement n the retreval tme (Secton 5). In addton, the achevable tmeaccuracy trade-offs are compared wth two baselne approaches, an mplementaton based on Localty Senstve Hashng [1] and the current state of the art method for approxmate recommendaton n matrx-factorzaton based CF systems[13]. We show that for a gven requred recommendaton qualty (accuracy n pckng the optmal tems), our approach allows achevng a much hgher speedup than these alternatves. Notaton: We use lower-case fonts for scalars, bold lowercase fonts for vectors, and bold upper-case fonts for matrces. For example, x s a scalar, x s a vector, and X s a matrx. Gven a vector x R d, let x be the measure n dmenson, wth (x 1,x 2,...,x d ) T R d. The norm s denoted by ; d n Eucldean space x = =1 x2. We denote by x y a dot product (nner product) between x and y. Fnally, we use ( a,x T) T to denote a concatenaton of a scalar a wth a vector x. 2. BACKGROUND AND RELATED WORK In ths secton we wll explan the problem of fndng best recommendatons n MF models and revew possble approaches for effcent retreval of recommendatons. 2.1 Matrx Factorzaton Based Recommender Systems In MF models, each user u s assocated wth a user-trats vector x u R d, and each tem wth an tem-trats vector y R d. The predcted ratng of a user u to an tem s denoted by ˆr u and obtaned usng the rule: ˆr u = µ+b u +b +x u y, (1) 2 The contextual nformaton may nclude the tme of day, recent search queres, etc. where µ s the overall mean ratng value and b and b u represent the tem and user bases respectvely. The above model s a smple baselne model smlar to [14]. It can be readly extended to form the core of a varety of more complex MF models, and adapted to dfferent knds of user feedback. Whle µ and b are mportant components of the model, they do not effect the rankng of tems for any gven user, and the rule r u = b + x u y wll produce the same set of recommendatons as that of Equaton 1. We can also concatenate the tem bas b to the user vector and reduce our predcton rule to a smple dot product: r u = x u ȳ, where x u (1, x T u) T, and ȳ (b, y T ) T. Hence, computng recommendatons n MF models amounts to a smple search n an nner product space: gven a user vector x u, we wsh to fnd tems wth vectors ȳ that wll maxmze the nner product x u ȳ. For the sake of readablty, from ths pont onward we wll drop the bar and refer to x u and ȳ u as x u and y. We therefore focus on the problem of fndng maxmal nner product matches as descrbed above. 2.2 Retreval of Recommendatons n Inner- Product Spaces The problem of effcent retreval of recommendatons n MF models s relatvely new, but t has been dscussed n the past [1, 11, 13]. In real-world large scale systems such as the Xbox Recommender, ths s a concrete problem, and we dentfed t as the man bottleneck that drans our onlne resources. Prevous studes can be categorzed nto two basc approaches. The frst approach s to propose new recommendaton algorthms n whch the predcton rule s not based on nner-product matches. Ths was the approach taken by Khoshneshnetal. [1], whowerefrsttorase theproblemof effcent retreval of recommendatons n MF models. In [1] a new model s proposed n whch users and tems are embedded based on ther Eucldean smlarty rather than ther nner-product. In a Eucldean space, the plethora of algorthms for nearest-neghbor search can be utlzed for an effcent retreval of recommendatons. A smlar approach was taken by [11] where an tem-orented model was desgned to allevate retreval of recommendatons by embeddng tems n a Eucldean space. Whle these methods show sgnfcant mprovements n retreval tmes, they devate from the well famlar MF framework. These approaches whch are based on new algorthms do not beneft the core of exstng MF based recommender systems n whch the retreval of recommendatons s stll based on nner-products. The second approach to ths problem s based on desgnng new algorthms to mtgate maxmal nner-product search. These algorthms can be used n any exstng MF based system and requre only to mplement a new data structure on top of the recommender to assst at the retreval phase. For example, n [13] a new IP-Tree data structure was proposed that enables a branch-and-bound search scheme n nner-product spaces. In order to reach hgher speedup values, the IP-Tree was combned wth sphercal user clusterng that allows to pre-compute and cache recommendatons to smlar users. However, ths approach requres pror knowledge of all the user vectors whch s not avalable n systems such as the Xbox recommender where ad-hoc contextual nformaton s used to update the user vectors. Ths work was later contnued n [18] for the general problem of maxmal nner-product search, but these extensons showed effectve-

3 ness n hgh-dmensonal sparse datasets whch s not the case for vectors generated by a MF process. Ths paper bulds upon a novel transformaton that reduces the maxmal nner-product problem to smple nearest neghbor search n a Eucldean space. On one hand the proposed approach can be employed by any classcal MF model, and on the other hand t enables usng any of the exstng algorthms for Eucldean spaces. Next, we revew several alternatves for solvng the problem n a Eucldean Space Nearest Neghbor n Eucldean Spaces Localty Senstve Hashng (LSH) was recently popularzed as an effectve approxmate retreval algorthm. LSH was ntroduced by Broder et al. to fnd documents wth hgh Jaccard smlarty[4]. It was later extended to other metrcs ncludng the Eucldean dstance [9], cosne smlarty [5], and earth mover dstance [5]. A dfferent approach s based on space parttonng trees: KD-trees[3]sadatastructurethatparttonsR d ntohyperrectangular (axs parallel) cells. In constructon tme, nodes are splt along one coordnate. At query tme, one can search of all ponts n a rectangular box and nearest neghbors effcently. Several augmented splts are used to mprove the query tme. For example, (1) Prncpal component axes trees (PCA-Trees) transform the orgnal coordnates to the prncpal components [21]; (2) Prncpal Axs Trees (PAC- Trees) [15] use a prncpal component axs at every node; (3) Random Projecton Trees (RPT) use a random axs at each node [6]; and (4) Maxmum Margn Trees (MMT) use a maxmum margn axs at every node [2]. A theoretcal and emprcal comparson for some varants can be found [19]. Our approach makes use of PCA-trees and combnes t wth a novel neghborhood boostng scheme. In Secton 5 we compare to alternatves such as LSH, KD-Trees, and PAC- Trees. We do not compare aganst MMT and RPT as we don t see ther advantage over the other methods for the partcular problem at hand. 3. REDUCIBLE SEARCH PROBLEMS A key contrbuton of ths work s focused on the concept of effcent reductons between search problems. In ths secton we formalze the concept of a search problem and show effcent reductons between known varants. We defne a search problem as: Defnton 1. A search problem S(I, Q, s) conssts of an nstance set of n tems I = { 1, 2,..., n} I, a query q Q, and a search functon s : I Q {1,2,...,n}. Functon s retreves the ndex of an tem n I for a gven query q. The goal s to pre-process the tems wth g : I I such that each query s answered effcently. The preprocessng g can nvolve a transformaton from one doman to another, so that a transformed search problem can operate on a dfferent doman. The followng defnton formalzes the reducton concept between search problems: Defnton 2. A search problem S 1(I,Q,s 1) s reducble to a search problem S 2(I,Q,s 2), denoted by S 1 S 2, f there exst functons g : I I and h : Q Q such that j = s 1(I,q) f and only f j = s 2(g(I),h(q)). Ths reducton does not apply any constrants on the runnng tme of g and h. Note that g runs only once as a pre-processng step, whle h s appled at the query tme. Ths yelds a requrement that h has a O(1) runnng tme. We formalze ths wth the followng notaton: Defnton 3. We say that S 1 O(f(n)) S 2 f S 1 S 2 and the runnng tme of g and h are O(f(n)) and O(1) respectvely. For a query vector n R d, we consder three search problems n ths paper: MIP, the maxmum nner product from n vectors n R d (MIP n,d ); NN, the nearest neghbor from n vectors n R d (NN n,d ); MCS, the maxmum cosne smlarty from n vectors n R d (MCS n,d ). They are formally defned as follows: Instance: A matrx of n vectors Y = [y 1,y 2,...,y n] such that y R d ; therefore I = R d n. Query: A vector x R d ; hence Q = R d. Objectve: Retreve an ndex accordng to s(y,x) = argmaxx y s(y,x) = argmn x y s(y,x) = argmax x y x y where ndcates column of Y. MIP n,d NN n,d MCS n,d, The followng secton shows how transformatons between these three problems can be acheved wth MCS n,d O(n) MIP n,d O(n) NN n,d+1 and NN n,d O(n) MCS n,d+1 O(n) MIP n,d Order Preservng Transformatons The trangle nequalty does not hold between vectors x, y, and y j when an nner product compares them, as s the case n MIP. Many effcent search data structures rely on the trangle nequalty, and f MIP can be transformed to NN wth ts Eucldan dstance, these data structures would mmedately become applcable. Our frst theorem states that MIP can be reduced to NN by havng an Eucldan metrc n one more dmenson than the orgnal problem. Theorem 1. MIP n,d O(n) NN n,d+1 Proof: Let φ max y and preprocess nput wth: ỹ = ( ) T g(y ) = φ 2 y 2,y T. Durng query tme: x = h(x) = (,x T) T. As we have x 2 = x 2 ỹ 2 = φ 2 y 2 + y 2 = φ 2 x ỹ = φ 2 x 2 +x y = x y x ỹ 2 = x 2 + ỹ 2 2 x ỹ = x 2 +φ 2 2x y. Fnally, as φ and x are ndependent of ndex, j = argmn x ỹ 2 = argmaxx y.

4 Theorem 1 provdes the man workhorse for our proposed approach (Secton 4). In the remanng of ths secton, we present ts propertes as well the related transformatons. If t s known that the transformed Ỹ = [ỹ1,ỹ2,...,ỹn] s n a manfold, as gven above, we mght expect to recover Y by reducng back wth NN n,d O(n) MIP n,d 1. However, n the general case the transformaton s only possble by ncreasng the dmensonalty by one agan: Theorem 2. NN n,d O(n) MIP n,d+1 Proof: The preprocessng of the nput: ỹ = g(y ) = ( ) y 2 T. (,y ) T Durng query tme: x = h(x) = 1, 2x T T. We have x ỹ = y 2 2x y. Fnally, j = argmax x ỹ = argmn x 2 + y 2 2x y = argmn x y 2. MIP search can also be embedded n a MCS search by ncreasng the dmensonalty by one: Theorem 3. MIP n,d O(n) MCS n,d+1 Proof: Preprocessng and query transformaton are dentcal to Theorem 1. The preprocessng of the nput: φ ( ) T max y and let ỹ = g(y ) = φ 2 y 2,y T. Durng query tme: x = h(x) = (,x T) T. Fnally, j = argmax x ỹ x ỹ = argmax x y x φ = argmax x y. However, MCS s smply MIP searchng over normalzed vectors: Theorem 4. MCS n,d O(n) MIP n,d Proof: The preprocessng of the nput: ỹ = g(y) = y y. Durng query tme: x = h(x) = x. Fnally, x y j = argmax x ỹ = argmax x y. Our fnal result states that a NN search can be transformed to a MCS search by ncreasng the dmensonalty by one: Theorem 5. NN n,d O(n) MCS n,d+1 Proof: Same reducton as n Theorem 1. The preprocessng of the nput: φ max y and ỹ = g(y ) = ( φ 2 y 2,y T ) T. Durng query tme: x = h(x) = (,x T ) T. Thus by Theorem 1, j = argmax x ỹ x ỹ = argmax x y = argmn x ỹ 2. x φ = argmax x y Next, weutlzetheorem 1forspeedngupretrevalofrecommendatons n Xbox and other MF based recommender systems. Algorthm 1 TransformAndIndex(Y, d ) nput: tem vectors Y, depth d d+1 output: tree t compute φ, µ, W S = for = 1 : n do ỹ = g(y ) ; S = S ỹ end for return T PCA-Tree(S, d ) 4. AN OVERVIEW OF OUR APPROACH Our soluton s based on two components, a reducton to a Eucldan search problem, and a PCA-Tree to address t. The reducton s very smlar to that defned n Theorem 1, but composed wth an addtonal shft and rotaton, so that the MIP search problem s reduced to NN search, wth all vectors algned to ther prncpal components. 4.1 Reducton We begn wth defnng the frst reducton functon followng Theorem 1. Let φ max y, and y = g 1(y ) = x = h 1(x) = ( φ 2 y 2, y T ) T (,x T) T, (2) whch, when appled to Y, gves elements y R d+1. Ths reduces MIP to NN. As NN s nvarant to shfts and rotatons n the nput space, we can compose the transformatons wth PCA rotaton and stll keep an equvalent search problem. y We mean-center and rotate the data: Let µ = 1 n be the mean after the frst reducton, and M R d+1 n a matrx wth µ replcated along ts columns. The SVD of the centered data matrx s (Y M) = WΣU T, where data tems appear n the columns of Y. Matrx W s a (d + 1) by (d + 1) matrx. Each of the columns of W = [w 1,...,w d+1 ] defnes an orthogonal unt-length egenvector, sothat each w j defnesahyperplaneontowhch each y µ s projected. Matrx W s a rotaton matrx that algns the vectors to ther prncpal components. 3 We defne the centered rotaton as our second transformaton, The composton ỹ = g 2(y ) = W T (y µ) x = h 2(x ) = W T (x µ). (3) g(y ) = g 2(g 1(y )), h(x) = h 2(h 1(x)) (4) stll defnesareductonfrom MIPtoNN. Usngỹ = g(y ), gves us a transformed set of nput vectors Ỹ, over whch an Eucldan search can be performed. Moreover, after ths transformaton, the ponts are rotated so that ther components are n decreasng order of varance. Next, we ndex the transformed tem vectors n Ỹ usng a PCA-Tree data structure. We summarze the above logc n Algorthm 1. 3 Notce that Σ s not ncluded, as the Eucldan metrc s nvarant under rotatons of the space, but not shears.

5 Algorthm 2 PCA-Tree(S, δ) nput: tem vectors set S, depth δ output: tree t f δ = then return new leaf wth S end f j = d+1 δ // prncpal component at depth δ m = medan({ỹ j for all ỹ S}) S = {ỹ S where ỹ j m} S > = {ỹ S where ỹ j > m} t.leftchld = PCA-Tree(S, δ 1) t.rghtchld = PCA-Tree(S >, δ 1) return t 4.2 Fast Retreve wth PCA-Trees Buldng the PCA-Tree follows from a the KD-Tree constructon algorthm on Ỹ. Snce the axes are algned wth the d+1 prncpal components of Y, we can make use of a KD-tree constrcton process to get a PCA-Tree data structure. The top d d + 1 prncpal components are used, and each tem vector s assgned to ts representatve leaf. Algorthm 2 defnes ths tree constructon procedure. At the retreval tme, the transformed user vector x = h(x) s used to traverse the tree to the approprate leaf. The leaf contans the tem vectors n the neghborhood of x, hence vectors that are on the same sde of all the splttng hyperplanes (the top prncpal components). The tems n ths leaf form an ntal canddates set from whch the top tems or nearest neghbors are selected usng a drect rankng by dstance. The number of tems n each leaf decays exponentally n the depth d of the tree. By ncreasng the depth we are left wth less canddates hence tradng better speedup values wth lower accuracy. The process allows achevng dfferent trade-offs between the qualty of the recommendatons and an allotted runnng tme: wth a larger d, a smaller proporton of canddates are examned, resultng n a larger speedup, but also a reduced accuracy. Our emprcal analyss (Secton 5) examnes the trade-offs we can acheve usng our PCA-trees, and contrasts ths wth trade-offs achevable usng other methods Boostng Canddates Wth Hammng Dstance Neghborhoods Whle the ntal canddates set ncludes many nearby tems, t s possble that some of the optmal top K vectors are ndexed n other leafs and most lkely the adjacent leafs. In our approach we propose boostng the canddates set wth the tem vectors n leafs that are on the wrong sde n at most one of the medan-shfted PCA hyperplane compared to x. These vectors are lkely to have a small Eucldean dstance from the user vector. Our PCA-Tree s a complete bnary tree of heght d, where each leaf corresponds to a bnary vector of length d. We supplement the ntal canddates set from the leaf of the user vector, wth all the canddates of leafs wth a Hammng dstance of 1, and hence examne canddates from d of the 2 d leafs. In Secton we show that ths approach s nstrumental n achevng the best balance between speedup and accuracy. 5. EMPIRICAL ANALYSIS OF SPEEDUP- ACCURACY TRADEOFFS We use two large scale datasets to evaluate the speedup acheved by several methods: 1. Xbox Moves [12] Ths dataset s a Mcrosoft proprety dataset consstng of 1 mllon bnary {, 1} ratngs of more than 15K moves by 5.8 mllon users. We appled the method used n [12] to generate the vectors representng tems and users. 2. Yahoo! Musc[8] Ths s a publcly avalable ratngs dataset consstng of 252,8,275 ratngs of 624,961 tems by 1,,99 users. The ratngs are on a scale of -1. The users and tems vectors were generated by the algorthm n [7]. From both datasets we created a set of tem vectors and user vectors of dmensonalty d = 5. The followng evaluatons are based on these vectors. Measurements and Baselnes. We quantfy the mprovement of an algorthm A over another (nave) algorthm A by the followng term: A (A) = Tme taken by Algorthm A Tme taken by Algorthm A. (5) In all of our evaluatons we measure the speedup wth respect to the same algorthm: a nave search algorthm that terates over all tems to fnd the best recommendatons for every user (.e. computes the nner product between the user vectorandeach ofthetemvectors, keepngtrackofthetem wth the hghest nner product found so far). Thus denotng by T nave the tme taken by the nave algorthm we have: T nave = Θ(#users #tems d). The state of the art method for fndng approxmately optmal recommendatons uses a combnaton of IP-Trees and user cones [13]. In the followng evaluaton we dubbed ths method IP-Tree. The IP-Tree approach assumes all the user vectors (queres) are computed n advance and can be clustered nto a structure of user cones. In many realworld systems lke the Xbox recommender the user vectors are computed or updated onlne, so ths approach cannot be used. In contrast, our method does not requre havng all the user vectors n advance, and s thus applcable n these settngs. The IP-Tree method reles on an adaptaton of the branchand-bound search n metrc-trees[17] to handle nearest neghbor search n nner-product spaces. However, the constructon of the underlayng metrc-tree data structure, whch s a space parttonng tree, s not adapted to nner-product spaces (t parttons vectors accordng to Eucldean proxmty). By usng the Eucldean transformaton of Theorem 1, we can utlze the data structures and algorthms desgned for Eucldean spaces n ther orgnal form, wthout adaptatons that may curb ther effectveness. Next, we show that our approach acheves a superor computaton speedup, despte havng no access to any pror knowledge about the user vectors or ther dstrbuton. 4 4 We focus on onlne processng tme,.e. the tme to choose an tem to recommend for a target user. We gnore the computaton tme requred by offlne preprocessng steps.

6 Theorem 1 allows usng varous approxmate nearest-neghbor algorthms for Eucldean spaces, whose performance depends on the specfc dataset used. We propose usng PCA-Trees as explaned n Secton 4.2, and show that they have an excellent performance for both the Xbox moves and Yahoo! musc datasets, consstng of low dmensonalty dense vectors obtaned by matrx factorzaton. A dfferent and arguably more popular approach for fndng approxmatenearest-neghbors n Eucldean spaces s Localty-Senstve Hashng (LSH) [1]. In the evaluatons below we also nclude a comparson aganst LSH. We emphasze that usng both our PCA-Trees approach and LSH technques s only enabled by our Eucldean transformaton (Theorem 1). Our approxmate retreval algorthms ntroduce a tradeoff between accuracy and speedup. We use two measures to quantfy the qualty of the top K recommendatons. The frst measure Precson@K denotes how smlar the approxmate recommendatons are to the optmal top K recommendatons (as retreved by the nave approach): Precson@K Lrec Lopt K, (6) where L rec and L opt are the lsts of the top K approxmate and the top K optmal recommendatons respectvely. Our evaluaton metrcs only consder the tems at the top of the approxmate and optmal lsts. 5 A hgh value for Precson mples that the approxmate recommendatons are very smlar to the optmal recommendatons. In many practcal applcatons (especally for large tem catalogs), t s possble to have low Precson rates but stll recommend very relevant tems (wth a hgh nner product between the user and tem vectors). Ths motvates our second measure RMSE@K whch examnes the preference to the approxmate tems compared to the optmal tems: RMSE@K 1 K K k=1 ( L rec(k) L opt(k)) 2, (7) where L rec(k) and L opt(k) are the scores (predcted ratngs) of the k th recommended tem n the approxmated lst and the optmal lst respectvely. Namely, L rec(k) and L opt(k) are the values of nner products between the user vector and k th recommended tem vector and ( optmal tem vector ) respectvely. Note that the amount L rec(k) L opt(k) s always postve as the tems n each lst are ranked by ther scores. 5.1 Results Our ntal evaluaton consders three approxmaton algorthms: IP-Tree, LSH, and our approach (Secton 4.2). Fgure 1(a) depcts Precson@1 for the Xbox Moves dataset (hgher values ndcate better performance). The Precson values are plotted aganst the average speedup values they enable. At very low speedup values the LSH algorthm shows the best trade-off between precson and speedup, but when hgher speedup values are consdered the LSH performance drops sgnfcantly and becomes worst. One possble reason for ths s that our Eucldean transformaton results n transformed vectors wth one dmenson beng very large compared wth the other dmensons, whch s a dffcult 5 Note that for ths evaluaton the recall s completely determned by the precson. Method Enabled by Pror Neghborhood Theorem 1 knowledge boostng IP-Tree no user vectors not allowed KD-Tree yes none allowed PCA-Tree yes none allowed PAC-Tree yes none not allowed Table 1: A summary of the dfferent tree approaches. IP-Tree s the baselne from [13], whch requres pror knowledge of the users vectors. All other approaches (as well as LSH) were not feasble before Theorem 1 was ntroduced n ths paper. nput dstrbuton for LSH approaches. 6 In contrast, the tree-based approaches (IP-Tree and our approach) show a smlar behavor of a slow and steady decrease n Precson values as the speedup ncreases. The speedup values of our approach offers a better precson-vs-speedup tradeoff than the IP-tree approach, though ther precson s almost the same for hgh speedup values. Fgure 1(b) depcts the RMSE@1 (lower values ndcate better performance) vs. speedup for the three approaches. The trend shows sgnfcantly superor results for our PCA- Tree approach, for all speedup values. Smlarly to Fgure 1(a), we see a sharp degradaton of the LSH approach as the speedup ncreases, whle the tree-based approaches show a trend of a slow ncrease n RMSE values as the speedup ncreases. We note that even for hgh speed-up values, whch yeld low precson rates n Fgure 1(a), the RMSE values reman very low, ndcatng that very hgh qualty of recommendatons can be acheved at a fracton of the computatonal costs of the nave algorthm. In other words, the recommended tems are stll very relevant to the user, although the lst of recommended tems s qute dfferent from the optmal lst of tems. Fgure 2 depcts Precson@1 and RMSE@1 for the Yahoo! Musc dataset. The general trends of all three algorthms seem to agree wth those of Fgure 1: LSH starts better but deterorates quckly, and the tree-based approaches have smlar trends. The scale of the RMSE errors n Fgure 1(b) s dfferent (larger) because the predcted scores are n the range of -1, whereas n the Xbox Moves dataset the predctons are bnary. The emprcal analyss on both the Xbox and Yahoo! datasets shows that t s possble to acheve excellent recommendatons for very low computatonal costs by employng our Eucldean transformaton and usng an approxmate Eucldean nearest neghbor method. The results ndcate that treebased approaches are superor to an LSH based approach (except when the requred speedup s very small). Further, the results ndcate that our method yelds hgher qualty recommendatons than the IP-trees approach[13]. Note that we also compared Precson@K and RMSE@K for other K values. Whle the fgures are not ncluded n ths paper, the trends are all smlar to those presented above Comparng Dfferent Tree Approaches A key buldng block n our approach s algnng the tem vectors wth ther prncpal components (Equaton 3) and usng PCA-Trees rather than KD-Trees. Another essental 6 The larger dmenson s the auxlary dmenson ( φ 2 y 2 ) n Equaton 2.

7 1.9 IP Tree LSH Ths Paper.4.35 IP Tree LSH Ths Paper (a) Precson vs (b) RMSE vs. Fgure 1: Performance aganst speedup values for the Xbox Moves dataset top 1 recommendatons 1.9 IP Tree LSH Ths Paper 3 IP Tree LSH Ths Paper Precson@ RMSE@ (a) Precson vs (b) RMSE vs. Fgure 2: Performance aganst speedup values for the Yahoo! Musc dataset top 1 recommendatons ngredent n our approach s the neghborhood boostng of Secton One may queston the vtalty of PCA-Trees or the neghborhood boostng to our overall soluton. We therefore present a detaled comparson of the dfferent tree based approaches. For the sake of completeness, we also ncluded a comparson to PAC-Trees [15]. Table 1 summarzes the dfferent data structures. Except the IP-Tree approach, all of these approaches were not feasble before Theorem 1 was ntroduced n ths paper. Note that neghborhood boostng s possble only when the tree splts are all based on a sngle consstent axs system. It s therefore prohbted n IP-Tees and PAC-Trees where the splttng hyperplanes are ad-hoc on every node. We compare the approach proposed n ths paper wth smple KD-Trees, PAC-Trees, and wth PCA-Trees wthout neghborhood boostng(our approach wthout neghborhood boostng). Fgure 3 depcts Precson@1 and RMSE@1 on the Yahoo! Musc dataset. As the speedup levels ncrease, we notce an evdent advantage n favor of PCA algned trees over KD-Trees. When comparng PCA-Trees wthout neghborhood boostng to PAC-Trees we see a mxed pcture: For low speedup values PCA-Trees perform better, but for hgher speedup values we notce an emnent advantage n favor of PAC-Trees. To conclude, we note the overall advantage for the method proposed n ths paper over any of the other tree based alternatves both n terms of Precson and RMSE. 6. CONCLUSIONS We presented a novel transformaton mappng a maxmal nner product search to Eucldean nearest neghbor search, andshowedhowtcanbeusedtospeed-uptherecommendaton process n a matrx factorzaton based recommenders such as the Xbox recommender system. We proposed a method for approxmately solvng the Eucldean nearest neghbor problem usng PCA-Trees, and emprcally evaluated t on the Xbox Move recommendatons and the Yahoo Musc datasets. Our analyss shows that our approach allows achevng excellent qualty recommendatons at a fracton of the computatonal cost of a nave approach, and that t acheves superor qualty-speedup tradeoffs compared wth state-of-the-art methods.

8 1.9.8 KD Tree PAC Tree No neghborhood boostng Ths Paper 3 25 KD Tree PAC Tree No neghborhood boostng Ths Paper.7 2 Precson@ RMSE@ (a) Precson vs (b) RMSE vs. Fgure 3: Comparng tree based methods for the Yahoo! Musc dataset top 1 recommendatons 7. REFERENCES [1] Alexandr Andon and Potr Indyk. Near-optmal hashng algorthms for approxmate nearest neghbor n hgh dmensons. In FOCS, pages , 26. [2] Robert M. Bell and Yehuda Koren. Lessons from the netflx prze challenge. SIGKDD Explor. Newsl., 27. [3] Jon Lous Bentley. Multdmensonal bnary search trees used for assocatve searchng. Commun. ACM, 18(9):59 517, September [4] Andre Broder. On the resemblance and contanment of documents. In Proceedngs of the Compresson and Complexty of Sequences 1997, pages 21 29, [5] Moses S. Charkar. Smlarty estmaton technques from roundng algorthms. In Proceedngs of the Thry-fourth Annual ACM Symposum on Theory of Computng, pages , 22. [6] Sanjoy Dasgupta and Yoav Freund. Random projecton trees and low dmensonal manfolds. In Proceedngs of the Forteth Annual ACM Symposum on Theory of Computng, pages , 28. [7] Gdeon Dror, Noam Koengsten, and Yehuda Koren. Yahoo! musc recommendatons: Modelng musc ratngs wth temporal dynamcs and tem taxonomy. In Proc. 5th ACM Conference on Recommender Systems, 211. [8] Gdeon Dror, Noam Koengsten, Yehuda Koren, and Markus Wemer. The Yahoo! musc dataset and KDD-Cup 11. Journal Of Machne Learnng Research, 17:1 12, 211. [9] Potr Indyk and Rajeev Motwan. Approxmate nearest neghbors: Towards removng the curse of dmensonalty. In Proceedngs of the Thrteth Annual ACM Symposum on Theory of Computng, pages , [1] Mohammad Khoshneshn and W. Nck Street. Collaboratve flterng va eucldean embeddng. In Proceedngs of the fourth ACM conference on Recommender systems, 21. [11] Noam Koengsten and Yehuda Koren. Towards scalable and accurate tem-orented recommendatons. In Proc. 7th ACM Conference on Recommender Systems, 213. [12] Noam Koengsten and Ulrch Paquet. Xbox moves recommendatons: Varatonal Bayes matrx factorzaton wth embedded feature selecton. In Proc. 7th ACM Conference on Recommender Systems, 213. [13] Noam Koengsten, Parksht Ram, and Yuval Shavtt. Effcent retreval of recommendatons n a matrx factorzaton framework. In CIKM, 212. [14] Yehuda Koren, Robert M. Bell, and Chrs Volnsky. Matrx factorzaton technques for recommender systems. IEEE Computer, 29. [15] James McNames. A fast nearest-neghbor algorthm based on a prncpal axs search tree. IEEE Trans. Pattern Anal. Mach. Intell., 23(9): , September 21. [16] Ulrch Paquet and Noam Koengsten. One-class collaboratve flterng wth random graphs. In Proceedngs of the 22nd nternatonal conference on World Wde Web, WWW 13, pages , 213. [17] Franco P. Preparata and Mchael I. Shamos. Computatonal Geometry: An Introducton. Sprnger, [18] Parksht Ram and Alexander Gray. Maxmum nner-product search usng cone trees. In SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng. ACM, 212. [19] Parksht Ram and Alexander G. Gray. Whch space parttonng tree to use for search? In Chrstopher J. C. Burges, Léon Bottou, Zoubn Ghahraman, and Klan Q. Wenberger, edtors, NIPS, pages , 213. [2] Parksht Ram, Dongryeol Lee, and Alexander G. Gray. Nearest-neghbor search on a tme budget va max-margn trees. In SDM, pages SIAM / Omnpress, 212. [21] Robert F. Sproull. Refnements to nearest-neghbor searchng n k-dmensonal trees. Algorthmca, 6(4): , 1991.

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Feature-Based Matrix Factorization

Feature-Based Matrix Factorization Feature-Based Matrx Factorzaton arxv:1109.2271v3 [cs.ai] 29 Dec 2011 Tanq Chen, Zhao Zheng, Quxa Lu, Wenan Zhang, Yong Yu {tqchen,zhengzhao,luquxa,wnzhang,yyu}@apex.stu.edu.cn Apex Data & Knowledge Management

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Laplacian Eigenmap for Image Retrieval

Laplacian Eigenmap for Image Retrieval Laplacan Egenmap for Image Retreval Xaofe He Partha Nyog Department of Computer Scence The Unversty of Chcago, 1100 E 58 th Street, Chcago, IL 60637 ABSTRACT Dmensonalty reducton has been receved much

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

LOCALIZING USERS AND ITEMS FROM PAIRED COMPARISONS. Matthew R. O Shaughnessy and Mark A. Davenport

LOCALIZING USERS AND ITEMS FROM PAIRED COMPARISONS. Matthew R. O Shaughnessy and Mark A. Davenport 2016 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 13 16, 2016, SALERNO, ITALY LOCALIZING USERS AND ITEMS FROM PAIRED COMPARISONS Matthew R. O Shaughnessy and Mark A. Davenport

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Constructing Minimum Connected Dominating Set: Algorithmic approach

Constructing Minimum Connected Dominating Set: Algorithmic approach Constructng Mnmum Connected Domnatng Set: Algorthmc approach G.N. Puroht and Usha Sharma Centre for Mathematcal Scences, Banasthal Unversty, Rajasthan 304022 usha.sharma94@yahoo.com Abstract: Connected

More information

A Facet Generation Procedure. for solving 0/1 integer programs

A Facet Generation Procedure. for solving 0/1 integer programs A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Signed Distance-based Deep Memory Recommender

Signed Distance-based Deep Memory Recommender Sgned Dstance-based Deep Memory Recommender ABSTRACT Personalzed recommendaton algorthms learn a user s preference for an tem, by measurng a dstance/smlarty between them. However, some of exstng recommendaton

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

MULTI-VIEW ANCHOR GRAPH HASHING

MULTI-VIEW ANCHOR GRAPH HASHING MULTI-VIEW ANCHOR GRAPH HASHING Saehoon Km 1 and Seungjn Cho 1,2 1 Department of Computer Scence and Engneerng, POSTECH, Korea 2 Dvson of IT Convergence Engneerng, POSTECH, Korea {kshkawa, seungjn}@postech.ac.kr

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Abstract. 1 Introduction

Abstract. 1 Introduction Challenges and Solutons for Synthess of Knowledge Regardng Collaboratve Flterng Algorthms Danel Lowd, Olver Godde, Matthew McLaughln, Shuzhen Nong, Yun Wang, and Jonathan L. Herlocker. School of Electrcal

More information

On the Efficiency of Swap-Based Clustering

On the Efficiency of Swap-Based Clustering On the Effcency of Swap-Based Clusterng Pas Fränt and Oll Vrmaok Department of Computer Scence, Unversty of Joensuu, Fnland {frant, ovrma}@cs.oensuu.f Abstract. Random swap-based clusterng s very smple

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made

More information