Semi-Supervised Kernel Mean Shift Clustering

Size: px
Start display at page:

Download "Semi-Supervised Kernel Mean Shift Clustering"

Transcription

1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 1 Sem-Supervsed Kernel Mean Shft Clusterng Saket Anand, Student Member, IEEE, Sushl Mttal, Member, IEEE, Oncel Tuzel, Member, IEEE, and Peter Meer, Fellow, IEEE Abstract Mean shft clusterng s a powerful nonparametrc technque that does not requre pror knowledge of the number of clusters and does not constran the shape of the clusters. However, beng completely unsupervsed, ts performance suffers when the orgnal dstance metrc fals to capture the underlyng cluster structure. Despte recent advances n sem-supervsed clusterng methods, there has been lttle effort towards ncorporatng supervson nto mean shft. We propose a sem-supervsed framework for kernel mean shft clusterng (SKMS) that uses only parwse constrants to gude the clusterng procedure. The ponts are frst mapped to a hgh-dmensonal kernel space where the constrants are mposed by a lnear transformaton of the mapped ponts. Ths s acheved by modfyng the ntal kernel matrx by mnmzng a log det dvergence-based objectve functon. We show the advantages of SKMS by evaluatng ts performance on varous synthetc and real datasets whle comparng wth state-of-the-art sem-supervsed clusterng algorthms. Index Terms sem-supervsed kernel clusterng; log det Bregman dvergence; mean shft clusterng; 1 INTRODUCTION Mean shft s a popular mode seekng algorthm, whch teratvely locates the modes n the data by maxmzng the kernel densty estmate (KDE). Although the procedure was ntally descrbed decades ago [17], t was not popular n the vson communty untl ts potental uses for feature space analyss and optmzaton were understood [8], [12]. The nonparametrc nature of mean shft makes t a powerful tool to dscover arbtrarly shaped clusters present n the data. Addtonally, the number of clusters s automatcally determned by the number of dscovered modes. Due to these propertes, mean shft has been appled to solve several computer vson problems, e.g., mage smoothng and segmentaton [11], [41], vsual trackng [9], [19] and nformaton fuson [7], [10]. Mean shft clusterng was also extended to nonlnear spaces, for example, to perform clusterng on analytc manfolds [36], [38] and kernel nduced feature space [34], [39] under an unsupervsed settng. In many clusterng problems, n addton to the unlabeled data, often some addtonal nformaton s also easly avalable. Dependng on a partcular problem, ths addtonal nformaton could be avalable n dfferent forms. For example, the number of clusters or a few must-lnk (smlarty) and cannot-lnk (dssmlarty) pars could be known a-pror. In the last decade, sem-supervsed clusterng methods that am to ncorporate pror nformaton nto the clusterng algorthm as a gude, have receved S. Anand and P. Meer are wth the Department of Electrcal and Computer Engneerng, Rutgers Unversty, Pscataway, NJ E- mal: {anands@eden, meer@cronos}.rutgers.edu. S. Mttal s wth Scbler Technologes, Bangalore, Inda. E-mal: mttal@scbler.com. O. Tuzel s wth Mtsubsh Electrc Research Laboratores, Cambrdge, MA E-mal: oncel@merl.com. consderable attenton n machne learnng and computer vson [2], [6], [18]. Increasng nterest n ths area has trggered the adaptaton of several popular unsupervsed clusterng algorthms nto a sem-supervsed framework, e.g., background constraned k-means [40], constraned spectral clusterng [23], [28] and kernel graph clusterng [24]. It s shown that unlabeled data, when used n conjuncton wth a small amount of labeled data, can produce sgnfcant mprovement n clusterng accuracy. However, despte these advances, mean shft clusterng has largely been gnored under the sem-supervsed learnng framework. To leverage all the useful propertes of mean shft, we adapt the orgnal unsupervsed algorthm nto a sem-supervsed clusterng technque. The work n [37] ntroduced weak supervson nto the kernel mean shft algorthm where the addtonal nformaton was provded through a few parwse must-lnk constrants. In that framework, each par of must-lnk ponts was collapsed to a sngle pont through a lnear projecton operaton, guaranteeng ther assocaton wth the same cluster. In ths paper, we extend that work by generalzng the lnear projecton operaton to a lnear transformaton of the kernel space that permts us to scale the dstance between the constrant ponts. Usng ths transformaton, the must-lnk ponts are moved closer to each other, whle the cannot-lnk ponts are moved farther apart. We show that ths transformaton can be acheved mplctly by modfyng the kernel matrx. We also show that gven one constrant par, the correspondng kernel update s equvalent to mnmzng the log det dvergence between the updated and the ntal kernel matrx. For multple constrants, ths result proves to be very useful snce we can learn the kernel matrx by solvng a constraned log det mnmzaton problem [22]. Fg. 1 llustrates the basc approach. The orgnal data

2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 2 (a) (c) (e) Fg. 1: Olympc Crcles. (a) Orgnal data n the nput space wth fve dfferent clusters. (b) Labeled ponts used to generate parwse constrants. (c) Parwse dstance matrx (PDM) usng the modes dscovered by unsupervsed mean shft clusterng performed n the kernel space. The ponts are ordered accordng to class. (d) PDM usng the modes found after supervson was added. Sample clusterng results usng (e) unsupervsed and (f) sem-supervsed kernel mean shft clusterng. wth fve crcles, each contanng 300 ponts, s shown n Fg. 1a. We assume that 20 labeled ponts from each cluster are selected at random (only 6.7% of the data) to generate parwse must-lnk and cannot-lnk constrants (Fg. 1b). We use all the 5 ( ) 20 2 unque ntra-class pars as must-lnk constrants and a same number of cannot-lnk constrants. Note that, although we generate parwse constrants usng a small fracton of labeled ponts, the algorthm does not requre the explct knowledge of class labels. The data s frst mapped to a kernel space usng a Gaussan kernel (σ = 0.5). In the frst case, kernel mean shft s drectly appled to the data ponts n the absence of any supervson. Fg. 1c shows the correspondng parwse dstance matrx (PDM) obtaned usng modes recovered by mean-shft. The lack of a block dagonal structure mples the absence of meanngful clusters dscovered n the data. In the second case, the constrant pars are used to transform the ntal kernel space by learnng an approprate kernel matrx va log det dvergence mnmzaton. Fg. 1d shows the PDM between the modes obtaned when kernel mean shft s appled to the transformed feature ponts. In ths case, the fve-cluster structure s clearly vsble n the PDM. Fnally, Fg. 1e and 1f show the correspondng clusterng (b) (d) (f) labels mapped back to the nput space. Through ths example, t becomes evdent that a small amount of supervson can mprove the clusterng performance sgnfcantly. In the followng, we lst our key contrbutons ( ) and present the overall organzaton of the paper. In Secton 2, we present an overvew of the past work n sem-supervsed clusterng. In Secton 3.1, we brefly revew the Eucldean mean shft algorthm and n Secton 3.2 we dscuss ts extenson to hgh-dmensonal kernel spaces. By employng the kernel trck, the explct representaton of the kernel mean shft algorthm n terms of the mappng functon s transformed nto an mplct representaton descrbed n terms of the kernel functon nducng that mappng. In Secton 4.1, we derve the expresson for performng kernel updates usng orthogonal projecton of the must-lnk constrant ponts. Wth lttle manpulaton of the dstances between the constrant ponts, n Secton 4.2, we extend ths projecton operaton to the more general lnear transformaton, that can also utlze cannot-lnk constrants. By allowng relaxaton on the constrants, we motvate the formulaton of learnng the kernel matrx as a Bregman dvergence mnmzaton problem. In Secton 5, we revew the optmzaton algorthm of [22], [25] for learnng the kernel matrx usng log det Bregman dvergences. More detals about ths algorthm are also furnshed n the supplementary materal. In Secton 6, we descrbe the complete semsupervsed kernel mean shft (SKMS) algorthm. In addton, several practcal enhancements lke automatcally estmatng varous algorthmc parameters and low-rank kernel learnng for sgnfcant computaton gans are also dscussed n detal. In Secton 7, we show expermental results on several synthetc and real data sets that encompass a wde spectrum of challenges encountered n practcal machne learnng problems. We also present detaled comparson wth state-of-the-art sem-supervsed clusterng technques. Fnally, n Secton 8, we conclude wth a dscusson around the key strengths and lmtatons of our work. 2 RELATED WORK Sem-supervsed clusterng has receved a lot of attenton n the past few years due to ts hghly mproved performance over tradtonal unsupervsed clusterng methods [3], [6]. As compared to fully-supervsed learnng algorthms, these methods requre a weaker form of supervson n terms of both the amount and the form of labeled data used. In clusterng, ths s usually acheved by usng only a few parwse must-lnk and cannot-lnk constrants. Snce generatng parwse constrants does not requre the knowledge of class labels or the number of clusters, they can easly be generated usng addtonal nformaton. For example, whle

3 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 3 clusterng mages of objects, two mages wth smlar text annotatons could be used as a must-lnk par. We dscuss some of the tradtonal unsupervsed clusterng methods and ther sem-supervsed varants below. Parttonng based clusterng: k-means s one of the oldest and most popular clusterng algorthm. See [21] for an extensve revew of all the varants of k-means algorthm. One of ts popular sem-supervsed extensons usng parwse constrants was proposed n [40]. The method parttons the data nto k non-overlappng regons such that must-lnk pars are enforced to le n the same cluster whle the cannot-lnk pars are enforced to le n dfferent clusters. However, snce the method enforces all constrants rgdly, t s senstve to labelng errors n parwse constrants. Basu et al. [2] proposed a more robust, probablstc model by explctly allowng relaxaton for a few constrants n k- means clusterng. Smlarly, Blenko et al. [4] proposed a metrc learnng based approach to ncorporate constrants nto k-means clusterng framework. Snce the clusterng n both these methods s performed n the nput space, these methods fal to handle non-lnear cluster boundares. More recently, Kuls et al. proposed sem-supervsed kernel k- means (SSKK) algorthm [24], that constructs a kernel matrx by addng the nput smlarty matrx to the constrant penalty matrx. Ths kernel matrx s then used to perform k-means clusterng. Graph based clusterng: Spectral clusterng [31] s another very popular technque that can also cluster data ponts wth non-lnear cluster boundares. In [23], ths method was extended to ncorporate weak supervson by updatng the computed affnty matrx for the specfed constrant ponts. Later, Lu et al. [28] further modfed the algorthm by propagatng the specfed constrants to the remanng ponts usng a Gaussan process. More recently, Lu and Ip [29] showed mproved clusterng performances by applyng exhaustve constrant propagaton and handlng soft constrants (E2CP). One of the fundamental problems wth ths method s that t can be senstve to labelng nose snce the effect of a mslabeled data pont par could easly get propagated throughout the affnty matrx. Moreover, n general, spectral clusterng methods suffer when the clusters have very dfferent scales and denstes [30]. Densty based clusterng: Clusterng methods n ths category make use of the estmated local densty of the data to dscover clusters. Gaussan Mxture Models (GMM) are often used for soft clusterng where each mxture represents a cluster dstrbuton [27]. Mean shft [11], [12], [37] was employed n computer vson for clusterng data n the feature space by locatng the modes of the nonparametrc estmate of the underlyng densty. There exst other densty based clusterng methods that are less known n the vson communty, but have been appled to data mnng applcatons. For example, DBSCAN [15] groups together ponts that are connected through a chan of hgh-densty neghbors that are determned by the two parameters: neghborhood sze and mnmum allowable densty. SSDBSCAN [26] s a sem-supervsed varant of DBSCAN that explctly uses the labels of a few ponts to determne the neghborhood parameters. C-DBSCAN [33] performs a herarchcal densty based clusterng whle enforcng the must-lnk and cannot-lnk constrants. 3 KERNEL MEAN SHIFT CLUSTERING Frst, we brefly descrbe the Eucldean mean shft clusterng algorthm n Secton 3.1 and then derve the kernel mean shft algorthm n Secton Eucldean Mean Shft Clusterng Gven n data ponts x on a d-dmensonal space R d and the assocated dagonal bandwdth matrces h I d d, = 1,..., n, the sample pont densty estmator obtaned wth the kernel profle k(x) s gven by f(x) = 1 n n 1 h d x x k( h We utlze multvarate normal profle 2). (1) k(x) = e 1 2 x x 0. (2) Takng the gradent of (1), we observe that the statonary ponts of the densty functon satsfy 2 n 1 x x (x x) g( 2) n = 0 (3) h h d+2 where g(x) = k (x). The soluton s a local maxmum of the densty functon whch can be teratvely reached usng mean shft procedure δx = n n x g( ) x x 2 h d+2 h 1 g( ) x (4) x x 2 h d+2 h where x s the current mean and δx s the mean shft vector. To recover from saddle ponts addng a small perturbaton to the current mean vector s usually suffcent. Comancu and Meer [11] showed that the convergence to a local mode of the dstrbuton s guaranteed. 3.2 Mean Shft Clusterng n Kernel Spaces We extend the orgnal mean shft algorthm from the Eucldean space to a general nner product space. Ths makes t possble to apply the algorthm to a larger class of nonlnear problems, such as clusterng on manfolds [38]. We also note that a smlar dervaton for fxed bandwdth mean shft algorthm was gven n [39]. Let X be the nput space such that the data ponts x X, = 1,..., n. Although, n general, X may not necessarly be a Eucldean space, for sake of smplcty, we assume X corresponds to R d. Every pont x s then mapped to a d φ -dmensonal feature space H by applyng the mappng functons φ l, l = 1,..., d φ, where φ(x) = [φ 1 (x) φ 2 (x)... φ dφ (x)]. (5)

4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 4 (a) (b) (c) Fg. 2: Clusterng wth must-lnk constrants. (a) Input space. Red markers are the constrant par (x 1, x 2). (b) The nput space s mapped to the feature space va quadratc mappng φ(x) = [x x 2 ]. The black arrow s the constrant vector (φ(x 1) φ(x 2)), and the red dashed lne s ts null space. (c) The feature space s projected to the null space of the constrant vector. Constrant ponts collapse to a sngle pont and are guaranteed to be clustered together. Two clusters can be easly dentfed. Note that n many problems, ths mappng s suffcent to acheve the desred separablty between dfferent clusters. We frst derve the mean shft procedure on the feature space H n terms of the explct representaton of the mappng φ. The pont sample densty estmator at y H, wth the dagonal bandwdth matrces h I dφ d φ s f H (y) = 1 n n 1 h d φ k( y φ(x ) h 2). (6) Takng the gradent of (6) w.r.t. φ, lke (4), the soluton can be found teratvely usng the mean shft procedure ( n φ(x ) y φ(x ) g ) 2 h d φ +2 h δy = ( ) y. (7) y φ(x ) n 1 g h d φ +2 h 2 By employng the kernel trck, we now derve the mplct formulaton of the kernel mean shft algorthm. We defne K : X X R, a postve semdefnte, scalar kernel functon satsfyng for all x, x X K(x, x ) = φ(x) φ(x ). (8) K( ) defnes an nner product whch makes t possble to map the data mplctly to a hgh-dmensonal kernel space. Let Φ = [φ(x 1 ) φ(x 2 )... φ(x n )] (9) be the d φ n matrx of the mapped ponts and K = Φ T Φ be the n n kernel (Gram) matrx. We observe that at each teraton of the mean shft procedure (7), the estmate ȳ = y + δy always les n the column space of Φ. Any such pont ȳ can be wrtten as ȳ = Φαȳ (10) where αȳ s an n-dmensonal weghtng vector. The dstance between two ponts y and y n ths space can be expressed n terms of ther nner product and ther respectve weghtng vectors y y 2 = Φα y Φα y 2 (11) = α y Φ Φα y + α y Φ Φα y 2α y Φ Φα y = α y Kα y + α y Kα y 2α y Kα y. Let e denote the -th canoncal bass vector for R n. Applyng (11) to compute dstances n (7) by usng the equvalence φ(x ) = Φe, the mean shft algorthm teratvely updates the weghtng vector α y αȳ = n n e h d φ +2 ( g 1 g h d φ +2 α y Kαy+e Ke 2α y Ke h 2 ( α y Kα y+e Ke 2α y Ke h 2 ) ). (12) The clusterng starts on the data ponts on H, therefore the ntal weghtng vectors are gven by α y = e, such that, y = Φα y = φ(x ), = 1... n. At convergence, the mode ȳ can be recovered usng (10) as Φᾱ y. The ponts convergng close to the same mode are clustered together, followng the orgnal proof [11]. Snce any postve semdefnte matrx K s a kernel for some feature space [13], the derved method mplctly apples mean shft on the feature space nduced by K. Note that under ths formulaton, mean shft n the nput space can be mplemented by smply choosng the mappng functon φ( ) to be dentty,.e. φ(x) = x. An mportant pont s that the dmensonalty of the feature space d φ can be very large, for example t s nfnte n case of the Gaussan kernel functon. In such cases t may not be possble to explctly compute the pont sample densty estmator (6) and consequently the mean shft vectors (7). Snce the dmensonalty of the subspace spanned by the feature ponts s equal to rank K n, t s suffcent to use the mplct form of the mean shft procedure usng (12). 4 KERNEL LEARNING USING LINEAR TRANSFORMATIONS A nonlnear mappng of the nput data to a hgherdmensonal kernel space often mproves cluster separablty. By effectvely enforcng the avalable constrants, t s possble to transform the entre space and gude the clusterng to dscover the desred structure n the data. To llustrate ths ntuton, we present a smple two class example from [37] n Fg. 2. The orgnal data les along the x-axs, wth the blue ponts assocated wth one class and the black ponts wth the second class (Fg. 2a). Ths data

5 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 5 appears to have orgnated from three clusters. Let (x 1, x 2 ) be the par of ponts marked n red whch are constraned to be clustered together. We map the data explctly to a two-dmensonal feature space va a smple quadratc mappng φ(x) = [x x 2 ] (Fg. 2b). Although the data s lnearly separable, t stll appears to form three clusters. Usng the must-lnk constrant par, we enforce the two red ponts to be clustered together. The black arrow denotes the constrant vector φ(x 1 ) φ(x 2 ) and the dashed red lne s ts null space. By projectng the feature ponts to the null space of the constrant vector, the ponts φ(x 1 ) and φ(x 2 ) are collapsed to the same pont, guaranteeng ther assocaton wth the same mode. From Fg.2c, we can see that n the projected space, the data has the desred cluster structure whch s consstent wth ts class assocaton. Ths approach, although smple, does not scale well wth ncreasng number of constrants for a smple nonlnear mappng lke above. Gven m lnearly ndependent constrant vectors n a d φ -dmensonal space, the dmensonalty of the null space of the constrant matrx s d φ m. Ths mples that f d φ or more constrants are specfed, all the ponts collapse to a sngle pont and are therefore grouped together n one cluster. Ths problem can be allevated f a mappng functon φ( ) s chosen such that d φ s very large. Snce explctly desgnng the mappng functon φ(x) s not always practcal, we use a kernel functon K(x, x ) to mplctly map the nput data to a very hgh-dmensonal kernel space. As we show n Secton 4.1, the subsequent projecton to the null-space of the constrant vectors can also be acheved mplctly by approprately updatng the kernel matrx. In Secton 4.2, we further generalze the projecton operaton to a lnear transformaton that also utlzes cannot-lnk constrants. 4.1 Kernel Updates Usng Orthogonal Projecton Recall the matrx Φ n (9), obtaned by mappng the nput ponts to H va the nonlnear functon φ. Let (j 1, j 2 ) be a must-lnk constrant par such that φ(x j1 ) = Φe j1 and φ(x j2 )=Φe j2 are to be clustered together. Gven a set of m such must-lnk constrant pars M, for every (j 1, j 2 ) M, the d φ -dmensonal constrant vector can be wrtten as a j = Φ (e j1 e j2 ) = Φz j. We refer to the n-dmensonal vector z j as the ndcator vector for the j th constrant. The d φ m dmensonal constrant matrx A can be obtaned by column stackng all the m constrant vectors,.e., A = ΦZ, where Z = [z 1,..., z m ] s the n m matrx of ndcator vectors. Smlar to the example n Fg. 2, we mpose the constrants by projectng the matrx Φ to the null space of A usng the the projecton matrx ( ) + P = I dφ A A A A (13) where I dφ s the d φ -dmensonal dentty matrx and + denotes the matrx pseudonverse operaton. Let S = A A be the m m scalng matrx. The matrx S can be computed usng the ndcator vectors and the ntal kernel matrx K wthout knowng the mappng φ as S = A A = Z Φ ΦZ = Z KZ. (14) Gven the constrant set, the new mappng functon φ(x) s computed as φ(x) = Pφ(x). Snce all the constrants n M are satsfed n the projected subspace, we have PΦZ 2 F = 0 (15) where F denotes the Frobenus norm. The ntal kernel functon (8) correspondngly transforms to the projected kernel functon K(x, x ) = φ(x) φ(x ). Usng the projecton matrx P, t can be rewrtten n terms of the ntal kernel functon K(x, x ) and the constrant matrx A K(x, x ) = φ(x) P Pφ(x ) = φ(x) Pφ(x ) = φ(x) (I dφ AS + A )φ(x ) = K(x, x ) φ(x) AS + A φ(x ). (16) Note that the dentty P P = P follows from the fact that P s a symmetrc projecton matrx. The m-dmensonal vector A φ(x) can also be wrtten n terms of the ntal kernel functon as A φ(x) = Z Φ φ(x) = Z [K(x 1, x),..., K(x n, x)]. (17) Let the vector k x = [K(x 1, x),..., K(x n, x)]. From (16), the projected kernel functon can be wrtten as K(x, x ) = K(x, x ) k x ZS + Z k x (18) The projected kernel matrx can be drectly computed as K = K KZS + Z K (19) where the symmetrc n n ntal kernel matrx K has rank r n. The rank of the matrx S s equal to the number of lnearly ndependent constrants, and the rank of the projected kernel matrx K s r rank S. 4.2 Kernel Updates Usng Lnear Transformaton By projectng the feature ponts to the null space of the constrant vector a, ther components along the a are fully elmnated. Ths operaton guarantees that the two ponts belong to the same cluster. As proposed earler, by approprately choosng the kernel functon K( ) (such that d φ s very large), we can make the ntal kernel matrx K full-rank. However, a suffcently large number of such lnearly ndependent constrants could stll result n a projected space wth dmensonalty that s too small for meanngful clusterng. From a clusterng perspectve, t mght suffce to brng the two must-lnk constrant ponts suffcently close to each other. Ths can be acheved through a lnear transformaton of the kernel space that only scales down the component along the constrant vector. Such a lnear transformaton would preserve the rank of the kernel matrx and s potentally capable of handlng a very large number of mustlnk constrants. A smlar transformaton that ncreases the dstance between two ponts also enables us to ncorporate cannot-lnk constrants.

6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 6 Gven a constrant vector a = Φz, a symmetrc transformaton matrx of the form T = I dφ s ( aa ) (20) allows us to control the scalng along the vector a usng the scalng factor s. When s = a 1 a the transformaton becomes a projecton to the null space of a. The transformaton decreases dstances along a for 0 < s < a 2 a, whle t s ncreased for s < 0 or s > a 2 a. We can set a target dstance d > 0 for the constrant pont par by applyng an approprate transformaton T. The constrant equaton can be wrtten as TΦz 2 F = z Kz = d (21) where K = Φ T TΦ = Φ T 2 Φ s the correspondng transformed kernel matrx. To handle must-lnk constrants, d should be small, whle t should be large for cannot-lnk constrant pars. Usng the specfed d, we can compute s and therefore the transformed kernel matrx K = Φ ( I dφ s aa ) 2 Φ. (22) Substtutng a = Φz the expresson for K s K = K 2s Kzz K + s 2 ( z Kz ) Kzz K. (23) From (21), we can solve for s ( ) s = 1 d 1 ± where p = z Kz > 0 (24) p p and the choce of s does not affect the followng kernel update such that the constrant (21) s satsfed K = K + βkzz K, β = ( d p 2 1 ). (25) p The case when p = 0 mples that the constrant vector s a zero vector and T = I dφ and a value of β = 0 should be used. When multple must-lnk and cannot-lnk constrants are avalable, the kernel matrx can be updated teratvely for each constrant by computng the update parameter β j usng correspondng d j and p j. However, by enforcng the dstance between constrant pars to be exactly d j, the lnear transformaton mposes hard constrants for learnng the kernel matrx whch could potentally have two adverse consequences: 1) The hard constrants could make the algorthm dffcult (or even mpossble) to converge, snce prevously satsfed constrants could easly get volated durng subsequent updates of the kernel matrx. 2) Even when the constrants are non-conflctng n nature, n the absence of any relaxaton, the method becomes senstve to labelng errors. Ths s llustrated through the followng example.. Ths update rule s equvalent to mnmzng the log det dvergence (29) for a sngle equalty constrant usng Bregman projectons (31). The nput data conssts of ponts along fve concentrc crcles as shown n Fg.3a. Each cluster comprses of 150 nosy ponts along a crcle. Four labeled ponts per class (shown ( by square markers) are used to generate a set of 4 ) 2 5 = 30 must-lnk constrants. The ntal kernel matrx K s computed usng a Gaussan kernel wth σ = 5, and updated by mposng the provded constrants (19). The updated kernel matrx K s used for kernel mean shft clusterng (Secton 3.2) and the correspondng results are shown n Fg. 3b. To test the performance of the method under labelng errors, we also add one mslabeled mustlnk constrant (shown n black lne). The clusterng performance deterorates drastcally wth just one mslabeled constrant as shown n Fg. 3c. In order to overcome these lmtatons, the learnng algorthm must accommodate for systematc relaxaton of constrants. Ths can be acheved by observng that the kernel update n (25) s equvalent to mnmzng the log det dvergence between K and K subject to constrant (21) [25, Sec ]. In the followng secton, we formulate the kernel learnng algorthm nto a log det mnmzaton problem wth soft constrants. 5 KERNEL LEARNING USING BREGMAN DI- VERGENCE Bregman dvergences have been studed n the context of clusterng, matrx nearness and metrc and kernel learnng [22], [25], [1]. We brefly dscuss the log det Bregman dvergence and ts propertes n Secton 5.1. We summarze the kernel learnng problem usng Bregman dvergence whch was ntroduced n [25], [22] n Secton The LogDet Bregman Dvergence The Bregman dvergence [5] between real, symmetrc n n matrces X and Y s a scalar functon gven by ( ) D ϕ (X, Y)=ϕ(X) ϕ(y) tr ϕ (Y) (X Y) (26) where ϕ s a strctly convex functon and denotes the gradent operator. For ϕ(x) = log (det(x)), the resultng dvergence s called the log det dvergence D ld (X, Y) = tr ( XY 1) log det ( XY 1) n (27) and s defned when X, Y are postve defnte. In [25], ths defnton was extended to rank defcent (postve semdefnte) matrces by restrctng the convex functon to the range spaces of the matrces. For postve semdefnte matrces X and Y, both havng rank r n and sngular value decomposton X = VΛV and Y = UΘU, the log det dvergence s defned as D ld (X, Y) =,j r ( v u j ) 2 ( λ θ j log λ θ j 1 ). (28) Moreover, the log det dvergence, lke any Bregman dvergence, s convex wth respect to ts frst argument X [1]. Ths property s useful n formulatng the kernel learnng as a convex mnmzaton problem n the presence of multple constrants.

7 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 7 (a) (b) (c) Fg. 3: Fve concentrc crcles. Kernel learnng wthout relaxaton (a) Input data. The square markers ndcate the data ponts used to generate parwse constrants. The black lne shows the only mslabeled smlarty constrant used. (b) Clusterng results when only the correctly labeled smlarty constrants were used. (c) Clusterng results when the mslabeled constrant was also used. 5.2 Kernel Learnng wth LogDet Dvergences Let M and C denote the sets of m must-lnk and c cannotlnk pars respectvely, such that m + c = n c. Let d m and d c be the target squared dstance thresholds for mustlnk and cannot-lnk constrants respectvely. The problem of learnng a kernel matrx usng lnear transformatons for multple constrants, dscussed n Secton 4.2, can be equvalently formulated as the followng constraned log det mnmzaton problem ( ) mn D ld K, K (29) K s.t. (e j1 e j2 ) K (ej1 e j2 ) = d m (j 1, j 2 ) M (e j1 e j2 ) K (ej1 e j2 ) = d c (j 1, j 2 ) C. The fnal kernel matrx K s obtaned by teratvely updatng the ntal kernel matrx K. In order to permt relaxaton of constrants, we rewrte the learnng problem usng a soft margn formulaton smlar to [22], where each constrant par (j 1, j 2 ) s assocated wth a slack varable ξ j, j = 1,..., n c mn K, ξ D ld ( K, K ) + γd ld (dag ( ξ ) ), dag (ξ) s.t. (e j1 e j2 ) K (ej1 e j2 ) ξ j (j 1, j 2 ) M (e j1 e j2 ) K (ej1 e j2 ) ξ j (j 1, j 2 ) C. (30) The n c -dmensonal vector ξ s the vector of slack varables and ξ s the vector of target dstance thresholds d m and d c. The regularzaton parameter γ controls the trade-off between fdelty to the orgnal kernel and the tranng error. Note that by changng the equalty constrants to nequalty constrants, the algorthm allows must-lnk pars to be closer and cannot-lnk pars to be farther than ther correspondng dstance thresholds. The optmzaton problem n (30) s solved usng the method of Bregman projectons [5]. A Bregman projecton (not necessarly orthogonal) s performed to update the current matrx, such that the updated matrx satsfes that constrant. In each teraton, t s possble that the current update volates a prevously satsfed constrant. Snce the problem n (30) s convex [25], the algorthm converges to the global mnmum after repeatedly updatng the kernel matrx for each constrant n the set M C. Fg. 4: Block dagram descrbng the sem-supervsed kernel mean shft clusterng algorthm. The bold boxes ndcate the user nput to the algorthm. For the log det dvergence, the Bregman projecton that mnmzes the objectve functon n (30) for a gven constrant (j 1, j 2 ) M C, can be wrtten as derved n [25] K t+1 = K t + β t Kt (e j1 e j2 ) (e j1 e j2 ) Kt. (31) For the t th teraton the parameter β t s computed n closed form as explaned n the supplementary materal. The algorthm converges when β t approaches zero for all (j 1, j 2 ) M C wth the fnal learned kernel matrx K = Φ Φ. Usng K, the kernel functon that defnes the nner product n the correspondng transformed kernel space can be wrtten as K (x, y) = K (x, y) + k x (K + ( K K ) K +) k y (32) where K( ) s the scalar ntal kernel functon (8), and the vectors k x = [K(x, x 1 ),..., K(x, x n )] and k y = [K(y, x 1 ),..., K(y, x n )] [22]. The ponts x, y X could be out of sample ponts,.e., ponts that are not n the sample set {x 1,..., x n } used to learn K. Note that (18) also generalzes the nner product n the projected kernel space to out of sample ponts. 6 SEMI-SUPERVISED KERNEL MEAN-SHIFT CLUSTERING ALGORITHM We present the complete algorthm for sem-supervsed kernel mean shft clusterng (SKMS) n ths secton. Fg. 4 shows a block dagram for the overall algorthm. We explan each of the modules usng two examples: Olympc crcles, a synthetc data set and a 1000 sample subset of USPS dgts, a real data set wth mages of handwrtten dgts. In Secton 6.1, we propose a method to select the scale parameter σ for the ntal Gaussan kernel functon usng the sets M, C and target dstances d m, d c. We show the speed-up acheved by performng the low-rank kernel matrx updates (as opposed to updatng the full kernel matrx) n Secton 6.2. For mean shft clusterng, we present a strategy to automatcally select the bandwdth parameter usng the parwse mustlnk constrants n Secton 6.3. Fnally, n Secton 6.4, we dscuss the selecton of the trade-off parameter γ.

8 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 8 Fg. 5: Selectng the scale parameter ˆσ that mnmzes D ld (ξ, ξ σ ) usng grd search. Selected σ values: Olympc crcles, ˆσ = 0.75; USPS dgts, ˆσ = Intal Parameter Selecton Gven two nput sample ponts x, x j R d, the Gaussan kernel functon s gven by K σ (x, x j ) = exp ( x x j 2 ) 2σ 2 [0, 1] (33) where σ s the scale parameter. From (33) and (11) t s evdent that parwse dstances between sample ponts n the feature space nduced by K σ ( ) les n the nterval [0, 2]. Ths provdes us wth an effectve way of settng up the target dstances d m = mn(d 1, 0.05) and d c = max(d 99, 1.95), where d 1 and d 99 are the 1 st and 99 th percentle of dstances between all pars of ponts n the kernel space. We select the scale parameter σ such that, n the ntal kernel space, dstances between must-lnk ponts are small whle those between cannot-lnk ponts are large. Ths results n good regularzaton and faster convergence of the learnng algorthm. Recall that ξ s the n c -dmensonal vector of target squared dstances between the constrant pars { d m (j 1, j 2 ) M ξ j = (34) d c (j 1, j 2 ) C. Let ξ σ be the vector of dstances computed for the n c constrant pars usng the kernel matrx K σ. The scale parameter that mnmzes the log det dvergence between the vectors ξ and ξ σ s ˆσ = arg mn σ S D ld (dag (ξ), dag (ξ σ )) (35) and the correspondng kernel matrx s Kˆσ. The kernel learnng s nsenstve to small varatons n ˆσ, thus t s suffcent to do a search over a dscrete set S. The elements of the set S are roughly the centers of equal probablty bns over the range of all parwse dstances between the nput sample ponts x, = 1,..., n. For Olympc crcles, we used S = {0.025, 0.05, 0.1, 0.3, 0.5, 0.75, 1, 1.5, 2, 3, 5} and for USPS dgts, S = {1, 2,..., 10, 15, 20, 25}. A smlar set can be automatcally generated by mappng a unform probablty grd (over 0 1) va the nverse cdf of the parwse dstances. Fg. 5 shows the plot of the objectve functon n (35) aganst dfferent values of σ for the Olympc crcles data set and the USPS dgts data set. 6.2 Low Rank Kernel Learnng When the ntal kernel matrx has rank r n, the n n matrx updates (31) can be modfed to acheve a sgnfcant computatonal speed-up [25] (see supplementary materal for the low-rank kernel learnng algorthm). We learn a kernel matrx for clusterng the two example data sets: Olympc crcles (5 classes, n = 1500) and USPS dgts (10 classes, n = 1000). The must-lnk constrants are generated usng 15 labeled ponts from each class: 525 for Olympc crcles and 1050 for USPS dgts, whle an equal number of cannot-lnk constrants s used. The n n ntal kernel matrx Kˆσ s computed as descrbed n the prevous secton. Usng sngular value decomposton (SVD), we compute an n n low-rank kernel matrx K such that rank K = r n and K F Kˆσ F For Olympc crcles, ths results n a rank 58 approxmaton of the matrx K leadng to a computatonal speed-up from secs. to 0.92 secs. In case of USPS dgts, the matrx K has rank 499 and the run tme reduces from secs. to 68.2 secs. We also observed that n all our experments n Secton 7, the clusterng performance dd not deterorate sgnfcantly when the low-rank approxmaton of the kernel matrx was used. 6.3 Settng the Mean Shft Parameters For kernel mean shft clusterng, we defne the bandwdth for each pont as the dstance to ts k th nearest neghbor. In general, clusterng s an nteractve process where the bandwdth parameter for mean shft s provded by the user, or s estmated based on the desred number of clusters. In ths secton we propose an automatc recommendaton method for the bandwdth parameter k by usng only the must-lnk constrant pars. We buld upon the ntuton that n the transformed kernel space, the neghborhood of a must-lnk par comprses of ponts smlar to the constrant ponts. Gven the j th constrant par (j 1, j 2 ) M, we compute the parwse dstances n the transformed kernel space between the frst constrant pont Φe j1 and all other feature ponts as d = (e j1 e ) K(ej1 e ), = 1,..., n, j 1. These ponts are then sorted n the ncreasng order of d. The bandwdth parameter k j for the j th constrant corresponds to the ndex of j 2 n ths sorted lst. Therefore, the pont Φe j2 s the kj th neghbor of Φe j1. Fnally, the value of k s selected as the medan over the set {k j, j = 1,..., m}. For a correct choce of k, we expect the performance of mean shft to be nsenstve to small changes n the value of k. In Fg. 6, we plot the number of clusters recovered by kernel mean shft as the bandwdth parameter s vared. For learnng the kernel, we used 5 ponts from each class to generate must-lnk constrants: 50 for the Olympc crcles and 100 for USPS dgts. An equal number of cannot-lnk constrants s used. Fg. 6a shows the plot for Olympc crcles (5 classes), where the medan based value (k = 6) underestmates the bandwdth parameter. A good choce s

9 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 9 (a) (b) Fg. 6: Selectng the mean shft bandwdth parameter k, gven fve labeled ponts per class. (a) Olympc crcles. Number of recovered clusters s senstve at the medan based estmate k = 6, but not at k = 15 (see text). (b) USPS dgts. Number of recovered clusters s not senstve at the medan based estmate k = 4. The astersk ndcates the medan based estmate, whle the square marker shows a good estmate of k. k = 15, whch les n the range (10 23) where the clusterng output s nsenstve to changes n k. As seen n Fg. 6b, n case of USPS dgts (10 classes), the number of clusters recovered by mean shft s nsenstve to the medan based estmate (k =4). For all other data sets we used n our experments, the medan based estmates produced good clusterng output. Therefore, wth the excepton of Olympc crcles, we used the bandwdth parameter obtaned usng the medan based approach. However, for completeness, the choce of k should be verfed, and corrected f necessary, by analyzng the senstvty of the clusterng output to small perturbatons n k. For computatonal effcency, we run the mean shft algorthm n a lower dmensonal subspace spanned by the sngular vectors correspondng to at most the 25 largest sngular values of K. For all the experments n Secton 7, a 25-dmensonal representaton of the kernel matrx was large enough to represent the data n the kernel space. 6.4 Selectng the Trade-off Parameter The trade-off parameter γ s used to weght the objectve functon for the kernel learnng. We select γ by performng a two-fold cross-valdaton over dfferent values of γ and the clusterng performance s evaluated usng the scalar measure Adjusted Rand (AR) ndex [20]. The kernel matrx s learned usng half of the data wth 15 ponts used to generate the parwse constrants, 525 must-lnk constrants for the Olympc crcles and 1050 for the USPS dgts. An equal number of cannot-lnk constrant pars s also generated. Input: X - n d data matrx M, C - smlarty and dssmlarty constrant sets S - set of possble σ values γ - the trade-off parameter Procedure: Intalze slack varables ξ j = ξ j = d m for (j 1, j 2) M ξ j = ξ j = d c for (j 1, j 2) C Select ˆσ S by mnmzng (35) and compute ntal Gaussan kernel matrx Kˆσ Compute n n low-rank kernel matrx K such that rank K = r n and K F Kˆσ F 0.99 Learn K usng log det dvergences (See suppl. materal) Usng K and M, estmate bandwdth parameter k Compute bandwdths h as the k-th smallest dstance from the pont usng K and (11), d φ = rank K Repeat for all data ponts = 1,..., n Intalze ᾱ = α y = e Update ᾱ usng K n (12) untl convergence to local mode Group data ponts x and x j together f ᾱ Kᾱ + ᾱ j Kᾱj 2ᾱ Kᾱj = 0 Return cluster labels Fg. 8: Sem-supervsed kernel mean shft algorthm (SKMS). Each cross-valdaton step nvolves learnng the kernel matrx wth the specfed γ and clusterng the testng subset usng the kernel mean shft algorthm and the transformed kernel functon (32). Fg. 7 shows the average AR ndex plotted aganst log γ for the two data sets. In both the examples an optmum value of γ = 100 was obtaned. In general, ths value may not be optmum for other applcatons. However, n all our experments n Secton 7, we use γ = 100, snce we obtaned smlar curves wth small varatons n the AR ndex n the vcnty of γ = 100. Fg. 8 shows a step-by-step summary of the semsupervsed kernel mean shft (SKMS) algorthm. Fg. 7: Selectng the trade-off parameter γ. The AR ndex vs log γ for 50 cross-valdaton runs. The astersks mark the selected value of γ = EXPERIMENTS We show the performance of sem-supervsed kernel mean shft (SKMS) algorthm on two synthetc examples and four real-world examples. We also compare our method wth two state-of-the-art methods: the sem-supervsed kernel k- means (SSKK) [24] and the constraned spectral clusterng (E2CP) [29]. In the past, the superor performance of

10 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 10 (a) (b) (c) Fg. 9: Olympc crcles. (a) Input data. (b) AR ndex as the number of parwse constrants s vared. (c) AR ndex as the fracton of mslabeled constrants s vared. E2CP over other recent methods has been demonstrated. Please see [29] for more detals. In addton to ths, we also compare SKMS wth the kernelzed k-means (Kkm) and kernelzed spectral clusterng (KSC) algorthms. The learned kernel matrx s used to compute dstances n SKMS and Kkm, whle KSC uses t as the affnty matrx. By provdng the same learned kernel matrx to the three algorthms, we compare the clusterng performance of mean shft wth that of k-means and spectral clusterng. Note that, unlke Kkm and KSC, the methods SSKK and E2CP do not use the learned kernel matrx and supervson s suppled wth alternatve methods. Wth the excepton of SKMS, all the methods requre the user to provde the correct number of clusters. Comparson metrc. The clusterng performance of dfferent algorthms s compared usng the Adjusted Rand (AR) ndex [20]. It s an adaptaton of the rand ndex that penalzes random cluster assgnments, whle measurng the agreement between the clusterng output and the ground truth labels. The AR ndex s a scalar and takes values between zero and one, wth perfect clusterng yeldng a value of one. Expermental setup. To generate parwse constrants, we randomly select b labeled ponts from each class. All possble must-lnk constrants are generated for each class such that m= ( b 2) and a subset of all cannot-lnk constrant pars s selected at random such that c=m= nc 2. For each expermental settng, we average the clusterng results over 50 ndependent runs, each wth randomly selected parwse constrants. The ntal kernel matrx s computed usng a Gaussan kernel (33) for all the methods. We hand-pcked the scale parameter σ for SSKK and E2CP from a wde range of values such that ther fnal clusterng performance on each data set was maxmzed. For Kkm, KSC and SKMS, the values of σ and the target dstances d m and d c are estmated as descrbed n Secton 6.1. Fnally, the mean shft bandwdth parameter k, s estmated as descrbed n Secton 6.3. For each experment, we specfy the scale parameter σ we used for SSKK and E2CP. For Kkm, KSC and SKMS, σ s chosen usng (35) and the most frequent value s reported. We also lst the range of k, the bandwdth parameter for SKMS for each applcaton. For the log det dvergence based kernel learnng, we set the maxmum number of teratons to and the trade-off parameter γ to Synthetc Examples Olympc Crcles. As shown n Fg. 9a, the data conssts of nosy ponts along fve ntersectng crcles each comprsng 300 ponts. For SSKK and E2CP algorthms, the value of the ntal kernel parameter σ was 0.5. For Kkm, KSC and SKMS the most frequently selected σ was 0.75 and the range of the bandwdth parameter k was We performed two sets of experments. In the frst experment, the performance of all the algorthms s compared by varyng the total number of parwse constrants. The number of labeled ponts per class vary as {5, 7, 10, 12, 15, 17, 20, 25} and are used to generate mustlnk and cannot-lnk constrants that vary between 100 and Fg. 9b demonstrates SKMS performs better than E2CP and KSC whle ts performance s smlar to SSKK and Kkm. For 100 constrants, SKMS recovered 5 8 clusters, makng a mstake 22% of the tmes. For all other settngs together, t recovered an ncorrect number of clusters (4 6) only 6.3% of the tmes. In the second experment, we ntroduced labelng errors by randomly swappng the labels of a fracton of the parwse constrants. We use 20 labeled sample ponts per class to generate 1900 parwse constrants and vary the fracton of mslabeled constrants between 0 and 0.6 n steps of 0.1. Fg. 9c shows the clusterng performance of all the methods. The performance of SKMS degrades only slghtly even when half the constrant ponts are labeled wrongly, but t starts deteroratng when 60% of the constrant pars are mslabeled. Concentrc Crcles. The data conssts of ten concentrc crcles each comprsng 100 nosy ponts (Fg. 10a). For Kkm, KSC and SKMS the most frequently selected σ was 1 and the range of the bandwdth parameter for SKMS was Both algorthms, SSKK and E2CP have the value of σ = 0.2. In the frst experment, we vary the number of labeled ponts per class between 5 and 25 as n the prevous example to generate parwse constrants between 200 and For 200 constrants, SKMS ncorrectly recovered nne clusters 50% of the tmes, whle for all the other settngs t correctly detected 10 clusters every tme. In the second experment, we use 25 labeled ponts to generate 6000 parwse constrants and mslabeled a

11 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 11 (a) (b) (c) Fg. 10: Ten concentrc crcles. (a) Input data. (b) AR ndex as the number of parwse constrants s vared. (c) AR ndex as the fracton of mslabeled constrants s vared. fracton of randomly selected constrants. Ths mslabeled fracton was vared between 0 to 0.6 n steps of 0.1 and the performance of all the algorthms n the two experments s shown n Fg. 10b-c. 7.2 Real-World Applcatons In ths secton, we demonstrate the performance of our method on two real applcatons havng a small number of classes; USPS dgts: 10 classes and MIT scene: 8 classes; and two real applcatons wth a large number of classes; PIE faces: 68 classes and Caltech-101 subset: 50 classes. We also show the performance of SKMS whle clusterng out of sample ponts usng (32) on the USPS dgts and the MIT scene data sets. Snce the sample sze per class s much smaller for PIE faces and Caltech-101 subset, results for generalzaton are not shown. We observe that the superorty of SKMS over other competng algorthms s clearer when the number of clusters s large. USPS Dgts. The USPS dgts data set s a collecton of grayscale mages of natural handwrtten dgts and s avalable from rowes/data.html. Each class contans 1100 mages of one of the ten dgts. Fg. 11 shows sample mages from each class. Each mage s then represented wth a 256-dmensonal vector where the columns of the mage are concatenated. We vary the number of labeled ponts per class between 5 and 25 as n the prevous example to generate parwse constrants from 200 to The maxmum number of labeled ponts per class used comprses only 2.27% of the total data. Fg. 11: Sample mages from the USPS dgts data set. Snce the whole data set has ponts, we select 100 ponts from each class at random to generate a 1000 sample subset. The labeled ponts are selected at random from ths subset for learnng the kernel matrx. The value of σ used was 5 for SSKK and 2 for E2CP. For Kkm, KSC and SKMS the most frequently selected σ was 7 and the range of the bandwdth parameter k was In the frst experment we compare the performance of all the algorthms on ths subset of 1000 ponts. Fg. 12a shows the clusterng performance of all the methods. Once the number of constrants were ncreased beyond 500, SKMS outperformed the other algorthms. For 200 constrants, the SKMS dscovered 9 11 clusters, makng a mstake 18% of the tmes, whle t recovered exactly 10 clusters n all the other cases. In the second experment, we evaluated the performance of SKMS for the entre data set usng 25 labeled ponts per class. The AR ndex averaged over 50 runs was ± Note that from Fg. 12a t can be observed that there s only a margnal decrease n clusterng accuracy. The parwse dstance matrx (PDM) after performng mean shft clusterng s shown n Fg. 12b. For llustraton purpose, the data s ordered such that the mages belongng to the same class appear together. The block dagonal structure ndcates good generalzaton of SKMS for out of sample ponts wth lttle confuson between classes. Nether SSKK nor E2CP could be generalzed to out of sample ponts because these methods need to learn a new kernel or affnty matrx ( ). MIT Scene. The data set s avalable from MIT /spatalenvelope/ and contans 2688 labeled mages. Each mage s pxels n sze and belongs to one of the eght outdoor scene categores, four natural and four man-made. Fg. 13 shows one example mage from each of the eght categores. Usng the code provded wth the data, we extracted the GIST descrptors [32] whch were then used to test all the algorthms. We vary the number of labeled ponts per class lke n the prevous example, but only between 5 and 20 to generate the parwse constrants from 160 to The maxmum number of labeled ponts used comprses only 7.44% of the total data. We select 100 ponts at random from each of the eght classes to generate the ntal kernel matrx. The parwse constrants are obtaned from the randomly chosen labeled ponts from each class. The value of σ used was 1 for SSKK and 0.5 for E2CP. For Kkm, KSC and SKMS the most frequently selected σ was 1.75 and the range of the bandwdth parameter k was Fg. 14a shows the clusterng performance of all the algorthms as the number of constrant ponts are vared.

12 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 12 (a) (b) Fg. 12: USPS dgts. (a) AR ndex as the number of parwse constrants s vared. (b) The parwse dstance matrx (PDM) after performng mean shft clusterng. (a) (b) Fg. 14: MIT Scene data set. (a) AR ndex on the kernel matrx as the number of parwse constrants s vared. (b) AR ndex on entre data as the number of parwse constrants s vared. For 160 constrant pars, SKMS ncorrectly dscovered 7 10 clusters about 22% of the tme, whle t correctly recovered eght clusters n all other settngs. In the second experment, the whole data set was clustered usng the learned kernel matrx and generalzng to the out of sample ponts (32). Both SSKK and E2CP were used to learn the full kernel and affnty matrx respectvely. Fg. 14b shows the performance of all the algorthms as the number of parwse constrants was vared. Note that n [29] for smlar experments, the superor performance of E2CP was probably because of the use of spatal Markov kernel nstead of the Gaussan kernel. PIE Faces. From the CMU PIE face data set [35], we use only the frontal pose and neutral expresson of all 68 subjects under 21 dfferent lghtng condtons. We coarsely algned the mages wth respect to eye and mouth locatons and reszed them to be In Fg.15, we show eght llumnaton condtons for three dfferent subjects. Due to sgnfcant llumnaton varaton, nterclass varablty s very large and some of the samples from dfferent subjects appear to be much closer to each other than wthn classes. We convert the mages from color to gray scale and normalze the ntenstes between zero and one. Each mage s then represented wth a dmensonal vector where the columns of the mage are concatenated. We vary the number of labeled ponts per class as {3, 4, 5, 6, 7} to generate parwse constrants between 408 and The maxmum number of labeled ponts comprses 30% of the total data. We generate the ntal kernel matrx usng all the data. The value of σ used was 10 for SSKK and 22 for E2CP. For Kkm, KSC and SKMS the most frequently selected σ was 25 and the range of the bandwdth parameter k was 4 7. Fg. 16 shows the performance comparson and t can be observed that SKMS outperforms all the other algorthms. Note that SKMS approaches near perfect clusterng for more than 5 labeled ponts per class. When three labeled ponts per class were used, the SKMS Fg. 13: Sample mages from each of the eght categores of the MIT scene data set. Fg. 15: PIE faces data set. Sample mages showng eght dfferent llumnatons for three subjects.

13 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 13 Fg. 16: PIE Faces data set. AR ndex as the number of parwse constrants s vared. dscovered clusters, makng a mstake about 84% of the tme. For all other settngs, SKMS correctly recovered 68 clusters about 62% of the tmes, whle t recovered 67 clusters about 32% of the tme and between clusters n the remanng runs. The Kkm and KSC methods perform poorly nspte of explctly usng the number of clusters and the same learned kernel matrx as SKMS. Caltech-101 Objects. The Caltech-101 data set [16] s a collecton of varable szed mages across 101 object categores. Ths s a partcularly hard data set wth large ntraclass varablty. Fg. 17 shows sample mages from eght dfferent categores. We randomly sampled a subset of 50 categores, as lsted n Table 1, wth each class contanng 31 to 40 samples. For each sample mage, we extract GIST descrptors [32] and use them for evaluaton of all the competng clusterng algorthms. We vary the number of labeled ponts per class as {5, 7, 10, 12, 15} to generate parwse constrants between 500 and The maxmum number of labeled ponts comprses 38% of the total data. We use a larger number of constrants n order to overcome the large varablty n the data set. We generate the ntal kernel matrx usng all the data. The value of σ used s 0.2 for E2CP and 0.3 for SSKK. For Kkm, KSC and SKMS the most frequently selected σ was 0.5 and the range of the bandwdth parameter k was Fg. 18 shows the comparson of SKMS wth the other competng algorthms. It can be seen that SKMS outperforms all the other methods. For fve labeled ponts per class, SKMS detected clusters, makng mstakes 75% of the tmes. For all the other settngs together, SKMS recovered the ncorrect number (48 51) of clusters only Fg. 17: Sample mages from eght of the 50 classes used from Caltech-101 data set. Fg. 18: Caltech-101 data set. AR ndex as the number of parwse constrants s vared. 9% of the tmes. A tabular summary of all comparsons across dfferent data sets s provded n the supplementary materal. 8 DISCUSSION We presented the sem-supervsed kernel mean shft (SKMS) clusterng algorthm where the nherent structure of the data ponts s learned usng a few user suppled parwse constrants. The data s nonlnearly mapped to a hgher-dmensonal kernel space where the constrants are effectvely mposed by applyng a lnear transformaton. Ths transformaton s learned by mnmzng a log det Bregman dvergence between the ntal and the learned kernels. The method also estmates the parameters for the mean shft algorthm and recovers an unknown number of clusters automatcally. We evaluate the performance of SKMS on challengng real and synthetc data sets and compare t wth state-of-the-art methods. We compared the SKMS algorthm wth kernelzed k- means (Kkm) and kernelzed spectral clusterng (KSC) algorthms, whch used the same learned kernel matrx. The lnear transformaton appled to the ntal kernel space mposes soft dstance (nequalty) constrants, and may result n clusters that have very dfferent denstes n the transformed space. Ths explans the relatvely poor performance of KSC n most experments because spectral clusterng methods perform poorly when the clusters have sgnfcantly dfferent denstes [30]. The k-means method s less affected by cluster densty, but t s senstve to ntalzaton, shape of the clusters and outlers,.e., sparse ponts far from the cluster center. Unlke the other methods, mean shft clusterng does not need the number of clusters as nput and can dentfy clusters of dfferent shapes, szes and densty. Snce localty s mposed by the bandwdth parameter, mean shft s more robust to outlers. As shown n our experments, ths advantage gets pronounced when the number of clusters n the data s large. Clusterng, n general, becomes very challengng when the data has large ntra-class varablty. For example, Fg. 19 shows three mages from the hghway category of the MIT scene data set that were msclassfed. The msclassfcaton error rate n Caltech-101 data set was even hgher. On large scale data sets wth over 10000

14 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 14 TABLE 1: Object classes used from Caltech-101 data set. elephant flamngo head emu faces gerenuk stegosaurus accordon ferry cougar face mayfly char scssors menorah platypus butterfly tck metronome nlne skate bass pyramd leopards sea horse cougar body stop sgn lotus dalmatan gramophone camera trlobte dragonfly grand pano headphone sunflower ketch wld cat crayfsh nautlus buddha yn yang dolphn mnaret anchor car sde rooster wheelchar octopus joshua tree ant umbrella crocodle Fg. 19: Three mages from the class hghway of the MIT Scene data set that were msclassfed. categores [14], all methods typcally perform poorly, as an mage may qualfy for multple categores. For example, the mages n Fg. 19 were classfed n the street category, whch s semantcally correct. To deal wth a large number of categores, clusterng (classfcaton) algorthms should ncorporate the ablty to use hgher level semantc features that connect an mage to ts possble categores. The code for SKMS s wrtten n MATLAB and C and s avalable for download at /code/skms/ndex.html REFERENCES [1] A. Banerjee, S. Merugu, I. S. Dhllon, and J. Ghosh. Clusterng wth Bregman dvergences. Journal of Machne Learnng Research, 6: , [2] S. Basu, M. Blenko, and R. Mooney. A probablstc framework for sem-supervsed clusterng. In Proc. 10th Intl. Conf. on Knowledge Dscovery and Data Mnng, [3] S. Basu, I. Davdson, and K. Wagstaff. Constraned Clusterng: Advances n Algorthms, Theory, and Applcatons. Chapman & Hall/CRC Press, [4] M. Blenko, S. Basu, and R. J. Mooney. Integratng constrants and metrc learnng n sem-supervsed clusterng. In Intl. Conf. Machne Learnng, pages 81 88, [5] L. M. Bregman. The relaxaton method of fndng the common pont of convex sets and ts applcaton to the soluton of problems n convex programmng. USSR Computatonal Mathematcs and Mathematcal Physcs, 7: , [6] O. Chapelle, B. Schlkopf, and A. Zen. Sem-Supervsed Learnng. MIT Press, [7] H. Chen and P. Meer. Robust fuson of uncertan nformaton. IEEE Sys. Man Cyber. Part B, 35: , [8] Y. Cheng. Mean shft, mode seekng, and clusterng. IEEE PAMI, 17: , [9] R. Collns. Mean shft blob trackng through scale space. In CVPR, volume 2, pages , [10] D. Comancu. Varable bandwdth densty-based fuson. In CVPR, volume 1, pages 56 66, [11] D. Comancu and P. Meer. Mean shft: A robust approach toward feature space analyss. IEEE PAMI, 24: , [12] D. Comancu, V. Ramesh, and P. Meer. Real-tme trackng of nonrgd objects usng mean shft. In CVPR, volume 1, pages , [13] N. Crstann and J. Shawe-Taylor. Support Vector Machnes and Other Kernel-Based Learnng Methods. Cambrdge U. Press, [14] J. Deng, A. C. Berg, K. L, and L. Fe-Fe. What does classfyng more than 10,000 mage categores tell us? In ECCV, pages 71 84, [15] M. Ester, H.-P. Kregel, J. Sander, and X. Xu. A densty-based algorthm for dscoverng clusters n large spatal databases wth nose. In Second Int. Conf. on Knowledge Dscovery and Data Mnng, pages , [16] L. Fe-Fe, R. Fergus, and P. Perona. Learnng generatve vsual models from few tranng examples: An ncremental bayesan approach tested on 101 object categores. In IEEE CVPR Workshop of Generatve Model Based Vson (WGMBV), [17] K. Fukunaga and L. D. Hostetler. The estmaton of the gradent of a densty functon wth applcatons n pattern recognton. IEEE Informaton Theory, 21:32 40, [18] R. Ghan. Combnng labeled and unlabeled data for multclass text categorzaton. In Proc. 19th Intl. Conf. on Machne Learnng, pages , [19] G. Hager, M. Dewan, and C. Stewart. Multple kernel trackng wth SSD. In CVPR, volume 1, pages , [20] L. Hubert and P. Arabe. Comparng parttons. Journal of Classfcaton, 2: , [21] A. K. Jan. Data clusterng: 50 years beyond k-means. Pattern Recognton Letters, 31: , [22] P. Jan, B. Kuls, J. V. Davs, and I. S. Dhllon. Metrc and kernel learnng usng a lnear transformaton. Journal of Machne Learnng Research, 13: , [23] S. D. Kamvar, D. Klen, and C. D. Mannng. Spectral learnng. In Intl. Conf. on Artfcal Intellgence, pages , [24] B. Kuls, S. Basu, I. S. Dhllon, and R. J. Mooney. Sem-supervsed graph clusterng: A kernel approach. Machne Learnng, 74:1 22, [25] B. Kuls, M. A. Sustk, and I. S. Dhllon. Low-rank kernel learnng wth Bregman matrx dvergences. Journal of Machne Learnng Research, 10: , [26] L. Lels and J. Sander. Sem-supervsed densty-based clusterng. In IEEE Proc. Int. Conf. of Data Mnng, pages , [27] M. Lu, B. Vemur, S.-I. Amar, and F. Nelsen. Shape retreval usng herarchcal total bregman soft clusterng. PAMI, 34: , [28] Lu Zhengdong and M. A. Carrera-Perpñán. Constraned spectral clusterng through affnty propagaton. In CVPR, [29] Lu Zhwu and H. H.-S. Ip. Constraned spectral clusterng va exhaustve and effcent constrant propagaton. In ECCV, pages 1 14, [30] B. Nadler and M. Galun. Fundamental lmtatons of spectral clusterng. In NIPS, pages , [31] A. Y. Ng, M. I. Jordan, and Y. Wess. On spectral clusterng: Analyss and an algorthm. In NIPS, pages , [32] A. Olva and A. Torralba. Modelng the shape of the scene: A holstc representaton of the spatal envelope. IJCV, 42: , [33] C. Ruz, M. Splopoulou, and E. M. Ruz. Densty-based semsupervsed clusterng. Data Mn. Knowl. Dscov., 21: , [34] Y. Shekh, E. Khan, and T. Kanade. Mode-seekng by medodshfts. In ICCV, pages 1 8, [35] T. Sm, S. Baker, and M. Bsat. The CMU Pose, Illumnaton, and Expresson (PIE) Database. In Proceedngs of the IEEE Internatonal Conference on Automatc Face and Gesture Recognton, May [36] R. Subbarao and P. Meer. Nonlnear mean shft for clusterng over analytc manfolds. In CVPR, pages , [37] O. Tuzel, F. Porkl, and P. Meer. Kernel methods for weakly supervsed mean shft clusterng. In ICCV, pages 48 55, [38] O. Tuzel, R. Subbarao, and P. Meer. Smultaneous multple 3D moton estmaton va mode fndng on Le groups. ICCV, pages 18 25, [39] A. Vedald and S. Soatto. Quck shft and kernel methods for mode seekng. In ECCV, pages , [40] K. Wagstaff, C. Carde, S. Rogers, and S. Schrödl. Constraned k- means clusterng wth background knowledge. In Intl. Conf. Machne Learnng, pages , [41] J. Wang, B. Thesson, Y. Xu, and M. Cohen. Image and vdeo segmentaton by ansotropc kernel mean shft. In ECCV, volume 2, pages , 2004.

15 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 15 Saket Anand Saket Anand receved hs B.E. degree n Electroncs Engneerng from Unversty of Pune, Inda, n He completed hs MS n Electrcal and Computer Engneerng from Rutgers Unversty, New Jersey, n From 2007 to 2009, he worked n handwrtng recognton as a Research Engneer at Read-Ink Technologes, Bangalore, Inda. He s currently pursung a PhD degree n Electrcal and Computer Engneerng at Rutgers Unversty, New Jersey. Hs research nterests nclude computer vson, statstcal pattern recognton and machne learnng. He s a student member of the IEEE. Sushl Mttal Sushl Mttal receved hs B.Tech degree n Electroncs and Communcaton Engneerng from Natonal Insttute of Technology, Warangal, Inda n From 2004 to 2005, he was n the Department of Electrcal Engneerng at Indan Insttute of Scence as a research assocate. He completed hs MS and PhD degrees n Electrcal and Computer Engneerng from Rutgers Unversty, New Jersey n 2008 and 2011, respectvely. After hs PhD, he worked n the Department of Statstcs at Columba Unversty as a postdoctoral research scentst untl He s currently a co-founder at Scbler Technologes, Bangalore. Hs research nterests nclude computer vson, machne learnng and appled statstcs. He s a member of the IEEE. Peter Meer Peter Meer receved the Dpl. Engn. degree from the Bucharest Polytechnc Insttute, Romana, n 1971, and the D.Sc. degree from the Technon, Israel Insttute of Technology, Hafa, n 1986, both n electrcal engneerng. From 1971 to 1979 he was wth the Computer Research Insttute, Cluj, Romana, workng on R&D of dgtal hardware. Between 1986 and 1990 he was Assstant Research Scentst at the Center for Automaton Research, Unversty of Maryland at College Park. In 1991 he joned the Department of Electrcal and Computer Engneerng, Rutgers Unversty, Pscataway, NJ and s currently a Professor. He has held vstng appontments n Japan, Korea, Sweden, Israel and France, and was on the organzng commttees of numerous nternatonal workshops and conferences. He was an Assocate Edtor of the IEEE Transacton on Pattern Analyss and Machne Intellgence between 1998 and 2002, a member of the Edtoral Board of Pattern Recognton between 1990 and 2005, and was a Guest Edtor of Computer Vson and Image Understandng for a specal ssue on robustness n computer vson n He s coauthor of an award wnnng paper n Pattern Recognton n 1989, the best student paper n the 1999, the best paper n the 2000 and the runner-up paper n 2007 of the IEEE Conference on Computer Vson and Pattern Recognton (CVPR). Wth coauthors Dorn Comancu and Vsvanathan Ramesh he receved at the 2010 CVPR the Longuet-Hggns prze for fundamental contrbutons n computer vson n the past ten years. Hs research nterest s n applcaton of modern statstcal methods to mage understandng problems. He s an IEEE Fellow. Oncel Tuzel Oncel Tuzel receved the BS and the MS degrees n computer engneerng from the Mddle East Techncal Unversty, Ankara, Turkey, n 1999 and 2002, and PhD degree n computer scence from the Rutgers Unversty, Pscataway, New Jersey, n He s a prncpal member of research staff at Mtsubsh Electrc Research Labs (MERL), Cambrdge, Massachusetts. Hs research nterests are n computer vson, machne learnng, and robotcs. He coauthored more than 30 techncal publcatons, and has appled for more than 20 patents. He was on the program commttee of several nternatonal conferences such as CVPR, ICCV, and ECCV. He receved the best paper runner-up award at the IEEE Computer Vson and Pattern Recognton Conference n He s a member of the IEEE.

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem Ecent Computaton of the Most Probable Moton from Fuzzy Correspondences Moshe Ben-Ezra Shmuel Peleg Mchael Werman Insttute of Computer Scence The Hebrew Unversty of Jerusalem 91904 Jerusalem, Israel Emal:

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Fitting: Deformable contours April 26 th, 2018

Fitting: Deformable contours April 26 th, 2018 4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Laplacian Eigenmap for Image Retrieval

Laplacian Eigenmap for Image Retrieval Laplacan Egenmap for Image Retreval Xaofe He Partha Nyog Department of Computer Scence The Unversty of Chcago, 1100 E 58 th Street, Chcago, IL 60637 ABSTRACT Dmensonalty reducton has been receved much

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue

More information

Multi-stable Perception. Necker Cube

Multi-stable Perception. Necker Cube Mult-stable Percepton Necker Cube Spnnng dancer lluson, Nobuuk Kaahara Fttng and Algnment Computer Vson Szelsk 6.1 James Has Acknowledgment: Man sldes from Derek Hoem, Lana Lazebnk, and Grauman&Lebe 2008

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Support Vector Machines. CS534 - Machine Learning

Support Vector Machines. CS534 - Machine Learning Support Vector Machnes CS534 - Machne Learnng Perceptron Revsted: Lnear Separators Bnar classfcaton can be veed as the task of separatng classes n feature space: b > 0 b 0 b < 0 f() sgn( b) Lnear Separators

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Inverse-Polar Ray Projection for Recovering Projective Transformations

Inverse-Polar Ray Projection for Recovering Projective Transformations nverse-polar Ray Projecton for Recoverng Projectve Transformatons Yun Zhang The Center for Advanced Computer Studes Unversty of Lousana at Lafayette yxz646@lousana.edu Henry Chu The Center for Advanced

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Semi-Supervised Discriminant Analysis Based On Data Structure

Semi-Supervised Discriminant Analysis Based On Data Structure IOSR Journal of Computer Engneerng (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. VII (May Jun. 2015), PP 39-46 www.osrournals.org Sem-Supervsed Dscrmnant Analyss Based On Data

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Simplification of 3D Meshes

Simplification of 3D Meshes Smplfcaton of 3D Meshes Addy Ngan /4/00 Outlne Motvaton Taxonomy of smplfcaton methods Hoppe et al, Mesh optmzaton Hoppe, Progressve meshes Smplfcaton of 3D Meshes 1 Motvaton Hgh detaled meshes becomng

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

MULTI-VIEW ANCHOR GRAPH HASHING

MULTI-VIEW ANCHOR GRAPH HASHING MULTI-VIEW ANCHOR GRAPH HASHING Saehoon Km 1 and Seungjn Cho 1,2 1 Department of Computer Scence and Engneerng, POSTECH, Korea 2 Dvson of IT Convergence Engneerng, POSTECH, Korea {kshkawa, seungjn}@postech.ac.kr

More information

Mercer Kernels for Object Recognition with Local Features

Mercer Kernels for Object Recognition with Local Features TR004-50, October 004, Department of Computer Scence, Dartmouth College Mercer Kernels for Object Recognton wth Local Features Swe Lyu Department of Computer Scence Dartmouth College Hanover NH 03755 A

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 5 Luca Trevisan September 7, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 5 Luca Trevisan September 7, 2017 U.C. Bereley CS294: Beyond Worst-Case Analyss Handout 5 Luca Trevsan September 7, 207 Scrbed by Haars Khan Last modfed 0/3/207 Lecture 5 In whch we study the SDP relaxaton of Max Cut n random graphs. Quc

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information