The Rate Adapting Poisson Model for Information Retrieval and Object Recognition

Size: px
Start display at page:

Download "The Rate Adapting Poisson Model for Information Retrieval and Object Recognition"

Transcription

1 for Informaton Retreval and Object Recognton Peter V. Gehler Max Planck Insttute for Bologcal Cybernetcs, Spemannstrasse 38, Tübngen, Germany Alex D. Holub Department Of Electrcal Engneerng, Calforna Insttute of Technology, MC Pasadena, CA USA Max Wellng Bren School of Informaton and Computer Scence, Unversty of Calforna Irvne, CA USA Abstract Probablstc modellng of text data n the bagof-words representaton has been domnated by drected graphcal models such as plsi, LDA, NMF, and dscrete PCA. Recently, state of the art performance on vsual object recognton has also been reported usng varants of these models. We ntroduce an alternatve undrected graphcal model sutable for modellng count data. Ths Rate Adaptng Posson (RAP) model s shown to generate superor dmensonally reduced representatons for subsequent retreval or classfcaton. Models are traned usng contrastve dvergence whle nference of latent topcal representatons s effcently acheved through a smple matrx multplcaton. 1. Introducton and Context The domnant paradgm for modellng hstogram data s the extracton of latent semantc structure, often referred to as topcs. Text data for example can be represented as word counts for a gven dctonary, a representaton referred to as bag of words. For mage data there exsts an analogue, the so-called bag of features representaton, whch can be thought of as count data of vsual words. Latent varable models determne a mappng from such count data to a compressed latent representaton. Ths representaton can subsequently be used to mprove document retreval and classfcaton performance. The smplest models assgn each document to a sngle clus- Appearng n Proceedngs of the 23 rd Internatonal Conference on Machne Learnng, Pttsburgh, PA, Copyrght 2006 by the author(s)/owner(s). ter a pror. However, t has been recognzed that dstrbuted latent representatons are superor. For nstance, a smple sngular value decomposton of the count matrx, known as latent semantc ndexng (LSI), s qute successful n extractng semantc structure (Deerwester et al., 1990). A probablstc extenson of ths dea was ntroduced by Hofmann (1999) as probablstc latent semantc ndexng (PLSI). By realzng that PLSI s not a proper generatve model at the level of documents, a further extenson, latent Drchlet allocaton, was ntroduced by Ble et al. (2003). As ponted out by Buntne and Jakuln (2004), the basc archtecture of LDA s known under varous names such as add-mxtures, grade of membershps model, multple aspect model and multnomal PCA. These authors also extend LDA to a stll broader class of models known as dscrete PCA. These models can be characterzed along a number of dmensons. Frstly, they represent a subset of the of the class of drected graphcal models, or approxmatons thereof. Drected models share certan propertes, such as the phenomenon of explanng away (gven an observaton on a chld node, ts parents become dependent) and easy ancestral samplng. As shown n (Buntne, 2002; Grolam & Kaban, 2003) the growth of the number of parameters wth the number of tranng documents for PLSI can be understood as varatonal EM learnng of an LDA model, where for each tranng document the true posteror s replaced wth a pont estmate. Ths nsght also relates non-negatve matrx factorzaton (Lee & Seung, 1999) to the Gamma- Posson model (Buntne & Jakuln, 2004) n a smlar manner. More sophstcated approxmatons to the ntractable nference problem have also been studed n the lterature: a structured mean feld approxmaton (Ble & Jordan, 2004), expectaton propagaton (Mnka & Lafferty, 2002) and a collapsed Gbbs sampler (Grffths & Steyvers, 2002).

2 The Rate Adaptng Posson Model There s also another property to characterze these models. Models such as LDA, PLSI, Gamma-Posson models and NMF (n fact all dscrete PCA models and ther varatonal approxmatons) combne topcs n the probablty doman. For nstance, n LDA we generate a probablty vector θ wth θ = 1 from a Drchlet dstrbuton and lnearly combne these probabltes usng a stochastc matrx M. Each column of M represents a dscrete dstrbuton over words for topc j and a document s modelled as N doc samples from the lnear combnaton p = j M jθ j. However, we can also take lnear combnatons n the logprobablty doman. Exponental famly PCA (EPCA) represents an example of ths class of models. In fact we can thnk of EPCA exactly as a varatonal approxmaton (agan usng pont estmates) of a model wth condtonal dstrbutons n the exponental famly and a flat (constant) pror. Specal cases nclude PCA as the varatonal approxmaton of factor analyss (or probablstc PCA)(Rowes, 1997) and the sparse codng algorthm of Olshausen and Feld (1997) s a varatonal approxmaton of ICA. Exponental famly harmonums ntroduced by Wellng et al. (2004) can be understood as undrected probablstc models whch lnearly combne topcs n the logprobablty doman. The undrected semantcs of ths model has nterestng consequences. Most mportantly, the latent varables are condtonally ndependent gven the data, and vce versa. Ths s n stark contrast to the margnal ndependence of the latent varables n drected models. The mplcaton s that the mappng from nput space to latent space s gven by a sngle matrx multplcaton, possbly followed by a componentwse nonlnearty. For applcatons such as nformaton retreval and object recognton where speed f of the essence, ths s a very useful property. We note that harmonums also generate dstrbuted latent representatons. An nterestng explanaton for the mproved retreval and classfcaton results usng harmonums was gven n Xng et al. (2005). These authors observe that harmonums mx ther topcs usng a very dfferent mechansm than LDA. Ths has mportant consequences n partcular for low count values. If a word appears only once n a document, LDA assumes a pror that ths word s generated by a sngle topc, an assumpton not made by harmonums. In some sense, the smpler nference n harmonums s traded-off aganst more dffcult learnng due to the presence of an ntractable normalzaton constant whch depends on the parameters of the model. However, harmonums are desgned to take advantage of contrastve dvergence learnng (Hnton, 2002) whch has shown to be an effcent algorthm that scales well to large problems. Posson Bnomal Fgure 1. Markov random feld representaton of the RAP model. Top-layer nodes represent bnomal hdden varables h whle bottom-layer nodes represent Posson vsble varables x. 2. The Rate Adaptng Posson Model The rate adaptng Posson (RAP) model follows the general archtecture of an exponental famly harmonum (Wellng et al., 2004). The RAP model s dfferent from the undrected probablstc latent semantc ndexng (UP- LSI) model presented n Wellng et al. (2004) whch uses a multnomal condtonal dstrbuton over the observed varables. Ths results n a large array W j a of couplng parameters between topcs and observed varables wth a separate entry for every count-level a. Ths fact renders that model only practcal for observatons wth very few states (e.g. bnary). The RAP model s more economcal n ts use of parameters, couplng topcs to counts usng a condtonal Posson dstrbuton nvolvng a sngle matrx W j. Ths change has made the experments n secton 3 and 4 possble RAP: Generatve Model A harmonum can be specfed by wrtng down two consstent condtonal dstrbutons n the exponental famly. For RAP, we use condtonal Posson dstrbutons for the observed count data and condtonal bnomal dstrbutons for the latent topc varables, p(x h) = Pos x [log(λ ) + j W jh j ] (1) p(h x) = j Bn h j [σ(log( p j 1 p j ) + W jx ); M j ] (2) where σ(x) = 1/(1+e x ) s the sgmod functon, λ s the mean rate of the condtonal Posson dstrbuton for word, p j s the probablty of success and M j the total number of samples for the condtonal bnomal dstrbuton for topc j, x s the count vector, h a dscrete topc vector and W the nteracton between topcs and counts. From these equatons t can be seen that the value of the varables of the opposte layer shft the canoncal parameters of the varables n the layer under consderaton. It s due to ths behavor that we named the model rate adaptng. Note also that all varables are condtonally ndependent gven values for the varables n the opposte layer. These two condtonal dstrbutons are consstent wth the jont dstrbuton over {x, h} defned through p(x, h) = h W x

3 The Rate Adaptng Posson Model topc factors Posson x f(x) word factors Fgure 2. Factor graph representaton of the margnalzed RAP model. Square boxes ndcate word and topc factors. exp[f(x, h)]/z wth f(x,h) = + j p j log(λ )x log(x!) log( )h j log(h j!) log((m j h j )!) 1 p j x Fgure 3. Hnge nonlnearty f w T j x β j then the factor does not contrbute. On the other hand, f w T j x β j then the hnge functon s lnear, and hence those factors modulate the log Posson rate λ as follows, + j W j x h j (3) log(λ ) log(λ ) + j I( W j x > β j ) M j W j (7) where we have not wrtten any terms that do not explctly depend on a random varable. The two-layer undrected archtecture of ths model s shown n fgure 1. Samples from the model can be obtaned effcently by Gbbs samplng because all varables n a layer can be sampled n parallel gven the values for the varables of the opposte layer and vce versa. To fnd the most lkely varable assgnments one can locate modes of the dstrbuton by teratng the equatons, x mode h mode j = exp(log(λ ) + j = (M j + 1)σ(log( ) + 1 p j W j h j ) (4) p j W j x ) (5) The RAP model can also be represented as a factor graph (see fgure 2) by margnalzng out the latent varables, p(x) exp[ (log(λ )x log(x!))+ M j log(1 + exp( W j x β j ))] (6) j where we have abbrevated β j = log[p j /(1 p j )]. We can read out the word-factors from ths expresson as F (x ) = λ x /x! for each varable and the topc-factors F j (x) = exp(m j log(1 + exp( W jx β j ))). Note that the factors F j (x) are functons of all the varables x jontly. The nonlnearty for a topc-factor n the log doman s precsely gven by the hnge functon (see fgure 3). Hence, a topc factor does not contrbute to the probablty dstrbuton (.e. F j (x) = 1) f the nput count vector x s not well algned wth the topc vector w j = {W } j. A threshold β j determnes what t means to be well algned : where I( ) s the ndcator functon. Clearly, ths s an approxmaton because there s n fact a soft transton between the two regmes of the hnge functon 1. However, t clarfes the role of the weght matrx as a collecton of topc vectors that form a new low dmensonal bass for the latent representaton. Count vectors get mapped nto latent topc space by computng ts coordnates n ths bass as h = Wx. The thresholds β then decde on the necessary magntude of these coordnates before they wll have an mpact on the Posson rates. We note that there are n fact 2 K (wth K the total number of topcs) dfferent ways to modulate the log Posson rates because there are 2 K subsets of {1,..., K}. Emprcally we have found that the best performance n terms of retreval and classfcaton s obtaned when the angle between latent coordnates s used as a measure of smlarty: K(x n,x m ) = cos( h T n h m ). Ths s not surprsng as we expect that the length of a document roughly scales the count vector lnearly assumng ts topcal content does not change RAP: Invarant Transformatons The margnal dstrbuton p(x) n equaton (6) s gven as a product of factors where each factor follows the general form log(1 + e z ). It s not hard to check that the followng dentty holds, log(1 + e z ) = z + log(1 + e z ) whch has the consequence that we can change parameters wthout affectng the model. In other words, the parameters are not dentfable n the current parameterzaton. If we defne an arbtrary subset S of the ntegers {1,..., K}, then the followng transformatons, when executed jontly, do 1 Ths approxmaton s expected to be accurate when all parameter values {W, β} are large.

4 not change the RAP model, log(λ ) log(λ ) + j S M j W j (8) W j W j β j β j j S. (9) Fortunately, t s easy to fx the spurous degrees of freedom by choosng for nstance wj T x > 0 j or alternatvely, fxng the sgn of β j, j. Gong one step further, we can apply the transformaton only to half the hnge nonlnearty and obtan, log(1+e z ) = 1 2 z+ 1 2 log(1+cosh(z))+ 1 2 log(2) where the constant term s absorbed n the normalzaton and the lnear term s absorbed n the varable factors F. The new topc factors, F j = 1 + cosh(z), are now symmetrc around z = 0 mplyng that large nner products w j T x wth ether sgn (.e. algned or ant-algned) result n a postve contrbuton of that factor to the probablty of nput x. We wll call ths type of bass vectors {w j } prototypes n contrast to constrants whch contrbute postve probablty for an nput x when t s approxmately orthogonal to t: wj T x 0 (Wellng et al., 2002) RAP: Learnng Parameter learnng for the RAP model s performed by stochastc gradent ascent on the log-lkelhood of the data. For large redundant datasets t s more effcent to estmate the requred gradents on small batches rather than on the entre dataset. Ths s true n partcular n the ntal phase of learnng where there s consensus among the data on how to change the parameters. Towards convergence t s useful to ether ncrease the batch-sze or decrease the stepsze n order to reduce the varance of the stochastc optmzaton. We have also ncluded a momentum term to help speed up convergence. The dervatves of the log-lkelhood of the RAP model are easy to wrte down (but hard to calculate n practce due to the ntractable normalzaton constant), δ log λ x p x pt (10) δβ j M j [ σ(w T j x β j ) p σ(w T j x β j ) pt ] δw j M j [ x σ(w T j x β j ) p x σ(w T j x β j ) pt ] where p denotes the emprcal dstrbuton 2 and p T the model dstrbuton at the current values of the parameters. Note that our estmate of the gradents of β and W nvolves Rao-Blackwellsaton over the latent varables. Here we replace a sample average 1 N N n=1 f(h n) wth 1 N N n=1 f(h)p(h x n). Ths s guaranteed to reduce the varance of our estmates (Casella & Robert, 1996). 2 The average over the emprcal dstrbuton s smply gven by the sample average over the data-cases. It s n partcular the negatve terms n these equatons that are hard to estmate. One approach s to run the Gbbs sampler defned by equatons (1) and (2). Note however that at every teraton of learnng we have to run ths sampler to equlbrum. Instead, we wll follow the contrastve dvergence (CD) paradgm where for every data-case n the batch we ntalze a separate Gbbs sampler at that datacase and run t for only a few steps. Wth p 1 (.e. T = 1 n equaton (10)) we wll denote the Gbbs chan 3 that samples: h 0 n p(h data-case n ) x 1 n p(x h 0 n). CDlearnng smply bols down to obtanng samples from p 1 through the above one-step Gbbs chan and computng nosy estmates of the gradent through equaton 10. The averages n the frst terms are agan computed as sample estmates of data-cases n the batch whle the averages n the second terms are computed as sample averages over the samples from p 1. Although truncatng the Markov chan wll ntroduce a bas n the estmates of the gradents, the fnal bas of the parameter estmates has been shown to be small emprcally for a number of applcatons (Carrera-Perpnan & Hnton, 2005). Moreover, the varance of the gradent estmates and hence the varance of the fnal parameter estmates s greatly reduced (albet at the expense of ntroducng a bas). Below a summary of the CD-learnng algorthm as descrbed n the precedng text. We have also mplemented Algorthm 1 Contrastve Dvergence Learnng for RAP Repeat untl convergence: 1 For each data-case x n do: 1a Sample the hdden unts gven the data-case clamped to the vsble unts from h 0 n Q j p(h jn x n ) usng Eqn.(2). 1b Resample the data-case gven the sampled values of the hdden unts from x 1 n Q p(x n h 0 n) gven n Eqn.(1). 2 Compute the data averages and sample averages n Eqn.(10) wth T=1. 3 Perform gradent updates accordng to Eqn.(10) wth T=1. a mean feld learnng algorthm where Gbbs samplng updates are replaced by mean feld updates (Wellng & Hnton, 2001), but we found the results to be sgnfcantly nferor to the samplng based algorthm. 3. Experments: Document Retreval In ths and the next secton we descrbe how the latent structure of the RAP model can be used for two dfferent tasks, namely document retreval and object recognton 4. We compare ts performance aganst two other latent var- 3 Note that samplng from the equlbrum dstrbuton should be denoted as p,.e. T =. 4 Matlab code for tranng the RAP model and the prerpocessed text data can be obtaned from pgehler

5 able models: PLSI and LSI. Performance of LDA has never sgnfcantly surpassed PLSI (n fact we often found nferor results) whch s the reason we left them out. In document retreval the goal s to match a gven query, represented by a word count vector, wth a subset of a text corpus where the retreved subset should resemble the query as closely as possble. A latent varable model can be turned nto a document retreval algorthm through the followng three steps: 1) estmate the parameters of the model on a tranng corpus, 2) map all tranng and query documents nto the dmensonally reduced latent space, 3) compute smlartes between queres and tranng documents based on the latent representaton, 4) retreve the k most smlar tranng documents form the corpus for every query. We have used the cosne smlarty measure n our experments. One can also compute smlarty n (tfdf reweghted) word space drectly whch we use n our experments as a baselne Text Corpora In these experments we used three well known datasets: Reuters-21578, Ohsumed, and 20-Newsgroups 5. We use the BOW package and ts front-end RAINBOW to preprocess the data (McCallum, 1996). All documents were stemmed wth the Porter stemmer, a lst of stop-words and all words wth less than three characters were removed. Addtonally, for the Reuters and Ohsumed datasets all words were removed whch occur only once n the tranng data or n only a sngle tranng document. For the 20-Newsgroups dataset the words wth hghest average mutual nformaton wth the class varable were extracted. Ths preprocessng left the Newsgroup dataset wth words, documents and 20 classes, and the Reuters dataset wth words, documents and 91 classes (we also used another splt of the data wth 115 classes but found very smlar results). The Ohsumed dataset conssts of words, documents and 23 classes where each data-pont mght belong to more than one class. The corpus was splt nto a tranng set and a test set whose tems are used as query documents durng the performance evaluaton. For the Reuters dataset the predefned ModApte splt of the data nto tran and 4024 test documents was used. Ohsumed s splt n 33% test and 67% tranng data whle n the newsgroup corpus we held out 10% for testng purposes. 5 These corpora are avalable from The orgnal sources and specfcs concernng these sets can be found on ths ste and are omtted here for brevty. Precson RAP 100 dmensons PLSI 250 dmensons LSI 175 dmensons TF IDF Recall Fgure 4. RPC plot on a log-scale of the 20 Newsgroups dataset for varous models. As a baselne the retreval results wth tfdf reweghed word-counts are shown. Number of topcs for each model was chosen by optmzng 1-NN classfcaton performance on the test set correspondng to the average precson for retrevng a sngle document (left most marker) Results Learnng of the RAP model was done wth a small learnng rate and a momentum term n 200k teratons usng mnbatches of 100 tranng samples per teraton. The latent representaton of any document x s then computed by a matrx multplcaton W x. LSI s computed by performng a SVD decomposton on the tf-df 6 reweghed word counts. PLSI models are traned usng the tempered verson of the EM algorthm (Hofmann, 1999). 10% of the tranng data was held out for valdaton purposes and the temperature parameter β s ntalzed at 1 and whenever the loglkelhood on the valdaton data decreases β s decreased about.025 untl no more mprovement was observed. The latent representaton s defned by the posteror dstrbuton over the topcs z: P(z d). For a query document q, P(z q) was computed usng 25 teratons of the foldng-n heurstc (Hofmann, 1999). For comparson we also show the baselne results whch are obtaned by computng the smlarty of tf-df reweghed documents n word space. As performance measure we use the recall precson curve (RPC) where #(correctly retreved documents) Recall = #(relevant documents n the corpus) (11) #(correctly retreved documents) Precson =. #(retreved documents) (12) For a gven test document, all tranng documents were ranked n terms of ther cosne smlarty. Then recall and precson values were computed for 1, 2, 4, 8, reh 6 tf-df(d, w) = P n(d,w) #docs log n the corpus w n(d,w ) 2 #docs wth word w, where n(d, w) are the occurrences of word w n document d

6 Precson RAP 125 dmensons PLSI 225 dmensons LSI 125 dmensons TF IDF Recall Area under RPC RAP PLSI LSI Latent dmensonalty Fgure 5. Same as fgure 4 for Ohsumed dataset. Fgure 7. Area under the RPC as a functon of the latent dmensonalty on the Reuters dataset. 4. Experments: Object Recognton Precson RAP 200 dmensons PLSI 125 dmensons LSI 100 dmensons TF IDF Recall Fgure 6. Same as Fgure 4 for the Reuters dataset. treved documents. The RPC curves of all models are plotted n Fgures 4, 5 and 6, where the recall and precson values are averaged over the entre test-set. In fgure 7 we show the area under the RPC (AUC) as a functon of the number of topcs. The leftmost pont on an RPC,.e. the average precson for retrevng a sngle document, corresponds to the 1-NN classfcaton performance usng the cosne dstance. The latent dmensonalty of the models shown n the plots were selected to be the best accordng to ths measure, where we scanned the number of topcs from 25 to 250 at ncrements of 25. The RAP model yelds the best retreval performance on all datasets n terms of AUC, and scores only slghtly worse than LSI on Ohsumed n terms of 1-NN classfcaton performance. Accordng to fgure 7 the RAP model also seems to suffer less from overfttng as the number of topcs ncreases. Latent models have recently been appled to both object (Fergus et al., 2005) and scene (L & Perona, 2005) recognton. In ths secton we compare the performance of the RAP model n the vsual object recognton doman. We followed these steps n our vsual experments: (1) nterest pont detecton and extracton, (2) vocabulary generaton, (3) latent analyss, (4) kernel classfcaton on latent representatons. We brefly descrbe these steps below. Images were ntally normalzed to be the same sze. Interestng regons of mages (nterest ponts) were detected usng three dfferent feature detectors: mult-scale Harrs, mult-scale Hessan, and entropy-based (Kadr & Brady, 2001). Grey-scale patches were extracted from mages based on both the scale and locaton ndcated by the dfferent nterest pont detectors. All patches from all detectors were ntensty normalzed and reszed to and subsequently converted nto vectors. We performed K-means clusterng on the patches n order to dscretze feature space and create a vsual vocabulary of words. The number of clusters was left as a free parameter of the system and typcally vared from Each mage contans a set of nterest pont detectons. An nterest pont was assgned to the vsual cluster (word) closest n a Eucldean sense to that feature. The cumulatve counts over all clusters were used as feature vectors to represent each mage, such that each mage was represented by a vector of dmensonalty equal to the sze of the vsual vocabulary. Smlar to the document experments descrbed above, we are not utlzng any spatal nformaton between the extracted patches. We compared the performance of three dfferent latent algorthms descrbed above: LSI, PLSI and RAP. The latent representatons for each mage were used to tran SVM classfers usng the LIBSVM 7 package wth a lnear kernel. Ten- 7 Avalable at: cjln/lbsvm/.

7 Classfcaton Performance Performance, Clusters=125 RAP PLSI LSI Dmensonalty of Latent Space Fgure 8. Example mages used for the object recognton experments. (Top Row) Example mages from the Caltech4. Classes: Arplanes, Motorcycles, Faces, Leopards. (Bottom Two Rows) Example mages of four random classes from the Caltech101. Two mages of each class shown to gve an ndcaton of the wthn class varance. Classes: Budha, Char, Watch, Bran. Note that the Caltech101 ncludes the Caltech4 classes. Fgure 9. Caltech4 performance comparson. All experments averaged 35 tmes. Baselne (chance) performance s 25%. Plotted s the test performance as a functon of the number of latent dmensons wth 125 clusters and usng a lnear kernel. Performance dfferences between RAP and PLSI/LSI were sgnfcant for all numbers of latent dmensons (p < 0.05). fold cross-valdaton was used to fnd the optmal values of the SVM hyper-parameters. We used a one-vs-one tranng paradgm for the mult-class datasets. Feature dmensons were normalzed to zero varance and unt standard devaton. We conducted experments on both the Caltech4 and the challengng Caltech101 datasets (fgure 8 llustrates representatve examples of some categores). These datasets can be found at: The Caltech4 contans a total of 4 object categores and s regarded to be relatvely easy to classfy due to stereotyped poses and drastc vsual dssmlarty between classes. The Caltech101 contans a total of 101 object categores and s more challengng due to the sheer number of object categores. 15 tranng mages and a maxmum of 50 testng mages were used for all experments. For the Caltech101, the class Faces-Easy was removed. Performance results reported correspond to the average classfcaton performance across all categores. Fgures 9, 10 and 11 show comparsons between RAP and LSI/PLSI. Error bars are not shown because the varaton from one splt of the data to another was larger than the varaton between models. Instead we used the two-sded pared sgn test to determne whether the medan dfference n performance s sgnfcantly dfferent from zero at a level of α = We conclude that almost always RAP sgnfcantly outperforms LSI and PLSI. 5. Dscusson The experments provde clear evdence for the clam that harmonum models, and n partcular RAP, can be effcently and successfully traned on relatvely large datasets. Relatve to popular exstng methods such as LSI and PLSI the latent representatons generated by RAP are superor n two applcaton domans: document retreval and object classfcaton. Moreover, mappng test-data nto latent space s orders of magntude faster for RAP (through a smple matrx multplcaton) than for PLSI (through the teratve foldng-n heurstc). A natural next step s to tran herarchcal models. Dependences between topcs are then modelled wth a new layer of meta-topcs. Intal experments n ths drecton have not shown mproved retreval or classfcaton performance. However, recent work by (Hnton et al., 2006) ndcates that deep herarches can be a promsng drecton for mprovement. The choce of a condtonal Posson dstrbuton may not be optmal due to the effect that words that have been used already become more lkely to be used than others,.e. ther frequency grows wth document length. Ths calls for dstrbutons wth longer tals such as the negatve-bnomal dstrbuton (Arold et al., 2005). The Posson dstrbuton n RAP can be easly nterchanged wth a negatve- Bnomal ncorporatng ths effect. A Bayesan approach for harmonum models seems an mportant topc for future nvestgaton. Acknowledgments Ths materal s based upon work supported by the Natonal Scence Foundaton under Grant No

8 Classfcaton Performance RAP PLSI LSI Performance, Latent= Vocabulary Sze Fgure 10. Same experment as n fgure 9 but plottng performance as a functon of the sze of the vocabulary usng 35 latent dmensons. Performance dfferences between RAP and PLSI/LSI were sgnfcant for vocabulary szes (p < 0.05). Classfcaton Performance Caltech101 Performance, Clusters=250 RAP PLSI LSI Dmensonalty of Latent Space Fgure 11. Caltech101 performance comparsons usng 250 clusters. All experments averaged 7 tmes. Baselne (chance) performance s 1% for ths task. Same plot as n fgure 9. Performance dfferences between RAP and PLSI were sgnfcant for 75 and 125 latent dmensons (p < 0.05). References Arold, E., Cohen, W., & Fenberg, S. (2005). Bayesan methods for frequent terms n text. Proc. of the CSNA & INTERFACE Annual Meetngs. Ble, D. M., & Jordan, M. I. (2004). Varatonal nference for drchlet process mxtures. Bayesan Analyss, 1, Ble, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Drchlet allocaton. Journal of Machne Learnng Research, 3, Buntne, W. (Ed.). (2002). Varatonal extensons to em and multnomal pca, vol of Lecture Notes n Computer Scence. Helsnk, Fnland: Sprnger. Buntne, W., & Jakuln, A. (2004). Applyng dscrete pca n data analyss. Proceedngs of the 20th conference on Uncertanty n artfcal ntellgence (pp ). Banff, Canada. Carrera-Perpnan, M., & Hnton, G. (2005). On contrastve dvergence learnng. Tenth Internatonal Workshop on Artfcal Intellgence and Statstcs. Barbados. Casella, G., & Robert, C. (1996). Rao-blackwellsaton of samplng schemes. Bometrka, 83(1), Deerwester, S., Dumas, S., Landauer, T., Furnas, G., & Harshman, R. (1990). Indexng by latent semantc analyss. Journal of the Amercan Socety of Informaton Scence, 41, Fergus, R., Fe-Fe, L., Perona, P., & Zsserman, A. (2005). Learnng object categores from google s mage search. Proceedngs of the Internatonal Conference on Computer Vson. Grolam, M., & Kaban, A. (2003). On an equvalence between PLSI and LDA. Proceedngs of SIGIR Grffths, T., & Steyvers, M. (2002). A probablstc approach to semantc representaton. Proceedngs of the 24th Annual Conference of the Cogntve Scence Socety. Hnton, G. (2002). Tranng products of experts by mnmzng contrastve dvergence. Neural Computaton, 14, Hnton, G., Osndero, S., & Teh, Y. (2006). A fast learnng algorthm for deep belef networks. Neural Computaton. to appear. Hofmann, T. (1999). Probablstc latent semantc analyss. Proc. of Uncertanty n Artfcal Intellgence, UAI 99. Stockholm. Kadr, T., & Brady, M. (2001). Salency, scale and mage descrpton. Int. J. Comput. Vson, 45, Lee, D., & Seung, H. (1999). Learnng the parts of objects by non-negatve matrx factorzaton. Nature, 401, L, F., & Perona, P. (2005). A bayesan herarchcal model for learnng natural scene categores. Proceedngs of the Conference on Computer Vson and Pattern Recognton. McCallum, A. (1996). Bow: A toolkt for statstcal language modelng, text retreval, classfcaton and clusterng. mccallum/bow. Mnka, T., & Lafferty, J. (2002). Expectaton-propogaton for the generatve aspect model. Proc. of the 18th Annual Conference on Uncertanty n Artfcal Intellgence (pp ). Olshausen, A., & Feld, D. (1997). Sparse codng wth overcomplete bass set: A strategy employed by v1? Vson Research, 37, Rowes, S. (1997). Em algorthms for pca and spca. Neural Informaton Processng Systems (pp ). Wellng, M., & Hnton, G. (2001). A new learnng algorthm for mean feld Boltzmann machnes. Proc. of the Int l Conf. on Artfcal Neural Networks. Madrd, Span. Wellng, M., Hnton, G., & Osndero, S. (2002). Learnng sparse topographc representatons wth products of student-t dstrbutons. Neural Informaton Processng Systems. Wellng, M., Rosen-Zv, M., & Hnton, G. (2004). Exponental famly harmonums wth an applcaton to nformaton retreval. Neural Informaton Processng Systems. Xng, E., Yan, R., & Hauptman, A. (2005). Mnng assocated text and mages wth dual-wng harmonums. Proc. of the Conf. on Uncertanty n Artfcal Intellgence.

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence

Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence 2nd Internatonal Conference on Software Engneerng, Knowledge Engneerng and Informaton Engneerng (SEKEIE 204) Text Smlarty Computng Based on LDA Topc Model and Word Co-occurrence Mngla Shao School of Computer,

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning Computer Anmaton and Vsualsaton Lecture 4. Rggng / Sknnng Taku Komura Overvew Sknnng / Rggng Background knowledge Lnear Blendng How to decde weghts? Example-based Method Anatomcal models Sknnng Assume

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

TESTING AND IMPROVING LOCAL ADAPTIVE IMPORTANCE SAMPLING IN LJF LOCAL-JT IN MULTIPLY SECTIONED BAYESIAN NETWORKS

TESTING AND IMPROVING LOCAL ADAPTIVE IMPORTANCE SAMPLING IN LJF LOCAL-JT IN MULTIPLY SECTIONED BAYESIAN NETWORKS TESTING AND IMPROVING LOCAL ADAPTIVE IMPORTANCE SAMPLING IN LJF LOCAL-JT IN MULTIPLY SECTIONED BAYESIAN NETWORKS Dan Wu 1 and Sona Bhatt 2 1 School of Computer Scence Unversty of Wndsor, Wndsor, Ontaro

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2512-2520 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 Communty detecton model based on ncremental EM clusterng

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1 A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

A Hidden Markov Model Variant for Sequence Classification

A Hidden Markov Model Variant for Sequence Classification Proceedngs of the Twenty-Second Internatonal Jont Conference on Artfcal Intellgence A Hdden Markov Model Varant for Sequence Classfcaton Sam Blasak and Huzefa Rangwala Computer Scence, George Mason Unversty

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification Fast Sparse Gaussan Processes Learnng for Man-Made Structure Classfcaton Hang Zhou Insttute for Vson Systems Engneerng, Dept Elec. & Comp. Syst. Eng. PO Box 35, Monash Unversty, Clayton, VIC 3800, Australa

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Non-Negative Matrix Factorization and Support Vector Data Description Based One Class Classification

Non-Negative Matrix Factorization and Support Vector Data Description Based One Class Classification IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 36 Non-Negatve Matrx Factorzaton and Support Vector Data Descrpton Based One

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models Modelng Waveform Shapes wth Random Effects Segmental Hdden Markov Models Seyoung Km, Padhrac Smyth Department of Computer Scence Unversty of Calforna, Irvne CA 9697-345 {sykm,smyth}@cs.uc.edu Abstract

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Expert Systems with Applications

Expert Systems with Applications Expert Systems wth Applcatons 37 2010) 4403 4412 Contents lsts avalable at ScenceDrect Expert Systems wth Applcatons ournal homepage: www.elsever.com/locate/eswa A novel dual wng harmonum model aded by

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information