A Hidden Markov Model Variant for Sequence Classification
|
|
- Neal Dorsey
- 6 years ago
- Views:
Transcription
1 Proceedngs of the Twenty-Second Internatonal Jont Conference on Artfcal Intellgence A Hdden Markov Model Varant for Sequence Classfcaton Sam Blasak and Huzefa Rangwala Computer Scence, George Mason Unversty sblasak@gmu.edu, rangwala@cs.gmu.edu Abstract Sequence classfcaton s central to many practcal problems wthn machne learnng. Dstances metrcs between arbtrary pars of sequences can be hard to defne because sequences can vary n length and the nformaton contaned n the order of sequence elements s lost when standard metrcs such as Eucldean dstance are appled. We present a scheme that employs a Hdden Markov Model varant to produce a set of fxed-length descrpton vectors from a set of sequences. We then defne three nference algorthms, a Baum-Welch varant, a Gbbs Samplng algorthm, and a varatonal algorthm, to nfer model parameters. Fnally, we show expermentally that the fxed length representaton produced by these nference methods s useful for classfyng sequences of amno acds nto structural classes. 1 Introducton The need to operate on sequence data s prevalent n a varety of real world applcatons rangng from proten/dna classfcaton, speech recognton, ntruson detecton and text classfcaton. Sequence data can be dstngushed from the more-typcal vector representaton n that the length of sequences wthn a dataset can vary and that the order of symbols wthn a sequence carres meanng. For sequence classfcaton, a varety of strateges, dependng on the problem type, can be used to map sequences to a representaton that can be handled by tradtonal classfers. A smple technque nvolves selectng a fxed number of elements from the sequence and then usng those elements as a fxed-length vector n the classfcaton engne. In another technque, a small subsequence length, l, s selected, and a sze M l vector s constructed contanng the counts of all length l subsequences from the orgnal sequence. Ths vector can then be used for classfcaton [Lesle et al., 2002]. A thrd method for classfyng sequence data requres only a postve defnte mappng defned over pars of sequences rather than any Fundng: NSF III drect mappng of sequences to vectors. Ths strategy, known as the kernel trck, s often used n conjuncton wth support vector machnes (SVMs) and allows for a wde varety of sequence smlarty measurements to be employed. Hdden Markov Models (HMM) [Rabner and Juang, 1986; Eddy, 1998] have a rch hstory n sequence data modelng (n speech recognton and bonformatcs applcatons) for the purposes of classfcaton, segmentaton, and clusterng. HMMs success s based on the convenence of ther smplfyng assumptons. The space of probable sequences s constraned by assumng only parwse dependences over hdden states. Parwse dependences also allow for a class of effcent nference algorthms whose crtcal steps buld on the Forward- Backward algorthm [Rabner and Juang, 1986]. We present an HMM varant over a set of sequences, wth one transton matrx per sequence, as a novel alternatve for handlng sequence data. After tranng, the per-sequence transton matrces of the HMM varant are used as fxed-length vector representatons for each assocated sequence. The HMM varant s also smlar to a number of topc models, and we descrbe t n the context of Latent Drchlet Allocaton [Ble et al., 2003]. We then descrbe three methods to nfer the parameters of our HMM varant, explore connectons between these methods, and provde ratonale for the classfcaton behavor of the parameters derved through each. We perform a comprehensve set of experments, evaluatng the performance of our method n conjuncton wth support vector machnes, to classfy sequences of amno acds nto structural classes (fold recognton and remote homology detecton problem [Rangwala and Karyps, 2006]). The combnaton of these methods, ther nterpretatons, and ther connectons to pror work consttutes a new twst on classc ways of understandng sequence data that we beleve s valuable to anyone approachng a sequence classfcaton task. 2 Problem Statement Gven a set of N sequences, we would lke to fnd a set of fxed-length vectors, A 1...N, that, when used as nput to a functon f(a), maxmze the probablty of reconstructng the orgnal set of sequences. Under our scheme, 1192
2 f(a) s a Hdden Markov Model varant wth one transton matrx, A n, assgned to each sequence, and a sngle emssons matrx, B, and start probablty vector, a, for the entre set of sequences. By maxmzng the lkelhood of the set of sequences under the HMM varant model, we wll also fnd the set of transton matrces that best represent our set of sequences. We further postulate that ths maxmum lkelhood representaton wll acheve good classfcaton results f each sequence s later assocated wth a meanngful label. 2.1 Model Descrpton We defne a Hdden Markov Model varant that represents a set of sequences. Each sequence s assocated wth a separate transton matrx, whle the emsson matrx and ntal state transton vector are shared across all sequences. We use the value of each transton matrx as a fxed-length representaton of the sequence. We defne the parameters and notaton for the model n Table 1. Parameter N T n K M a A nj B m z nt x nt Descrpton the number of sequences the length of sequence n the number of hdden symbols the number of observed symbols start state probabltes, where s ndexed by the value of the frst hdden state transton probabltes, where n s an ndex of a tranng sequence, the orgnatng hdden state, and j the destnaton hdden state emsson probabltes, where ndcates the hdden state and m the observed symbol assocated wth the hdden state the hdden state at poston t n sequence n the observed symbol at poston t n sequence n Table 1: HMM Varant model parameters The jont probablty of the model s shown below: (1) p(x, z a, A, B) = N a zn1 A nznt 1 z nt B znt x nt n=1 t=1 Ths dffers from the standard hdden Markov model only n the addton of a transton matrx, A n (hghlghted n bold n Equaton 1), for each sequence, where the ndex n ndcates a sequence n the tranng set. Under the standard HMM, a sngle transton matrx, A, would be used for all sequences. To regularze the model, we further augment the basc HMM by placng Drchlet prors on a, each row of A, and each row of B. The pror parameters are the unform Drchlet parameters γ, α, and β for a, A, and B respectvely. The probablty of the model wth prors s shown below, where the pror probabltes are the frst three terms n the product below and take the form Dr(x; a, K) = Γ(Ka) Γ(a) K xa 1 : (2) p(x, z, a, A, B α, β, γ) = ( ) Γ(Kγ) a γ 1 ( ) Γ(Kα) Γ(γ) K A α 1 Γ(Mβ) Γ(α) K nj B β 1 Γ(β) M m n j m N a zn1 A nznt 1 z nt B znt x nt n=1 t=1 One potental dffculty that could be expected n classfyng smple HMMs by transton matrx s that the probablty of a sequence under an HMM does not change under a permutaton of the hdden states. Ths problem s avoded when we force each sequence to share an emssons matrx, whch locks the meanng of each transton matrx row to a partcular emsson dstrbuton. If the emsson matrx were not shared, then two HMMs wth permuted hdden states could have transton matrces that wth large Eucldean dstances. For nstance, the followng HMMs have dfferent transton matrces, but the probablty of an observed sequence s the same under each: [ HMM 1: A 1 = [ HMM 2: A 2 = ] [, B 1 = ], B 2 = [ However, a Eucldean dstance between ther two transton matrces, A 1 and A 2 s large. 3 Background 3.1 Mxtures of HMMs Smyth ntroduces a mxture of HMMs n [Smyth, 1997] and presents an ntalzaton technque that s smlar to our model n that an ndvdual HMM s learned for each sequence, but dffers from our model n that the emsson matrces are not shared between HMMs. In [Smyth, 1997], these ntal N models are used to compute the set of all parwse dstances between sequences, defned as the symmetrzed log lkelhood of each element of the par under the other s respectve model. Clusters are then computed from ths dstance matrx, whch are used to ntalze a set of K<NHMMs where each sequence s assocated wth one of K labels. Smyth notes that whle the log probablty of a sequence under an HMM s an ntutve dstance measure between sequences, t s not ntutve how the parameters of the model are meanngful n terms of defnng a dstance between sequences. In ths research, we demonstrate expermentally that the transton matrx of our model s useful for sequence classfcaton when combned wth standard dstance metrcs and tools. 3.2 Topc Models Smpler precursors of LDA [Ble et al., 2003] and plsi [Hofmann, 1999], whch represent an entre corpus of documents wth a sngle topc dstrbuton vector, are very smlar to the basc Hdden Markov Model, whch assgns a sngle transton matrx to the entre set of sequences that are beng modeled. To extend the HMM to a plsi analogue, all that s needed s to splt the sngle transton matrx nto a per-sequence transton matrx. To extend ths model to an LDA analogue, we must go a step further and attach Drchlet prors to the transton matrces, as n our model. Inference of the LDA model (Fgure 1a) on a corpus of documents learns a matrx of document-topc proba- ] ] 1193
3 bltes. A row of ths matrx, sometmes descrbed as a mxed-membershp vector, can be vewed as a measurement of how a gven document s composed from the set of topcs. In our HMM varant (Fgure 1b), a sngle transton matrx, A n, can be thought of as the analogue to a document-topc matrx row and can be vewed as a measurement of how a sequence s composed of pars of adjacent symbols. The LDA model also ncludes a topc-word matrx, whch ndcates the probablty of a word gven a topc assgnment. Ths matrx has the same meanng as the emssons matrx, B, n the HMM varant. The Fsher kernel [Jaakkola and Haussler, 1999] and the Probablstc Product Kernel [Jebara et al., 2004] (PPK), are prncpled methods that allow probablstc models to be ncorporated nto SVM kernels. The HMM varant s smlar to these methods n that t uses latent nformaton from a generatve model as nput to a dscrmnatve classfer. It dffers from these methods, however, both n whch portons of the generatve model that are ncorporated nto the dscrmnatve classfer and n the assumptons about how dfferences n generatng dstrbutons comparsons between tranng examples. 4 Learnng the model parameters 4.1 Baum-Welch A well-known method for learnng HMM model parameters s the Baum-Welch algorthm. The Baum-Welch algorthm s an expectaton maxmzaton algorthm for the standard HMM model, and the basc algorthm s easly modfed to learn the multple transton matrces of our varant. The parameter updates shown below converges to a maxmum a posteror (MAP) estmate of p(z, a, A, B x, γ, α, β) [Rabner and Juang, 1986]: (3) a n f n(1)b n(1) + γ 1 (a) (4) (5) A (new) nj B (new) m Tn f n(t 1)A njb jxt b nj(t) +α 1 f n(t)b nj(t) +β 1 n t:x t =m where f and b are the forward and backward recursons defned below: (6) { j f n(t) = fnj(t 1)AnjBx t, t > 1 a B x1, t =1 (b) Fgure 1: Plate dagrams of the (a) LDA model, expanded to show each word separately and the (b) HMM varant. The model parameters n the LDA model are defned as follows: K - number of topcs, φ k - a vector of word probabltes gven topc k, β - parameters of the Drchlet pror of φ k, θ n - a vector of topc probabltes n document n, α - parameters of the Drchlet pror of θ n. A row of the matrx B n the HMM varant has exactly the same meanng as a topc-word vector, φ k, n the LDA model. (7) b n(t) = { j AnjBjx bnj(t +1), t+1 t < Tn 1 K, t = Tn The complexty of the Baum-Welch-lke algorthm for our varant s dentcal to the complexty of Baum-Welch for the standard HMM. The update for A j n the orgnal HMM nvolves summng over n T n terms, whle the update for a sngle A nj s a sum over T n terms, makng the total number of terms over all the A n s n our varant, n T n, the same number as the orgnal algorthm. 4.2 Gbbs Samplng Two Gbbs samplng schemes are commonly used to nfer Hdden Markov Model parameters [Scott, 2002]. Unlke the Baum-Welch algorthm whch returns a MAP estmate of the parameters, these samplng schemes allow the expectaton of the parameters to be computed over the posteror dstrbuton p(z, a, A, B x, γ, α, β). In the Drect Gbbs sampler (DG), hdden states and parameters are ntally chosen at random, then new hdden states are sampled usng the current set of parameters: (8) p(z (new) t z t 1,z t+1) A zt 1 B xt A zt+1 In the Forward Backward sampler (FB), the ntal settngs and parameter updates are the same as the DG scheme, but the hdden states are sampled n order from T n down to 1 usng values from the forward recurson. Specfcally, each hdden state z nt s sampled gven z nt+1 = j from a multnomal wth parameters (9) (10) p(z (new) ntn x n1:tn ) fn(tn) p(z (new) nt x n1:tn,z (new) nt+1 )=p(z(new) nt x n1:t,z (new) nt+1 ) f n(t)a nj, t < T n In both algorthms, after the hdden states are sampled, parameters are sampled from Drchlet condtonal dstrbutons, shown for A below, where I(ω) =1f ω s true and 0 otherwse: (11) Tn p(a nj z n,α)=dr( I(z nt 1 = )I(z nt = j) +α) The FB sampler has been shown to mx more quckly than the DG sampler, especally n cases where adjacent hdden states are hghly correlated [Scott, 2002]. We therefore use the FB sampler n our mplementaton. 1194
4 4.3 Varatonal Algorthm Another approach for nference of the HMM varant parameters s through varatonal technques. We employ a mean feld varatonal algorthm that follows a smlar pattern as EM. When the varatonal update steps are run untl convergence, Kullback-Lebler dvergence between the varatonal dstrbuton, q(z, a, A, B), and the model s condtonal probablty dstrbuton, p(z, a, A, B x, γ, α, β), s mnmzed. The transton matrces returned by the varatonal algorthm are the expectatons of those matrces under the varatonal dstrbuton. Thus, lke the Gbbs samplng algorthm, the parameters returned by the varatonal algorthm approxmate the expectatons of the parameters under the condtonal dstrbuton. Our mean feld varatonal approxmaton s shown below: N K K (12) q(z, a, A, B) =q(a) q(a n) q(b ) q(z nt) n=1 =1 =1 nt ( Γ( = γ) ) a γ 1 Γ( j αnj) Γ( γ) n j Γ( αnj) A α nj 1 nj j ( Γ( β ) m m) m Γ( β B β m 1 m h z nt nt m) m nt wth varatonal parameters h nt, whch approxmate each z nt, and α nj, β m, and γ, whch can be thought of as Drchlet parameters approxmatng α, β, and γ. When we maxmze the varatonal free energy wth respect to the varatonal parameters, we obtan the followng update equatons, where Ψ(x) = d log Γ(x) dx : (13) α nj = h nt 1h ntj + α t (14) β m = h nt + β nt:x t =m (15) γ = h n1 + γ n ( (16) h nt exp h nt 1 Ψ( α n ) Ψ( α n j ) + j h nt+1 j Ψ( α ( nj) Ψ( α nj ) + Ψ( β xnt ) Ψ( )) β m), j j m Notce that the update for h nt depends only on the adjacent h s, h nt 1 and h nt+1 as well as the expectatons of the transton probabltes from the adjacent h s and the expectaton of the emsson probabltes from the current h nt. Ths mean feld algorthm can therefore be understood as an equvalent of the Drect Gbbs samplng method except that at subsequent tme steps nteractons occur between varatonal parameters rather than through the sampled values of z. A complete dervaton of the varatonal algorthm s ncluded on the authors webste Class categores, SCOP 1.67, 25% Baum Welch Gbbs Samplng Varatonal Fold categores, SCOP 1.67, 25% Baum Welch Gbbs Samplng Varatonal Fold categores, SCOP 1.67, 40% Baum Welch Gbbs Samplng Varatonal Superfamly categores, SCOP 1.67, 40% Baum Welch Gbbs Samplng Varatonal Table 2: AUC results from all of the mult-class SVM experments are dsplayed. The best performng algorthm, the best performng settng of K, and the best combnaton of K and algorthm s marked n bold. The Gbbs-Samplng-derved representaton most frequently returned the best AUC score on the majorty of the datasets. 5 Expermental Setup 5.1 Protocol To evaluate our fxed-length representaton scheme, for each dataset (descrbed n Secton 5.2), we created three sets of fxed-length representatons per tral over ten trals by runnng each of the three nference algorthms: () Baum-Welch, () Gbbs Samplng, and () the mean feld varatonal algorthm, on the entre set of nput data. We vared the number of hdden states, K, from 5 to 20 n ncrements of 5. Ths procedure created a total of 120 (3 10 4) fxed-length representatons for each dataset. The fxed-length vector data was then used as nput to a support vector machne (SVM) classfer 2. We used the SVM to ether perform ether multway classfcaton on the dataset under the Crammer-Snger [Crammer and Snger, 2002] constructon or the one-versus-rest approach, where a bnary classfer was traned for each of the classes. We compare classfcaton results from our model wth results from the Spectrum(2) kernel for all experments. The Spectrum(l) kernel s a strng kernel whose vector representaton s the set of counts of substrngs of observed symbols length l n a gven strng [Lesle et al., 2002]. For the one-versus rest experments, we compare our results to more bologcally senstve kernels for proten classfcaton, descrbed n Rangwala et. al [Rangwala and Karyps, 2005]. 5.2 Proten Datasets The Structural Classfcaton of Protens (SCOP) [Murzn et al., 1995] database categorzes protens nto a multlevel herarchy that captures commonaltes between proten structure at dfferent levels of detal. To evaluate our representaton, we ran sets of proten class- 2 We used SVM-lght and SVM-struct for classfcaton ( [Joachms, 1999]. 1195
5 fcaton experments on the three top levels of the SCOP taxonomy: class, fold, and superfamly. Our datasets, whch were obtaned from prevous studes [Rangwala and Karyps, 2006; Kuang et al., 2004], were derved from ether the SCOP 1.67 or the SCOP 1.53 versons and fltered at 25% and 40% parwse sequence denttes. A proten sequence dataset fltered at 25% dentty wll have no two sequences wth more than 25% sequence dentty. We parttoned the data nto a sngle test and tranng set for each category. At the class level, the orgnal dataset was splt randomly n to tranng and test sets. To elmnate hgh levels of smlarty between sequences that could lead to trvally good classfcaton results, we mposed constrants on the tranng/test set parttonng for classfcaton n the fold and superfamly experments. For the fold level classfcaton problem, the tranng sets were parttoned so that no examples that shared the fold and superfamly labels were ncluded n both the tranng and test sets. Smlarly, for the superfamly level classfcaton problem (referred to as the remote homology detecton problem [Lesle et al., 2002; Rangwala and Karyps, 2005]), no examples that shared the superfamly and famly levels were ncluded n both the tranng and test sets. 5.3 Evaluaton Metrcs We evaluated each classfcaton experment by computng the area under the ROC curve (AUC), a plot of the true postve rate aganst the false postve rate, constructed by adjustng the SVM s ntercept parameter. We also computed the AUC50 value, whch s a normalzed computaton of the area under the ROC curve untl the frst 50 false postves have been detected. We were worred about varance over dfferent Baum-Welch runs due to convergence of the algorthm to dfferent local optma. To mtgate ths concern, we ran both the Baum-Welch algorthm and the other nference algorthms, for consstency, 10 separate tmes on each dataset. The results presented for each nference method are averages over ndvdual results of the 10 trals across the dfferent classes. 6 Results and Dscusson 6.1 Proten Sequence Classfcaton Table 2 shows a comparson of results (average AUC scores) across the nference algorthms n three taxonomc categores (class, fold, and superfamly) usng the multclass SVM. Although the AUC scores are close for each algorthm, n most cases, the Gbbs samplng algorthm outperforms the other algorthms. Table 3 shows a comparson of results over the nference algorthms but only for the one-versus-rest superfamly classfcaton experment on the SCOP 1.53 dataset. Smlar to the multclass experments usng the lnear kernel, the Gbbs samplng algorthm outperforms the other nference methods n the one-versus-rest experments. Although the values of the best performng algorthm s AUC and AUC50 scores do not sgnfcantly change from the lnear to the Gaussan kernel, the var- Lnear Kernel Metrc AUC AUC Baum Welch Gbbs Samplng Varatonal Gaussan Kernel Metrc AUC AUC Baum Welch Gbbs Samplng Varatonal Table 3: AUC and AUC50 results for proten superfamly classfcaton AUC results on the SCOP 1.53 wth 25% Astral flterng over a selected set of 23 superfamles usng Gaussan and lnear kernels n one-versus-rest SVM classfcaton. atonal algorthm shows a large mprovement, rangng from 6% to 30%. 6.2 Analyss of nference algorthms The dfferences n AUC values resultng from the dfferent tranng algorthms (Tables 2 and 3) can be explaned, at least n part, by a hgh level overvew of how each algorthm operates. Whle the Baum-Welch algorthm returns MAP parameters of the model, both the Gbbs samplng method and the varatonal algorthm return expectatons of the parameters under an approxmate of the posteror dstrbuton. The MAP soluton from the Baum-Welch algorthm s lkely to reach a local maxmum of the posteror, whle the other algorthms should tend to average over posteror parameters. The Gbbs samplng algorthm and the varatonal algorthm each compute expectatons of the parameters under an approxmate posteror dstrbuton, but each uses a dfferent method to construct ths approxmaton. The varatonal algorthm wll be less lkely to converge to a good approxmaton of the margnal dstrbuton because the mean feld varatonal approxmaton necessarly does away wth the drect couplng between adjacent hdden states characterstc of the HMM. 6.3 Comparatve Performance Tables 4 and 5 show a comparson between the HMM varant and common classfcaton methods for the multclass and one-versus rest experments respectvely. The AUC and AUC50 scores ndcate that our scheme produces a representaton that s roughly equvalent n power to the Spectrum kernel for proten classfcaton. In defense of the HMM varant, the sze of the vector representaton produced by the spectrum kernel s sgnfcantly larger than the typcal representatons produced by our HMM varant. The Msmatch(5,1) kernel, used for SCOP 1.53 superfamly classfcaton (Table 5), s smlar to the Spectrum(5) kernel but also counts substrngs of length 5 that dffer by one amno acd resdue from those found n an observed sequence. The sze of the vector representaton assocated wth ths kernel can be up to Ths value s large compared to the largest vector representaton n our experments, whch s 400 for the HMM varant wth 20 hdden states. Nearly 1196
6 Dataset/Kernel HMM Varant Spectrum Class Fold (25 Categores) Fold (27 Categores) Superfamly Table 4: A comparson of results between the Spectrum kernel and the HMM varant under experments usng the multclass SVM formulaton. The HMM varant scores are the best performng from Table 2. Algorthm AUC AUC50 HMM Varant (best) Spectrum(2) [Lesle et al., 2002] Msmatch(5,1) [Lesle et al., 2003] Fsher [Jaakkola et al., 2000] SW-PSSM [Rangwala and Karyps, 2005] Table 5: A selecton of AUC and AUC50 scores for the Remote Homology Detecton problem usng a varety of SVM kernels on the SCOP 1.53, 25% dataset usng 1-vs-rest classfcaton. The HMM varant scores are the best performng from Table 3. all of these hgh-performng kernel methods, unlke the HMM varant, employ doman specfc knowledge, such as carefully tuned poston-specfc scorng matrces, to ad classfcaton. In contrast, the only parameter that needs to be adjusted n the HMM varant s the number of hdden states. 7 Conclusons and Future Work Our HMM varant s an extenson of the standard HMM that assgns ndvdual transton matrces to each sequence n a dataset but keeps a sngle emssons matrx for the entre dataset. We descrbe three nference algorthms, two of whch, a Baum-Welch-lke algorthm and a Gbbs samplng algorthm, are smlar to standard methods used to nfer HMM parameters. A thrd, the varatonal nference algorthm, s related to algorthms used for nference on topc models and more complex HMM extensons. We demonstrate, by comparng results on proten sequence classfcaton usng our method n conjuncton wth SVMs, that each of these algorthms nfers transton matrces that capture useful characterstcs of ndvdual sequences. Because our model fts wthn a large exstng body of work on generatve models, we are especally nterested n related models that perform classfcaton drectly. References [Ble et al., 2003] D.M. Ble, A.Y. Ng, and M.I. Jordan. Latent drchlet allocaton. The Journal of Machne Learnng Research, 3: , [Crammer and Snger, 2002] K. Crammer and Y. Snger. On the algorthmc mplementaton of multclass kernel-based vector machnes. The Journal of Machne Learnng Research, 2: , [Eddy, 1998] S. Eddy. Profle hdden markov models. Bonformatcs, 14(9): , [Hofmann, 1999] T. Hofmann. Probablstc latent semantc ndexng. In Proceedngs of the 22nd annual nternatonal ACM SIGIR conference on Research and development n nformaton retreval, pages ACM, [Jaakkola and Haussler, 1999] T.S. Jaakkola and D. Haussler. Explotng generatve models n dscrmnatve classfers. Advances n neural nformaton processng systems, pages , [Jaakkola et al., 2000] T. Jaakkola, M. Dekhans, and D. Haussler. A dscrmnatve framework for detectng remote proten homologes. Journal of Computatonal Bology, 7(1-2):95 114, [Jebara et al., 2004] T. Jebara, R. Kondor, and A. Howard. Probablty product kernels. The Journal of Machne Learnng Research, 5: , [Joachms, 1999] T. Joachms. SVMLght: Support Vector Machne. SVM-Lght Support Vector Machne joachms. org/, Unversty of Dortmund, [Kuang et al., 2004] R. Kuang, E. Ie, K. Wang, K. Wang, M. Sddq, Y. Freund, and C. Lesle. Profle-based strng kernels for remote homology detecton and motf extracton. Computatonal Systems Bonformatcs, pages , [Lesle et al., 2002] C. Lesle, E. Eskn, and W. S. Noble. The spectrum kernel: A strng kernel for svm proten classfcaton. Proceedngs of the Pacfc Symposum on Bocomputng, pages , [Lesle et al., 2003] C. Lesle, E. Eskn, W. S. Noble, and J. Weston. Msmatch strng kernels for svm proten classfcaton. Advances n Neural Informaton Processng Systems, 20(4): , [Murzn et al., 1995] A.G. Murzn, S.E. Brenner, T. Hubbard, and C. Chotha. SCOP: a structural classfcaton of protens database for the nvestgaton of sequences and structures. Journal of molecular bology, 247(4): , [Rabner and Juang, 1986] L. Rabner and B. Juang. An ntroducton to hdden Markov models. IEEE ASSp Magazne, 3(1 Part 1):4 16, [Rangwala and Karyps, 2005] H. Rangwala and G. Karyps. Profle-based drect kernels for remote homology detecton and fold recognton. Bonformatcs, 21(23):4239, [Rangwala and Karyps, 2006] Huzefa Rangwala and George Karyps. Buldng multclass classfers for remote homology detecton and fold recognton. BMC Bonformatcs, 7:455, [Scott, 2002] S.L. Scott. Bayesan methods for hdden Markov models: Recursve computng n the 21st century. Journal of the Amercan Statstcal Assocaton, 97(457): , [Smyth, 1997] P. Smyth. Clusterng sequences wth hdden Markov models. Advances n neural nformaton processng systems, pages ,
EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationKeywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines
(IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationA Robust Method for Estimating the Fundamental Matrix
Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.
More informationModeling Waveform Shapes with Random Effects Segmental Hidden Markov Models
Modelng Waveform Shapes wth Random Effects Segmental Hdden Markov Models Seyoung Km, Padhrac Smyth Department of Computer Scence Unversty of Calforna, Irvne CA 9697-345 {sykm,smyth}@cs.uc.edu Abstract
More informationMachine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)
Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationDiscriminative Dictionary Learning with Pairwise Constraints
Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationText Similarity Computing Based on LDA Topic Model and Word Co-occurrence
2nd Internatonal Conference on Software Engneerng, Knowledge Engneerng and Informaton Engneerng (SEKEIE 204) Text Smlarty Computng Based on LDA Topc Model and Word Co-occurrence Mngla Shao School of Computer,
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationThree supervised learning methods on pen digits character recognition dataset
Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationLecture 4: Principal components
/3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationBiostatistics 615/815
The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationCSCI 5417 Information Retrieval Systems Jim Martin!
CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationUsing Neural Networks and Support Vector Machines in Data Mining
Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationApplication of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions
Applcaton of Maxmum Entropy Markov Models on the Proten Secondary Structure Predctons Yohan Km Department of Chemstry and Bochemstry Unversty of Calforna, San Dego La Jolla, CA 92093 ykm@ucsd.edu Abstract
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationAn Evaluation of Divide-and-Combine Strategies for Image Categorization by Multi-Class Support Vector Machines
An Evaluaton of Dvde-and-Combne Strateges for Image Categorzaton by Mult-Class Support Vector Machnes C. Demrkesen¹ and H. Cherf¹, ² 1: Insttue of Scence and Engneerng 2: Faculté des Scences Mrande Galatasaray
More informationDetection of an Object by using Principal Component Analysis
Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationBOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET
1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School
More informationEXTENDED BIC CRITERION FOR MODEL SELECTION
IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7
More informationIntelligent Information Acquisition for Improved Clustering
Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center
More informationLearning-Based Top-N Selection Query Evaluation over Relational Databases
Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **
More informationUnsupervised Learning and Clustering
Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationApplying Continuous Action Reinforcement Learning Automata(CARLA) to Global Training of Hidden Markov Models
Applyng Contnuous Acton Renforcement Learnng Automata(CARLA to Global Tranng of Hdden Markov Models Jahanshah Kabudan, Mohammad Reza Meybod, and Mohammad Mehd Homayounpour Department of Computer Engneerng
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationThe Rate Adapting Poisson Model for Information Retrieval and Object Recognition
for Informaton Retreval and Object Recognton Peter V. Gehler PGEHLER@TUEBINGEN.MPG.DE Max Planck Insttute for Bologcal Cybernetcs, Spemannstrasse 38, 72076 Tübngen, Germany Alex D. Holub HOLUB@VISION.CALTECH.EDU
More informationJournal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article
Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2512-2520 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 Communty detecton model based on ncremental EM clusterng
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationFast Sparse Gaussian Processes Learning for Man-Made Structure Classification
Fast Sparse Gaussan Processes Learnng for Man-Made Structure Classfcaton Hang Zhou Insttute for Vson Systems Engneerng, Dept Elec. & Comp. Syst. Eng. PO Box 35, Monash Unversty, Clayton, VIC 3800, Australa
More informationFrom Comparing Clusterings to Combining Clusterings
Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology,
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationClassifying Acoustic Transient Signals Using Artificial Intelligence
Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationSorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions
Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place
More informationCollaboratively Regularized Nearest Points for Set Based Recognition
Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,
More informationLocal Quaternary Patterns and Feature Local Quaternary Patterns
Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents
More informationSteps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices
Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationColour Image Segmentation using Texems
XIE AND MIRMEHDI: COLOUR IMAGE SEGMENTATION USING TEXEMS 1 Colour Image Segmentaton usng Texems Xanghua Xe and Majd Mrmehd Department of Computer Scence, Unversty of Brstol, Brstol BS8 1UB, England {xe,majd}@cs.brs.ac.u
More informationA PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION
1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute
More informationSRBIR: Semantic Region Based Image Retrieval by Extracting the Dominant Region and Semantic Learning
Journal of Computer Scence 7 (3): 400-408, 2011 ISSN 1549-3636 2011 Scence Publcatons SRBIR: Semantc Regon Based Image Retreval by Extractng the Domnant Regon and Semantc Learnng 1 I. Felc Raam and 2 S.
More informationPositive Semi-definite Programming Localization in Wireless Sensor Networks
Postve Sem-defnte Programmng Localzaton n Wreless Sensor etworks Shengdong Xe 1,, Jn Wang, Aqun Hu 1, Yunl Gu, Jang Xu, 1 School of Informaton Scence and Engneerng, Southeast Unversty, 10096, anjng Computer
More information12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification
Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero
More informationLearning to Detect Information Outbreaks in Social Networks
Learnng to Detect Informaton Outbreaks n Socal Networks Jayuan Ma jayuanm@stanford.edu Stanford Unversty Xncheng Zhang xnchen2@stanford.edu Stanford Unversty 1. INTRODUCTION Ths s the nformaton age. Everyday
More informationAssociative Based Classification Algorithm For Diabetes Disease Prediction
Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha
More informationBayesian Classifier Combination
Bayesan Classfer Combnaton Zoubn Ghahraman and Hyun-Chul Km Gatsby Computatonal Neuroscence Unt Unversty College London London WC1N 3AR, UK http://www.gatsby.ucl.ac.uk {zoubn,hckm}@gatsby.ucl.ac.uk September
More informationOptimal Workload-based Weighted Wavelet Synopses
Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,
More informationQuery classification using topic models and support vector machine
Query classfcaton usng topc models and support vector machne Deu-Thu Le Unversty of Trento, Italy deuthu.le@ds.untn.t Raffaella Bernard Unversty of Trento, Italy bernard@ds.untn.t Abstract Ths paper descrbes
More informationDeep Classification in Large-scale Text Hierarchies
Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong
More informationAn Ensemble Learning algorithm for Blind Signal Separation Problem
An Ensemble Learnng algorthm for Blnd Sgnal Separaton Problem Yan L 1 and Peng Wen 1 Department of Mathematcs and Computng, Faculty of Engneerng and Surveyng The Unversty of Southern Queensland, Queensland,
More informationAn Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem
An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r
More informationImage Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline
mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and
More informationA fast algorithm for color image segmentation
Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au
More informationPrivate Information Retrieval (PIR)
2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market
More informationFace Recognition University at Buffalo CSE666 Lecture Slides Resources:
Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural
More informationSkew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach
Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationFactor Graphs for Region-based Whole-scene Classification
Factor Graphs for Regon-based Whole-scene Classfcaton Matthew R. Boutell Jebo Luo Chrstopher M. Brown CSSE Dept. Res. and Dev. Labs Dept. of Computer Scence Rose-Hulman Inst. of Techn. Eastman Kodak Company
More informationHybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: 1311-9702; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-2016-0017 Hybrdzaton of Expectaton-Maxmzaton
More informationA Novel Term_Class Relevance Measure for Text Categorization
A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure
More informationRelated-Mode Attacks on CTR Encryption Mode
Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory
More informationRECOGNIZING GENDER THROUGH FACIAL IMAGE USING SUPPORT VECTOR MACHINE
Journal of Theoretcal and Appled Informaton Technology 30 th June 06. Vol.88. No.3 005-06 JATIT & LLS. All rghts reserved. ISSN: 99-8645 www.jatt.org E-ISSN: 87-395 RECOGNIZING GENDER THROUGH FACIAL IMAGE
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationAnnouncements. Supervised Learning
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples
More informationLearning to Classify Documents with Only a Small Positive Training Set
Learnng to Classfy Documents wth Only a Small Postve Tranng Set Xao-L L 1, Bng Lu 2, and See-Kong Ng 1 1 Insttute for Infocomm Research, Heng Mu Keng Terrace, 119613, Sngapore 2 Department of Computer
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More information