Clustering Sequence Data using Hidden Markov Model. Representation. Cen Li and Gautam Biswas. Box 1679 Station B,
|
|
- Barry Rodgers
- 5 years ago
- Views:
Transcription
1 Clustering Sequence Data using Hidden Markov Model Representation Cen Li and Gautam Biswas Box 1679 Station B, Department of Computer Science, Vanderbilt University, Nashville, TN USA ABSTRACT This paper proposed a clustering methodology for sequence data using hidden Markov model(hmm) representation. The proposed methodology improves upon existing HMM based clustering methods in two ways: (i) it enables HMMs to dynamically change its model structure to obtain a better t model for data during clustering process, and (ii) it provides objective criterion function to select the optimal clustering partition. The algorithm is presented in terms of four nested levels of searches: (i) the search for the optimal number of clusters in a partition, (ii) the search for the optimal structure for a given partition, (iii) the search for the optimal HMM structure for each cluster, and (iv) the search for the optimal HMM parameters for each HMM. Preliminary results are given to support the proposed methodology. Keywords: clustering, hidden Markov model, model selection, Bayesian Information Criterion(BIC), mutual information 1. INTRODUCTION Clustering assumes data is not labeled with class information. The goal is to create structure for data by objectively partitioning data into homogeneous groups where the within group object similarity and the between group object dissimilarity are optimized. The technique has been used extensively and successfully by data mining researchers in discovering structures from databases where domain knowledge is not available or incomplete 1. 2 In the past, the focus of clustering analysis has been on data described with static features 123, i.e., valuesofthe features do not change during the observation period. Examples of static features include an employee's educational level and salary, or a patient's age, gender, and weight. In real world, most systems are dynamic which can often be best described by temporal features, whose values change signicantly during observation period. Examples of temporal features include monthly ATM transactions and account balances of bank customers, and blood pressure, temperature and respiratory rate of patients under intensive care. This paper addresses the problem of clustering data described by temporal features. Clustering temporal data is inherently more complex than clustering static data because (i) the dimensionality of the data is signicantly larger in the dynamic case, and (ii) the complexity of cluster denition(modeling) and interpretation increases by orders of magnitude with dynamic data. 5 We choose hidden Markov model representation for our temporal data clustering problem. There are a number of advantages in the HMM representation for our problem: There are direct links between the HMM states and real world situations for the problem under consideration. The hidden states of a HMM can be used to eectively model the set of potentially valid states of a dynamic process. While the exact sequence of stages going through by a dynamic system may not be observed, it can be estimated based on observable behavior of the systems. HMMs represent a well-dened probabilistic model. The parameters of a HMM can be determined in a precise, well-dened manner, using methods such as maximal likelihood estimates or maximal mutual information criterion. HMMs are graphical models of underlying dynamic processes that govern system behavior. Graphical models may aid the interpretation task.
2 Clustering using HMMs was rst mentioned by Rabiner et al. 6 for speech recognition problems. The idea has been further explored by other researchers including Lee, 7 Dermatas and Kokkinakis, 8 Lee, 7 Kosaka et al., 9 and Smyth. Two main problems that have been identied in these works are: (i) no objective criterion measure is used for determining the optimal size of the clustering partition, and (ii) uniform, pre-specied HMM structure is assumed for dierent clusters of each partition. This paper describes a HMM clustering methodology that tries to remedy these two problems by developing an objective partition criterion measure based on model mutual information, and by developing an explicit HMM model renement procedure that dynamically modify HMM structures during clustering process. 2. PROPOSED HMM CLUSTERING METHODOLOGY The proposed HMM clustering method can be summarized in terms of four levels of nested searches. From the outer most to the inner most level, the four searches are: the search for 1. the optimal number of clusters in a partition, 2. the optimal structure for a given partition, 3. the optimal HMM structure for each cluster, and. the optimal HMM parameters for each cluster. Starting from the inner most level of search, each of these four search steps are described in more detail next Search Level : HMM Parameter Reestimation This step tries to nd the maximal likelihood parameters for the HMM of a xed size. The well known Baum- Welch parameter reestimation procedure 11 is used for this purpose. The Baum-Welch procedure is a variation of the more general EM algorithm, 12 which iterates between two steps: (i) the expectation step(e-step), and (ii) the maximization step(m-step). The E-step assumes the current parameters of the model and computes the expected values of a necessary statistics. The M-step uses these statistics to update the model parameters so as to maximize the expected likelihood of the parameters. 13 The procedure is implemented using the forward-backward computations Search level 3: the optimal HMM structure This step attempts to replace an existing model for a group of objects by a more accurate and rened HMM model. Solcke and Omohundro 1 described a technique for inducing the structure of HMMs from data based on a general \model merging" strategy. Takami and Sagayama 16 proposed the Successive State Splitting(SSS) algorithm to model context-dependent phonetic variations. Ostendorf and Singer 17 further expanded the basic SSS algorithm by choosing the node and the candidate split at the same time based on the likelihood gains. Casacuberta et. al 18 proposed to derive the structure of HMM through error correcting grammatical inference techniques. Our HMM renement procedure combines ideas from the past works. We start with an initial model conguration and incrementally grow or shrink the model through HMM state splitting and merging operations for choosing the right size model. The goal is to obtain a model that can better account for the data, i.e., having a higher model posterior probability. For both merge and split operations, we assume the Viterbi path does not change after each operation, that is for the split operation, the observations that were in state s will reside in either one of the two new states, q 0 or q 1. The same is true for the merge operation. This assumption can greatly simplify the parameter estimation process for the new states. The choice of state(s) to apply the split(merge) operation is dependent upon the state emission probabilities. For the split operation, the state that has the highest variances is split. For the merge operation, the two states that have the closest mean vector are considered for merging. Next we describe two criteria measure we propose to use for HMM model selection: (i) Posterior Probability Measure(PPM), and (ii) Bayesian information criterion.
3 Posterior probabilities for HMMs The computation for Posterior Probability of a HMM model (PPM) is based on the computation for Bayesian model merging criterion in. 1 The Bayesian model merging criterion trades the model likelihood against bias towards a simpler model. Assume the a prior probability of a fully parameterized model, ( ) the model structure and the model parameter, is uniformly distributed. Given some data X, using Bayes's rule, the posterior probability of P ( )P (Xj ) the model P ( jx) can be expressed as: P ( jx) = P / P ( )P (Xj ), where P (Xj ) is the (X) likelihood function. We propose to extend Stolcke and Omohundro's P ( jx) computation in Discrete Density HMMs(DDHMMs) for our Continuous Density HMM model. We decompose model into three independent components: its global structure, G, the transitions from each state q, trans, (q) and the emissions within each state, ( (q) (q) ). Assuming parameters associated with one state are independent of those in another state, the model prior can be written as P ( ) =P ( G ) Y q2q P ( (q) trans j(q) G ) Y q2q P ( (q) (q) j(q) G ): The structure of the model is modeled with an exponential distribution which explicitly bias towards smaller models: P ( G ) / C ;N where C is a constant, and C>1. Since the transitions represent discrete, nite probabilistic choices of the next state, Dirichlet distribution is used for calculating the probability of transitions from each state 1 1 : P ( (q) trans j G)= ty n (q) 1 B( t ::: t ) i=1 qi t;1 where qi are the transition probabilities at state q, i ranging over the states that can follow q. t is the prior weights, and can be chosen to introduce more or less bias towards uniform assignment of the parameters. This prior has a desirable characteristic that it favors state congurations that have less yet more signicant output transitions. For our single component CDHMM case, we propose to use Jerey's prior for the location-scale parameters, i.e., the mean vector and variance matrices associated with each state 19 : P (( (q) (q) )j G)= ;1 : (q) This location-scale prior shows that data having a smaller lead to a more accurate determination of parameter. In the case of CDHMM state conguration, this prior awards CDHMMs with clearly dened states, i.e., the variances,, associated with these states are small Bayesian Information Criterion for HMMs One problem with PPM criterion is that it depends heavily on the base value, C, of the exponential distribution for the global model structure probability. Currently, we do not have a strategy for selecting the exponential base value for dierent problems and the model selection performance deteriorates if the right base value is not used. An alternative scheme is the Bayesian model selection approach. A criterion that is often used by Bayesian model selection is relative model posterior probability, P ( X), given by P ( X) =P ()P (Xj): By assuming an uniform prior probability for dierent models, P ( X) / P (Xj), where P (Xj) is the marginal likelihood. The goal of this approach is to select the model that gives the highest marginal likelihood. Computing marginal likelihood for complex models has been an active research area. Established approaches include Monte-Carlo methods, i.e., Gibbs sampling methods 23, 2 and various approximation methods, i.e., the Laplace approximation and approximation based on Bayesian information criterion. 21 Ithasbeenwell documented that the Monte-Carlo methods are very accurate, but are computationally inecient especially for large databases. It is also shown that under certain regularity conditions, Laplace approximation can be quite accurate, but its computation can be expensive, especially for its component Hessian matrix computation.
4 A widely used and very ecient approximation method for marginal likelihood is Bayesian information criterion where, in log form, marginal likelihood of a model given data is computed as: logp(jx) =logp(xj ) ; d 2 logn where is Maximum Likelihood(ML) conguration of the model, d is the dimensionality of the model parameter space and N is the number of cases in data. We choose BIC as our alternative HMM model selection criterion Search Level 2: the optimal partition structure The two most commonly used distance measures in the context of the HMM representation is the sequence-tomodel likelihood measure and the symmetrized distance measure between pairwise models. 26 Wechoose the sequence-to-model likelihood distance measure for our HMM clustering algorithm. Sequence-to-HMM likelihood, P (Oj), measures the probability that a sequence, O, is generated by a given model,. When the sequence-to- HMM likelihood distance measure is used for object-to-cluster assignments, it automatically enforces the maximizing within-group similarity criterion. A K-means style clustering control structure and a depth-rst binary divisive clustering control structure are proposed to generate partitions having dierent number of clusters. For each partition, the initial object-to-cluster memberships are determined by the sequence-to-hmm likelihood(see Section 2.2.1) distance measure. The objects are subsequently redistributed after HMM parameter reestimation and HMM model renement have been applied in the intermediate clusters. For the K-means algorithm, the redistribution is global for all clusters. For binary hierarchical clustering, the redistribution is carried out between the child clusters of the current cluster. Thus the algorithm is not guaranteed to produce the maximally probable partition of the data set. If the goal is to have a single partition of data, K-means style control structure may be used. If one wants to look at partitions at various levels of details, binary divisive clustering may be suitable. 2.. Search level 1: the optimal number of clusters in the partition The quality of a clustering is measured in terms of its within cluster similarity andbetween cluster dissimilarity. A common criterion measure used by anumber of HMM clustering schemes is the overall likelihood of data given models of the set of clusters. Since our distance measure does well in maximizing the homogeneity of objects within each cluster, we want a criterion measure that is good at comparing partitions in terms of their between-cluster distances. We use the Partition Mutual Information(PMI) measure 27 for this task. From Bayes rule, the posterior probability of a model, i, trained on data, O i,isgiven by: P ( i jo i )= P (O ij i )P ( i ) P (O i ) = P (O i j i )P ( i ) P J P (O j=1 ij j )P ( j ) where P ( i ) is the prior probability of a data coming from cluster i before the feature values are inspected, and P (O i j i ) is the conditional probability of displaying the feature O i given that it comes from cluster i. Let MI i represent the average mutual information between the observation sequence O i and the complete set of models =( 1 ::: J ): MI i = logp( i jo i ) = log(p(o i j i )P( i )) ; logpj j=1 P(O ij j )P( j ): Maximizing this value is equivalent to separating the correct model i from all other models on the training sequence O i. Then, the overall information of the partition with J models is computed by summing over the mutual information PJ j j=1pn i=1 MIi of all training sequences: PMI = J where n j is the number of objects in cluster j, andj is the total number of clusters in a partition. PMI is maximized when the J models are the most separated set of models, without fragmentations.
5 Object 1 Object 2 Object 3 Object Feature 1 Feature Figure 1. Objects generated from dierent HMMs 3. PRELIMINARY RESULTS We have conducted preliminary experiments with HMM clustering on articially generated data. Since we have not nished implementing the HMM renement procedure, in the following experiments, we assume the correct model structure is known and xed throughout the clustering process. Therefore, uniform prior distributions are assumed for all HMMs in computation. For these experiments, the objectives of the HMM clustering is to derive a good partition with optimal number of clusters and object-cluster memberships. To generate data with K clusters, rst we manually create K HMMs. From each of these K HMMs, we generate N k objects, each described with M temporal sequences. The length of each temporal sequence is L. The total data points for such a data set is K N k M L. In these experiments, we choose K =,N k =,M = 2, and L = 0. The HMM for each cluster has 5 states. Figure 1 shows four example data objects from these models. It is observed that, from the feature values, it is quite dicult to dierentiate which objects are generated from the same model. In fact, objects 1 and 3 are generated from the same model and objects 2 and are generated from a dierent model Experiment One In this experiment, we illustrate the binary HMM clustering process and the eects of the PMI criterion measure. In the rst part of the experiment, the PMI criterion measure was not incorporated in the binary clustering tree building process. The branches of the tree is terminated either because there are too few objects in the node, or because the object redistribution process in a node ends with one cluster partition. The full binary clustering tree, as well as the PMI scores for intermediate and nal partitions are computed and shown in Figure 2(a). The PMI scores to the right of the tree indicate the quality of the current partition, which includes all nodes at the frontier of the current tree. For example, the PMI score for the partition having clusters C and C 123 is 0.0, and PMI score for the partition having clusters C, C 2, 26C 2,andC 13 is ;1:75 2. The result of this clustering process is a 7-cluster partition, with six fragmented clusters, i.e., cluster C 2 is fragmented into C 2 and 26C 2, cluster C 3 is fragmented into 1 C 3, 29C 3, and cluster C 1 is fragmented into C 1 and 26C 1. Figure 2(b) shows the binary HMM clustering tree where PMI criterion measure is used for determining branch terminations. The dotted lines cut o branches of the search tree where the split of the parent cluster results in a decrease in the PMI score. This clustering process rediscovers the correct -cluster partition.
6 C 123 PMI C 123 C C C C 123 C 2 C PMI bef ore =0:0 C 2 C 13 C 2 26 C PMI af ter = ;87:3 C 2 26 C 2 C 3 C PMI bef ore =0:0 C 3 C 1 1 C 3 29 C 3 C 1 26 C PMI af ter = ;:13 1 C 3 PMI bef ore =0:0 PMI af ter = ;58:7 29 C 3 C 1 26 C 1 (a) (b) Figure 2. The binary HMM clustering tree. Misclassification Misclassification 0 35 K-Means Binary Divisive (a) S/N ratio 5 (b) Binary Divisive K-means Number of States Figure 3. HMM clustering results: (a) data having dierent levels of noise, (b) clustering starting with dierent size HMMs
7 3.2. Experiment Two In this experiment, we want to study the performance of the HMM clustering system with data being corrupted by dierent levels of noises. White Gaussian noises were added to the data. The added noise was computed at dierent signal-to-noise ratios. 28 More noise is successively added to the original -cluster data, i.e., the signal-to-noise ratio is successfully decreased from 35 to 1. Figure 3(a) shows the clustering results in terms of the misclassication counts vs. the signal-to-noise ratio. We observe that noise does not seem to have much eect on the clustering results until it is very large, i.e., S=NdB < 5. After that, the clustering process fails to separate out objects from three HMMs Experiment Three In this experiment, we study the eects of dierent initial HMM structure on clustering performance. Since the four original HMMs all have 5-states. The initial HMMs in this experiment have number of states ranging from 2 to 8. Figure 3(b) shows the results in terms of the misclassication counts versus number of states in the initial HMMs. The clustering results remains the same for initial HMMs having 3, and 5 states. For initial HMMs having 2 states, the misclassication is high. The algorithm fails to separate objects from HMM model 2 and 3. For initial HMMs having 6, 7, and 8 states, the clustering partition generated is close to optimal. This result agree with the intuition that the initial HMMs having too few states will result in worse clustering partition than cases when initial HMMs have too many states. The reason for this that when a model is too small, multiple state denitions have to be squeezed into one state, which makes the state denitions less specic and the model less accurate. On the other hand, when there are extra states in the model, the model can be made more accurate by dividing single state denition into multiple state denitions. At the very least, it can retain the original model by setting the transitions to the extra states very small to ignore those states. REFERENCES 1. P. Cheeseman and J. Stutz, \Bayesian classication(autoclass): Theory and results," in Advances in Knowledge Discovery and Data Mining, U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., ch. 6, pp. 3{180, AAAI-MIT press, G. Biswas, J. Weinberg, and C. Li, \Iterate: A conceptual clustering method for knowledge discovery in databases," in Articial Intelligence in Petroleum Industry: Symbolic and Computational Applications, B. Braunschweig and R. Day, eds., Teditions Technip, D. Fisher, \Knowledge acquisition via incremental conceptual clustering," Machine Learning 2, pp. 139{172, C. S. Wallace and D. L. Dowe, \Intrinsic classicatin by mml - the snob program," in Proceedings of the Seventh Australian Joint Conference onarticial Intelligence, pp. 37{, World Scientic, C. Li, \Unsupervised classication on temporal data." Survey paper, Department of Computer Science, Vanderbilt University, Apr L. R. Rabiner, C. H. Lee, B. H. Juang, and J. G. Wilpon, \Hmm clustering for connected word recognition," in Proceedings of the International Conference onacoustics, Speech, and Signal Processing, K. F. Lee, \Context-dependent phonetic hidden markov models for speaker-independent continuous speech recognition," IEEE Transactions on Acoustics, Speech, and Signal Processing 38(), pp. 599{609, E. Dermatas and G. Kokkinakis, \Algorithm for clustering continuous density hmmby recognition error," IEEE Transactions on Speech and Audio Processing, pp. 231{23, May T. Kosaka, S. Masunaga, and M. Kuraoka, \Speaker-independent phone modeling based on speaker-dependent hmm's composition and clustering," in Proceedings of the ICASSP' 95, pp. 1{, P. Smyth, \Clustering sequences with hidden markov models," Advances in Neural Information Processing, L. E. Baum, T. Petrie, G. Soules, and N. Weiss, \A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains," The Annuals of Mathematical Statistics (1), pp. 16{171, A. P. Dempster, N. M. Laird, and D. B. Rubin, \Maximum likelihood from incomplete data via the em algorithm," Journal of Royal Statistical Society Series B(methodological) 39, pp. 1{38, 1977.
8 13. Z. Ghahramani and M. I. Jordan, \Factorial hidden markov models," Tech. Rep. 9502, MIT Comuter Cognitive Science, Aug A. Stolcke and S. M. Omohundro, \Best-rst model merging for hidden markov model induction," Tech. Rep. TR-9-003, International Computer Science Institute, 197 Center St., Suite 600, Berkeley, CA , Jan S. M. Omohundro, \Best-rst model merging for dynamic learning and recognition," Advances in Neural Information Processing Systems, pp. 958{965, J. Takami and S. Sagayama, \A successive state splitting algorithm for ecient allophone modeling," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1, pp. 573{576, M. Ostendorf and H. Singer, \Hmm topology design using maximum likelihood successive state splitting," Computer Speech and Language 11, pp. 17{1, F. Casacuberta, E. Vidal, and B. Mas, \Learning the structure of hmm's through grammatical inference techniques," in Proceedings of the International Conference on Acoustic, Speech, and Signal Processing, pp. 717{7, G. Box and G. C. Tiao, Bayesian Inference in Statistical Analysis, Addison-Wesley Publishing Co., R. E. Kass and A. E. Raftery, \Bayes factor," Journal of the American Statistical Association, pp. 773{795, June D. Heckerman, \A tutorial on learning with bayesian networks," Tech. Rep. MSR-TR-95-06, Microsoft Research, Advanced Technology Division, One Microsoft Way, Redmond, WA 98052, G. F. Cooper and E. Herskovits, \A bayesian method for the induction of probabilistic network from data," Machine Learning 9, pp. 9{37, S. Chib, \Marginal likelihood from the gibbs sampling," Journal of the American Statistical Association, pp. 1313{1321, Dec C. G. and G. E. I., \Explaining the gibbs sampler," The American Statistician 6, pp. 167{17, Aug L. R. Rabiner, \A tutorial on hidden markov models and selected applications in speech recognition," Proceedings of the IEEE 77, pp. 7{285, Feb B. H. Juang and L. R. Rabiner, \A probabilistic distance measure for hidden markov models," AT&T Technical Journal 6, pp. 391{08, Feb L. R. Bahl, P. F. Brown, P. V. De Souza, and R. L. Mercer, \Maximum mutual information estimation of hidden markov model parameters," in Proceedings of the IEEE-IECEJ-AS International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 9{52, D. J. Mashao, Computations and Evaluations of an Optimal Feature-set for an HMM-based Recognizer. PhD thesis, Brown University, May 1996.
(a) (b) Penalty Prior CS BIC Likelihood -600 Likelihood
A Bayesian Approach to Temporal Data Clustering using Hidden Markov Models Cen Li cenli@vuse.vanderbilt.edu Gautam Biswas biswas@vuse.vanderbilt.edu Department of EECS, Box 1679 Station B, Vanderbilt University,
More informationClustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract
Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic
More informationthe number of states must be set in advance, i.e. the structure of the model is not t to the data, but given a priori the algorithm converges to a loc
Clustering Time Series with Hidden Markov Models and Dynamic Time Warping Tim Oates, Laura Firoiu and Paul R. Cohen Computer Science Department, LGRC University of Massachusetts, Box 34610 Amherst, MA
More informationapplication of learning vector quantization algorithms. In Proceedings of the International Joint Conference on
[5] Teuvo Kohonen. The Self-Organizing Map. In Proceedings of the IEEE, pages 1464{1480, 1990. [6] Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQPAK: A program package for the correct
More informationSkill. Robot/ Controller
Skill Acquisition from Human Demonstration Using a Hidden Markov Model G. E. Hovland, P. Sikka and B. J. McCarragher Department of Engineering Faculty of Engineering and Information Technology The Australian
More informationSummary: A Tutorial on Learning With Bayesian Networks
Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.
More informationHierarchical Mixture Models for Nested Data Structures
Hierarchical Mixture Models for Nested Data Structures Jeroen K. Vermunt 1 and Jay Magidson 2 1 Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, Netherlands
More informationCluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s]
Fast, single-pass K-means algorithms Fredrik Farnstrom Computer Science and Engineering Lund Institute of Technology, Sweden arnstrom@ucsd.edu James Lewis Computer Science and Engineering University of
More informationSimilarity-Based Clustering of Sequences using Hidden Markov Models
Similarity-Based Clustering of Sequences using Hidden Markov Models Manuele Bicego 1, Vittorio Murino 1, and Mário A.T. Figueiredo 2 1 Dipartimento di Informatica, Università di Verona Ca Vignal 2, Strada
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationClustering Documents in Large Text Corpora
Clustering Documents in Large Text Corpora Bin He Faculty of Computer Science Dalhousie University Halifax, Canada B3H 1W5 bhe@cs.dal.ca http://www.cs.dal.ca/ bhe Yongzheng Zhang Faculty of Computer Science
More informationModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System
ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System Nianjun Liu, Brian C. Lovell, Peter J. Kootsookos, and Richard I.A. Davis Intelligent Real-Time Imaging and Sensing (IRIS)
More informationA Model Selection Criterion for Classification: Application to HMM Topology Optimization
A Model Selection Criterion for Classification Application to HMM Topology Optimization Alain Biem IBM T. J. Watson Research Center P.O Box 218, Yorktown Heights, NY 10549, USA biem@us.ibm.com Abstract
More informationGeoff McLachlan and Angus Ng. University of Queensland. Schlumberger Chaired Professor Univ. of Texas at Austin. + Chris Bishop
EM Algorithm Geoff McLachlan and Angus Ng Department of Mathematics & Institute for Molecular Bioscience University of Queensland Adapted by Joydeep Ghosh Schlumberger Chaired Professor Univ. of Texas
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationModeling time series with hidden Markov models
Modeling time series with hidden Markov models Advanced Machine learning 2017 Nadia Figueroa, Jose Medina and Aude Billard Time series data Barometric pressure Temperature Data Humidity Time What s going
More informationProbabilistic Abstraction Lattices: A Computationally Efficient Model for Conditional Probability Estimation
Probabilistic Abstraction Lattices: A Computationally Efficient Model for Conditional Probability Estimation Daniel Lowd January 14, 2004 1 Introduction Probabilistic models have shown increasing popularity
More informationA Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models
A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University
More informationEffect of Initial HMM Choices in Multiple Sequence Training for Gesture Recognition
Effect of Initial HMM Choices in Multiple Sequence Training for Gesture Recognition Nianjun Liu, Richard I.A. Davis, Brian C. Lovell and Peter J. Kootsookos Intelligent Real-Time Imaging and Sensing (IRIS)
More informationNetworks for Control. California Institute of Technology. Pasadena, CA Abstract
Learning Fuzzy Rule-Based Neural Networks for Control Charles M. Higgins and Rodney M. Goodman Department of Electrical Engineering, 116-81 California Institute of Technology Pasadena, CA 91125 Abstract
More informationSupplementary Material: The Emergence of. Organizing Structure in Conceptual Representation
Supplementary Material: The Emergence of Organizing Structure in Conceptual Representation Brenden M. Lake, 1,2 Neil D. Lawrence, 3 Joshua B. Tenenbaum, 4,5 1 Center for Data Science, New York University
More informationNetwork. Department of Statistics. University of California, Berkeley. January, Abstract
Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,
More informationAn Introduction to Pattern Recognition
An Introduction to Pattern Recognition Speaker : Wei lun Chao Advisor : Prof. Jian-jiun Ding DISP Lab Graduate Institute of Communication Engineering 1 Abstract Not a new research field Wide range included
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationBuilding Classifiers using Bayesian Networks
Building Classifiers using Bayesian Networks Nir Friedman and Moises Goldszmidt 1997 Presented by Brian Collins and Lukas Seitlinger Paper Summary The Naive Bayes classifier has reasonable performance
More informationTHE most popular training method for hidden Markov
204 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 A Discriminative Training Algorithm for Hidden Markov Models Assaf Ben-Yishai and David Burshtein, Senior Member, IEEE Abstract
More informationHidden Markov Model for Sequential Data
Hidden Markov Model for Sequential Data Dr.-Ing. Michelle Karg mekarg@uwaterloo.ca Electrical and Computer Engineering Cheriton School of Computer Science Sequential Data Measurement of time series: Example:
More informationMotivation. Technical Background
Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering
More informationHidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017
Hidden Markov Models Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 1 Outline 1. 2. 3. 4. Brief review of HMMs Hidden Markov Support Vector Machines Large Margin Hidden Markov Models
More informationApproximate Discrete Probability Distribution Representation using a Multi-Resolution Binary Tree
Approximate Discrete Probability Distribution Representation using a Multi-Resolution Binary Tree David Bellot and Pierre Bessière GravirIMAG CNRS and INRIA Rhône-Alpes Zirst - 6 avenue de l Europe - Montbonnot
More informationHidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi
Hidden Markov Models Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Sequential Data Time-series: Stock market, weather, speech, video Ordered: Text, genes Sequential
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationimages is then estimated using the function approximator, such as the neural networks, or by matching novel images to the examples in the database. Wi
From Gaze to Focus of Attention Rainer Stiefelhagen 1, Michael Finke 2, Jie Yang 2, and Alex Waibel 12 1 Universitat Karlsruhe, Computer Science, ILKD Am Fasanengarten 5, 76131 Karlsruhe, Germany stiefel@ira.uka.de
More informationAmbiguity Detection by Fusion and Conformity: A Spectral Clustering Approach
KIMAS 25 WALTHAM, MA, USA Ambiguity Detection by Fusion and Conformity: A Spectral Clustering Approach Fatih Porikli Mitsubishi Electric Research Laboratories Cambridge, MA, 239, USA fatih@merl.com Abstract
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More informationVariational Methods for Graphical Models
Chapter 2 Variational Methods for Graphical Models 2.1 Introduction The problem of probabb1istic inference in graphical models is the problem of computing a conditional probability distribution over the
More informationData Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationt 1 y(x;w) x 2 t 2 t 3 x 1
Neural Computing Research Group Dept of Computer Science & Applied Mathematics Aston University Birmingham B4 7ET United Kingdom Tel: +44 (0)121 333 4631 Fax: +44 (0)121 333 4586 http://www.ncrg.aston.ac.uk/
More informationSpeech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri
Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute eugenew@cs.nyu.edu Slide Credit: Mehryar Mohri Speech Recognition Components Acoustic and pronunciation model:
More informationText Modeling with the Trace Norm
Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to
More informationPerson Authentication from Video of Faces: A Behavioral and Physiological Approach Using Pseudo Hierarchical Hidden Markov Models
Person Authentication from Video of Faces: A Behavioral and Physiological Approach Using Pseudo Hierarchical Hidden Markov Models Manuele Bicego 1, Enrico Grosso 1, and Massimo Tistarelli 2 1 DEIR - University
More informationNorbert Schuff VA Medical Center and UCSF
Norbert Schuff Medical Center and UCSF Norbert.schuff@ucsf.edu Medical Imaging Informatics N.Schuff Course # 170.03 Slide 1/67 Objective Learn the principle segmentation techniques Understand the role
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationInvariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction
Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of
More informationSubmitted to ACM Autonomous Agents 99. When building statistical machine learning models from real data
Submitted to ACM Autonomous Agents 99 A Synthetic Agent System for Bayesian Modeling Human Interactions Barbara Rosario, Nuria Oliver and Alex Pentland Vision and Modeling. Media Laboratory MIT, Cambridge,
More informationLoopy Belief Propagation
Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationMixture models and frequent sets: combining global and local methods for 0 1 data
Mixture models and frequent sets: combining global and local methods for 1 data Jaakko Hollmén Jouni K. Seppänen Heikki Mannila Abstract We study the interaction between global and local techniques in
More informationModeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA
Modeling and Reasoning with Bayesian Networks Adnan Darwiche University of California Los Angeles, CA darwiche@cs.ucla.edu June 24, 2008 Contents Preface 1 1 Introduction 1 1.1 Automated Reasoning........................
More informationMixture Models and EM
Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering
More informationA Hierarchical Statistical Framework for the Segmentation of Deformable Objects in Image Sequences Charles Kervrann and Fabrice Heitz IRISA / INRIA -
A hierarchical statistical framework for the segmentation of deformable objects in image sequences Charles Kervrann and Fabrice Heitz IRISA/INRIA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex,
More informationContrained K-Means Clustering 1 1 Introduction The K-Means clustering algorithm [5] has become a workhorse for the data analyst in many diverse elds.
Constrained K-Means Clustering P. S. Bradley K. P. Bennett A. Demiriz Microsoft Research Dept. of Mathematical Sciences One Microsoft Way Dept. of Decision Sciences and Eng. Sys. Redmond, WA 98052 Renselaer
More informationThe Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem
Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran
More informationAutomatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Outline Objective Approach Experiment Conclusion and Future work Objective Automatically establish linguistic indexing of pictures
More informationCluster Analysis. Ying Shen, SSE, Tongji University
Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group
More informationsize, runs an existing induction algorithm on the rst subset to obtain a rst set of rules, and then processes each of the remaining data subsets at a
Multi-Layer Incremental Induction Xindong Wu and William H.W. Lo School of Computer Science and Software Ebgineering Monash University 900 Dandenong Road Melbourne, VIC 3145, Australia Email: xindong@computer.org
More informationNOVEL HYBRID GENETIC ALGORITHM WITH HMM BASED IRIS RECOGNITION
NOVEL HYBRID GENETIC ALGORITHM WITH HMM BASED IRIS RECOGNITION * Prof. Dr. Ban Ahmed Mitras ** Ammar Saad Abdul-Jabbar * Dept. of Operation Research & Intelligent Techniques ** Dept. of Mathematics. College
More informationE-Companion: On Styles in Product Design: An Analysis of US. Design Patents
E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing
More informationHidden Markov decision trees
Hidden Markov decision trees Michael I. Jordan*, Zoubin Ghahramanit, and Lawrence K. Saul* {jordan.zoubin.lksaul}~psyche.mit.edu *Center for Biological and Computational Learning Massachusetts Institute
More informationDependency detection with Bayesian Networks
Dependency detection with Bayesian Networks M V Vikhreva Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Leninskie Gory, Moscow, 119991 Supervisor: A G Dyakonov
More informationCS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs
CS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs Felix Wang fywang2 John Wieting wieting2 Introduction We implement a texture classification algorithm using 2-D Noncausal Hidden
More information1 1 λ ( i 1) Sync diagram is the lack of a synchronization stage, which isthe main advantage of this method. Each iteration of ITSAT performs ex
Fast Robust Inverse Transform SAT and Multi-stage ation Hubert Jin, Spyros Matsoukas, Richard Schwartz, Francis Kubala BBN Technologies 70 Fawcett Street, Cambridge, MA 02138 ABSTRACT We present a new
More informationDocument Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.
Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips
More informationTime series, HMMs, Kalman Filters
Classic HMM tutorial see class website: *L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. of the IEEE, Vol.77, No.2, pp.257--286, 1989. Time series,
More informationProbabilistic Graphical Models
Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational
More informationPackage HMMCont. February 19, 2015
Type Package Package HMMCont February 19, 2015 Title Hidden Markov Model for Continuous Observations Processes Version 1.0 Date 2014-02-11 Author Maintainer The package includes
More informationCenter for Automation and Autonomous Complex Systems. Computer Science Department, Tulane University. New Orleans, LA June 5, 1991.
Two-phase Backpropagation George M. Georgiou Cris Koutsougeras Center for Automation and Autonomous Complex Systems Computer Science Department, Tulane University New Orleans, LA 70118 June 5, 1991 Abstract
More informationMinoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University
Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University
More informationClustering Algorithms In Data Mining
2017 5th International Conference on Computer, Automation and Power Electronics (CAPE 2017) Clustering Algorithms In Data Mining Xiaosong Chen 1, a 1 Deparment of Computer Science, University of Vermont,
More informationMix-nets: Factored Mixtures of Gaussians in Bayesian Networks with Mixed Continuous And Discrete Variables
Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks with Mixed Continuous And Discrete Variables Scott Davies and Andrew Moore School of Computer Science Carnegie Mellon University Pittsburgh,
More informationCS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed
More informationMotivation: Shortcomings of Hidden Markov Model. Ko, Youngjoong. Solution: Maximum Entropy Markov Model (MEMM)
Motivation: Shortcomings of Hidden Markov Model Maximum Entropy Markov Models and Conditional Random Fields Ko, Youngjoong Dept. of Computer Engineering, Dong-A University Intelligent System Laboratory,
More informationInternational Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA
International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERATIONS KAM_IL SARAC, OMER E GEC_IO GLU, AMR EL ABBADI
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference
More informationUsing Local Trajectory Optimizers To Speed Up Global. Christopher G. Atkeson. Department of Brain and Cognitive Sciences and
Using Local Trajectory Optimizers To Speed Up Global Optimization In Dynamic Programming Christopher G. Atkeson Department of Brain and Cognitive Sciences and the Articial Intelligence Laboratory Massachusetts
More informationRandom projection for non-gaussian mixture models
Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,
More informationUnsupervised naive Bayes for data clustering with mixtures of truncated exponentials
Unsupervised naive Bayes for data clustering with mixtures of truncated exponentials José A. Gámez Computing Systems Department Intelligent Systems and Data Mining Group i 3 A University of Castilla-La
More informationParameter Selection for EM Clustering Using Information Criterion and PDDP
Parameter Selection for EM Clustering Using Information Criterion and PDDP Ujjwal Das Gupta,Vinay Menon and Uday Babbar Abstract This paper presents an algorithm to automatically determine the number of
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More informationClustering Relational Data using the Infinite Relational Model
Clustering Relational Data using the Infinite Relational Model Ana Daglis Supervised by: Matthew Ludkin September 4, 2015 Ana Daglis Clustering Data using the Infinite Relational Model September 4, 2015
More informationFMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate
More informationAutomatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Abstract Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in content-based
More information160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp
Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing
More informationModel-Based Clustering for Online Crisis Identification in Distributed Computing
Model-Based Clustering for Crisis Identification in Distributed Computing Dawn Woodard Operations Research and Information Engineering Cornell University with Moises Goldszmidt Microsoft Research 1 Outline
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationProbabilistic Graphical Models
Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft
More information2. CNeT Architecture and Learning 2.1. Architecture The Competitive Neural Tree has a structured architecture. A hierarchy of identical nodes form an
Competitive Neural Trees for Vector Quantization Sven Behnke and Nicolaos B. Karayiannis Department of Mathematics Department of Electrical and Computer Science and Computer Engineering Martin-Luther-University
More informationIntroduction to Mobile Robotics
Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationA Web Recommendation System Based on Maximum Entropy
A Web Recommendation System Based on Maximum Entropy Xin Jin, Bamshad Mobasher,Yanzan Zhou Center for Web Intelligence School of Computer Science, Telecommunication, and Information Systems DePaul University,
More informationCS 532c Probabilistic Graphical Models N-Best Hypotheses. December
CS 532c Probabilistic Graphical Models N-Best Hypotheses Zvonimir Rakamaric Chris Dabrowski December 18 2004 Contents 1 Introduction 3 2 Background Info 3 3 Brute Force Algorithm 4 3.1 Description.........................................
More informationConstraints in Particle Swarm Optimization of Hidden Markov Models
Constraints in Particle Swarm Optimization of Hidden Markov Models Martin Macaš, Daniel Novák, and Lenka Lhotská Czech Technical University, Faculty of Electrical Engineering, Dep. of Cybernetics, Prague,
More informationNONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION
NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauef and Charles A. Bournant *Department of Electrical Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 tschoo1
More information