Latent Variable Models for the Analysis, Visualization and Prediction of Network and Nodal Attribute Data. Isabella Gollini.
|
|
- Alison Spencer
- 5 years ago
- Views:
Transcription
1 z i! z j Latent Variable Models for the Analysis, Visualization and Prediction of etwork and odal Attribute Data School of Engineering University of Bristol isabella.gollini@bristol.ac.uk January 4th, 4 Joint work with Prof. Brendan Murphy (University College Dublin) About Me With Jonty: Probabilistic methods for uncertainty assessment and quantification in natural hazards (floods, volcanoes, and earthquakes etc.). Models to cluster binary data with complex dependence structure Gollini, I., and Murphy, T.B., () Mixture of Latent Trait Analyzers for Model-Based Clustering of Categorical Data, Statistics and Computing. Models for network data Use of Variational methods for fast approximate inference. Outline otation number of nodes of an observed network etwork Data Latent Space Models for etworks Variational Inference Factor Analysis for odal Attributes Joint Model for etwork and odal Attributes Latent Variable Models for Multiple etworks Y ( ) adjacencymatrix if there is an edge between node i and j y ij = otherwise X ( M) matrix of M nodal attributes. z n (,σ I) D dimensional continuous latent variable Latent Space Model () for etworks Hoff et al. () introduced a model that assumes that each node n has an unknown position z n in a D-dim Euclidean latent space. p(y Z,α)= p(y ij z i,z j,α)= exp(α z i z j ) yij + exp(α z i z j ) Why Squared Euclidean Distance? It requires less approximation to be made in the estimation procedure. It allows to visualize more clearly the presence of potential clusters, giving a higher probability of a link between two close nodes in the latent space and lower probability to two nodes lying far away from each other. with p(α)= (ξ,ψ ), p(z n ) iid = (,σ I D )andσ,ξ,ψ are fixed parameters d ij = z i! z j d ij = z i! z j! =! =! =! =! =! The posterior distribution cannot be calculated analytically. p(y ij = ) = exp(!! d ij) + exp(!! d ij).5 OTE: We propose to use is the Squared Euclidean Distance
2 for etworks Variational Approach We fit the model using a Variational inference approach that is considerably quicker but less accurate than MCMC. The posterior probability of the unknown (Z,α) is: p(z,α Y)=p(Y Z,α)p(α) n= where C is the unknown normalising constant p(z n ) C We propose a variational posterior q(z,α Y) introducing variational parameters ξ, ψ, z n, Σ: q(z,α Y)=q(α) n= q(z n ) where q(α)= ( ξ, ψ )andq(z n )= ( z n, Σ). Variational Approach The basic idea behind the variational approach is to find a lower bound of the log marginal likelihood logp(y) by introducing the variational posterior distribution q(z,α Y). This approach leads to minimize the Kulback-Leibler divergence between the variational posterior q(z,α Y) and the true posterior p(z,α Y): KL[q(Z,α Y) p(z,α Y)] = q(z,α Y)log p(z,α Y) = = q(z,α Y) d(z,α) p(y,z,α) q(z,α Y)log p(y)q(z,α Y) d(z,α) q(z,α Y)log p(y,z,α) d(z,α) log p(y) q(z,α Y) Variational Approach for etworks Variational Approach EM Algorithm KL[q(Z,α Y) p(z,α Y)] divergence can be written as: KL[q(Z,α Y) p(z,α Y)] = KL[q(α) p(α)] + i= KL[q(z i ) p(z i )] E q(z,α Y) [log(p(y Z,α))] E q(z,α Y) [log(p(y Z,α))] is approximated using the Jensen s inequality: E q(z,α Y) [log(p(y Z,α))] = y ij E q(z,α Y) [α z i z j ] E q(z,α Y) [log( + exp(α z i z j ))] y ij (E q(z,α Y) [α z i z j ]) log( + E q(z,α Y) [exp(α z i z j )]) The EM algorithm at each (i + )th iteration: E-Step Estimate z (i+) n and Σ (i+) : Q(Θ ;Θ )=KL[q(Z,α Y) p(z,α Y)] where Θ =( ξ, ψ ). M-Step Estimate ξ and ψ : Θ (i+) = argmax Q(Θ ;Θ ) Monks etwork Comparison of Estimation Methods and Distance Metrics Sampson (969) recorded the social interactions among a group of = 8 monks while being a resident in a ew England monastery. The directed links of the network represent the liking relationships. z n Sampson Data z n MCMC - Euclidean distance (latentnet) z n Variational - Euclidean distance (VBLPCM) z n Variational - Squared Euclidean distance z n z n z n - - z n
3 Variational Inference VS MCMC etwork and odal Attributes Closed form posteriors Far faster than MCMC based methods In the absence of posterior dependence, the lower bound would match the log likelihood. As long as the posterior dependence is weak, the VA may be useful: Larger networks For starting point of MCMC algorithms To explore the model space. Underestimates variances Difficult to assess how tight the lower bound is. Sensitive to starting values (local minima) The classical approach to incorporate nodal attributes in the is: exp(α + β p(y Z,α,X,β)= T x ij z i z j ) y ij +exp(α + β T x ij z i z j ) β and x ij are vectors of length M. This contains only link covariate information x ij so it is not designed to deal with nodal attributes directly. This model assumes that the probability of a link depends on the nodal attributes (social selection) Sometimes the nodal attributes depend on the network links (social influence). We present a model where the network and the nodal attributes data mutually depend on each other. Factor Analysis (FA) for odal Attributes Factor Analysis (FA) for odal Attributes Factor analysis (FA) (Spearman, 94) is a useful technique to visualize continuous data, reducing the data dimensionality from M to D (where D M) inordertoexplainthe variability expressed by the correlation within the data. FA assumes that there is a continuous latent variable z n underlying the behavior of the continuous response variables given by an observation x n. z n iid (,σ I), and ε n iid (,Ψ), where Ψ = diag(ψ,...,ψ M ), and So, x n = µ + Λz n + ε n p(x n z n ) (µ,(λσ)(λσ) T + Ψ) The EM algorithm is used to find maximum likelihood estimate. p(z n x n ) (ẑ n, ˆΣ) Everything can be calculated analytically in closed form. The joint model for etwork and odal Attributes The probability of a node being connected with other nodes and the behaviour of nodal attributes are explained by the same latent variable. A continuous latent variable z n (,σ I D )summarizesthe information given by both the network and the nodal attributes. etwork Y and nodal attributes x n are independent given the latent variable z n. FA X Ψ Λ Z α Y The joint model for etwork and odal Attributes Fit the model We assume that: p(z n ) (,σ I). The network data are modeled via : p(z n Y) ( z n, Σ). The nodal attributes are modeled via FA: p(z n x n ) (ẑ n, ˆΣ). Joint model: where Σ = p(z n Y,x n ) p(z n Y)p(z n x n ) p(z n ) ( z n, Σ) Σ + ˆΣ σ I D and z n = Σ Σ zn + ˆΣ ẑ n
4 The joint model for etwork and odal Attributes EM Algorithm E-Step Estimate Σ (i+) The joint model for etwork and odal Attributes Yeast Proteins (i+) The network is composed by =5 nodes representing the interaction between Saccharomyces cerevisiae (yeast) proteins. and z n Q(Θ,ΘFA ; Θ, ΘFA ) = = Ep(Z Y,X;Θ,ΘFA ) + Ep(Z Y,X;Θ The nodal attributes consist of expression levels during yeast sporulation from M=8 experiments with Saccharomyces cerevisiae proteins. [log(p(y, Z Θ ))] +,ΘFA ) [log(p(x, Z ΘFA ))] therefore, (i+) (i+) = [Σ ] + [Σ ] ID σ (i+) (i+) (i+) (i+) (i+) (i+) z n = Σ [Σ ] z n + [Σ ] z n Σ Factor analysis is an appropriate tool to visualize the M-dimensional expression data in a low dimensional latent space. (i+) The fixed parameters: zn (, I ) ξ = ψ = M-Step Update (i+) (i+) (Θ, ΘFA ) = argmax Q(Θ, ΘFA ; Θ, ΘFA ) The joint model for etwork and odal Attributes Results The joint model for etwork and odal Attributes Performance.. ROC curves. AUC zjm = 9 % AUC ~ z JM = 9 % AUC ~ z = 9 %... ROC curve (left) and Boxplot (right) of the estimated probabilities of a link for the true negatives and true positives. positions (left) and LSJM positions (right). Multiple etwork Views Joint Modelling of Multiple etwork Views In many applications the behaviour of the nodes is strongly shaped by the complex relation of many interactions. Longitudinal networks: the links represent the same relation at different time points. Multiplex networks: the links come from different kind of relations (eg genetic and physical etc.) We have K networks on the same nodes. We propose a model that merges the information given by all these networks. A continuous latent variable zn (, σ ID ) identifies the position of node n in a D-dimensional latent space. Z Y Y Y
5 Joint Modelling of Multiple etwork Views Model The probability of a link depends on the distance between two nodes in the latent space. p(y,...,y K Z,α)= K k= exp(α k z i z j ) yijk + exp(α k z i z j ) Variational Approach k =,...,K: p(z n Y k ) ( z nk, Σ k ). Joining the two models: where Σ = p(z n Y,...,Y K ;Θ,...,Θ K ) K k= p(z n Y k ;Θ k ) p(z n ) K K k= Σ k K σ I D ( z n, Σ) and z n = Σ K k= Σ k z nk Joint Modelling of Multiple etwork Views EM Algorithm E-Step Estimate the parameters of the joint posterior distribution Σ (i+) and z (i+) n : Q(Θ,...,Θ K ;Θ,...,Θ K )= = K k= E p(z Y,...,Y K ;Θ,...,Θ K )[log(p(y k,z Θ k ))] We estimate the parameters z nk, Σ k of the posterior distribution p(z n Y k ;Θ k ) given each network k separately. We merge these estimates to find joint posterior distribution of the latent positions ( z n, Σ). M-Step Update the variational model parameters ξ,..., ξ K, ψ,..., ψ K : (Θ (i+),...,θ (i+) K )=argmax Q(Θ,...,Θ K ;Θ,...,Θ K ). LSJM Monks etwork (cont d) LSJM Monks etwork LSJM positions z n - - We analyze the networks of liking relationship at K =time points fitting the to each network separately. - Sampson Data z n Sampson Data z n Sampson Data z n - - LSJM - Sampson Data z n z n z n z n LSJM Protein-Protein Interactions LSJM Protein-Protein Interactions positions Genetic Interactions Physical Interactions K = undirected networks formed by genetic and physical protein-protein interactions between = 67 Saccharomyces cerevisiae proteins. The complex relational structure of this dataset has led to implementation of models aiming at describing the functional relationships between the observations. The data were downloaded from the Biological General Repository for Interaction Datasets (BioGRID) database. z n z n z n Latent posterior distributions fitting the for the two networks separately. z n
6 LSJM Protein-Protein Interactions ROC and BOX plots LSJM Protein-Protein Interactions LSJM positions! LSJM: Genetic! Physical LSJM Physical Interactions zn zn. p (y i j = Model). yij =. yij = yij = yij = AUC Genetic Interactions = 95 % AUC Physical Interactions = 95 %.... p (y i j = Model).... Genetic Interactions ROC curve ROC curves and Boxplots of the estimated probabilities of a link for the true negatives and true positives zn zn Left: p(zn Y,..., YK ; Θ,..., ΘK ) (z n, Σ ) fitting the LSJM Right: the dots represent the overall positions z n and the arrows connect the estimated position under each model p(zn Yk ; Θk ) = (z nk, Σ k ). LSJM Protein-Protein Interactions LSJM ROC and BOX plots. p (y i j = Model). Missing (unobserved) links can be easily managed by the LSJM using the information given by all the network views... p (y i j = Model) LSJM Missing Links and Missing odes... AUC Genetic Interactions = 94 % AUC Physical Interactions = 9 %.... To estimate the probability of the presence or absence of an edge we employ the posterior mean of the αk and of the latent positions so that we get the following equation:.. Physical Interactions Genetic Interactions ROC curve yij =. yij = yij = yij = yijk = p(yijk = z i,z j, ξ k ) = ROC curves and Boxplots of the estimated probabilities of a link for the true negatives and true positives. LSJM Protein-Protein Interactions Missing Data... Missing Links (-fold cross validation): LSJM: misclassification rate of 9% for the genetic interaction network, and 6% for the physical interaction network. : misclassification rate of 8% for the genetic interaction network, and 7% for the physical interaction network. AUC Genetic Interactions = 66 % AUC Physical Interactions = 9 %... ROC curves fitting a LSJM (left) and single (right) If we want to infer whether to assign yijk = or not, we need to introduce a threshold τk, and let yijk = if p(yijk = z ik,z jk, ξ k ) > τk. LSJM: misclassification rate of 4% for the genetic interactions dataset and % for the physical interaction network. : useless since it would locate the nodes only relying on the prior information..... Missing odes (-fold cross validation):.. ROC curve Overall AUC Genetic Interactions = 9 % AUC Physical Interactions = 88 %. + exp(ξ k z i z j ) LSJM Protein-Protein Interactions Missing Data To evaluate the link prediction we applied a -fold cross validation setting the % of the links to be missing at each time point. ROC curve Overall exp(ξ k z i z j ) Try to improve the predictions using a higher dimension for the latent variables.
7 Conclusions References The joint models are particularly useful to: Locate unconnected nodes/subgraphs in the latent space. Estimate missing links. Wide range of applications Variational Bayes allows to deal with networks of thousands of nodes. Possible extentions: Joint models for directed networks using the inner product instead of Euclidean distance. Joint models for categorical nodal attributes (LTA instead of FA). Joint models with clusters (LPCM, MFA, MLTA). Beyond binary networks: Rank and Count data. Hoff, P. D., Raftery, A. E., and Handcock, M. S. (). Latent space approaches to social network analysis. Journal of the american Statistical association, 97 (46), Salter-Townshend, M., White, A., Gollini, I., and Murphy, T.B. (). Review of Statistical etwork Analysis: Models, Algorithms and Software, Statistical Analysis and Data Mining, 5 (4), Gollini, I., and Murphy, T.B. (). Joint Modelling of Multiple etwork Views, arxiv:.759
Package lvm4net. R topics documented: August 29, Title Latent Variable Models for Networks
Title Latent Variable Models for Networks Package lvm4net August 29, 2016 Latent variable models for network data using fast inferential procedures. Version 0.2 Depends R (>= 3.1.1), MASS, ergm, network,
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationCS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed
More informationThe Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul
The Random Subgraph Model for the Analysis of an Ecclesiastical Network in Merovingian Gaul Charles Bouveyron Laboratoire MAP5, UMR CNRS 8145 Université Paris Descartes This is a joint work with Y. Jernite,
More informationExpectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University
Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 9, 2012 Today: Graphical models Bayes Nets: Inference Learning Readings: Required: Bishop chapter
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationMarkov chain Monte Carlo methods
Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationStructured prediction using the network perceptron
Structured prediction using the network perceptron Ta-tsen Soong Joint work with Stuart Andrews and Prof. Tony Jebara Motivation A lot of network-structured data Social networks Citation networks Biological
More informationEstimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationClustering web search results
Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means
More informationLecture 21 : A Hybrid: Deep Learning and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation
More informationImage analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis
7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning
More informationLecture 8: The EM algorithm
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 8: The EM algorithm Lecturer: Manuela M. Veloso, Eric P. Xing Scribes: Huiting Liu, Yifan Yang 1 Introduction Previous lecture discusses
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationVariational Autoencoders
red red red red red red red red red red red red red red red red red red red red Tutorial 03/10/2016 Generative modelling Assume that the original dataset is drawn from a distribution P(X ). Attempt to
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationScore function estimator and variance reduction techniques
and variance reduction techniques Wilker Aziz University of Amsterdam May 24, 2018 Wilker Aziz Discrete variables 1 Outline 1 2 3 Wilker Aziz Discrete variables 1 Variational inference for belief networks
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationMixture Models and EM
Table of Content Chapter 9 Mixture Models and EM -means Clustering Gaussian Mixture Models (GMM) Expectation Maximiation (EM) for Mixture Parameter Estimation Introduction Mixture models allows Complex
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationDigital Image Processing Laboratory: MAP Image Restoration
Purdue University: Digital Image Processing Laboratories 1 Digital Image Processing Laboratory: MAP Image Restoration October, 015 1 Introduction This laboratory explores the use of maximum a posteriori
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationNetwork-based auto-probit modeling for protein function prediction
Network-based auto-probit modeling for protein function prediction Supplementary material Xiaoyu Jiang, David Gold, Eric D. Kolaczyk Derivation of Markov Chain Monte Carlo algorithm with the GO annotation
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationMachine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves
Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves
More informationRegularization and model selection
CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial
More informationINLA: Integrated Nested Laplace Approximations
INLA: Integrated Nested Laplace Approximations John Paige Statistics Department University of Washington October 10, 2017 1 The problem Markov Chain Monte Carlo (MCMC) takes too long in many settings.
More informationMulti-label classification using rule-based classifier systems
Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar
More informationMultiple Imputation with Mplus
Multiple Imputation with Mplus Tihomir Asparouhov and Bengt Muthén Version 2 September 29, 2010 1 1 Introduction Conducting multiple imputation (MI) can sometimes be quite intricate. In this note we provide
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationMissing Data Estimation in Microarrays Using Multi-Organism Approach
Missing Data Estimation in Microarrays Using Multi-Organism Approach Marcel Nassar and Hady Zeineddine Progress Report: Data Mining Course Project, Spring 2008 Prof. Inderjit S. Dhillon April 02, 2008
More informationECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning Topics: Unsupervised Learning: Kmeans, GMM, EM Readings: Barber 20.1-20.3 Stefan Lee Virginia Tech Tasks Supervised Learning x Classification y Discrete x Regression
More informationGenerative and discriminative classification
Generative and discriminative classification Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Classification in its simplest form Given training data labeled for two or more classes Classification
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationLouis Fourrier Fabien Gaie Thomas Rolf
CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University March 4, 2015 Today: Graphical models Bayes Nets: EM Mixture of Gaussian clustering Learning Bayes Net structure
More informationDATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane
DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationExpectation-Maximization. Nuno Vasconcelos ECE Department, UCSD
Expectation-Maximization Nuno Vasconcelos ECE Department, UCSD Plan for today last time we started talking about mixture models we introduced the main ideas behind EM to motivate EM, we looked at classification-maximization
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationAssignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018
Assignment 2 Unsupervised & Probabilistic Learning Maneesh Sahani Due: Monday Nov 5, 2018 Note: Assignments are due at 11:00 AM (the start of lecture) on the date above. he usual College late assignments
More informationImage Segmentation using Gaussian Mixture Models
Image Segmentation using Gaussian Mixture Models Rahman Farnoosh, Gholamhossein Yari and Behnam Zarpak Department of Applied Mathematics, University of Science and Technology, 16844, Narmak,Tehran, Iran
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationMachine Learning. Sourangshu Bhattacharya
Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany
More informationCHAPTER 1 INTRODUCTION
Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,
More information1 Training/Validation/Testing
CPSC 340 Final (Fall 2015) Name: Student Number: Please enter your information above, turn off cellphones, space yourselves out throughout the room, and wait until the official start of the exam to begin.
More informationRobotics. Lecture 5: Monte Carlo Localisation. See course website for up to date information.
Robotics Lecture 5: Monte Carlo Localisation See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College London Review:
More informationText Modeling with the Trace Norm
Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to
More informationA Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression
Journal of Data Analysis and Information Processing, 2016, 4, 55-63 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jdaip http://dx.doi.org/10.4236/jdaip.2016.42005 A Comparative Study
More informationContext Change and Versatile Models in Machine Learning
Context Change and Versatile s in Machine Learning José Hernández-Orallo Universitat Politècnica de València jorallo@dsic.upv.es ECML Workshop on Learning over Multiple Contexts Nancy, 19 September 2014
More informationarxiv: v1 [stat.me] 29 May 2015
MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis Vincent Audigier 1, François Husson 2 and Julie Josse 2 arxiv:1505.08116v1 [stat.me] 29 May 2015 Applied Mathematics
More informationInference and Representation
Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised
More informationThe Multi Stage Gibbs Sampling: Data Augmentation Dutch Example
The Multi Stage Gibbs Sampling: Data Augmentation Dutch Example Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 8 1 Example: Data augmentation / Auxiliary variables A commonly-used
More informationClustering algorithms
Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22 Table of contents 1 Supervised
More informationGraphical Models & HMMs
Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models
More informationCase Study IV: Bayesian clustering of Alzheimer patients
Case Study IV: Bayesian clustering of Alzheimer patients Mike Wiper and Conchi Ausín Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School 2nd - 6th
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.
More informationVariational Autoencoders. Sargur N. Srihari
Variational Autoencoders Sargur N. srihari@cedar.buffalo.edu Topics 1. Generative Model 2. Standard Autoencoder 3. Variational autoencoders (VAE) 2 Generative Model A variational autoencoder (VAE) is a
More informationWarped Mixture Models
Warped Mixture Models Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani Cambridge University Computational and Biological Learning Lab March 11, 2013 OUTLINE Motivation Gaussian Process Latent Variable
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationGraphical Models, Bayesian Method, Sampling, and Variational Inference
Graphical Models, Bayesian Method, Sampling, and Variational Inference With Application in Function MRI Analysis and Other Imaging Problems Wei Liu Scientific Computing and Imaging Institute University
More information9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology
9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example
More information22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos
Machine Learning for Computer Vision 1 22 October, 2012 MVA ENS Cachan Lecture 5: Introduction to generative models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris
More informationGraph mining assisted semi-supervised learning for fraudulent cash-out detection
Graph mining assisted semi-supervised learning for fraudulent cash-out detection Yuan Li Yiheng Sun Noshir Contractor Aug 2, 2017 Outline Introduction Method Experiments and Results Conculsion and Future
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationK-Means and Gaussian Mixture Models
K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Object Recognition 2015-2016 Jakob Verbeek, December 11, 2015 Course website: http://lear.inrialpes.fr/~verbeek/mlor.15.16 Classification
More informationTransitivity and Triads
1 / 32 Tom A.B. Snijders University of Oxford May 14, 2012 2 / 32 Outline 1 Local Structure Transitivity 2 3 / 32 Local Structure in Social Networks From the standpoint of structural individualism, one
More informationSelf-organizing mixture models
Self-organizing mixture models Jakob Verbeek, Nikos Vlassis, Ben Krose To cite this version: Jakob Verbeek, Nikos Vlassis, Ben Krose. Self-organizing mixture models. Neurocomputing / EEG Neurocomputing,
More informationProblem 1 (20 pt) Answer the following questions, and provide an explanation for each question.
Problem 1 Answer the following questions, and provide an explanation for each question. (5 pt) Can linear regression work when all X values are the same? When all Y values are the same? (5 pt) Can linear
More informationIdentifying and Understanding Differential Transcriptor Binding
Identifying and Understanding Differential Transcriptor Binding 15-899: Computational Genomics David Koes Yong Lu Motivation Under different conditions, a transcription factor binds to different genes
More informationMissing Data Analysis for the Employee Dataset
Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1
More informationHidden Markov Models in the context of genetic analysis
Hidden Markov Models in the context of genetic analysis Vincent Plagnol UCL Genetics Institute November 22, 2012 Outline 1 Introduction 2 Two basic problems Forward/backward Baum-Welch algorithm Viterbi
More informationmritc: A Package for MRI Tissue Classification
mritc: A Package for MRI Tissue Classification Dai Feng 1 Luke Tierney 2 1 Merck Research Labratories 2 University of Iowa July 2010 Feng & Tierney (Merck & U of Iowa) MRI Tissue Classification July 2010
More informationCSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13
CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit
More informationClassication of Corporate and Public Text
Classication of Corporate and Public Text Kevin Nguyen December 16, 2011 1 Introduction In this project we try to tackle the problem of classifying a body of text as a corporate message (text usually sent
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 9: Data Mining (4/4) March 9, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides
More informationBayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis
Bayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis Xavier Le Faucheur a, Brani Vidakovic b and Allen Tannenbaum a a School of Electrical and Computer Engineering, b Department of Biomedical
More informationVariational Methods for Discrete-Data Latent Gaussian Models
Variational Methods for Discrete-Data Latent Gaussian Models University of British Columbia Vancouver, Canada March 6, 2012 The Big Picture Joint density models for data with mixed data types Bayesian
More informationApplications of admixture models
Applications of admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price Applications of admixture models 1 / 27
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More information