Case Study IV: Bayesian clustering of Alzheimer patients
|
|
- Cody Maxwell
- 5 years ago
- Views:
Transcription
1 Case Study IV: Bayesian clustering of Alzheimer patients Mike Wiper and Conchi Ausín Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School 2nd - 6th July, 2018 Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 1 / 26
2 Objective We illustrate how to use the EM algorithm, Gibbs sampling and Variational Bayes approximation for clustering of Alzeihemer patients. We would like to divide patients in subgroups according to the symptoms presented. Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 2 / 26
3 Alzeihemer data This data set can be downloaded from the BayesLCA R package. rm(list=ls()) library(bayeslca) data("alzheimer") This data set contains information about the presence or absence of six symptoms displayed by 240 patients diagnosed with early onset Alzheimer s disease recorded in the Mercer Institute of St. James Hospital in Dublin. attach(alzheimer) par(mfrow=c(2,3)) plot(hallucination) plot(activity) plot(aggression) plot(agitation) plot(diurnal) plot(affective) par(mfrow = c(1, 1)) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 3 / 26
4 Latent class analysis We wish to obtain K groups of patients according to their symptoms. Thus, for each observation, x = (x 1,..., x M ), we may assume a K-component mixture of multivariate binary variables with probability distribution: Pr (x K, w, θ) = K M w K k=1 m=1 θ xm km (1 θ km) (1 xm) where w = {w K } and θ = {θ km } for K = 1,..., K and for m = 1,..., M. We assume the following prior distributions, w Dirichlet(δ 1,..., δ K ) θ km Beta (α km, β km ) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 4 / 26
5 Latent class analysis Let x = {x 1,..., x N } be the sample of N = 126 patients. Each observation x i = (x i1,..., x im ) is a vector of M = 6 binary variables representing the presence or absence of the m-th symptom. We assume that there are K = 3 groups of patients and the prior probability of belonging to group k is w k. We also assume that the presence of the symptoms in each group follow independent Bernoulli distributions, Pr(x im θ km ) = θ x im km (1 θ km) 1 x im, for m = 1,..., M Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 5 / 26
6 Latent class analysis We may define a latent set of variables, z = {z 1,..., z N }, indicating the group of each patient. The prior probability that i-th patient belongs to group k is: Pr(z i = k w, θ) = w k and, given that the patient is in group k, Pr(x i z i = k, w, θ) = M m=1 θ x im km (1 θ km) 1 x im Then, the complete-data likelihood function is, f (x, z w, θ) = K k=1 i=1 N I(z i = k)w k M m=1 θ x im km (1 θ km) 1 x im Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 6 / 26
7 Latent class analysis Since we are assuming the following prior distributions, f (w δ 1,..., δ K ) K k=1 w (δ k 1) k f (θ km α km, β km ) θ α km 1 km (1 θ km ) (β km 1) We can obtain the log-posterior, [ K N M log f (w, θ x, z) = I(z i = k) log w k + k=1 k=1 i=1 K K M + (δ k 1) log w k + log k=1 m=1 m=1 { } ] log θ x im km (1 θ km) (1 x im) { θ (α km 1) km (1 θ km ) (β km 1) } Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 7 / 26
8 EM algorithm E Step Calculate E z xi,w (t),θ(t) [log f (w, θ x, z)] which depends on: [ E I(z i = k) x i, w (t), θ (t)] = Pr (z i = k x i, w (t), θ (t)) M Step Maximize the previous expectation. arg maxe z xi,w (t),θ(t) [log f (w, θ x, z)] w,θ Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 8 / 26
9 EM algorithm Repeat t until convergence: E Step M Step z (t+1) ik = w (t+1) k θ (t+1) km = w (t) k K s=1 w (t) s ) M m=1 (x Pr im θ (t) km M m=1 Pr (x im θ (t) sm = δ k + N i=1 z(t+1) ik 1 K s=1 δ s + N K α km + N i=1 z(t+1) ik x im α km + β km + N i=1 z(t+1) ik 2 ) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 9 / 26
10 EM algorithm for Alzheimer data We apply the EM algorithm for our Alzheimer data and we assume three groups of patients. fit.em=blca(alzheimer, 3, method = "em") An important difficulty with the EM algorithm is that it may converge to a local maximum or saddle-point. Thus, the algorithm is run for a number of different starting values (5, by default). Parameter estimates from the run which achieved the highest log-posterior are returned. From only five starts, the algorithm obtains three distinct local maxima of the log-posterior. Then, it seems sensible to run the algorithm more times. fit.em=blca.em(alzheimer, 3,restarts=20) The algorithm provides MAP estimates of model parameters: print(fit.em) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 10 / 26
11 EM algorithm for Alzheimer data Obtain complete information of prior specifications, EM performance, log-posterior, AIC and BIC results: summary(fit.em) Note that AIC and BIC can be used to select the number of patient groups. The MAP of class probabilities are: fit.em$classprob and the MAP of the item probabilities, conditional on class membership, are: fit.em$itemprob These estimates can be visualized with the following plot: par(mfrow=c(1,1)) plot(fit.em) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 11 / 26
12 EM algorithm for Alzheimer data We may try different prior assumptions: fit.em=blca.em(alzheimer, 3,restarts=20,alpha=2,beta=2) print(fit.em) plot(fit.em) fit.em=blca.em(alzheimer, 3,restarts=20,alpha=0.001,beta=0.001) print(fit.em) plot(fit.em) We wish to approximate the whole posterior distribution of the model parameters rather than obtain only their MAP values. One possibility is using a Gibbs sampling. Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 12 / 26
13 Gibbs sampling Gibbs sampling is a particular MCMC method when the conditional posterior distributions are known. In order to obtain a sample from the joint distribution, f (w, θ, z x), we can sample iteratively from the conditional posterior distributions: ) 1 Sample θ (t+1) f (θ x, w (t), z (t) ) 2 Sample w (t+1) f (w x, θ (t+1), z (t) ) 3 Sample z (t+1) Pr (z x, θ (t+1), w (t+1) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 13 / 26
14 Gibbs sampling We obtain that the conditional posterior distributions are given by: 1 For k = 1,..., m, and for k = 1,..., K, N i=1 f (θ km x, w, z) θ I(z i =k)x im +α km 1 N i=1 km (1 θ km ) I(z i =k)(1 x im )+β km 1 which is a Beta distribution. 2 And, f (w x, θ, z) which is a Dirichlet distribution. 3 Finally, for i = 1,..., N, Pr(z i = k x i, w, θ) = K k=1 N i=1 w I(z i =k)+δ k 1 k M w k m=1 Pr(x im θ km ) K s=1 w M s m=1 Pr(x im θ sm ) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 14 / 26
15 Gibbs sampling algorithm for Alzheimer data We now apply the Gibbs sampling algorithm for the Alzheimer data. We initially use three groups: out=blca(alzheimer, 3, method = "gibbs") print(out) plot(out) We may also observe the plots of density estimates for model parameters. For the item probabilities, conditional on class membership: par(mfrow = c(3,2)) plot(out,which=3) And for the class probabilities: par(mfrow = c(1,1)) plot(out,which=4) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 15 / 26
16 Prior sensitivity We may try different prior assumptions: out.prior2=blca(alzheimer, 3, method = "gibbs", alpha=2, beta=2) print(out.prior2) plot(out.prior2) out.prior3=blca(alzheimer, 3, method = "gibbs", alpha=0.001,beta=0.001) print(out.prior3) plot(out.prior3) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 16 / 26
17 Model selection We may also try different values for the number of patient groups: out.size1=blca(alzheimer, 1, method = "gibbs") print(out.1) plot(out.1) out.size2=blca(alzheimer, 2, method = "gibbs") print(out3) plot(out3) We can use the DIC criteria (that will be studied in chapter 5) to select the mixture size. -out.size1$dic -out.size2$dic -out$dic Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 17 / 26
18 Gibbs sampling algorithm for Alzheimer data In all cases, we have run the Gibbs sampler over its default settings: with a burn-in of 100 iterations and thinning rate of 1. A convergence diagnosis must always be done. par(mfrow = c(4, 2)) plot(out, which = 5) We may observe that the mcmc performance is not very good. It seems that the mcmc chain has converged but it does not present a good mixing. This can be also observed with convergence diagnostic methods such as raftery.diag available in the coda package, which is automatically loaded in the BayesLCA package. raftery.diag(as.mcmc(out)) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 18 / 26
19 Gibbs sampling algorithm for Alzheimer data The output of the convergence diagnostic suggests that the sampler converges quickly (burn-in values are low), but is not mixing satisfactorily (note the high dependence factor of many parameters). A Gibbs sampler with better tuned parameters can then be run: out2=blca(alzheimer, 3, method = "gibbs", burn.in = 150, thin = 1/10, iter = 50000) plot(out2, which = 5) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 19 / 26
20 Gibbs sampling algorithm for Alzheimer data One question that should be mentioned is that the blca.gibbs function includes by default a relabeling method to reduce the label switching problem. This is a well-known problem in mixture models that appears due to the lack of identifiability. Without relabeling, we can observe that the label switching problem appears in the trace plots: fit.gs=blca(alzheimer, 3, method = "gibbs", relabel=f) plot(fit.gs, which = 5) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 20 / 26
21 Variational Bayes The idea consists in approximating the posterior distribution f (ω, θ, z x) with a variational distribution q(w, θ, z) which assumes independence among block of parameters: q(w, θ, z) = q 1 (w γ)q 2 (θ ζ)q 3 (z φ) where (γ, ζ, φ) are the variational parameters. The VB approach looks for the distributions q j that minimize the Kullback-Leibler divergence between the posterior and variational approximation. In mixture models, it can be shown that the form of q j is the same as that of the conditional posterior distribution. Then, w γ Dirichlet(γ 1,..., γ k ) θ km ζ Beta(ζ km1, ζ km2 ) z i φ Multinomial(φ 1,..., φ n ) The variational parameters are updated iteratively until the KL divergence is minimized. Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 21 / 26
22 Variational Bayes We now apply the VB algorithm for the Alzheimer data: fit.vb=blca(alzheimer, 3, method = "vb") print(fit.vb) Observe that the Variational Bayes method is much more faster than the Gibbs sampling. And VB also provides posterior standard deviation estimates. fit.vb$itemprob fit.vb$classprob fit.vb$itemprob.sd fit.vb$classprob.sd MAP estimates are close to those obtained with the Gibss sampling algorithm: fit.gs$itemprob fit.vb$itemprob fit.gs$classprob fit.vb$classprob Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 22 / 26
23 Variational Bayes However, the Gibbs sampling provides a better approximation of the posterior distributions. Observe that the posterior standard deviation estimates are larger than those obtained for the VB method: fit.gs$itemprob.sd fit.gs$classprob.sd fit.vb$itemprob.sd fit.vb$classprob.sd Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 23 / 26
24 Variational Bayes We may also observe these differences in the plots of density estimates for model parameters. For the item probabilities, conditional on class membership: par(mfrow = c(3,2)) plot(fit.gs,which=3) plot(fit.vb,which=3) And for the class probabilities: par(mfrow = c(1,1)) plot(fit.gs,which=4) plot(fit.vb,which=4) Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 24 / 26
25 Variational Bayes One method for determining an appropriate number of classes to fit to the Alzheimer data is to deliberately over-fit the model, and then consider only the classes for which the posterior mean of w K is positive. fit.vb=blca(alzheimer, 10, method = "vb") fit.vb$classprob This suggests a 2-class fit is the best suited for the variational Bayes approximation. plot(fit.vb, which = 5) The multiple jumps in the lower bound indicate where components have emptied out. Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 25 / 26
26 Summary We have implemented a Bayesian approach for clustering of Alzheimer patients according to their symptoms. A mixture of K groups of multivariate binary distributions have been considered for modelling the observed data. We have implemented three different computational Bayesian methods to estimate finite mixture models: EM, MCMC and VB. The VB approximation provides a very fast procedure to estimate the posterior distribution of the model parameters. However, it is well known that the enforced independence between parameters that is imposed in VB approximations results in diminished variance estimates. A better approximation (although usually more time consuming) is provided by MCMC methods and, in particular, the Gibbs sampling. Standard MCMC methods can be extremely time consuming for big data sets. Mike Wiper and Conchi Ausín Clustering of Alzheimer patients Advanced Statistics and Data Mining 26 / 26
MCMC Diagnostics. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) MCMC Diagnostics MATH / 24
MCMC Diagnostics Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) MCMC Diagnostics MATH 9810 1 / 24 Convergence to Posterior Distribution Theory proves that if a Gibbs sampler iterates enough,
More informationClustering Relational Data using the Infinite Relational Model
Clustering Relational Data using the Infinite Relational Model Ana Daglis Supervised by: Matthew Ludkin September 4, 2015 Ana Daglis Clustering Data using the Infinite Relational Model September 4, 2015
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More informationMarkov Chain Monte Carlo (part 1)
Markov Chain Monte Carlo (part 1) Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2018 Depending on the book that you select for
More informationDPP: Reference documentation. version Luis M. Avila, Mike R. May and Jeffrey Ross-Ibarra 17th November 2017
DPP: Reference documentation version 0.1.2 Luis M. Avila, Mike R. May and Jeffrey Ross-Ibarra 17th November 2017 1 Contents 1 DPP: Introduction 3 2 DPP: Classes and methods 3 2.1 Class: NormalModel.........................
More informationLinear Modeling with Bayesian Statistics
Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationwinbugs and openbugs
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 04/19/2017 Bayesian estimation software Several stand-alone applications and add-ons to estimate Bayesian models Stand-alone applications:
More informationOverview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week
Statistics & Bayesian Inference Lecture 3 Joe Zuntz Overview Overview & Motivation Metropolis Hastings Monte Carlo Methods Importance sampling Direct sampling Gibbs sampling Monte-Carlo Markov Chains Emcee
More informationProblem 1 (20 pt) Answer the following questions, and provide an explanation for each question.
Problem 1 Answer the following questions, and provide an explanation for each question. (5 pt) Can linear regression work when all X values are the same? When all Y values are the same? (5 pt) Can linear
More informationBayesian Modelling with JAGS and R
Bayesian Modelling with JAGS and R Martyn Plummer International Agency for Research on Cancer Rencontres R, 3 July 2012 CRAN Task View Bayesian Inference The CRAN Task View Bayesian Inference is maintained
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More information10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)
10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) Rahil Mahdian 01.04.2016 LSV Lab, Saarland University, Germany What is clustering? Clustering is the classification of objects into different groups,
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationIntroduction to Mobile Robotics
Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,
More informationProbabilistic Graphical Models
Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationBayesian Statistics Group 8th March Slice samplers. (A very brief introduction) The basic idea
Bayesian Statistics Group 8th March 2000 Slice samplers (A very brief introduction) The basic idea lacements To sample from a distribution, simply sample uniformly from the region under the density function
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationCase Study I: Naïve Bayesian spam filtering
Case Study I: Naïve Bayesian spam filtering Mike Wiper and Conchi Ausín Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School 26th - 30th June, 2017
More informationClustering and The Expectation-Maximization Algorithm
Clustering and The Expectation-Maximization Algorithm Unsupervised Learning Marek Petrik 3/7 Some of the figures in this presentation are taken from An Introduction to Statistical Learning, with applications
More informationA GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM
A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general
More informationModeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA
Modeling and Reasoning with Bayesian Networks Adnan Darwiche University of California Los Angeles, CA darwiche@cs.ucla.edu June 24, 2008 Contents Preface 1 1 Introduction 1 1.1 Automated Reasoning........................
More informationVariational Methods for Discrete-Data Latent Gaussian Models
Variational Methods for Discrete-Data Latent Gaussian Models University of British Columbia Vancouver, Canada March 6, 2012 The Big Picture Joint density models for data with mixed data types Bayesian
More informationQuantitative Biology II!
Quantitative Biology II! Lecture 3: Markov Chain Monte Carlo! March 9, 2015! 2! Plan for Today!! Introduction to Sampling!! Introduction to MCMC!! Metropolis Algorithm!! Metropolis-Hastings Algorithm!!
More informationClustering web search results
Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means
More informationIssues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users
Practical Considerations for WinBUGS Users Kate Cowles, Ph.D. Department of Statistics and Actuarial Science University of Iowa 22S:138 Lecture 12 Oct. 3, 2003 Issues in MCMC use for Bayesian model fitting
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationThe Multi Stage Gibbs Sampling: Data Augmentation Dutch Example
The Multi Stage Gibbs Sampling: Data Augmentation Dutch Example Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 8 1 Example: Data augmentation / Auxiliary variables A commonly-used
More informationMarkov chain Monte Carlo methods
Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis
More informationPackage mfa. R topics documented: July 11, 2018
Package mfa July 11, 2018 Title Bayesian hierarchical mixture of factor analyzers for modelling genomic bifurcations Version 1.2.0 MFA models genomic bifurcations using a Bayesian hierarchical mixture
More informationAnalysis of Incomplete Multivariate Data
Analysis of Incomplete Multivariate Data J. L. Schafer Department of Statistics The Pennsylvania State University USA CHAPMAN & HALL/CRC A CR.C Press Company Boca Raton London New York Washington, D.C.
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationFrom Bayesian Analysis of Item Response Theory Models Using SAS. Full book available for purchase here.
From Bayesian Analysis of Item Response Theory Models Using SAS. Full book available for purchase here. Contents About this Book...ix About the Authors... xiii Acknowledgments... xv Chapter 1: Item Response
More informationUnivariate Extreme Value Analysis. 1 Block Maxima. Practice problems using the extremes ( 2.0 5) package. 1. Pearson Type III distribution
Univariate Extreme Value Analysis Practice problems using the extremes ( 2.0 5) package. 1 Block Maxima 1. Pearson Type III distribution (a) Simulate 100 maxima from samples of size 1000 from the gamma
More informationDocumentation for MavericK software: Version 1.0
Documentation for MavericK software: Version 1.0 Robert Verity MRC centre for outbreak analysis and modelling Imperial College London and Richard A. Nichols Queen Mary University of London May 19, 2016
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Exam
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More information2. A Bernoulli distribution has the following likelihood function for a data set D: N 1 N 1 + N 0
Machine Learning Fall 2015 Homework 1 Homework must be submitted electronically following the instructions on the course homepage. Make sure to explain you reasoning or show your derivations. Except for
More informationECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov
ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern
More informationFitting Social Network Models Using the Varying Truncation S. Truncation Stochastic Approximation MCMC Algorithm
Fitting Social Network Models Using the Varying Truncation Stochastic Approximation MCMC Algorithm May. 17, 2012 1 This talk is based on a joint work with Dr. Ick Hoon Jin Abstract The exponential random
More information1 Methods for Posterior Simulation
1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationRJaCGH, a package for analysis of
RJaCGH, a package for analysis of CGH arrays with Reversible Jump MCMC 1. CGH Arrays: Biological problem: Changes in number of DNA copies are associated to cancer activity. Microarray technology: Oscar
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationST440/540: Applied Bayesian Analysis. (5) Multi-parameter models - Initial values and convergence diagn
(5) Multi-parameter models - Initial values and convergence diagnostics Tuning the MCMC algoritm MCMC is beautiful because it can handle virtually any statistical model and it is usually pretty easy to
More informationK-Means and Gaussian Mixture Models
K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser
More informationA Topography-Preserving Latent Variable Model with Learning Metrics
A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland
More informationA Comparison of Two MCMC Algorithms for Hierarchical Mixture Models
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell G. Almond Florida State University Abstract Mixture models form an important class of models for unsupervised learning, allowing
More informationRecap: The E-M algorithm. Biostatistics 615/815 Lecture 22: Gibbs Sampling. Recap - Local minimization methods
Recap: The E-M algorithm Biostatistics 615/815 Lecture 22: Gibbs Sampling Expectation step (E-step) Given the current estimates of parameters λ (t), calculate the conditional distribution of latent variable
More informationPackage BAMBI. R topics documented: August 28, 2017
Type Package Title Bivariate Angular Mixture Models Version 1.1.1 Date 2017-08-23 Author Saptarshi Chakraborty, Samuel W.K. Wong Package BAMBI August 28, 2017 Maintainer Saptarshi Chakraborty
More informationBayesian Computation with JAGS
JAGS is Just Another Gibbs Sampler Cross-platform Accessible from within R Bayesian Computation with JAGS What I did Downloaded and installed JAGS. In the R package installer, downloaded rjags and dependencies.
More informationModel-Based Clustering for Online Crisis Identification in Distributed Computing
Model-Based Clustering for Crisis Identification in Distributed Computing Dawn Woodard Operations Research and Information Engineering Cornell University with Moises Goldszmidt Microsoft Research 1 Outline
More informationExpectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University
Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate
More informationProbabilistic Graphical Models Part III: Example Applications
Probabilistic Graphical Models Part III: Example Applications Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2014 CS 551, Fall 2014 c 2014, Selim
More informationImage analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis
7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning
More informationLiangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison*
Tracking Trends: Incorporating Term Volume into Temporal Topic Models Liangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison* Dept. of Computer Science and Engineering, Lehigh University, Bethlehem, PA,
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationG-PhoCS Generalized Phylogenetic Coalescent Sampler version 1.2.3
G-PhoCS Generalized Phylogenetic Coalescent Sampler version 1.2.3 Contents 1. About G-PhoCS 2. Download and Install 3. Overview of G-PhoCS analysis: input and output 4. The sequence file 5. The control
More informationLatent Variable Models for the Analysis, Visualization and Prediction of Network and Nodal Attribute Data. Isabella Gollini.
z i! z j Latent Variable Models for the Analysis, Visualization and Prediction of etwork and odal Attribute Data School of Engineering University of Bristol isabella.gollini@bristol.ac.uk January 4th,
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationA Fast Learning Algorithm for Deep Belief Nets
A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National
More informationLatent Class Modeling as a Probabilistic Extension of K-Means Clustering
Latent Class Modeling as a Probabilistic Extension of K-Means Clustering Latent Class Cluster Models According to Kaufman and Rousseeuw (1990), cluster analysis is "the classification of similar objects
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationBayesian Estimation for Skew Normal Distributions Using Data Augmentation
The Korean Communications in Statistics Vol. 12 No. 2, 2005 pp. 323-333 Bayesian Estimation for Skew Normal Distributions Using Data Augmentation Hea-Jung Kim 1) Abstract In this paper, we develop a MCMC
More informationUnsupervised: no target value to predict
Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning
More informationMesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee
Mesh segmentation Florent Lafarge Inria Sophia Antipolis - Mediterranee Outline What is mesh segmentation? M = {V,E,F} is a mesh S is either V, E or F (usually F) A Segmentation is a set of sub-meshes
More informationSummary: A Tutorial on Learning With Bayesian Networks
Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.
More informationTree of Latent Mixtures for Bayesian Modelling and Classification of High Dimensional Data
Technical Report No. 2005-06, Department of Computer Science and Engineering, University at Buffalo, SUNY Tree of Latent Mixtures for Bayesian Modelling and Classification of High Dimensional Data Hagai
More informationPackage beast. March 16, 2018
Type Package Package beast March 16, 2018 Title Bayesian Estimation of Change-Points in the Slope of Multivariate Time-Series Version 1.1 Date 2018-03-16 Author Maintainer Assume that
More informationStatistical Matching using Fractional Imputation
Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:
More informationPackage binomlogit. February 19, 2015
Type Package Title Efficient MCMC for Binomial Logit Models Version 1.2 Date 2014-03-12 Author Agnes Fussl Maintainer Agnes Fussl Package binomlogit February 19, 2015 Description The R package
More informationScalable Multidimensional Hierarchical Bayesian Modeling on Spark
Scalable Multidimensional Hierarchical Bayesian Modeling on Spark Robert Ormandi, Hongxia Yang and Quan Lu Yahoo! Sunnyvale, CA 2015 Click-Through-Rate (CTR) Prediction Estimating the probability of click
More informationParameterization Issues and Diagnostics in MCMC
Parameterization Issues and Diagnostics in MCMC Gill Chapter 10 & 12 November 10, 2008 Convergence to Posterior Distribution Theory tells us that if we run the Gibbs sampler long enough the samples we
More informationAppendix A: An Alternative Estimation Procedure Dual Penalized Expansion
Supplemental Materials for Functional Linear Models for Zero-Inflated Count Data with Application to Modeling Hospitalizations in Patients on Dialysis by Şentürk, D., Dalrymple, L. S. and Nguyen, D. V.
More informationJournal of Statistical Software
JSS Journal of Statistical Software December 2007, Volume 23, Issue 9. http://www.jstatsoft.org/ WinBUGSio: A SAS Macro for the Remote Execution of WinBUGS Michael K. Smith Pfizer Global Research and Development
More informationCluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008
Cluster Analysis Jia Li Department of Statistics Penn State University Summer School in Statistics for Astronomers IV June 9-1, 8 1 Clustering A basic tool in data mining/pattern recognition: Divide a
More informationINTRO TO THE METROPOLIS ALGORITHM
INTRO TO THE METROPOLIS ALGORITHM A famous reliability experiment was performed where n = 23 ball bearings were tested and the number of revolutions were recorded. The observations in ballbearing2.dat
More informationA Short History of Markov Chain Monte Carlo
A Short History of Markov Chain Monte Carlo Christian Robert and George Casella 2010 Introduction Lack of computing machinery, or background on Markov chains, or hesitation to trust in the practicality
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 1, 2019 Today: Inference in graphical models Learning graphical models Readings: Bishop chapter 8 Bayesian
More informationBayesian Mixture Labelling by Highest Posterior Density
Bayesian Mixture Labelling by Highest Posterior Density Weixin Yao and Bruce G. Lindsay Abstract A fundamental problem for Bayesian mixture model analysis is label switching, which occurs due to the non-identifiability
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 17 EM CS/CNS/EE 155 Andreas Krause Announcements Project poster session on Thursday Dec 3, 4-6pm in Annenberg 2 nd floor atrium! Easels, poster boards and cookies
More informationPerformance of Sequential Imputation Method in Multilevel Applications
Section on Survey Research Methods JSM 9 Performance of Sequential Imputation Method in Multilevel Applications Enxu Zhao, Recai M. Yucel New York State Department of Health, 8 N. Pearl St., Albany, NY
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationPackage TBSSurvival. January 5, 2017
Version 1.3 Date 2017-01-05 Package TBSSurvival January 5, 2017 Title Survival Analysis using a Transform-Both-Sides Model Author Adriano Polpo , Cassio de Campos , D.
More informationDiscussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg
Discussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg Phil Gregory Physics and Astronomy Univ. of British Columbia Introduction Martin Weinberg reported
More informationSupplementary Material sppmix: Poisson point process modeling using normal mixture models
Supplementary Material sppmix: Poisson point process modeling using normal mixture models Athanasios C. Micheas and Jiaxun Chen Department of Statistics University of Missouri April 19, 2018 1 The sppmix
More informationPackage sparsereg. R topics documented: March 10, Type Package
Type Package Package sparsereg March 10, 2016 Title Sparse Bayesian Models for Regression, Subgroup Analysis, and Panel Data Version 1.2 Date 2016-03-01 Author Marc Ratkovic and Dustin Tingley Maintainer
More informationAssessing the Quality of the Natural Cubic Spline Approximation
Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,
More informationPackage mcemglm. November 29, 2015
Type Package Package mcemglm November 29, 2015 Title Maximum Likelihood Estimation for Generalized Linear Mixed Models Version 1.1 Date 2015-11-28 Author Felipe Acosta Archila Maintainer Maximum likelihood
More informationLDA for Big Data - Outline
LDA FOR BIG DATA 1 LDA for Big Data - Outline Quick review of LDA model clustering words-in-context Parallel LDA ~= IPM Fast sampling tricks for LDA Sparsified sampler Alias table Fenwick trees LDA for
More informationAn imputation approach for analyzing mixed-mode surveys
An imputation approach for analyzing mixed-mode surveys Jae-kwang Kim 1 Iowa State University June 4, 2013 1 Joint work with S. Park and S. Kim Ouline Introduction Proposed Methodology Application to Private
More informationCLUSTERING. JELENA JOVANOVIĆ Web:
CLUSTERING JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is clustering? Application domains K-Means clustering Understanding it through an example The K-Means algorithm
More informationInfectious Disease Models. Angie Dobbs. A Thesis. Presented to. The University of Guelph. In partial fulfilment of requirements.
Issues of Computational Efficiency and Model Approximation for Individual-Level Infectious Disease Models by Angie Dobbs A Thesis Presented to The University of Guelph In partial fulfilment of requirements
More informationMultiplicative Mixture Models for Overlapping Clustering
Multiplicative Mixture Models for Overlapping Clustering Qiang Fu Dept of Computer Science & Engineering University of Minnesota, Twin Cities qifu@cs.umn.edu Arindam Banerjee Dept of Computer Science &
More informationGraphical Models, Bayesian Method, Sampling, and Variational Inference
Graphical Models, Bayesian Method, Sampling, and Variational Inference With Application in Function MRI Analysis and Other Imaging Problems Wei Liu Scientific Computing and Imaging Institute University
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More information