Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
|
|
- Lynette Allen
- 5 years ago
- Views:
Transcription
1 Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process Changyou Chen,, Lan Du,, Wray Buntine, ANU College of Engineering and Computer Science The Australian National University National ICT, Australia September 7, Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
2 Outline The Poisson-Dirichlet Process The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
3 The Poisson-Dirichlet Process What can We Do with the Poisson-Dirichlet Process? Applications of the Poisson-Dirichlet process: Topic modeling: Finding meaningful topics discussed in large scale documents. Beneficial to automatic document analysis and understanding. Computational linguistic: e.g., the n-gram model, adaptor grammar. Computer vision. Using PDP/HPDP for image annotation, image segmentation, scene learning, and etc. Others: Data compression, relational modeling, etc. Cows, grass, hill, sky, tree... 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
4 The Poisson-Dirichlet Process What is the Poisson-Dirichlet Process? The Poisson-Dirichlet process takes as input a base distribution, and yields as output a discreet distribution which is somewhat similar to the base distribution y = f (x) y = K i= p i δ xi 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
5 The Poisson-Dirichlet Process What is the Poisson-Dirichlet Process? The Poisson-Dirichlet process (PDP) is a random probability measure, or a distribution over distributions. The basic form of the PDP is: p k δ X k ( ) () k= where p = (p,p,...) is a probability vector so p k and k= p k =. Also, δ X k ( ) is a discrete measure concentrated at Xk. The values X k X are i.i.d. drawn from the base measure H( ). The one parameter version of the PDP is the Dirichlet process (DP), and the two parameter version is the Pitman-Yor process. 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
6 The Poisson-Dirichlet Process Why Poisson-Dirichlet Processes? It allows us to extend finite mixture models to infinite mixture models, by putting a PDP prior on the mixture components. γ G c θ x N c P(γ λ), x P(x θ c ) θ G, θ x N x P(x θ) P(γ λ) is a infinite discreet distribution with parameter λ, or a Poisson-Dirichlet distribution. G is the distribution over mixture components θ, or a Poisson-Dirichlet Process. 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
7 The Poisson-Dirichlet Process Why Poisson-Dirichlet Processes? Easily extend to model hierarchical discreet distributions, e.g., hierarchical Dirichlet processes (HDP) or hierarchical Poisson-Dirichlet process (HPDP). e.g. G e.g. p G p p p K θ p... p... p j x (a) HDP N p... p j p... (b) Probability vector hierarchy 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
8 The Poisson-Dirichlet Process The Chinese Restaurant Process G p c c x N x * x N x * x N (a) PDP (b) PDP (c) CRP The Chinese restaurant process (c) (CRP) is the marginalized version of the PDP (b). Sampling the PDP can be done following the CRP s metaphor. 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
9 The Poisson-Dirichlet Process The Chinese Restaurant Process The CRP assumes a Chinese restaurant has an infinite number of tables, each with infinite capacity, the customers go in and sit at the restaurant as: The first customer sits at the first unoccupied table with probability. For the subsequent (n + )-th customer: He can choose to sit at an occupied table to share the dish with other seated customers with probability proportional to the number of customers that have already sit at that table. or to sit at an unoccupied table with probability proportional to the sum of strength parameter and the total table count Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
10 Outline The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
11 HPDP in the Chinese Restaurant Process Metaphor This shows a simple linear hierarchy of 3 Chinese restaurants. We ll bring customers in and watch the seating. z: dish u: table indicator Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
12 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
13 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? t =? 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
14 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? t =? 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
15 HPDP in the Chinese Restaurant Process Metaphor z = u = 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
16 HPDP in the Chinese Restaurant Process Metaphor z = u = NA 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
17 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
18 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
19 HPDP in the Chinese Restaurant Process Metaphor z = u = 9 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
20 HPDP in the Chinese Restaurant Process Metaphor z = u = NA Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
21 HPDP in the Chinese Restaurant Process Metaphor z = u = NA Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
22 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
23 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t =? t =? 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
24 HPDP in the Chinese Restaurant Process Metaphor t = z = u = t = t = 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
25 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t = t = t =? 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
26 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t =? t = t = t =? 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
27 HPDP in the Chinese Restaurant Process Metaphor t = t = 3 z =? u = t =? t = t = t =? 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
28 HPDP in the Chinese Restaurant Process Metaphor t = t = 3 z = 3 u = t = 3 t = t = t = 3 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
29 HPDP in the Chinese Restaurant Process Metaphor We can also add customers at any restaurant. t = t = 3... t = 3 t =... t = t = Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
30 HPDP Statistics used in Sampling Seating arrangements represenation t, =,m,, = t, =,m,, = 3 t,3 =,m,3, = t, =,m,, = 4 t, =,m,, = t,3 =,m,3, = t, =,m,, =, m,, = 3 t, =,m,, = t,3 =,m,3, = m = counts at each table z = u = t = t = 3 t = z = u = NA z = u = t = 3 t = t = z = z = 3 u =... u =... Table indicators representation n, =,t, = n, = 3,t, = n,3 =,t,3 = n, = 4,t, = n, =,t, = n,3 =,t,3 = n, = 5,t, = n, =,t, = n,3 =,t,3 = t = number of tables 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
31 Table Indicator Representation of the HPDP The above u is called the table indicator, defined as: Definition (Table indicator u l ) The table indicator u l for each customer l is an auxiliary latent variable which indicates up to which level in the restaurant hierarchy l has contributed a table count (i.e. activated a new table). It is enough to use this variable to represent the HPDP, e.g., other statistics (#customers and #tables in each restaurant) can be reconstructed from the table indicators easily. A block Gibbs sampler can be derived using this representation. 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
32 Table Counts Representation of the HPDP Theorem: (Teh et. al., 6 []; Buntine and Hutter, []) Posterior of the HPDP with table count representation: ( ) (bj a j ) Tj P r ( z :J, t :J H ) = j (b j ) Nj S n jk t jk,a j k n jk : #customers in the j-th restaurant eating dish k t jk : #tables in the j-th restaurant serving dish k N j : = n jk, T j := t jk k k S N M,a : the generalized Stirling number (x y) N : the Pochhammer symbol with increment y () H : base distribution 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
33 Table Indicator Representation of the HPDP Theorem Posterior of the HPDP in our representation: ( ) (bj a j ) Tj P r ( z :J, u :J H ) = j (b j ) Nj S n jk t jk!(n jk t jk )! t jk,a j k n jk! the symbols are the same with the previous ones except that t jk can be constructed from the table indicators: t jk = δ zl =kδ ul d(j) j T(j) l D(j ) 33 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
34 How does our Sampler Work? Sampling a DP with dim = 5, data=5 points, true conc.=: One run with K=5, N=5, disc.=, conc.= Mean over runs from with K=5, N=5, disc.=, conc.= Sampling with seating arrangements Table indicator sampler Collapsed table sampler 6 5 Sampling with seating arrangements Table indicator sampler Collapsed table sampler Mean tables Deviation for mean tables Time (ms) (a) Estimate of total table counts Time (ms) (b) Std.Dev. of total table counts One run with K=5, N=5, disc.=, conc.= Mean over runs with K=5, N=5, disc.=, conc.= 3 Sampling with seating arrangements Table indicator sampler Collapsed table sampler 3.5 Sampling with seating arrangements Table indicator sampler Collapsed table sampler Mean concentration Deviation for concentration Time (ms) (c) Estimate of concentration par. b Time (ms) (d) Std.Dev. of concentration par. b 34 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
35 Experiments: Topic Modeling using the HPDP Outline The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion 35 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
36 Experiments: Topic Modeling using the HPDP Datasets, Compared Algorithms and Evaluations We use five datasets (from Blogs, Reuters, NIPS, UCI): Health Person Obama NIPS Enron # words,9,678,656,574,38,667,93,365 6,4,7 # documents,655 8,66 9,95,5 39,86 vocabulary size,863 3,946 8,38,49 8, Compared algorithms: Teh et.al. s sampling direct assignment SDA [] Buntine and Hutter s collapsed table sampler CTS [] Our proposed block Gibbs sampler STC STC initialized with SDA, denoted as STC + SDA We used the unbiased left-to-right algorithm [3] to calculate the testing perplexities for the topic models, which is a standard evaluation. 36 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
37 Experiments: Topic Modeling using the HPDP Perplexities Table: Test log (perplexities) on the five datasets Dataset Health Person Obama I = I = I = I = I = I = SDA CTS SDA+STC STC Dataset Enron NIPS I = 5 I = I = I = SDA SDA+STC STC updated result 37 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
38 Experiments: Topic Modeling using the HPDP Convergence Speed testing perplexity..8.6 SDA STC CTS SDA+STC training time (s) (a) Health with I = testing perplexity SDA STC CTS SDA+STC training time (s) (b) Health with I = testing perplexity 3.5 SDA STC CTS SDA+STC training time (s) (c) Person with I = testing perplexity SDA STC CTS SDA+STC training time (s) (d) Person with I = testing perplexity SDA STC SDA+STC training time (s) (e) NIPS with I = testing perplexity SDA STC SDA+STC training time (s) (f) NIPS with I = 38 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
39 Outline Conclusion The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion 39 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
40 Conclusion Conclusion Proposed a new representation for the HPDP based on the CRP metaphor. All useful statistics of the CRP can be reconstructed from the table indicator. No dynamic memory allocations for table counts. A blocked Gibbs sampler can be easily derived, e.g., we do not have to sample the table counts separately. Experimental results on topic modeling indicate fast mixing of the proposed algorithm. All other PDP related applications can be adapted to this representation. 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
41 Reference Conclusion Teh, Y.: A Bayesian interpretation of interpolated Kneser-Ney. Technical Report TRA/6, School of Computing, National University of Singapore (6) Buntine, W., Hutter, M.: A Bayesian review of the Poisson-Dirichlet process. Technical Report arxiv:7.96, NICTA and ANU, Australia () Buntine, W.: Estimating likelihoods for topic models. In: ACML 9. (9) Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
42 Conclusion Thanks for your attention!!! 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process
Mondrian Forests: Efficient Online Random Forests
Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan Joint work with Daniel M. Roy and Yee Whye Teh 1 Outline Background and Motivation Mondrian Forests Randomization mechanism Online
More informationClustering Relational Data using the Infinite Relational Model
Clustering Relational Data using the Infinite Relational Model Ana Daglis Supervised by: Matthew Ludkin September 4, 2015 Ana Daglis Clustering Data using the Infinite Relational Model September 4, 2015
More informationMondrian Forests: Efficient Online Random Forests
Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan (Gatsby Unit, UCL) Daniel M. Roy (Cambridge Toronto) Yee Whye Teh (Oxford) September 4, 2014 1 Outline Background and Motivation
More informationThe Multi Stage Gibbs Sampling: Data Augmentation Dutch Example
The Multi Stage Gibbs Sampling: Data Augmentation Dutch Example Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 8 1 Example: Data augmentation / Auxiliary variables A commonly-used
More informationTrajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model
Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model Xiaogang Wang 1 Keng Teck Ma 2 Gee-Wah Ng 2 W. Eric L. Grimson 1 1 CS and AI Lab, MIT, 77 Massachusetts Ave., Cambridge,
More informationSpatial distance dependent Chinese restaurant processes for image segmentation
Spatial distance dependent Chinese restaurant processes for image segmentation Soumya Ghosh 1, Andrei B. Ungureanu 2, Erik B. Sudderth 1, and David M. Blei 3 1 Department of Computer Science, Brown University,
More informationEdge-exchangeable graphs and sparsity
Edge-exchangeable graphs and sparsity Tamara Broderick Department of EECS Massachusetts Institute of Technology tbroderick@csail.mit.edu Diana Cai Department of Statistics University of Chicago dcai@uchicago.edu
More informationScalable Bayes Clustering for Outlier Detection Under Informative Sampling
Scalable Bayes Clustering for Outlier Detection Under Informative Sampling Based on JMLR paper of T. D. Savitsky Terrance D. Savitsky Office of Survey Methods Research FCSM - 2018 March 7-9, 2018 1 / 21
More informationWarped Mixture Models
Warped Mixture Models Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani Cambridge University Computational and Biological Learning Lab March 11, 2013 OUTLINE Motivation Gaussian Process Latent Variable
More informationOne-Shot Learning with a Hierarchical Nonparametric Bayesian Model
One-Shot Learning with a Hierarchical Nonparametric Bayesian Model R. Salakhutdinov, J. Tenenbaum and A. Torralba MIT Technical Report, 2010 Presented by Esther Salazar Duke University June 10, 2011 E.
More informationFeature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web
Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web Chenghua Lin, Yulan He, Carlos Pedrinaci, and John Domingue Knowledge Media Institute, The Open University
More informationLearning the Structure of Sum-Product Networks. Robert Gens Pedro Domingos
Learning the Structure of Sum-Product Networks Robert Gens Pedro Domingos w 20 10x O(n) X Y LL PLL CLL CMLL Motivation SPN Structure Experiments Review Learning Graphical Models Representation Inference
More informationMulti-Class Video Co-Segmentation with a Generative Multi-Video Model
2013 IEEE Conference on Computer Vision and Pattern Recognition Multi-Class Video Co-Segmentation with a Generative Multi-Video Model Wei-Chen Chiu Mario Fritz Max Planck Institute for Informatics, Saarbrücken,
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationNonparametric Bayesian Texture Learning and Synthesis
Appears in Advances in Neural Information Processing Systems (NIPS) 2009. Nonparametric Bayesian Texture Learning and Synthesis Long (Leo) Zhu 1 Yuanhao Chen 2 William Freeman 1 Antonio Torralba 1 1 CSAIL,
More informationModeling time series with hidden Markov models
Modeling time series with hidden Markov models Advanced Machine learning 2017 Nadia Figueroa, Jose Medina and Aude Billard Time series data Barometric pressure Temperature Data Humidity Time What s going
More informationScene classification using spatial pyramid matching and hierarchical Dirichlet processes
Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 2010 Scene classification using spatial pyramid matching and hierarchical Dirichlet processes Haohui Yin Follow
More informationSpatial Latent Dirichlet Allocation
Spatial Latent Dirichlet Allocation Xiaogang Wang and Eric Grimson Computer Science and Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge, MA, 02139, USA
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationImage analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis
7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning
More informationAutomatic Learning Image Objects via Incremental Model
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 14, Issue 3 (Sep. - Oct. 2013), PP 70-78 Automatic Learning Image Objects via Incremental Model L. Ramadevi 1,
More informationApplied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University
Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University NIPS 2008: E. Sudderth & M. Jordan, Shared Segmentation of Natural
More informationLearning stick-figure models using nonparametric Bayesian priors over trees
Learning stick-figure models using nonparametric Bayesian priors over trees Edward W. Meeds, David A. Ross, Richard S. Zemel, and Sam T. Roweis Department of Computer Science University of Toronto {ewm,
More informationModel-Based Clustering for Online Crisis Identification in Distributed Computing
Model-Based Clustering for Crisis Identification in Distributed Computing Dawn Woodard Operations Research and Information Engineering Cornell University with Moises Goldszmidt Microsoft Research 1 Outline
More informationEstimating Labels from Label Proportions
Estimating Labels from Label Proportions Novi Quadrianto Novi.Quad@gmail.com The Australian National University, Australia NICTA, Statistical Machine Learning Program, Australia Joint work with Alex Smola,
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More informationCase Study IV: Bayesian clustering of Alzheimer patients
Case Study IV: Bayesian clustering of Alzheimer patients Mike Wiper and Conchi Ausín Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School 2nd - 6th
More informationAn Unsupervised Hierarchical Feature Learning Framework for One-Shot Image Recognition
ACCEPTED FOR THE IEEE TRANSACTIONS ON MULTIMEDIA, AUGUST 2012 1 An Unsupervised Hierarchical Feature Learning Framework for One-Shot Image Recognition Zhenyu Guo, and Z. Jane Wang, Senior Member, IEEE
More informationLiangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison*
Tracking Trends: Incorporating Term Volume into Temporal Topic Models Liangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison* Dept. of Computer Science and Engineering, Lehigh University, Bethlehem, PA,
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationVariational Methods for Discrete-Data Latent Gaussian Models
Variational Methods for Discrete-Data Latent Gaussian Models University of British Columbia Vancouver, Canada March 6, 2012 The Big Picture Joint density models for data with mixed data types Bayesian
More informationClassifying Users and Identifying User Interests in Folksonomies
Classifying Users and Identifying User Interests in Folksonomies Elias Zavitsanos 1, George A. Vouros 1, and Georgios Paliouras 2 1 Department of Information and Communication Systems Engineering University
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationAnalysis of Incomplete Multivariate Data
Analysis of Incomplete Multivariate Data J. L. Schafer Department of Statistics The Pennsylvania State University USA CHAPMAN & HALL/CRC A CR.C Press Company Boca Raton London New York Washington, D.C.
More informationAUTOMATIC LABELING OF RSS ARTICLES USING ONLINE LATENT DIRICHLET ALLOCATION
Michigan Technological University Digital Commons @ Michigan Tech Dissertations, Master's Theses and Master's Reports - Open Dissertations, Master's Theses and Master's Reports 2014 AUTOMATIC LABELING
More informationBeam Search based MAP Estimates for the Indian Buffet Process
Piyush Rai School of Computing, University of Utah, Salt Lake City, UT, USA Hal Daumé III Department of Computer Science, University of Maryland, College Park, MD, USA piyush@cs.utah.edu hal@umiacs.umd.edu
More informationDistributed Algorithms for Topic Models
Journal of Machine Learning Research x (2009) x-xx Submitted 6/08; Published xx/xx Distributed Algorithms for Topic Models David Newman Department of Computer Science University of California, Irvine Irvine,
More informationfor Images A Bayesian Deformation Model
Statistics in Imaging Workshop July 8, 2004 A Bayesian Deformation Model for Images Sining Chen Postdoctoral Fellow Biostatistics Division, Dept. of Oncology, School of Medicine Outline 1. Introducing
More informationNon-Parametric Bayesian Sum-Product Networks
Sang-Woo Lee slee@bi.snu.ac.kr School of Computer Sci. and Eng., Seoul National University, Seoul 151-744, Korea Christopher J. Watkins c.j.watkins@rhul.ac.uk Department of Computer Science, Royal Holloway,
More informationParallel Gibbs Sampling From Colored Fields to Thin Junction Trees
Parallel Gibbs Sampling From Colored Fields to Thin Junction Trees Joseph Gonzalez Yucheng Low Arthur Gretton Carlos Guestrin Draw Samples Sampling as an Inference Procedure Suppose we wanted to know the
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationMachine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University K- Means + GMMs Clustering Readings: Murphy 25.5 Bishop 12.1, 12.3 HTF 14.3.0 Mitchell
More informationarxiv: v1 [stat.me] 29 May 2015
MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis Vincent Audigier 1, François Husson 2 and Julie Josse 2 arxiv:1505.08116v1 [stat.me] 29 May 2015 Applied Mathematics
More informationIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks Luke de Oliveira Vai Technologies Lawrence Berkeley National Laboratory @lukede0 @lukedeo lukedeo@vaitech.io https://ldo.io 1 Outline Why Generative Modeling?
More informationPower-Law Graph Cuts
Xiangyang Zhou Jiaxin Zhang Brian Kulis Department of Computer Science and Engineering The Ohio State University Abstract Algorithms based on spectral graph cut objectives such as normalized cuts, ratio
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationINTRO TO THE METROPOLIS ALGORITHM
INTRO TO THE METROPOLIS ALGORITHM A famous reliability experiment was performed where n = 23 ball bearings were tested and the number of revolutions were recorded. The observations in ballbearing2.dat
More informationSmooth Image Segmentation by Nonparametric Bayesian Inference
Smooth Image Segmentation by Nonparametric Bayesian Inference Peter Orbanz and Joachim M. Buhmann Institute of Computational Science, ETH Zurich European Conference on Computer Vision (ECCV), 2006 The
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationThe loop-erased random walk and the uniform spanning tree on the four-dimensional discrete torus
The loop-erased random walk and the uniform spanning tree on the four-dimensional discrete torus by Jason Schweinsberg University of California at San Diego Outline of Talk 1. Loop-erased random walk 2.
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationBayesian Statistics Group 8th March Slice samplers. (A very brief introduction) The basic idea
Bayesian Statistics Group 8th March 2000 Slice samplers (A very brief introduction) The basic idea lacements To sample from a distribution, simply sample uniformly from the region under the density function
More informationNested Dictionary Learning for Hierarchical Organization of Imagery and Text
Nested Dictionary Learning for Hierarchical Organization of Imagery and Text Lingbo Li Duke University Durham, NC 2778 ll83@duke.edu XianXing Zhang Duke University Durham, NC 2778 xz6@duke.edu Mingyuan
More informationBayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis
Bayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis Xavier Le Faucheur a, Brani Vidakovic b and Allen Tannenbaum a a School of Electrical and Computer Engineering, b Department of Biomedical
More informationNonparametric Clustering with Distance Dependent Hierarchies
Nonparametric Clustering with Distance Dependent Hierarchies Soumya Ghosh Dept. of Computer Science, Brown Univ., Providence, RI sghosh@cs.brown.edu Michalis Raptis Comcast Labs, Washington, D.C. mraptis@cable.comcast.com
More informationIntroduction to Machine Learning
Department of Computer Science, University of Helsinki Autumn 2009, second term Session 8, November 27 th, 2009 1 2 3 Multiplicative Updates for L1-Regularized Linear and Logistic Last time I gave you
More informationPIANOS requirements specifications
PIANOS requirements specifications Group Linja Helsinki 7th September 2005 Software Engineering Project UNIVERSITY OF HELSINKI Department of Computer Science Course 581260 Software Engineering Project
More informationCS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Exam
More informationBeyond the black box: Using, programming, and sharing hierarchical modeling algorithms such as MCMC and Sequential MC using NIMBLE
Beyond the black box: Using, programming, and sharing hierarchical modeling algorithms such as MCMC and Sequential MC using NIMBLE Christopher Paciorek UC Berkeley Statistics Perry de Valpine (PI) Daniel
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Bayes Nets (Finish) Parameter Learning Structure Learning Readings: KF 18.1, 18.3; Barber 9.5,
More informationAn empirical study of smoothing techniques for language modeling
Computer Speech and Language (1999) 13, 359 394 Article No. csla.1999.128 Available online at http://www.idealibrary.com on An empirical study of smoothing techniques for language modeling Stanley F. Chen
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationDiscriminative Bayesian Nonparametric Clustering
Discriminative Bayesian Nonparametric Clustering Vu Nguyen, Dinh Phung, Trung Le, and Hung Bui Deakin University, Australia Adobe Research, USA {v.nguyen,dinh.phung,trung.l}@deakin.edu.au bui.h.hung@gmail.com
More informationThis chapter explains two techniques which are frequently used throughout
Chapter 2 Basic Techniques This chapter explains two techniques which are frequently used throughout this thesis. First, we will introduce the concept of particle filters. A particle filter is a recursive
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationNew Results of Fully Bayesian
of Fully Bayesian UCI February 7, 2012 of Fully Bayesian Calibration Samples Principle Component Analysis Model Building Three source parameter sampling schemes Simulation Quasar data sets New data sets
More informationHe (Ethan) Zhao PhD Student Monash University
Topic Structure Learning for Interpretable Text Understanding He (Ethan) Zhao PhD Student Monash University 1 Topic modelling / # 0 $ "#! "# - {1,,. # } Prior on 1 and 2: LDA: Dirichlet on 1 and 2 HDPLDA:
More informationSemantic Segmentation. Zhongang Qi
Semantic Segmentation Zhongang Qi qiz@oregonstate.edu Semantic Segmentation "Two men riding on a bike in front of a building on the road. And there is a car." Idea: recognizing, understanding what's in
More informationRandomized Algorithms for Fast Bayesian Hierarchical Clustering
Randomized Algorithms for Fast Bayesian Hierarchical Clustering Katherine A. Heller and Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College ondon, ondon, WC1N 3AR, UK {heller,zoubin}@gatsby.ucl.ac.uk
More informationEfficient Data Structures for Massive N-Gram Datasets
Efficient ata Structures for Massive N-Gram atasets Giulio Ermanno Pibiri University of Pisa and ISTI-CNR Pisa, Italy giulio.pibiri@di.unipi.it Rossano Venturini University of Pisa and ISTI-CNR Pisa, Italy
More informationAssignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018
Assignment 2 Unsupervised & Probabilistic Learning Maneesh Sahani Due: Monday Nov 5, 2018 Note: Assignments are due at 11:00 AM (the start of lecture) on the date above. he usual College late assignments
More informationStatistical techniques for data analysis in Cosmology
Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction
More informationPartitioning Algorithms for Improving Efficiency of Topic Modeling Parallelization
Partitioning Algorithms for Improving Efficiency of Topic Modeling Parallelization Hung Nghiep Tran University of Information Technology VNU-HCMC Vietnam Email: nghiepth@uit.edu.vn Atsuhiro Takasu National
More informationA GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM
A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general
More informationCS5220: Assignment Topic Modeling via Matrix Computation in Julia
CS5220: Assignment Topic Modeling via Matrix Computation in Julia April 7, 2014 1 Introduction Topic modeling is widely used to summarize major themes in large document collections. Topic modeling methods
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationConditional Random Fields as Recurrent Neural Networks
BIL722 - Deep Learning for Computer Vision Conditional Random Fields as Recurrent Neural Networks S. Zheng, S. Jayasumana, B. Romera-Paredes V. Vineet, Z. Su, D. Du, C. Huang, P.H.S. Torr Introduction
More informationSplitting Algorithms
Splitting Algorithms We have seen that slotted Aloha has maximal throughput 1/e Now we will look at more sophisticated collision resolution techniques which have higher achievable throughput These techniques
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationMore on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization
More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial
More informationMonte Carlo Methods and Statistical Computing: My Personal E
Monte Carlo Methods and Statistical Computing: My Personal Experience Department of Mathematics & Statistics Indian Institute of Technology Kanpur November 29, 2014 Outline Preface 1 Preface 2 3 4 5 6
More informationMachine Learning at the Limit
Machine Learning at the Limit John Canny*^ * Computer Science Division University of California, Berkeley ^ Yahoo Research Labs @GTC, March, 2015 My Other Job(s) Yahoo [Chen, Pavlov, Canny, KDD 2009]*
More informationThe Open University s repository of research publications and other research outputs. Search Personalization with Embeddings
Open Research Online The Open University s repository of research publications and other research outputs Search Personalization with Embeddings Conference Item How to cite: Vu, Thanh; Nguyen, Dat Quoc;
More informationData Association for Semantic World Modeling from Partial Views
Data Association for Semantic World Modeling from Partial Views Lawson L.S. Wong, Leslie Pack Kaelbling, Tomás Lozano-Pérez CSAIL, MIT, Cambridge, MA 02139 { lsw, lpk, tlp }@csail.mit.edu Abstract Autonomous
More informationImage-Based Rendering and Light Fields
CS194-13: Advanced Computer Graphics Lecture #9 Image-Based Rendering University of California Berkeley Image-Based Rendering and Light Fields Lecture #9: Wednesday, September 30th 2009 Lecturer: Ravi
More informationA Co-Clustering approach for Sum-Product Network Structure Learning
Università degli Studi di Bari Dipartimento di Informatica LACAM Machine Learning Group A Co-Clustering approach for Sum-Product Network Antonio Vergari Nicola Di Mauro Floriana Esposito December 8, 2014
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More informationProbabilistic Graphical Models
Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft
More informationScalable Approximate Bayesian Inference for Outlier. Detection under Informative Sampling
Journal of Machine Learning Research 17 (2016) 1-49 Submitted 3/15; Revised 1/16; Published 12/16 Scalable Approximate Bayesian Inference for Outlier Detection under Informative Sampling Terrance D. Savitsky
More informationA Topography-Preserving Latent Variable Model with Learning Metrics
A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland
More informationBayesian model ensembling using meta-trained recurrent neural networks
Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya
More informationTSIN01 Information Networks Lecture 8
TSIN01 Information Networks Lecture 8 Danyo Danev Division of Communication Systems Department of Electrical Engineering Linköping University, Sweden September 24 th, 2018 Danyo Danev TSIN01 Information
More informationK-means and Hierarchical Clustering
K-means and Hierarchical Clustering Xiaohui Xie University of California, Irvine K-means and Hierarchical Clustering p.1/18 Clustering Given n data points X = {x 1, x 2,, x n }. Clustering is the partitioning
More informationOutlines. Perspective Hierarchical Dirichlet Process (phdp) User-Tagged Image Modeling
Perspective Hierarchical Dirichlet Process for User-Tagged Image Modeling Slide 1: Thanks for your attendance, in the following I would like to present our paper Perspective Hierarchical Dirichlet Process
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More information