Probabilistic multi-resolution scanning for cross-sample differences
|
|
- Jessica French
- 6 years ago
- Views:
Transcription
1 Probabilistic multi-resolution scanning for cross-sample differences Li Ma (joint work with Jacopo Soriano) Duke University 10th Conference on Bayesian Nonparametrics Li Ma (Duke) Multi-resolution scanning June 23, / 36
2 Example: Flow cytometry SSC A FSC A SSC A FSC H FSC A FSC H Aqua FSC A Aqua Dext CD8 Dext Is there any difference in two cell populations? What is the difference? Often only a small subset ( 1%) of cells are involved. Li Ma (Duke) Multi-resolution scanning June 23, / 36
3 Challenges: Identifying highly local differences in large data sets Rich distributional features: Multi-modality, skewness, and other tail behaviors etc. Potential differences can be of a variety of shapes or forms. Differences are often highly local involving small portions of the data. Requires large amounts of computation to fit nonparametric models. Li Ma (Duke) Multi-resolution scanning June 23, / 36
4 What is needed? Detectors (not just tests!) for cross-sample difference that are Flexible (i.e. nonparametric). Sensitive to highly local structures. Computationally efficient. Bonus: allow principled uncertainty quantification and decision making. Li Ma (Duke) Multi-resolution scanning June 23, / 36
5 A simple inference strategy: multi-resolution scanning Scan over the sample space using moving windows. Use windows of a variety of sizes (i.e. resolutions). Carry out a base inference task (e.g. a two-sample test) on each window. Combine evidence across windows to summarize the inferred structure. Li Ma (Duke) Multi-resolution scanning June 23, / 36
6 Divide-and-conquer: multi-resolution scanning Multi-resolution scanning (MRS) transforms a large data set small information packets a complex nonparametric problem simple parametric problems. Computational efficiency and ease for parallelized computing. This strategy is simple, flexible, and computationally efficient. Caveat: The information packets may be too small for identifying highly local structures. Li Ma (Duke) Multi-resolution scanning June 23, / 36
7 Two-sample comparison Observe two data samples x 11,x 12,...,x 1n1 Q 1 x 21,x 22,...,x 2n2 Q 2. How are Q 1 and Q 2 different, if at all? Li Ma (Duke) Multi-resolution scanning June 23, / 36
8 A multi-resolution windowing scheme Construct windows of various sizes through nested dyadic partitioning: k=0 Ω k=1 k=2 A k=3 A l A r We shall refer to the partitioning also as a windowing scheme (tree).. Li Ma (Duke) Multi-resolution scanning June 23, / 36
9 Multi-resolution decomposition of probability distribution Scale k: A θ(a) 1 θ(a) Scale k+1: A l A r On each window A, θ(a) gives the proportion of probability in A l. Q can be fully represented by θ(a) on all windows Q {θ(a) : A T } where T is a windowing tree that generates the Borel σ-algebra. We shall call the θ(a) s probability assignment coefficients (PACs). Li Ma (Duke) Multi-resolution scanning June 23, / 36
10 Induced decomposition of the statistical experiment The transform of a distribution into PACs induces a decomposition of the experiment of observing an i.i.d. sample into a collection of sequential Binomial experiments: n A θ(a) n A l n A r 1 θ(a) ion to PACs The(red experiment ticks) on windowb A: Binomial experiment n(a l ) Binomial(n(A),θ(A)). Li Ma (Duke) Multi-resolution scanning June 23, / 36
11 Induced decomposition of the statistical experiment The likelihood of the i.i.d. sample can be written as n L (Q) = q(x i ) = L A (Q) i=1 A T where L A (Q) = C A (x) θ(a) n(a l) (1 θ(a)) n(a r) So each window contribute to the empirical evidence in an orthogonal manner x 1,x 2,...,x n {(n(a l ),n(a r )) : A T }. The sufficient statistic (n(a l ),n(a r )) forms the information packet for window A. The likelihood principle implies that one can model the Binomial experiments ignoring the sequential nature in the Bayesian paradigm. Li Ma (Duke) Multi-resolution scanning June 23, / 36
12 Two-sample comparison Scale k: A θ 1 (A) θ 2 (A) 1 θ 2 (A) 1 θ 1 (A) Scale k+1: A l A r Q 1 and Q 2 are different if and only if θ 1 (A) θ 2 (A) on some windows. Base inference task: On each window A, test a simple two-sample hypothesis: H 0 (A) : θ 1 (A) = θ 2 (A) under the corresponding Binomial experiment. Li Ma (Duke) Multi-resolution scanning June 23, / 36
13 MRS for two-sample difference Scan over the windows (up to some maximum resolution). Carry out a hypothesis test for H 0 (A) : θ 1 (A) = θ 2 (A) on each window under the model n 1 (A l ) Binomial(n 1 (A),θ 1 (A)) n 2 (A l ) Binomial(n 2 (A),θ 2 (A)). Combine the evidence, and report the significant windows. Li Ma (Duke) Multi-resolution scanning June 23, / 36
14 Base inference task A variety of testing strategies can be used, such as classical LR test or χ 2 test that give a p-value. Look for windows A with very small p-values. How small is small? Need to adjust for multiple testing in a resolution-specific way. We shall take a fully probabilistic Bayesian approach to complete the base inference task. Main advantage: To facilitate borrowing strength across windows. Li Ma (Duke) Multi-resolution scanning June 23, / 36
15 Bayesian hypothesis testing on each window A Introduce a latent state variable S(A) such that S(A) = 0 if H 0 (A) is true and = 1 if H 0 (A) is false. Hypothesis testing is to infer the state S(A). Li Ma (Duke) Multi-resolution scanning June 23, / 36
16 It requires specifying priors on S(A) and on θ 1 (A),θ 2 (A) given S(A). Adopting conjugate Beta priors on (θ 1 (A),θ 2 (A)), We shall write this as Bernoulli prior on S(A): θ 1 (A) = θ 2 (A) S(A) = 0 Beta(α l (A),α r (A)) θ 1 (A),θ 2 (A) S(A) = 1 ind Beta(α l (A),α r (A)). θ 1 (A),θ 2 (A) S(A) paired-beta(α(a),s(a)). S(A) Bernoulli(ρ(A)). Li Ma (Duke) Multi-resolution scanning June 23, / 36
17 Hierarchical model representation The MRS for two-sample comparison can be expressed as a hierarchical model: S(A) Bernoulli(ρ(A)) θ 1 (A),θ 2 (A) S(A) paired-beta(α(a),s(a)) independently for all windows A. n 1 θ 1 (A) Binomial(n 1 (A),θ 1 (A)) n 2 θ 2 (A) Binomial(n 2 (A),θ 2 (A)) The model is fully conjugate, and the evidence on each window is summarized in the posterior probability for S(A) = 1: P(S(A) = 1 x) = (1 ρ(a))/ρ(a) BF(A). We call this the posterior marginal alternative probability (PMAP) on A. Li Ma (Duke) Multi-resolution scanning June 23, / 36
18 Summarizing evidence and reporting significant windows For proper multiple testing adjustment, we need to make it harder for calling a windows in high resolution. Why? Specifically, we need to let ρ(a) = P(S(A) = 1) 2 k where k is the resolution level of A. This is really bad news for identifying local differences, because small windows tend to contain limited data in the first place! Li Ma (Duke) Multi-resolution scanning June 23, / 36
19 Borrowing strength across scanning windows The MRS strategy treats each window as an independent inferential unit. The PMAP on A does not take into account empirical evidence from other windows. Does nearby windows provide useful information? Yes! Due to the spatially clustering nature of differential structures. Essentially need smoothing on the cross-sample variability. How to borrow strength across windows? Incorporating dependency into the hypotheses across windows through a graphical model on the latent variable {S(A) : A T }. Li Ma (Duke) Multi-resolution scanning June 23, / 36
20 Markov tree (Crouse et al 1998) Model the state variable S(A) for all A s using a Markov process on the multi-resolution tree. Scale k: S(A) = g ρ g,h (A l ) ρ g,s (A r ) Scale k+1: S(A l ) = h S(A r ) = s This makes the null/alternative state on the windows attain spatial-scale dependency. Li Ma (Duke) Multi-resolution scanning June 23, / 36
21 Hierarchical model representation The new hierarchical model {S(A) : A T } MT(ρ) θ 1 (A),θ 2 (A) S(A) paired-beta(α(a),s(a)) S A ρ(a) ρ(a) θ 1 A, θ 2 A S A l S A r θ 1 A l, θ 2 A l θ 1 A r, θ 2 A r Li Ma (Duke) Multi-resolution scanning June 23, / 36
22 Specifying the Markov transition matrix If A s parent is in state s, then A is in state t with probability ρ s,t (A). P(S(A) = h S(parent(A)) = g) = ρ g,h (A). A parsimonious two-parameter specification: ( ) ( ρ0,0 (A) ρ ρ(a) = 0,1 (A) 1 γ2 k γ2 = k ρ 1,0 (A) ρ 1,1 (A) 1 γβ k γβ k ) where k is the level of A. The 2 k factor is included to control the prior expected number of rejections at each resolution. γ controls the prior expected number of significant windows. β controls the level of spatial dependency. Li Ma (Duke) Multi-resolution scanning June 23, / 36
23 Full posterior Given i.i.d. samples from Q 1 and Q 2, the joint posterior is still a Markov-MRS {S(A) : A T } ρ,x MT( ρ) θ 1 (A),θ 2 (A) S(A), α,x paired-beta(s(a), α(a)) with ρ = { ρ(a) : A T } given as ρ(a) = diag(φ(a)) 1 ρ(a) diag(m(a)) diag(φ(a l ) φ(a r )) where m(a) = (m 0 (A),m 1 (A)), represents the Hadamard product, and φ : T R 2 is given as follows ρ(a) diag(φ(a l ) φ(a r )) m(a) if n 1 (A) + n 2 (A)) > 1 φ(a) = (1/µ(A),1/µ(A)) if n 1 (A) + n 2 (A)) = 1 (1,1) otherwise. Computing φ involves a bottom-up recursive information passing. Li Ma (Duke) Multi-resolution scanning June 23, / 36
24 Computing PMAPs For each A T, let ϕ(a) = (P(S(A) = 0 x),p(s(a) = 1 x)). Then ϕ(ω) = ( ρ 0,0 (Ω), ρ 0,1 (Ω)). Now suppose { ϕ(a) have been computed for all windows up to resolution k 0, then for any resolution k + 1 window A, ϕ(a) = ρ(a) ϕ(a p ) where A p is the parent of A in resolution k. This involves a top-down recursive information passing. In essence this is a forward-backward belief propagation algorithm. Li Ma (Duke) Multi-resolution scanning June 23, / 36
25 Marginal posterior consistency Theorem Suppose we observe two independent i.i.d. samples x 1 = (x 1,1,...,x 1,n1 ) and x 2 = (x 2,1,...,x 2,n2 ) respectively from two distributions P 1 and P 2 supported on the entire sample space. Under weak conditions on the prior parameters, as n 1, n 2, and n 1 /(n 1 + n 2 ) ζ (0,1), for any window A, { P(S(A) = 1 x) p 1 if P 1 (A l A) P 2 (A l A) 0 if P 1 (A l A) = P 2 (A l A). Li Ma (Duke) Multi-resolution scanning June 23, / 36
26 Joint posterior consistency Theorem Suppose we observe two independent i.i.d. samples x 1 = (x 1,1,...,x 1,n1 ) and x 2 = (x 2,1,...,x 2,n2 ) respectively from two distributions P 1 and P 2 supported on the entire sample space. Under weak conditions on the prior parameters, as n 1, n 2, and n 1 /(n 1 + n 2 ) ζ (0,1), P ( S(A) = 1{P 1 (A l A) P 2 (A l A)} for all A up to resolution K x ) p 1. Li Ma (Duke) Multi-resolution scanning June 23, / 36
27 Example: Testing performance Local Shift Local Dispersion Global Shift Global Dispersion density density density density sensitivity sensitivity sensitivity sensitivity M MRS KNN Cramer co OPT PT CH specificity specificity specificity specificity Li Ma (Duke) Multi-resolution scanning June 23, / 36
28 Example: Visualizing the empirical evidence Alternative Prob. Effect Size level level Left: the PMAPs. Right: the posterior expected effect size. Effect size measured by absolute log odds eff(a) = log θ 1(A)/(1 θ 1 (A)) θ 2 (A)/(1 θ 2 (A)). Note the clustering of differential structures! Li Ma (Duke) Multi-resolution scanning June 23, / 36
29 Reporting significant windows We take a fully decision theoretic approach. Candidate loss functions (Müller et al 2007). L(δ,c) = c A 1 {Z(A)=0} δ(a) + (1 c) A 1 {Z(A)=1} (1 δ(a)) Optimal rule: δ(a) = 1 {P(S(A)=1 x)>c}. L(δ,c) = A eff(a)δ(a) + A eff(a)(1 δ(a)) + 2c A δ(a) Optimal rule: δ(a) = 1 {E(eff(A) x)>c}. Li Ma (Duke) Multi-resolution scanning June 23, / 36
30 Multiple testing control One can control the posterior expected number of false rejections (NFR) NFR(c) = E(# of falsely rejected H 0 (A) s x) = P(S(A) = 0 x), which is computable using the PMAPs. A:δ(A)=1 Or control the posterior expected false discovery rate (FDR) FDR(c) = E(# of falsely rejected H 0(A) s x) # of rejected H 0 (A) s = NFR(c) {A : δ(a) = 1}. Li Ma (Duke) Multi-resolution scanning June 23, / 36
31 Two extensions through hierarchical modeling Optional pruning Bayesian model averaging on the maximum resolution for scanning. Achieved through another layer of hyperprior on {S(A) : A T }. {S(A) : A T } ρ, η pmt(ρ, η). Li Ma (Duke) Multi-resolution scanning June 23, / 36
32 Two extensions through hierarchical modeling Adaptive partitioning inferring a good windowing scheme in multivariate problems. Achieved through a hyper-prior on the windowing tree. T λ RP(λ) {S(A) : A T } ρ, η,t pmt(ρ, η,t ) θ 1 (A),θ 2 (A) S(A), α,t paired-beta(s(a), α(a)) Full posterior available analytically. PMAPs computable through forward-backward recursion. Reporting and visualizing the significant windows on the MAP tree (Empirical Bayes). The MAP tree is computable through a forward-backward-forward algorithm. Tˆ Li Ma (Duke) Multi-resolution scanning June 23, / 36
33 Back to the flow cytometry data SSC A FSC A SSC A FSC H FSC A FSC H Aqua FSC A Aqua Dext CD8 Dext One sample is transfected with a small number of cells that are high in Dext and CD8. Li Ma (Duke) Multi-resolution scanning June 23, / 36
34 Visualizing the difference on the MAP windowing scheme Effect Size For such big data: adopt a loss that takes into account the effect size. Optimal decision rule: δ(a) = 1 {E(eff(A) x)>c} with 1% expected FDR. Li Ma (Duke) Multi-resolution scanning June 23, / 36
35 Example: Identifying differential cells in flow cytometry SSC A FSC H Aqua Dext FSC A FSC A FSC A CD8 Identified differential region involves 0.004% and 0.213% of the two cell populations, indeed high in Dext and CD8. Li Ma (Duke) Multi-resolution scanning June 23, / 36
36 Summary A hierarchical model formulation for MRS. Utilizing graphical modeling to borrow strength across windows thereby enhance the ability to identify local structures. Fully principled decision theoretic approach to hypothesis testing and multiplicity adjustment. Applicable to k-sample setting and other distributional families with corresponding multi-resolution decomposition. R package MRS to become available. Thank you! Li Ma (Duke) Multi-resolution scanning June 23, / 36
Bayesian Methods. David Rosenberg. April 11, New York University. David Rosenberg (New York University) DS-GA 1003 April 11, / 19
Bayesian Methods David Rosenberg New York University April 11, 2017 David Rosenberg (New York University) DS-GA 1003 April 11, 2017 1 / 19 Classical Statistics Classical Statistics David Rosenberg (New
More informationPassive Differential Matched-field Depth Estimation of Moving Acoustic Sources
Lincoln Laboratory ASAP-2001 Workshop Passive Differential Matched-field Depth Estimation of Moving Acoustic Sources Shawn Kraut and Jeffrey Krolik Duke University Department of Electrical and Computer
More informationBayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis
Bayesian Spherical Wavelet Shrinkage: Applications to Shape Analysis Xavier Le Faucheur a, Brani Vidakovic b and Allen Tannenbaum a a School of Electrical and Computer Engineering, b Department of Biomedical
More informationHomework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:
Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes
More informationModeling and Reasoning with Bayesian Networks. Adnan Darwiche University of California Los Angeles, CA
Modeling and Reasoning with Bayesian Networks Adnan Darwiche University of California Los Angeles, CA darwiche@cs.ucla.edu June 24, 2008 Contents Preface 1 1 Introduction 1 1.1 Automated Reasoning........................
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationEstimation of Item Response Models
Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationProbabilistic Graphical Models
Overview of Part One Probabilistic Graphical Models Part One: Graphs and Markov Properties Christopher M. Bishop Graphs and probabilities Directed graphs Markov properties Undirected graphs Examples Microsoft
More informationLecture 21 : A Hybrid: Deep Learning and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More information1 Methods for Posterior Simulation
1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationBayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri
Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri Galin L. Jones 1 School of Statistics University of Minnesota March 2015 1 Joint with Martin Bezener and John Hughes Experiment
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationMachine Learning. Sourangshu Bhattacharya
Machine Learning Sourangshu Bhattacharya Bayesian Networks Directed Acyclic Graph (DAG) Bayesian Networks General Factorization Curve Fitting Re-visited Maximum Likelihood Determine by minimizing sum-of-squares
More informationGT "Calcul Ensembliste"
GT "Calcul Ensembliste" Beyond the bounded error framework for non linear state estimation Fahed Abdallah Université de Technologie de Compiègne 9 Décembre 2010 Fahed Abdallah GT "Calcul Ensembliste" 9
More informationWarped Mixture Models
Warped Mixture Models Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani Cambridge University Computational and Biological Learning Lab March 11, 2013 OUTLINE Motivation Gaussian Process Latent Variable
More information1 : Introduction to GM and Directed GMs: Bayesian Networks. 3 Multivariate Distributions and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2015 1 : Introduction to GM and Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Wenbo Liu, Venkata Krishna Pillutla 1 Overview This lecture
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More informationBayesian Estimation for Skew Normal Distributions Using Data Augmentation
The Korean Communications in Statistics Vol. 12 No. 2, 2005 pp. 323-333 Bayesian Estimation for Skew Normal Distributions Using Data Augmentation Hea-Jung Kim 1) Abstract In this paper, we develop a MCMC
More informationProbabilistic Graphical Models
Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational
More informationGlobal modelling of air pollution using multiple data sources
Global modelling of air pollution using multiple data sources Matthew Thomas M.L.Thomas@bath.ac.uk Supervised by Dr. Gavin Shaddick In collaboration with IHME and WHO June 14, 2016 1/ 1 MOTIVATION Air
More information2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.
Code No: M0502/R05 Set No. 1 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss
More informationApproximate Bayesian Computation using Auxiliary Models
Approximate Bayesian Computation using Auxiliary Models Tony Pettitt Co-authors Chris Drovandi, Malcolm Faddy Queensland University of Technology Brisbane MCQMC February 2012 Tony Pettitt () ABC using
More informationA spatio-temporal model for extreme precipitation simulated by a climate model.
A spatio-temporal model for extreme precipitation simulated by a climate model. Jonathan Jalbert Joint work with Anne-Catherine Favre, Claude Bélisle and Jean-François Angers STATMOS Workshop: Climate
More informationHierarchical Bayesian Modeling with Ensemble MCMC. Eric B. Ford (Penn State) Bayesian Computing for Astronomical Data Analysis June 12, 2014
Hierarchical Bayesian Modeling with Ensemble MCMC Eric B. Ford (Penn State) Bayesian Computing for Astronomical Data Analysis June 12, 2014 Simple Markov Chain Monte Carlo Initialise chain with θ 0 (initial
More informationContents. Preface to the Second Edition
Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................
More informationChapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea
Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationMATH : EXAM 3 INFO/LOGISTICS/ADVICE
MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationOne-Shot Learning with a Hierarchical Nonparametric Bayesian Model
One-Shot Learning with a Hierarchical Nonparametric Bayesian Model R. Salakhutdinov, J. Tenenbaum and A. Torralba MIT Technical Report, 2010 Presented by Esther Salazar Duke University June 10, 2011 E.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining
More informationPart I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a
Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationBayesian Inference for Sample Surveys
Bayesian Inference for Sample Surveys Trivellore Raghunathan (Raghu) Director, Survey Research Center Professor of Biostatistics University of Michigan Distinctive features of survey inference 1. Primary
More informationCorrection for multiple comparisons. Cyril Pernet, PhD SBIRC/SINAPSE University of Edinburgh
Correction for multiple comparisons Cyril Pernet, PhD SBIRC/SINAPSE University of Edinburgh Overview Multiple comparisons correction procedures Levels of inferences (set, cluster, voxel) Circularity issues
More informationTuring Workshop on Statistics of Network Analysis
Turing Workshop on Statistics of Network Analysis Day 1: 29 May 9:30-10:00 Registration & Coffee 10:00-10:45 Eric Kolaczyk Title: On the Propagation of Uncertainty in Network Summaries Abstract: While
More informationFMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate
More informationComputer vision: models, learning and inference. Chapter 10 Graphical Models
Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x
More informationA noninformative Bayesian approach to small area estimation
A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported
More informationNatural image modeling using complex wavelets
Natural image modeling using complex wavelets André Jalobeanu Bayesian Learning group, NASA Ames - Moffett Field, CA Laure Blanc-Féraud, Josiane Zerubia Ariana, CNRS / INRIA / UNSA - Sophia Antipolis,
More informationScene Grammars, Factor Graphs, and Belief Propagation
Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and
More informationStatistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.
Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the
More informationIntroduction to Graphical Models
Robert Collins CSE586 Introduction to Graphical Models Readings in Prince textbook: Chapters 10 and 11 but mainly only on directed graphs at this time Credits: Several slides are from: Review: Probability
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationSTAT 598L Learning Bayesian Network Structure
STAT 598L Learning Bayesian Network Structure Sergey Kirshner Department of Statistics Purdue University skirshne@purdue.edu November 2, 2009 Acknowledgements: some of the slides were based on Luo Si s
More informationNotation Index. Probability notation. (there exists) (such that) Fn-4 B n (Bell numbers) CL-27 s t (equivalence relation) GT-5.
Notation Index (there exists) (for all) Fn-4 Fn-4 (such that) Fn-4 B n (Bell numbers) CL-27 s t (equivalence relation) GT-5 ( n ) k (binomial coefficient) CL-15 ( n m 1,m 2,...) (multinomial coefficient)
More informationD-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.
D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either
More informationGraphical Models. Dmitrij Lagutin, T Machine Learning: Basic Principles
Graphical Models Dmitrij Lagutin, dlagutin@cc.hut.fi T-61.6020 - Machine Learning: Basic Principles 12.3.2007 Contents Introduction to graphical models Bayesian networks Conditional independence Markov
More informationComposition Systems. Composition Systems. Contents. Contents. What s s in this paper. Introduction. On-Line Character Recognition.
Contents S. Geman, D.F. Potter, Z. Chi Presented by Haibin Ling 12. 2. 2003 Definition: Compositionality refers to the evident ability of humans to represent entities as hierarchies of parts,, with these
More informationGeneralized Fast Subset Sums for Bayesian Detection and Visualization
Generalized Fast Subset Sums for Bayesian Detection and Visualization Daniel B. Neill* and Yandong Liu Event and Pattern Detection Laboratory Carnegie Mellon University {neill, yandongl} @ cs.cmu.edu This
More informationDiscussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg
Discussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg Phil Gregory Physics and Astronomy Univ. of British Columbia Introduction Martin Weinberg reported
More informationLudwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer
Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures
More informationScalable Bayes Clustering for Outlier Detection Under Informative Sampling
Scalable Bayes Clustering for Outlier Detection Under Informative Sampling Based on JMLR paper of T. D. Savitsky Terrance D. Savitsky Office of Survey Methods Research FCSM - 2018 March 7-9, 2018 1 / 21
More informationA Co-Clustering approach for Sum-Product Network Structure Learning
Università degli Studi di Bari Dipartimento di Informatica LACAM Machine Learning Group A Co-Clustering approach for Sum-Product Network Antonio Vergari Nicola Di Mauro Floriana Esposito December 8, 2014
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationClustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford
Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically
More informationGraphical Models & HMMs
Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models
More informationECE276A: Sensing & Estimation in Robotics Lecture 11: Simultaneous Localization and Mapping using a Particle Filter
ECE276A: Sensing & Estimation in Robotics Lecture 11: Simultaneous Localization and Mapping using a Particle Filter Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu
More informationGlobal modelling of air pollution using multiple data sources
Global modelling of air pollution using multiple data sources Matthew Thomas SAMBa, University of Bath Email: M.L.Thomas@bath.ac.uk November 11, 015 1/ 3 OUTLINE Motivation Data Sources Existing Approaches
More informationBehaviour based particle filtering for human articulated motion tracking
Loughborough University Institutional Repository Behaviour based particle filtering for human articulated motion tracking This item was submitted to Loughborough University's Institutional Repository by
More informationBeviMed Guide. Daniel Greene
BeviMed Guide Daniel Greene 1 Introduction BeviMed [1] is a procedure for evaluating the evidence of association between allele configurations across rare variants, typically within a genomic locus, and
More informationArtificial Intelligence for Robotics: A Brief Summary
Artificial Intelligence for Robotics: A Brief Summary This document provides a summary of the course, Artificial Intelligence for Robotics, and highlights main concepts. Lesson 1: Localization (using Histogram
More informationChapter 3: Supervised Learning
Chapter 3: Supervised Learning Road Map Basic concepts Evaluation of classifiers Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Summary 2 An example
More informationSmoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data
Smoothing Dissimilarities for Cluster Analysis: Binary Data and unctional Data David B. University of South Carolina Department of Statistics Joint work with Zhimin Chen University of South Carolina Current
More informationStatistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400
Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,
More informationLinear Modeling with Bayesian Statistics
Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationDirected Graphical Models
Copyright c 2008 2010 John Lafferty, Han Liu, and Larry Wasserman Do Not Distribute Chapter 18 Directed Graphical Models Graphs give a powerful way of representing independence relations and computing
More informationA Decision-Theoretic Rough Set Model
A Decision-Theoretic Rough Set Model Yiyu Yao and Jingtao Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao,jtyao}@cs.uregina.ca Special Thanks to Professor
More informationThe Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.
More informationTree of Latent Mixtures for Bayesian Modelling and Classification of High Dimensional Data
Technical Report No. 2005-06, Department of Computer Science and Engineering, University at Buffalo, SUNY Tree of Latent Mixtures for Bayesian Modelling and Classification of High Dimensional Data Hagai
More informationSemantic Segmentation. Zhongang Qi
Semantic Segmentation Zhongang Qi qiz@oregonstate.edu Semantic Segmentation "Two men riding on a bike in front of a building on the road. And there is a car." Idea: recognizing, understanding what's in
More informationProbabilistic Robotics
Probabilistic Robotics Discrete Filters and Particle Filters Models Some slides adopted from: Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Kai Arras and Probabilistic Robotics Book SA-1 Probabilistic
More informationUsing the Kolmogorov-Smirnov Test for Image Segmentation
Using the Kolmogorov-Smirnov Test for Image Segmentation Yong Jae Lee CS395T Computational Statistics Final Project Report May 6th, 2009 I. INTRODUCTION Image segmentation is a fundamental task in computer
More informationScalable Approximate Bayesian Inference for Outlier. Detection under Informative Sampling
Journal of Machine Learning Research 17 (2016) 1-49 Submitted 3/15; Revised 1/16; Published 12/16 Scalable Approximate Bayesian Inference for Outlier Detection under Informative Sampling Terrance D. Savitsky
More informationStatistical Matching using Fractional Imputation
Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:
More informationThis chapter explains two techniques which are frequently used throughout
Chapter 2 Basic Techniques This chapter explains two techniques which are frequently used throughout this thesis. First, we will introduce the concept of particle filters. A particle filter is a recursive
More informationCS 664 Flexible Templates. Daniel Huttenlocher
CS 664 Flexible Templates Daniel Huttenlocher Flexible Template Matching Pictorial structures Parts connected by springs and appearance models for each part Used for human bodies, faces Fischler&Elschlager,
More informationOverview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week
Statistics & Bayesian Inference Lecture 3 Joe Zuntz Overview Overview & Motivation Metropolis Hastings Monte Carlo Methods Importance sampling Direct sampling Gibbs sampling Monte-Carlo Markov Chains Emcee
More informationCPSC 340: Machine Learning and Data Mining. Hierarchical Clustering Fall 2016
CPSC 340: Machine Learning and Data Mining Hierarchical Clustering Fall 2016 Admin Assignment 1 : 3 late days to hand it in before Friday. 0 after that. Assignment 2 is out: Due Friday of next week, but
More informationConditional Volatility Estimation by. Conditional Quantile Autoregression
International Journal of Mathematical Analysis Vol. 8, 2014, no. 41, 2033-2046 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijma.2014.47210 Conditional Volatility Estimation by Conditional Quantile
More informationCluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008
Cluster Analysis Jia Li Department of Statistics Penn State University Summer School in Statistics for Astronomers IV June 9-1, 8 1 Clustering A basic tool in data mining/pattern recognition: Divide a
More informationModeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use
Modeling Criminal Careers as Departures From a Unimodal Population Curve: The Case of Marijuana Use Donatello Telesca, Elena A. Erosheva, Derek A. Kreader, & Ross Matsueda April 15, 2014 extends Telesca
More informationMean Field and Variational Methods finishing off
Readings: K&F: 10.1, 10.5 Mean Field and Variational Methods finishing off Graphical Models 10708 Carlos Guestrin Carnegie Mellon University November 5 th, 2008 10-708 Carlos Guestrin 2006-2008 1 10-708
More information(Not That) Advanced Hierarchical Models
(Not That) Advanced Hierarchical Models Ben Goodrich StanCon: January 10, 2018 Ben Goodrich Advanced Hierarchical Models StanCon 1 / 13 Obligatory Disclosure Ben is an employee of Columbia University,
More informationINLA: an introduction
INLA: an introduction Håvard Rue 1 Norwegian University of Science and Technology Trondheim, Norway May 2009 1 Joint work with S.Martino (Trondheim) and N.Chopin (Paris) Latent Gaussian models Background
More informationChapter 6 Continued: Partitioning Methods
Chapter 6 Continued: Partitioning Methods Partitioning methods fix the number of clusters k and seek the best possible partition for that k. The goal is to choose the partition which gives the optimal
More information( ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc )
(http://www.nematodes.org/teaching/tutorials/ph ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc145477467) Model selection criteria Review Posada D & Buckley TR (2004) Model selection
More informationMesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee
Mesh segmentation Florent Lafarge Inria Sophia Antipolis - Mediterranee Outline What is mesh segmentation? M = {V,E,F} is a mesh S is either V, E or F (usually F) A Segmentation is a set of sub-meshes
More informationSTATISTICS (STAT) 200 Level Courses. 300 Level Courses. Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) 200 Level Courses STAT 250: Introductory Statistics I. 3 credits. Elementary introduction to statistics. Topics include descriptive statistics, probability, and estimation
More informationBiomedical Image Analysis. Point, Edge and Line Detection
Biomedical Image Analysis Point, Edge and Line Detection Contents: Point and line detection Advanced edge detection: Canny Local/regional edge processing Global processing: Hough transform BMIA 15 V. Roth
More informationNested Sampling: Introduction and Implementation
UNIVERSITY OF TEXAS AT SAN ANTONIO Nested Sampling: Introduction and Implementation Liang Jing May 2009 1 1 ABSTRACT Nested Sampling is a new technique to calculate the evidence, Z = P(D M) = p(d θ, M)p(θ
More informationAssessing the Quality of the Natural Cubic Spline Approximation
Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 14: Introduction to hypothesis testing (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 10 Hypotheses 2 / 10 Quantifying uncertainty Recall the two key goals of inference:
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More information