Multi-Modal Metropolis Nested Sampling For Inspiralling Binaries

Similar documents
Overview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week

Artificial Intelligence

Non-deterministic Search techniques. Emma Hart

Lecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman

Introduction (7.1) Genetic Algorithms (GA) (7.2) Simulated Annealing (SA) (7.3) Random Search (7.4) Downhill Simplex Search (DSS) (7.

MCMC Methods for data modeling

Escaping Local Optima: Genetic Algorithm

MSA101/MVE Lecture 5

Nested Sampling: Introduction and Implementation

Bayesian Methods in Vision: MAP Estimation, MRFs, Optimization

Introduction to Optimization Using Metaheuristics. The Lecturer: Thomas Stidsen. Outline. Name: Thomas Stidsen: Nationality: Danish.

Chapter 14 Global Search Algorithms

MAXIMUM LIKELIHOOD ESTIMATION USING ACCELERATED GENETIC ALGORITHMS

Introduction to Optimization Using Metaheuristics. Thomas J. K. Stidsen

Level-set MCMC Curve Sampling and Geometric Conditional Simulation

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

An Introduction to Markov Chain Monte Carlo

Simulated Annealing. G5BAIM: Artificial Intelligence Methods. Graham Kendall. 15 Feb 09 1

Global Optimization. for practical engineering applications. Harry Lee 4/9/2018 CEE 696

Evolutionary Computation Algorithms for Cryptanalysis: A Study

Markov chain Monte Carlo methods

Algorithms & Complexity

SIMULATED ANNEALING TECHNIQUES AND OVERVIEW. Daniel Kitchener Young Scholars Program Florida State University Tallahassee, Florida, USA

Statistical techniques for data analysis in Cosmology

Discussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg

Theoretical Concepts of Machine Learning

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems

Mesh segmentation. Florent Lafarge Inria Sophia Antipolis - Mediterranee

Algorithm Design (4) Metaheuristics

Expectation Maximization (EM) and Gaussian Mixture Models

Segmentation Computer Vision Spring 2018, Lecture 27

Clustering Relational Data using the Infinite Relational Model

x n+1 = x n f(x n) f (x n ), (1)

MontePython. Thejs Brinckmann, Deanna C. Hooper, Julien Lesgourgues. MontePython + CLASS Kavli workshop

Predicting Diabetes using Neural Networks and Randomized Optimization

METAHEURISTICS. Introduction. Introduction. Nature of metaheuristics. Local improvement procedure. Example: objective function

Simulated Annealing. Slides based on lecture by Van Larhoven

RJaCGH, a package for analysis of

Markov Chain Monte Carlo (part 1)

Artificial Intelligence

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Simulated annealing/metropolis and genetic optimization

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Lecture 7: Segmentation. Thursday, Sept 20

Genetic.io. Genetic Algorithms in all their shapes and forms! Genetic.io Make something of your big data

Evolutionary Algorithms. CS Evolutionary Algorithms 1

CHAPTER 4 GENETIC ALGORITHM

Probabilistic Graphical Models

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Optimization Techniques for Design Space Exploration

Short-Cut MCMC: An Alternative to Adaptation

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms

DERIVATIVE-FREE OPTIMIZATION

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

Introduction to Design Optimization: Search Methods

Chapter 1. Introduction

Hierarchical Bayesian Modeling with Ensemble MCMC. Eric B. Ford (Penn State) Bayesian Computing for Astronomical Data Analysis June 12, 2014

Clustering web search results

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Automata Construct with Genetic Algorithm

Automated Video Analysis of Crowd Behavior

Estimation of Item Response Models

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

Quantitative Biology II!

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

Fall 09, Homework 5

WALLABY kinematic parametre extraction:

Hardware-Software Codesign

GENETIC ALGORITHM with Hands-On exercise

L10. PARTICLE FILTERING CONTINUED. NA568 Mobile Robotics: Methods & Algorithms

Approximate Bayesian Computation. Alireza Shafaei - April 2016

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask


Random Search Report An objective look at random search performance for 4 problem sets

1 Methods for Posterior Simulation

Optimization Methods III. The MCMC. Exercises.

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Evolutionary Algorithms: Perfecting the Art of Good Enough. Liz Sander

Active contour: a parallel genetic algorithm approach

Clustering: Classic Methods and Modern Views

Latent Variable Models and Expectation Maximization

An Approach to Polygonal Approximation of Digital CurvesBasedonDiscreteParticleSwarmAlgorithm

Missing variable problems

Also, for all analyses, two other files are produced upon program completion.

PATTERN CLASSIFICATION AND SCENE ANALYSIS

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Jednociljna i višeciljna optimizacija korištenjem HUMANT algoritma

Using a genetic algorithm for editing k-nearest neighbor classifiers

Using Genetic Algorithms to optimize ACS-TSP

Evolutionary Methods for State-based Testing

Convexization in Markov Chain Monte Carlo

A Distributed Hybrid Algorithm for Composite Stock Cutting

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Clustering Lecture 5: Mixture Model

Model Selection - Which Curve?

Solving Traveling Salesman Problem Using Parallel Genetic. Algorithm and Simulated Annealing

Transcription:

Multi-Modal Metropolis Nested Sampling For Inspiralling Binaries Ed Porter (AEI) & Jon Gair (IOA) 2W@AEI Workshop AEI September 2008 (We acknowledge useful converstions with F. Feroz and M. Hobson (Cavendish Laboratory, Cambridge) regarding the MultiNest algorithm (arxiv:0704:3704, arxiv:0810:0781)) 1

Outline 1 Motivation : SMBHBs, multiple sources, multiple mode solutions... 2 Evolutionary Algorithms : Concepts, variants... 3 The Algorithm : the nested sampling, the clustering, x-means, k-means... 4 Current Results : Yes, we have results 5 Future Plans : EMRIs, spinning SMBHBs, smarter algorithms... 2

1 Motivation 3

Motivation & Terminology If we have two events A and B, we define P(A) = prior or marginal probability of A P(A B) = conditional or posterior probability of A given B Bayes Theorem : P(A B) = P(B A)P(A) / P(B) where P(B) acts as a normalising constant. Can also write Bayes theorem as Posterior = Likelihood x Prior Probability / Evidence Problem is P(B) is always very difficult to calculate. 4

SMBHBs 9 Parameter Set Can use F-Statistic to maximize over Non-spinning No extra harmonics Already have algorithm which uses MHMC algorithm which works very efficiently 5

LISA Response To A Source 6

Finding the Modes of the Solution Sky degeneracy From previous Metropolis-Hastings algorithms we can find either the primary or the antipodal solution, but not both at the same time : Cornish & Porter, CQG 24, 5729 (2007) 7

System Knowledge Main acceleration can come from knowledge of the system in question : e.g. degeneracies, symmetries etc Cornish & Porter, CQG 24, 5729 (2007) 8

2 Evolutionary Algorithms 9

What are they? Origins in artificial intelligence Uses a population based optimization algorithm Inspired by biological evolution : reproduction, death, mutation etc Candidate solutions play the role of organisms Fitness criteria determine the environment 10

Metropolis-Hastings Ratio Priors 11

Metropolis-Hastings Ratio Likelihoods 12

Metropolis-Hastings Ratio Transition Probability 13

Should I stay or should I go? First calculate then generate a = U(0,1) and move with probability a = min{1, H}. 14

Metropolis Sampling Developed in 1953 for Boltzmann distributions. Requires symmetric proposals i.e. q(x y) = q(y x) Now only likelihood ratio is important. Can also preserve detailed balance i.e. p(s x)q(y x)=p(s y)q(x y) 15

Metropolis-Hastings Sampling Updated in 1970 to include non-symmetric proposal distributions. Pros : 1) Faster mixing of chains. 2) Can use multiple proposal distributions which only depend on current state. 3) Just requires that a function proportional to a density can be calculated at a point in the parameter space. This allows us to generate a sample without knowing the normalising constant or the evidence. Cons : 1) Multiple proposal distributions require some knowledge of the system. 2) Can still get stuck on secondary solutions. 16

Simulated Annealing Suppose we want to sample from a distribution of the form P(x) = exp(- L(x) / 2). It is usually easier to sample instead from the distribution P*(x) = exp(- L(x) / kt) where k = 2 and we gradually cool from T 1 17

Simulated Annealing Pros : 1) Can prevent chains from getting stuck on secondaries. 2) Shortens and fattens high probability regions. 3) Can greatly speed up burn-in. Cons : 1) No information on initial temperature a priori (can get around this using thermostated annealing, see Cornish & Porter, Class.Quant.Grav.24: 5729,2007) 2) Cool down needs to be slow. 3) If cool down is too fast, the chain WILL get stuck. 4) Makes chain non-markovian (not really a Con for practical purposes). 18

Nested Sampling (Skilling 2004) Method to evaluate evidence Climb likelihood surface by passing through nested equi-likelihood contours Shrink prior volume by rejecting lowest likelihood point Always search for higher likelihood point within lowest equi-likelihood contour Can also return PDFs 19

BIC Scoring Statistical criterion for model selection Also called the Schwarz criterion or SIC If n = sample size, k = # of free parameters and L = maximized likelihood for the estimated model, then Properties BIC = -2 ln L + k ln(n) (a) Independent of priors (b) can measure efficiency of the parameterized model (c) penalizes complexity (d) Good for clustering 20

k-means Clustering (Duda & Hart 1973, Bishop 1995) k stands for number of clusters Goal : given n points, separate into k clusters Starting centroids chosen at random Can converge to the local minimum of a distortion measure Can be quite slow Issues with choosing value of k 21

x-means Clustering (Pelleg & Moore 1999) Define a k(min) and a k(max) Step 1 : Improve parameters (i.e. do k-means for a cluster) Step 2 : Improve structure (if and where new centroids are needed) 22

4 The Algorithm 23

Initial Population Selection Choose fitness threshold, e.g. SNR = 15 Generate organisms until N = 100 Allow organisms to improve their fitness using an uphill climber i.e. Deterministic search where higher likelihood points are accepted without question Generate k centroids according to x-means clustering and associate each of the the organisms with a centroid Centroids with highest BIC scores survive, while unfit centroids are killed off. 24

Population Evolution Population evolves according to a Swarm Intelligence Model i.e. allow organisms in a swarm to evolve on their own, but track center of mass of the swarm Every 20 iterations, recalculate the position and number of centroids, as swarms can break up Re-allocate the organisms with centroids, such that centroids and centroid number evolves along with the children The goal is that the organisms will evolve towards the different modes of a solution And while the organisms do all the work, it is the centroids that we are most interested in as they define the global fitness of a cluster 25

Moving The Clusters Nested Sampling : Try and replace ln L by jumping within a 1-sigma range Uphill Climber : Try proposed point and accept if ln L is better Metropolis : Uniform proposal distribution Metropolis-Hastings : Include non-uniform proposal distributions 26

4 Current Results 27

Binary Parameters 2 Sources 1) m1 = 1e7, m2 = 1e6, z=1, tc = 0.9 yrs, SNR ~ 200 2) m1 = 4e6, m2 = 1e6, z=1.5, tc = 1.02 yrs, SNR ~ 50 Assume Observation time of 1 year Low frequency approximation for LISA response 28

Single Source SMBHB Initial SA heat = 100, TA threshold set to SNR = 10 Used prior information to cluster in sky position Worked very well, clusters found both real and antipodal solutions Used F-Statistic Fitness threshold for initial population set at SNR of 5 No time of coalescence maximization Small number of organisms needed, ~20-30 Code took 3 hours to run on a dual Xeon processor desktop 29

30

31

32

33

Double Source SMBHB Same initial SA and TA conditions Initial fitness threshold set to SNR of 15 While sky clustering worked quite well, needed more time of coalescence and sky clustering worked better. More organisms needed (~100) to account for multiple modes Code takes ~24 hours to run Set k(min) = 2 and k(max) = 10 34

35

36

37

38

39

40

41

42

43

5 For the future... 44

Things To Try... Birth and death of weak organisms/clusters Growing/pruning of clusters to constant population Ant colony optimization More efficient clustering Using cluster to approximate covariance matrix, obtain size and direction of moves Fisher matrix less algorithm? Cross cluster learning 45

Sources To Try... 1) Spinning black hole binaries / EMRIs 2) Will be slower, will need more organisms 3) Can handle degenerate parameters very well 4) May be successful as we are not relying on a single chain which can get stuck 5) Method is designed to not only find the primary modes, but also the secondaries 46

Conclusion Entirely new algorithm for non-spinning SMBHBs Based on an Evolutionary Algorithm Not only finds multiple mode solutions but also maps PDFs Can also return the evidence Works very well for multiple sources Scaling time compared to MHMC algorithm is comparable Application to EMRIs and spinning black holes 47