In a two-way contingency table, the null hypothesis of quasi-independence. (QI) usually arises for two main reasons: 1) some cells involve structural

Size: px
Start display at page:

Download "In a two-way contingency table, the null hypothesis of quasi-independence. (QI) usually arises for two main reasons: 1) some cells involve structural"

Transcription

1 Simulate and Reject Monte Carlo Exact Conditional Tests for Quasi-independence Peter W. F. Smith and John W. McDonald Department of Social Statistics, University of Southampton, Southampton, SO17 1BJ, United Kingdom 1 Introduction In a two-way contingency table, the null hypothesis of quasi-independence (QI) usually arises for two main reasons: 1) some cells involve structural zeros or 2) interest is focused on part of the table, e.g., the o-diagonal cells. Consider Table 1, analyzed by Becker (1990), which cross-classies two independent interpretations of sputum cytology slides for lung cancer. Since the two interpretations tend to agree, most of the observations lie on the main diagonal and the hypothesis of independence is rejected. The hypothesis of QI for the o-diagonal cells, i.e., that the interpretations are independent given that they dier, is considered. However, the sparseness of the o-diagonal cells causes concern about the validity of using asymptotic tests, and an exact conditional test is used. Table 1: Cross-classication of rst and second independent interpretations of sputum cytology slides for lung cancer (Source: Archer et al., 1966) First interpretation Second interpretation N A S P T Negative Ambiguous cells Suspect Positive Technically unsatisfactory In order to perform an exact test of quasi-independence the null distribution of an appropriate test statistic must be calculated or simulated. For both independence and quasi-independence calculating the required distribution is often computationally infeasible. So simulation is used and a Monte Carlo exact conditional test is performed. 1

2 A Monte Carlo exact conditional test for independence is described by Agresti, Wackerly and Boyett (1979), Kreiner (1987) and Whittaker (1990). Briey, one generates a random sample of tables according to the conditional distribution of the table counts given the marginal totals. For each generated table, an appropriate test statistic is calculated and the exact conditional p-value is estimated by the proportion of generated tables which are at least as discrepant from the null as the observed. The accuracy of this unbiased estimate may be evaluated using binomial condence intervals. The problem, when using this approach to test for quasi-independence, is how to generate a random sample of tables from the null distribution. Since, as shown by Smith and McDonald (1993), the null distribution has a normalizing constant which is very dicult to evaluate. In the next section, we introduce a simulate and reject procedure based on simulating tables under independence. We then suggest some modications which dramatically reduce the rejection rate and so make the procedure viable. 2 Simulate and Reject Procedure Let X = fx ij : ij 2 I = (1; : : : ; r) (1; : : : ; c)g be a r c contingency table, and let I be a proper subset of the index set I. We call the cells in I the cells of interest and the cells not in I xed. For the 5 5 Table 1, I refers to the o-diagonal cells. The saturated log-linear model for m ij = E(X ij ) has the form log m ij = + 1 i + 2 j + 12 ij : The hypothesis of quasi-independence over I corresponds to 12 ij = 0 for ij 2 I. Now 12 for ij ij 62 I are nuisance parameters with sucient statistics x ij ; ij 62 I. Therefore, an exact conditional test for QI is constructed using the conditional distribution of the table counts, given the margins and the observed counts in the xed cells. Hence, tables under QI can be generated by simulating tables under independence and only retaining those where the counts in the xed cells match the observed values. For Table 1, we simulate tables from a multivariate hypergeometric distribution, thus maintaining the margins, and reject all tables which do not match the diagonal (26,11,6,4,2). Methods for simulating from a multivariate hypergeometric distribution are given by Agresti, Wackerly and Boyett (1979) and Pateeld (1981). Alas, this naive simulate and reject procedure is not computationally viable, since we failed to simulate under independence a table with a matching diagonal in over one billion attempts! All is not lost. Smith and McDonald (1993) show that the distribution of the cells of interest under quasi-independence does not depend on the observed values of the xed cells. By replacing the values in the xed cells with any counts and adjusting the margins accordingly, the simulate and reject procedure yields the correct null distribution. Therefore, the rejection rate 2

3 can be signicantly reduced by replacing the counts in the xed cells by those closest to independence, based on the adjusted margins. For Table 1 we replace the diagonal with (3,13,1,0,1). Note that the row and column margins for this adjusted tables are (x i+) = (30; 27; 8; 1; 3) and (x+i) = (6; 34; 7; 9; 13), respectively, and that now x ii equals the nearest integer to x i+x+i=x++, where + denotes summation over a subscript. Using this adjusted table, in order to obtain 2000 tables with matching diagonal, 234,595 tables were simulated under independence. The rejection rate of 99.15% is very large, but this adjusted-margins simulate and reject procedure is now computationally feasible. Pateeld (1981) simulates the required multivariate hypergeometric distribution by simulating cell by cell and row by row from univariate hypergeometrics, based on a factorization of the multiple hypergeometric mass function. Note that each r c table requires (r? 1) (c? 1) simulated counts (the others obtained by subtraction). For Table 1, 234; = 3; 753; 520 simulations were required to obtain 2000 tables with matching diagonal, i.e., an average of 1877 simulations per retained table. We now propose various ways of reducing the average number of simulated cell counts required per retained table, by modifying Pateeld's algorithm. 2.1 Rejecting Partly Simulated Tables Pateeld's algorithm simulates tables cell by cell, so a mismatch can be identied immediately after the count for a xed cell has been simulated, thus eliminating unnecessary simulation of the remaining cell counts in the table. For Table 1, after adjusting the diagonal and margins, we would repeatedly simulate the (1,1) cell count until a match of 3 occurs, then simulate the (1,2) to (1,4) cell counts and obtain the (1,5) cell count by subtraction. However, since the number of rejections does not aect the distribution of the tables retained, the (1,1) cell count can be set at its observed value and the rest of the row obtained as described. Next the (2,1) and (2,2) cell counts are simulated. If the simulated (2,2) cell count matches the observed value of 13, the rest of the row can be simulated. If not, the whole table must be rejected and a new table started. Once we have a successful match for the (2,2) cell count, we can continue simulating the table until the count in the next xed cell is simulated, the (3,3) cell here. Again, if we have a match, we continue simulating the table; a mismatch means that we must reject the table and start again. We continue in this manner until we have simulated a table with the required matching counts for all xed cells, remembering to check for matches where the count is obtained by subtraction. Partly simulated tables are now rejected, so eciency is measured by the number of cell counts simulated per retained table. By xing the rst cell count and rejecting partly simulated tables, 481,605 simulations were required to obtain 2000 tables, an average 241 per retained table (versus 1877 without these improvements). 3

4 2.2 Changing the Order of Cell Count Simulation A further improvement is to permute the rows and columns of the table in order to attempt to match the counts in the xed cells as early as possible. Hence, on average, reducing the number of wasted simulations. For example, if the only xed cell in a r c table is the (r; c) cell, we must simulate the whole table before checking for a match for the last cell. By permuting the rows and columns so the xed cell becomes the (1,1) cell, we can set the count in the (1,1) cell at its observed value and simulate the rest of the table. Therefore, no rejection is required. McDonald and Smith (1994) extend this idea to triangular tables and propose an algorithm where no rejection is necessary. However, when simulating cell counts row by row, no such permutation is possible for tables with only diagonal xed cells. We now discuss the important and common situation of testing for odiagonal QI in a r r table. Recall that Pateeld's algorithm simulates cell by cell, row by row. However, one can show that in order to simulate the (i; j) cell only cells above and to the left need to have been simulated, i.e., the cells (k; l); k = 1; : : : ; i; l = 1; : : : ; j; k 6= l. Note that these cells plus the cell whose count is being simulated form a rectangle. Therefore, we can change the order in which the cells are simulated so as to attempt to match the counts in the xed cells as early as possible. When matching on the diagonal, we can set the (1,1) cell count to the (adjusted) observed value. The next xed count to match on is in the (2,2) cell, so we need only simulate cells counts above and to the left before checking that the simulated count for the (2,2) cell equals the (adjusted) observed value. Here we have only simulated 3 cell counts before checking for a match. If we have a mismatch, we have saved r? 3 unnecessary simulations for the rst row. If we have a match, we continue by simulating the counts of the cells above and to the left of the (3,3) cell, which reduces the number of simulations required before the second match is attempted. After each successful match we continue through the table in this manner. We call this the expanding-rectangle algorithm. For Table 1, this algorithm reduced the average number of simulations per retained table to 172 (from 241 when simulating row by row). For a r r table with xed diagonal, the counts in the xed cells can be reordered by permuting the rows and columns using the same permutation. For the r! possible reorderings, the average number of simulations per retained table varies. In our experience, attempting the \hardest" matches rst reduces the average number of simulations per retained table. For example if the (r; r) cell count is the \hardest", we would simulate the whole table only to have to reject the table frequently because the nal match is the \hardest". On the other hand, if the \hardest" match is the (1,1) cell count, this count is set to the (adjusted) observed value and the \hardest" match never attempted. Our measure of hardness of match for the (i; i) cell count is the conditional probability of a match, given that we have matched on the (k; k); k = 1; : : : ; i? 1, cell counts. 4

5 When trying to determine the optimum permutation of the diagonal, the problem is how to calculate the conditional probability of a match, i.e., the hardness of a match. However, our experience suggests that the conditional probability of a match is approximately equal to the \marginal" probability of a match, i.e., the probability of a match if we were simulating the whole table before checking for matches. This is easily calculated for each diagonal cell since, as shown by Pateeld's factorization and used in his algorithm, the marginal probability of a match is hypergeometric. For Table 1 with diagonal (3,13,1,0,1), the marginal probabilities of a match are , , , , , respectively. We permute the rows (and columns) using the permutation (2,1,5,3,4) so that these probabilities are in increasing order for the rearranged table. Now using the expanding-rectangle algorithm on the permuted table, the average number of simulations per retained table is reduced to 104 (from 172 before permuting). 2.3 Estimated P-values The likelihood ratio test statistic for quasi-independence for Table 1 is with estimated exact p-value of and associated 99% condence interval of ( , ), based on 20,000 tables generated under QI. While the observed test statistic and associated p-value are extreme, note that the rejection rate does not depend on their values. 3 Discussion In this paper, we propose improvements to a naive simulate and reject procedure for generating r c tables under quasi-independence for an arbitrary pattern of xed cells. Although some of the algorithmic improvements are described for generating under QI for the o-diagonal cells of a square table, the ideas are applicable to other patterns of xed cells. Apart from complete enumeration, which is only viable for small tables, the simulate and reject procedure is currently the only method for generating independent tables from the exact null distribution under QI. Our improvements to the naive procedure greatly increase its eciency. Smith, McDonald and Forster (1994) discuss another method for generating tables under QI using a Gibbs sampling approach, based on theoretical results in Forster, McDonald and Smith (1994). However, the generated tables are not necessarily independent and are only realizations from an approximation to the exact null distribution. When using a single Markov chain, the observed table is the obvious starting value. For multiple chains, obtaining other starting values with the same sucient statistics for the nuisance parameters as the observed data is problematic. A possible solution is to generate a small number of independent starting values using the simulate and reject algorithms proposed. 5

6 Acknowledgements This work was supported by Economic and Social Research Council award H as part of the Analysis of Large and Complex Datasets Programme. References Agresti, A., Wackerly, D. and Boyett, J. M. (1979). Exact conditional tests for cross-classications: approximation of attained signicance levels. Psychometrika, 44, 75{83. Archer, P. G., Koprowska, I., McDonald, J. R., Naylor, B., Papanicolaou, G. N. and Umiker, W. O. (1966). A study of variability in the interpretation of sputum cytology slides. Cancer Res., 26, 2122{2144. Becker, M. P. (1990). Quasisymmetric models for the analysis of square contingency tables. J. R. Statist. Soc. B, 52, 369{378. Forster, J. J., McDonald, J. W. and Smith, P. W. F. (1994). Monte Carlo exact conditional tests for log-linear and logistic models. Working Paper, University of Southampton. Kreiner, S. (1987). Analysis of multi-dimensional contingency tables by exact conditional tests: techniques and strategies. Scand. J. Statist., 14, 97{112. McDonald, J. W. and Smith, P. W. F. (1994). Exact conditional tests of quasi-independence for triangular contingency tables: estimating attained signicance levels. Appl. Statist., (to appear). Pateeld, W. M. (1981). Algorithm AS 159: An ecient method of generating random R C tables with given row and column totals. Appl. Statist., 30, 91{97. Smith, P. W. F. and McDonald, J. W. (1993). Exact conditional tests for incomplete contingency tables: estimating attained signicance levels. Working Paper, University of Southampton. Smith, P. W. F., McDonald, J. W. and Forster, J. J. (1994). Monte Carlo exact conditional tests for quasi-independence using Gibbs sampling. Working Paper, University of Southampton. Whittaker J. (1990). Graphical Models in Applied Multivariate Statistics. Chichester: Wiley. 6

Richard E. Korf. June 27, Abstract. divide them into two subsets, so that the sum of the numbers in

Richard E. Korf. June 27, Abstract. divide them into two subsets, so that the sum of the numbers in A Complete Anytime Algorithm for Number Partitioning Richard E. Korf Computer Science Department University of California, Los Angeles Los Angeles, Ca. 90095 korf@cs.ucla.edu June 27, 1997 Abstract Given

More information

For the hardest CMO tranche, generalized Faure achieves accuracy 10 ;2 with 170 points, while modied Sobol uses 600 points. On the other hand, the Mon

For the hardest CMO tranche, generalized Faure achieves accuracy 10 ;2 with 170 points, while modied Sobol uses 600 points. On the other hand, the Mon New Results on Deterministic Pricing of Financial Derivatives A. Papageorgiou and J.F. Traub y Department of Computer Science Columbia University CUCS-028-96 Monte Carlo simulation is widely used to price

More information

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Network. Department of Statistics. University of California, Berkeley. January, Abstract Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,

More information

IMPUTING MISSING VALUES IN TWO-WAY CONTINGENCY TABLES USING LINEAR PROGRAMMING AND MARKOV CHAIN MONTE CARLO

IMPUTING MISSING VALUES IN TWO-WAY CONTINGENCY TABLES USING LINEAR PROGRAMMING AND MARKOV CHAIN MONTE CARLO STATISTICAL COMMISSION and Working paper no. 39 ECONOMIC COMMISSION FOR EUROPE English only CONFERENCE OF EUROPEAN STATISTICIANS UNECE Work Session on Statistical Data Editing (27-29 May 2002, Helsinki,

More information

Computational Methods in Statistics with Applications A Numerical Point of View. Large Data Sets. L. Eldén. March 2016

Computational Methods in Statistics with Applications A Numerical Point of View. Large Data Sets. L. Eldén. March 2016 Computational Methods in Statistics with Applications A Numerical Point of View L. Eldén SeSe March 2016 Large Data Sets IDA Machine Learning Seminars, September 17, 2014. Sequential Decision Making: Experiment

More information

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @

More information

Issues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users

Issues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users Practical Considerations for WinBUGS Users Kate Cowles, Ph.D. Department of Statistics and Actuarial Science University of Iowa 22S:138 Lecture 12 Oct. 3, 2003 Issues in MCMC use for Bayesian model fitting

More information

Statistical Matching using Fractional Imputation

Statistical Matching using Fractional Imputation Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:

More information

Integration. Volume Estimation

Integration. Volume Estimation Monte Carlo Integration Lab Objective: Many important integrals cannot be evaluated symbolically because the integrand has no antiderivative. Traditional numerical integration techniques like Newton-Cotes

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

The Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing

The Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing UW Biostatistics Working Paper Series 9-6-2005 The Optimal Discovery Procedure: A New Approach to Simultaneous Significance Testing John D. Storey University of Washington, jstorey@u.washington.edu Suggested

More information

Estimation of Item Response Models

Estimation of Item Response Models Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:

More information

What Is An Algorithm? Algorithms are the ideas behind computer programs. An algorithm is the thing which stays the same whether

What Is An Algorithm? Algorithms are the ideas behind computer programs. An algorithm is the thing which stays the same whether What Is An Algorithm? Algorithms are the ideas behind computer programs An algorithm is the thing which stays the same whether the program is in Pascal running on a Cray innew York or is in BASIC running

More information

STATISTICS (STAT) Statistics (STAT) 1

STATISTICS (STAT) Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).

More information

Probabilistic (Randomized) algorithms

Probabilistic (Randomized) algorithms Probabilistic (Randomized) algorithms Idea: Build algorithms using a random element so as gain improved performance. For some cases, improved performance is very dramatic, moving from intractable to tractable.

More information

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907 The Game of Clustering Rowena Cole and Luigi Barone Department of Computer Science, The University of Western Australia, Western Australia, 697 frowena, luigig@cs.uwa.edu.au Abstract Clustering is a technique

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table

An Algorithm to Compute Exact Power of an Unordered RxC Contingency Table NESUG 27 An Algorithm to Compute Eact Power of an Unordered RC Contingency Table Vivek Pradhan, Cytel Inc., Cambridge, MA Stian Lydersen, Department of Cancer Research and Molecular Medicine, Norwegian

More information

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s]

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s] Fast, single-pass K-means algorithms Fredrik Farnstrom Computer Science and Engineering Lund Institute of Technology, Sweden arnstrom@ucsd.edu James Lewis Computer Science and Engineering University of

More information

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019

More information

Multivariate Capability Analysis

Multivariate Capability Analysis Multivariate Capability Analysis Summary... 1 Data Input... 3 Analysis Summary... 4 Capability Plot... 5 Capability Indices... 6 Capability Ellipse... 7 Correlation Matrix... 8 Tests for Normality... 8

More information

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No.677 C.B.C.L Paper No.8

More information

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

Bootstrapping Method for  14 June 2016 R. Russell Rhinehart. Bootstrapping Bootstrapping Method for www.r3eda.com 14 June 2016 R. Russell Rhinehart Bootstrapping This is extracted from the book, Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation,

More information

Chapter 5. Radicals. Lesson 1: More Exponent Practice. Lesson 2: Square Root Functions. Lesson 3: Solving Radical Equations

Chapter 5. Radicals. Lesson 1: More Exponent Practice. Lesson 2: Square Root Functions. Lesson 3: Solving Radical Equations Chapter 5 Radicals Lesson 1: More Exponent Practice Lesson 2: Square Root Functions Lesson 3: Solving Radical Equations Lesson 4: Simplifying Radicals Lesson 5: Simplifying Cube Roots This assignment is

More information

Random Number Generation and Monte Carlo Methods

Random Number Generation and Monte Carlo Methods James E. Gentle Random Number Generation and Monte Carlo Methods With 30 Illustrations Springer Contents Preface vii 1 Simulating Random Numbers from a Uniform Distribution 1 1.1 Linear Congruential Generators

More information

BMVC 1996 doi: /c.10.41

BMVC 1996 doi: /c.10.41 On the use of the 1D Boolean model for the description of binary textures M Petrou, M Arrigo and J A Vons Dept. of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, United

More information

Lecture notes on Transportation and Assignment Problem (BBE (H) QTM paper of Delhi University)

Lecture notes on Transportation and Assignment Problem (BBE (H) QTM paper of Delhi University) Transportation and Assignment Problems The transportation model is a special class of linear programs. It received this name because many of its applications involve determining how to optimally transport

More information

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm In the name of God Part 4. 4.1. Dantzig-Wolf Decomposition Algorithm Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Introduction Real world linear programs having thousands of rows and columns.

More information

1 Introduction Testing seeks to reveal software faults by executing a program and comparing the output expected to the output produced. Exhaustive tes

1 Introduction Testing seeks to reveal software faults by executing a program and comparing the output expected to the output produced. Exhaustive tes Using Dynamic Sensitivity Analysis to Assess Testability Jerey Voas, Larry Morell y, Keith Miller z Abstract: This paper discusses sensitivity analysis and its relationship to random black box testing.

More information

AB AC AD BC BD CD ABC ABD ACD ABCD

AB AC AD BC BD CD ABC ABD ACD ABCD LGORITHMS FOR OMPUTING SSOITION RULES USING PRTIL-SUPPORT TREE Graham Goulbourne, Frans oenen and Paul Leng Department of omputer Science, University of Liverpool, UK graham g, frans, phl@csc.liv.ac.uk

More information

Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation

Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation Thomas Mejer Hansen, Klaus Mosegaard, and Knud Skou Cordua 1 1 Center for Energy Resources

More information

Geodesic and parallel models for leaf shape

Geodesic and parallel models for leaf shape Geodesic and parallel models for leaf shape Stephan F. Huckemann and Thomas Hotz Institute for Mathematical Stochastics, Georg-August Universität Göttingen 1 Introduction Since more than a decade, many

More information

Approximate Bayesian Computation. Alireza Shafaei - April 2016

Approximate Bayesian Computation. Alireza Shafaei - April 2016 Approximate Bayesian Computation Alireza Shafaei - April 2016 The Problem Given a dataset, we are interested in. The Problem Given a dataset, we are interested in. The Problem Given a dataset, we are interested

More information

Creating Meaningful Training Data for Dicult Job Shop Scheduling Instances for Ordinal Regression

Creating Meaningful Training Data for Dicult Job Shop Scheduling Instances for Ordinal Regression Creating Meaningful Training Data for Dicult Job Shop Scheduling Instances for Ordinal Regression Helga Ingimundardóttir University of Iceland March 28 th, 2012 Outline Introduction Job Shop Scheduling

More information

MCMC Methods for data modeling

MCMC Methods for data modeling MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms

More information

MDP Routing in ATM Networks. Using the Virtual Path Concept 1. Department of Computer Science Department of Computer Science

MDP Routing in ATM Networks. Using the Virtual Path Concept 1. Department of Computer Science Department of Computer Science MDP Routing in ATM Networks Using the Virtual Path Concept 1 Ren-Hung Hwang, James F. Kurose, and Don Towsley Department of Computer Science Department of Computer Science & Information Engineering University

More information

Instability, Sensitivity, and Degeneracy of Discrete Exponential Families

Instability, Sensitivity, and Degeneracy of Discrete Exponential Families Instability, Sensitivity, and Degeneracy of Discrete Exponential Families Michael Schweinberger Pennsylvania State University ONR grant N00014-08-1-1015 Scalable Methods for the Analysis of Network-Based

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University Hyperplane Ranking in Simple Genetic Algorithms D. Whitley, K. Mathias, and L. yeatt Department of Computer Science Colorado State University Fort Collins, Colorado 8523 USA whitley,mathiask,pyeatt@cs.colostate.edu

More information

Stochastic Function Norm Regularization of DNNs

Stochastic Function Norm Regularization of DNNs Stochastic Function Norm Regularization of DNNs Amal Rannen Triki Dept. of Computational Science and Engineering Yonsei University Seoul, South Korea amal.rannen@yonsei.ac.kr Matthew B. Blaschko Center

More information

Truncation Errors. Applied Numerical Methods with MATLAB for Engineers and Scientists, 2nd ed., Steven C. Chapra, McGraw Hill, 2008, Ch. 4.

Truncation Errors. Applied Numerical Methods with MATLAB for Engineers and Scientists, 2nd ed., Steven C. Chapra, McGraw Hill, 2008, Ch. 4. Chapter 4: Roundoff and Truncation Errors Applied Numerical Methods with MATLAB for Engineers and Scientists, 2nd ed., Steven C. Chapra, McGraw Hill, 2008, Ch. 4. 1 Outline Errors Accuracy and Precision

More information

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo

More information

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal. Chapter 8 out of 7 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal 8 Matrices Definitions and Basic Operations Matrix algebra is also known

More information

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES A. Likas, K. Blekas and A. Stafylopatis National Technical University of Athens Department

More information

Approximate (Monte Carlo) Inference in Bayes Nets. Monte Carlo (continued)

Approximate (Monte Carlo) Inference in Bayes Nets. Monte Carlo (continued) Approximate (Monte Carlo) Inference in Bayes Nets Basic idea: Let s repeatedly sample according to the distribution represented by the Bayes Net. If in 400/1000 draws, the variable X is true, then we estimate

More information

Binary Diagnostic Tests Clustered Samples

Binary Diagnostic Tests Clustered Samples Chapter 538 Binary Diagnostic Tests Clustered Samples Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. In the twogroup case, each cluster

More information

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster

More information

Overview. H. R. Alvarez A., Ph. D.

Overview. H. R. Alvarez A., Ph. D. Network Modeling Overview Networks arise in numerous settings: transportation, electrical, and communication networks, for example. Network representations also are widely used for problems in such diverse

More information

1 Introduction Complex decision problems related to economy, environment, business and engineering are multidimensional and have multiple and conictin

1 Introduction Complex decision problems related to economy, environment, business and engineering are multidimensional and have multiple and conictin A Scalable Parallel Algorithm for Multiple Objective Linear Programs Malgorzata M. Wiecek Hong Zhang y Abstract This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear

More information

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016 Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation

More information

Networks for Control. California Institute of Technology. Pasadena, CA Abstract

Networks for Control. California Institute of Technology. Pasadena, CA Abstract Learning Fuzzy Rule-Based Neural Networks for Control Charles M. Higgins and Rodney M. Goodman Department of Electrical Engineering, 116-81 California Institute of Technology Pasadena, CA 91125 Abstract

More information

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Computer vision: models, learning and inference. Chapter 10 Graphical Models Computer vision: models, learning and inference Chapter 10 Graphical Models Independence Two variables x 1 and x 2 are independent if their joint probability distribution factorizes as Pr(x 1, x 2 )=Pr(x

More information

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Feature Selection Using Modified-MCA Based Scoring Metric for Classification 2011 International Conference on Information Communication and Management IPCSIT vol.16 (2011) (2011) IACSIT Press, Singapore Feature Selection Using Modified-MCA Based Scoring Metric for Classification

More information

Dierential-Linear Cryptanalysis of Serpent? Haifa 32000, Israel. Haifa 32000, Israel

Dierential-Linear Cryptanalysis of Serpent? Haifa 32000, Israel. Haifa 32000, Israel Dierential-Linear Cryptanalysis of Serpent Eli Biham, 1 Orr Dunkelman, 1 Nathan Keller 2 1 Computer Science Department, Technion. Haifa 32000, Israel fbiham,orrdg@cs.technion.ac.il 2 Mathematics Department,

More information

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

The Relative Neighbourhood Graph. of a Finite Planar Set. Godfried T. Toussaint

The Relative Neighbourhood Graph. of a Finite Planar Set. Godfried T. Toussaint The Relative Neighbourhood Graph of a Finite Planar Set Godfried T. Toussaint Published in Pattern Recognition, Vol. 12, 1980, pp. 261-268. Winner of an Outstanding Paper Award given by the Pattern Recognition

More information

Lab 2: Support Vector Machines

Lab 2: Support Vector Machines Articial neural networks, advanced course, 2D1433 Lab 2: Support Vector Machines March 13, 2007 1 Background Support vector machines, when used for classication, nd a hyperplane w, x + b = 0 that separates

More information

The only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori

The only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori Use of K-Near Optimal Solutions to Improve Data Association in Multi-frame Processing Aubrey B. Poore a and in Yan a a Department of Mathematics, Colorado State University, Fort Collins, CO, USA ABSTRACT

More information

Department of. Computer Science. Remapping Subpartitions of. Hyperspace Using Iterative. Genetic Search. Keith Mathias and Darrell Whitley

Department of. Computer Science. Remapping Subpartitions of. Hyperspace Using Iterative. Genetic Search. Keith Mathias and Darrell Whitley Department of Computer Science Remapping Subpartitions of Hyperspace Using Iterative Genetic Search Keith Mathias and Darrell Whitley Technical Report CS-4-11 January 7, 14 Colorado State University Remapping

More information

Journal of Global Optimization, 10, 1{40 (1997) A Discrete Lagrangian-Based Global-Search. Method for Solving Satisability Problems *

Journal of Global Optimization, 10, 1{40 (1997) A Discrete Lagrangian-Based Global-Search. Method for Solving Satisability Problems * Journal of Global Optimization, 10, 1{40 (1997) c 1997 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. A Discrete Lagrangian-Based Global-Search Method for Solving Satisability Problems

More information

Parameterized Complexity of Independence and Domination on Geometric Graphs

Parameterized Complexity of Independence and Domination on Geometric Graphs Parameterized Complexity of Independence and Domination on Geometric Graphs Dániel Marx Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. dmarx@informatik.hu-berlin.de

More information

Distances between intuitionistic fuzzy sets

Distances between intuitionistic fuzzy sets Fuzzy Sets and Systems 4 (000) 505 58 www.elsevier.com/locate/fss Distances between intuitionistic fuzzy sets Eulalia Szmidt, Janusz Kacprzyk Systems Research Institute, Polish Academy of Sciences, ul.

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

Fast Fuzzy Clustering of Infrared Images. 2. brfcm

Fast Fuzzy Clustering of Infrared Images. 2. brfcm Fast Fuzzy Clustering of Infrared Images Steven Eschrich, Jingwei Ke, Lawrence O. Hall and Dmitry B. Goldgof Department of Computer Science and Engineering, ENB 118 University of South Florida 4202 E.

More information

Lower Bounds for Insertion Methods for TSP. Yossi Azar. Abstract. optimal tour. The lower bound holds even in the Euclidean Plane.

Lower Bounds for Insertion Methods for TSP. Yossi Azar. Abstract. optimal tour. The lower bound holds even in the Euclidean Plane. Lower Bounds for Insertion Methods for TSP Yossi Azar Abstract We show that the random insertion method for the traveling salesman problem (TSP) may produce a tour (log log n= log log log n) times longer

More information

StatsMate. User Guide

StatsMate. User Guide StatsMate User Guide Overview StatsMate is an easy-to-use powerful statistical calculator. It has been featured by Apple on Apps For Learning Math in the App Stores around the world. StatsMate comes with

More information

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods -

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods - Physics 736 Experimental Methods in Nuclear-, Particle-, and Astrophysics - Statistical Methods - Karsten Heeger heeger@wisc.edu Course Schedule and Reading course website http://neutrino.physics.wisc.edu/teaching/phys736/

More information

Forward Error Correction Codes

Forward Error Correction Codes Appendix 6 Wireless Access Networks: Fixed Wireless Access and WLL Networks Ð Design and Operation. Martin P. Clark Copyright & 000 John Wiley & Sons Ltd Print ISBN 0-471-4998-1 Online ISBN 0-470-84151-6

More information

[5] R. A. Dwyer. Higher-dimensional Voronoi diagrams in linear expected time. Discrete

[5] R. A. Dwyer. Higher-dimensional Voronoi diagrams in linear expected time. Discrete [5] R. A. Dwyer. Higher-dimensional Voronoi diagrams in linear expected time. Discrete and Computational Geometry, 6(4):343{367, 1991. [6] J. H. Friedman and L. C. Rafsky. Multivariate generalizations

More information

Predicting Popular Xbox games based on Search Queries of Users

Predicting Popular Xbox games based on Search Queries of Users 1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which

More information

Markov Chain Monte Carlo (part 1)

Markov Chain Monte Carlo (part 1) Markov Chain Monte Carlo (part 1) Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2018 Depending on the book that you select for

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods ISyE 6416: Computational Statistics Spring 2017 Lecture 13: Monte Carlo Methods Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Determine area

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,

More information

EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA. Shang-Lin Yang. B.S., National Taiwan University, 1996

EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA. Shang-Lin Yang. B.S., National Taiwan University, 1996 EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA By Shang-Lin Yang B.S., National Taiwan University, 1996 M.S., University of Pittsburgh, 2005 Submitted to the

More information

The ctest Package. January 3, 2000

The ctest Package. January 3, 2000 R objects documented: The ctest Package January 3, 2000 bartlett.test....................................... 1 binom.test........................................ 2 cor.test.........................................

More information

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual

More information

Markov Random Fields and Gibbs Sampling for Image Denoising

Markov Random Fields and Gibbs Sampling for Image Denoising Markov Random Fields and Gibbs Sampling for Image Denoising Chang Yue Electrical Engineering Stanford University changyue@stanfoed.edu Abstract This project applies Gibbs Sampling based on different Markov

More information

1 Methods for Posterior Simulation

1 Methods for Posterior Simulation 1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing

More information

BESTFIT, DISTRIBUTION FITTING SOFTWARE BY PALISADE CORPORATION

BESTFIT, DISTRIBUTION FITTING SOFTWARE BY PALISADE CORPORATION Proceedings of the 1996 Winter Simulation Conference ed. J. M. Charnes, D. J. Morrice, D. T. Brunner, and J. J. S\vain BESTFIT, DISTRIBUTION FITTING SOFTWARE BY PALISADE CORPORATION Linda lankauskas Sam

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

NOTATION AND TERMINOLOGY

NOTATION AND TERMINOLOGY 15.053x, Optimization Methods in Business Analytics Fall, 2016 October 4, 2016 A glossary of notation and terms used in 15.053x Weeks 1, 2, 3, 4 and 5. (The most recent week's terms are in blue). NOTATION

More information

Monte Carlo Methods. Lecture slides for Chapter 17 of Deep Learning Ian Goodfellow Last updated

Monte Carlo Methods. Lecture slides for Chapter 17 of Deep Learning   Ian Goodfellow Last updated Monte Carlo Methods Lecture slides for Chapter 17 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2017-12-29 Roadmap Basics of Monte Carlo methods Importance Sampling Markov Chains

More information

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a Preprint 0 (2000)?{? 1 Approximation of a direction of N d in bounded coordinates Jean-Christophe Novelli a Gilles Schaeer b Florent Hivert a a Universite Paris 7 { LIAFA 2, place Jussieu - 75251 Paris

More information

Inference for loglinear models (contd):

Inference for loglinear models (contd): Stat 504, Lecture 25 1 Inference for loglinear models (contd): Loglinear/Logit connection Intro to Graphical Models Stat 504, Lecture 25 2 Loglinear Models no distinction between response and explanatory

More information

Dynamic Programming. Outline and Reading. Computing Fibonacci

Dynamic Programming. Outline and Reading. Computing Fibonacci Dynamic Programming Dynamic Programming version 1.2 1 Outline and Reading Matrix Chain-Product ( 5.3.1) The General Technique ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Dynamic Programming version 1.2 2 Computing

More information

Simulation. Monte Carlo

Simulation. Monte Carlo Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run is always random A single instance of a random variable Goal of a simulation experiment is to get knowledge about

More information

Bayesian Robust Inference of Differential Gene Expression The bridge package

Bayesian Robust Inference of Differential Gene Expression The bridge package Bayesian Robust Inference of Differential Gene Expression The bridge package Raphael Gottardo October 30, 2017 Contents Department Statistics, University of Washington http://www.rglab.org raph@stat.washington.edu

More information

On the Number of Tilings of a Square by Rectangles

On the Number of Tilings of a Square by Rectangles University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange University of Tennessee Honors Thesis Projects University of Tennessee Honors Program 5-2012 On the Number of Tilings

More information

Condence Intervals about a Single Parameter:

Condence Intervals about a Single Parameter: Chapter 9 Condence Intervals about a Single Parameter: 9.1 About a Population Mean, known Denition 9.1.1 A point estimate of a parameter is the value of a statistic that estimates the value of the parameter.

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

Exact Sampling for Hardy- Weinberg Equilibrium

Exact Sampling for Hardy- Weinberg Equilibrium Exact Sampling for Hardy- Weinberg Equilibrium Mark Huber Dept. of Mathematics and Institute of Statistics and Decision Sciences Duke University mhuber@math.duke.edu www.math.duke.edu/~mhuber Joint work

More information

Level-set MCMC Curve Sampling and Geometric Conditional Simulation

Level-set MCMC Curve Sampling and Geometric Conditional Simulation Level-set MCMC Curve Sampling and Geometric Conditional Simulation Ayres Fan John W. Fisher III Alan S. Willsky February 16, 2007 Outline 1. Overview 2. Curve evolution 3. Markov chain Monte Carlo 4. Curve

More information

Handbook of Statistical Modeling for the Social and Behavioral Sciences

Handbook of Statistical Modeling for the Social and Behavioral Sciences Handbook of Statistical Modeling for the Social and Behavioral Sciences Edited by Gerhard Arminger Bergische Universität Wuppertal Wuppertal, Germany Clifford С. Clogg Late of Pennsylvania State University

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Nearest Neighbor Predictors

Nearest Neighbor Predictors Nearest Neighbor Predictors September 2, 2018 Perhaps the simplest machine learning prediction method, from a conceptual point of view, and perhaps also the most unusual, is the nearest-neighbor method,

More information

Multidimensional Scaling Methods For Many-Object Sets: a. Review. L. Tsogo, M.H. Masson. Universite de Technologie de Compiegne

Multidimensional Scaling Methods For Many-Object Sets: a. Review. L. Tsogo, M.H. Masson. Universite de Technologie de Compiegne Multidimensional Scaling Methods For Many-Object Sets: a Review L. Tsogo, M.H. Masson Centre de Recherches de Royallieu UMR CNRS 6599 Universite de Technologie de Compiegne B.P. 20529-60205 Compiegne cedex

More information

Statistical Physics of Community Detection

Statistical Physics of Community Detection Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined

More information