Optimization and Simulation
|
|
- Ralf Rafe Bennett
- 6 years ago
- Views:
Transcription
1 Optimization and Simulation Statistical analysis and bootstrapping Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 1 / 17
2 Introduction The outputs of the simulator are random variables. Running the simulator provides one realization of these r.v. We have no access to the pdf or CDF of these r.v. Well... this is actually why we rely on simulation. How to derive statistics about a r.v. when only instances are known? How to measure the quality of this statistic? M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 2 / 17
3 Sample mean and variance Consider X 1,..., X n i.i.d. r.v. E[X i ] = µ, Var(X i ) = σ 2. The sample mean X = 1 n is an unbiased estimate of the population mean µ, as E[ X] = µ. The sample variance S 2 = 1 n 1 n i=1 X i n (X i X) 2 is an unbiased estimator of the population variance σ 2, as E[S 2 ] = σ 2. (see proof: Ross, chapter 7) i=1 M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 3 / 17
4 Sample mean and variance Recursive computation 1 Initialize X 0 = 0, S 2 1 = 0. 2 Update the mean X k+1 = X k + X k+1 X k k +1 3 Update the variance ( Sk+1 2 = 1 1 ) Sk 2 k +(k +1)( X k+1 X k ) 2. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 4 / 17
5 Mean Square Error Consider X 1,..., X n i.i.d. r.v. with CDF F. Consider a parameter θ(f) of the distribution (mean, quantile, mode, etc.) Consider θ(x 1,...,X n ) an estimator of θ(f). The Mean Square Error of the estimator is defined as [ ) ] 2 MSE(F) = E F ( θ(x1,...,x n ) θ(f), where E F emphasizes that the expectation is taken under the assumption that the r.v. all have distribution F. If F is unknown, it is not immediate to find an estimator of MSE. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 5 / 17
6 How many draws must be used? Let X a r.v. with mean θ and variance σ 2. We want to estimate the mean θ of the simulated distribution. The estimator used is the sample mean: X. The mean square error is E[( X θ) 2 ] = σ2 n The sample mean X is normally distributed with mean θ and variance σ 2 /n. So we can stop generating data when σ/ n is small. σ is approximated by the sample variance S. Law of large numbers: at least 100 draws (say) should be used. See Ross p. 121 for details. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 6 / 17
7 Mean Square Error Other indicators than the mean are desired. Theoretical results about the MSE cannot always be derived. Solution: rely on simulation. Method: bootstrapping. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 7 / 17
8 Empirical distribution function Consider X 1,..., X n i.i.d. r.v. with CDF F. Consider a realization x 1,...,x n of these r.v. The empirical distribution function is defined as F e (x) = 1 n n I{x i x}, i=1 where I{x i x} = { 1 if xi x, 0 otherwise. CDF of a r.v. that can take any x i with equal probability. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 8 / 17
9 Empirical CDF F e (x), n = 10 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 9 / 17
10 Empirical CDF F e (x), n = 100 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 10 / 17
11 Empirical CDF F e (x), n = 1000 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 11 / 17
12 Mean Square Error We use the empirical distribution function F e We can approximate [ ) ] 2 MSE(F) = E F ( θ(x1,...,x n ) θ(f), by [ ) ] 2 MSE(F e ) = E Fe ( θ(x1,...,x n ) θ(f e ), θ(f e ) can be computed directly from the data (mean, variance, etc.) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 12 / 17
13 Mean Square Error We want to compute [ ) ] 2 MSE(F e ) = E Fe ( θ(x1,...,x n ) θ(f e ), F e is the CDF of a r.v. that can take any x i with equal probability. Therefore, MSE(F e ) = 1 n n n i 1 =1 n i n=1 Clearly impossible to compute when n is large. Solution: simulation. [ ( θ(xi1,...,x in) θ(f e )) 2 ], M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 13 / 17
14 Bootstrapping For r = 1,...,R Draw x r 1,...,xr n from F e, that is draw from the data: 1 Let s be a draw from U[0,1] 2 Set j = floor(ns). 3 Return x j. Compute M r = ( θ(x r 1,...,x r n) θ(f e )) 2, Estimate of MSE(F e ) and, therefore, of MSE(F): Typical value for R: R R r=1 M r M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 14 / 17
15 Bootstrap: simple example Data: 0.636, , 0.183, -1.67, Mean= MSE= E[( X θ) 2 ] = S 2 /n= r ˆθ θ(f e) MSE e M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 15 / 17
16 Summary The number of draws is determined by the required precision. In some cases, the precision is derived from theoretical results. If not, rely on bootstrapping. Idea: use simulation to estimate the Mean Square Error. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 16 / 17
17 Appendix: MSE for the mean Consider X 1,..., X n i.i.d. r.v. Denote θ = E[X i ] and σ 2 = Var(X i ). Consider X = n i=1 X i/n. E[ X] = n i=1 E[X i]/n = θ. MSE: E[( X θ) 2 ] = Var X ( n ) = Var X i /n i=1 = n Var(X i )/n 2 i=1 = σ 2 /n. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 17 / 17
Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea
Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.
More informationToday. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time
Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine
More informationBehavior of the sample mean. varx i = σ 2
Behavior of the sample mean We observe n independent and identically distributed (iid) draws from a random variable X. Denote the observed values by X 1, X 2,..., X n. Assume the X i come from a population
More informationModelling competition in demand-based optimization models
Modelling competition in demand-based optimization models Stefano Bortolomiol Virginie Lurkin Michel Bierlaire Transport and Mobility Laboratory (TRANSP-OR) École Polytechnique Fédérale de Lausanne Workshop
More informationSolving singularity issues in the estimation of econometric models
Solving singularity issues in the estimation of econometric models Michel Bierlaire Michaël Thémans Transport and Mobility Laboratory École Polytechnique Fédérale de Lausanne Solving singularity issues
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model
More informationSTA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27
STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 27 BOOTSTRAP CONFIDENCE INTERVALS Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 27 Distribution
More informationExploiting mobility data from Nokia smartphones
Exploiting mobility data from Nokia smartphones Michel Bierlaire, Jeffrey Newman Transport and Mobility Laboratory Ecole Polytechnique Fédérale de Lausanne Nokia @ EPFL Nokia Research Centers research.nokia.com/locations
More informationThree challenges in route choice modeling
Three challenges in route choice modeling Michel Bierlaire and Emma Frejinger transp-or.epfl.ch Transport and Mobility Laboratory, EPFL Three challenges in route choice modeling p.1/61 Route choice modeling
More informationISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods
ISyE 6416: Computational Statistics Spring 2017 Lecture 13: Monte Carlo Methods Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Determine area
More informationProbability Model for 2 RV s
Probability Model for 2 RV s The joint probability mass function of X and Y is P X,Y (x, y) = P [X = x, Y= y] Joint PMF is a rule that for any x and y, gives the probability that X = x and Y= y. 3 Example:
More informationPerformance Evaluation
Performance Evaluation Dan Lizotte 7-9-5 Evaluating Performance..5..5..5..5 Which do ou prefer and wh? Evaluating Performance..5..5 Which do ou prefer and wh?..5..5 Evaluating Performance..5..5..5..5 Performance
More informationUse of Extreme Value Statistics in Modeling Biometric Systems
Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision
More informationRANDOM SAMPLING OF ALTERNATIVES IN A ROUTE CHOICE CONTEXT
RANDOM SAMPLING OF ALTERNATIVES IN A ROUTE CHOICE CONTEXT Emma Frejinger Transport and Mobility Laboratory (TRANSP-OR), EPFL Abstract In this paper we present a new point of view on choice set generation
More informationStatistics 406 Exam November 17, 2005
Statistics 406 Exam November 17, 2005 1. For each of the following, what do you expect the value of A to be after executing the program? Briefly state your reasoning for each part. (a) X
More informationGAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao
GAMES Webinar: Rendering Tutorial 2 Monte Carlo Methods Shuang Zhao Assistant Professor Computer Science Department University of California, Irvine GAMES Webinar Shuang Zhao 1 Outline 1. Monte Carlo integration
More informationLab 5 Monte Carlo integration
Lab 5 Monte Carlo integration Edvin Listo Zec 9065-976 edvinli@student.chalmers.se October 0, 014 Co-worker: Jessica Fredby Introduction In this computer assignment we will discuss a technique for solving
More information4.5 The smoothed bootstrap
4.5. THE SMOOTHED BOOTSTRAP 47 F X i X Figure 4.1: Smoothing the empirical distribution function. 4.5 The smoothed bootstrap In the simple nonparametric bootstrap we have assumed that the empirical distribution
More informationSTAT 725 Notes Monte Carlo Integration
STAT 725 Notes Monte Carlo Integration Two major classes of numerical problems arise in statistical inference: optimization and integration. We have already spent some time discussing different optimization
More informationStatistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals.
Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals. In this Computer Class we are going to use Statgraphics
More informationPARAMETRIC MODEL SELECTION TECHNIQUES
PARAMETRIC MODEL SELECTION TECHNIQUES GARY L. BECK, STEVE FROM, PH.D. Abstract. Many parametric statistical models are available for modelling lifetime data. Given a data set of lifetimes, which may or
More informationTHE RECURSIVE HESSIAN SKETCH FOR ADAPTIVE FILTERING. Robin Scheibler and Martin Vetterli
THE RECURSIVE HESSIAN SKETCH FOR ADAPTIVE FILTERING Robin Scheibler and Martin Vetterli School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne,
More informationThe Bootstrap and Jackknife
The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter
More informationTraining in Mapping Changes on an Archaeological Site
Training in Mapping Changes on an Archaeological Site Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Pierre-Yves Gilliéron Bertrand Merminod Jérôme Zufferey EPFL Presentation 04.2015
More informationStatistical Matching using Fractional Imputation
Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationLecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
Lecture 12 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 23, 2007 1 2 3 4 5 1 2 Introduce the bootstrap 3 the bootstrap algorithm 4 Example
More informationRecursive Estimation
Recursive Estimation Raffaello D Andrea Spring 28 Problem Set : Probability Review Last updated: March 6, 28 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables, p x denotes the
More informationWill Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions
Will Monroe July 1, 017 with materials by Mehran Sahami and Chris Piech Joint Distributions Review: Normal random variable An normal (= Gaussian) random variable is a good approximation to many other distributions.
More informationHeteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors
Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors (Section 5.4) What? Consequences of homoskedasticity Implication for computing standard errors What do these two terms
More informationResampling Methods for Dependent Data
S.N. Lahiri Resampling Methods for Dependent Data With 25 Illustrations Springer Contents 1 Scope of Resampling Methods for Dependent Data 1 1.1 The Bootstrap Principle 1 1.2 Examples 7 1.3 Concluding
More informationOptimal Denoising of Natural Images and their Multiscale Geometry and Density
Optimal Denoising of Natural Images and their Multiscale Geometry and Density Department of Computer Science and Applied Mathematics Weizmann Institute of Science, Israel. Joint work with Anat Levin (WIS),
More informationWeek 4: Simple Linear Regression III
Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of
More informationSummer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis
Summer School in Statistics for Astronomers & Physicists June 15-17, 2005 Session on Computational Algorithms for Astrostatistics Cluster Analysis Max Buot Department of Statistics Carnegie-Mellon University
More informationBiogeme & Binary Logit Model Estimation
Biogeme & Binary Logit Model Estimation Anna Fernández Antolín Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering École Polytechnique Fédérale de Lausanne April
More informationNonparametric Error Estimation Methods for Evaluating and Validating Artificial Neural Network Prediction Models
Nonparametric Error Estimation Methods for Evaluating and Validating Artificial Neural Network Prediction Models Janet M. Twomey and Alice E. Smith Department of Industrial Engineering University of Pittsburgh
More informationDATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R
DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 24 Overview 1 Background to the book 2 Crack growth example 3 Contents
More informationLogit Mixture Models
Intelligent Transportation System Laboratory Department of Mathematics Master Thesis Sampling of Alternatives for Logit Mixture Models by Inès Azaiez Under the supervision of Professors Michel Bierlaire
More informationSnell s Law. Introduction
Snell s Law Introduction According to Snell s Law [1] when light is incident on an interface separating two media, as depicted in Figure 1, the angles of incidence and refraction, θ 1 and θ 2, are related,
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationData Analysis & Probability
Unit 5 Probability Distributions Name: Date: Hour: Section 7.2: The Standard Normal Distribution (Area under the curve) Notes By the end of this lesson, you will be able to Find the area under the standard
More informationNonparametric Density Estimation
Nonparametric Estimation Data: X 1,..., X n iid P where P is a distribution with density f(x). Aim: Estimation of density f(x) Parametric density estimation: Fit parametric model {f(x θ) θ Θ} to data parameter
More informationLocating Mobile Nodes with EASE: Learning Efficient Routes from Encounter Histories Alone
1 Locating Mobile Nodes with EASE: Learning Efficient Routes from Encounter Histories Alone Matthias Grossglauser, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract Routing in large-scale mobile
More informationSTATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS
STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability
More informationLaplace Transform of a Lognormal Random Variable
Approximations of the Laplace Transform of a Lognormal Random Variable Joint work with Søren Asmussen & Jens Ledet Jensen The University of Queensland School of Mathematics and Physics August 1, 2011 Conference
More informationTest Oracles and Randomness
Test Oracles and Randomness UNIVERSITÄ T ULM DOCENDO CURANDO SCIENDO Ralph Guderlei and Johannes Mayer rjg@mathematik.uni-ulm.de, jmayer@mathematik.uni-ulm.de University of Ulm, Department of Stochastics
More information(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1.
EXAMINING THE PERFORMANCE OF A CONTROL CHART FOR THE SHIFTED EXPONENTIAL DISTRIBUTION USING PENALIZED MAXIMUM LIKELIHOOD ESTIMATORS: A SIMULATION STUDY USING SAS Austin Brown, M.S., University of Northern
More informationIntroduction. Advanced Econometrics - HEC Lausanne. Christophe Hurlin. University of Orléans. October 2013
Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans October 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne October 2013 1 / 27 Instructor Contact
More informationScienceDirect. Analytical formulation of the trip travel time distribution
Available online at www.sciencedirect.com ScienceDirect Transportation Research Procedia ( ) 7 7th Meeting of the EURO Working Group on Transportation, EWGT, - July, Sevilla, Spain Analytical formulation
More informationMONTEREY, CALIFORNIA
NPS-MAE-04-005 MONTEREY, CALIFORNIA DETERMINING THE NUMBER OF ITERATIONS FOR MONTE CARLO SIMULATIONS OF WEAPON EFFECTIVENESS by Morris R. Driels Young S. Shin April 2004 Approved for public release; distribution
More informationR Programming: Worksheet 6
R Programming: Worksheet 6 Today we ll study a few useful functions we haven t come across yet: all(), any(), `%in%`, match(), pmax(), pmin(), unique() We ll also apply our knowledge to the bootstrap.
More informationComparison of Means: The Analysis of Variance: ANOVA
Comparison of Means: The Analysis of Variance: ANOVA The Analysis of Variance (ANOVA) is one of the most widely used basic statistical techniques in experimental design and data analysis. In contrast to
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationProbability and Statistics for Final Year Engineering Students
Probability and Statistics for Final Year Engineering Students By Yoni Nazarathy, Last Updated: April 11, 2011. Lecture 1: Introduction and Basic Terms Welcome to the course, time table, assessment, etc..
More informationApplied Statistics and Econometrics Lecture 6
Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,
More informationExam 2 is Tue Nov 21. Bring a pencil and a calculator. Discuss similarity to exam1. HW3 is due Tue Dec 5.
Stat 100a: Introduction to Probability. Outline for the day 1. Bivariate and marginal density. 2. CLT. 3. CIs. 4. Sample size calculations. 5. Review for exam 2. Exam 2 is Tue Nov 21. Bring a pencil and
More informationMonte Carlo Simula/on and Copula Func/on. by Gerardo Ferrara
Monte Carlo Simula/on and Copula Func/on by Gerardo Ferrara Introduc)on A Monte Carlo method is a computational algorithm that relies on repeated random sampling to compute its results. In a nutshell,
More informationINFO Wednesday, September 14, 2016
INFO 1301 Wednesday, September 14, 2016 Topics to be covered Last class we spoke about the middle (central tendency) of a data set. Today we talk about variability in a data set. Variability of Data Variance
More information* Hyun Suk Park. Korea Institute of Civil Engineering and Building, 283 Goyangdae-Ro Goyang-Si, Korea. Corresponding Author: Hyun Suk Park
International Journal Of Engineering Research And Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 13, Issue 11 (November 2017), PP.47-59 Determination of The optimal Aggregation
More informationBootstrap confidence intervals Class 24, Jeremy Orloff and Jonathan Bloom
1 Learning Goals Bootstrap confidence intervals Class 24, 18.05 Jeremy Orloff and Jonathan Bloom 1. Be able to construct and sample from the empirical distribution of data. 2. Be able to explain the bootstrap
More informationChapter 6. THE NORMAL DISTRIBUTION
Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells
More informationPredictive Distributed Visual Analysis for Video in Wireless Sensor Networks
.9/TMC.25.246539, IEEE Transactions on Mobile Computing Predictive Distributed Visual Analysis for Video in Wireless Sensor Networks Emil Eriksson, György Dán, Viktoria Fodor School of Electrical Engineering,
More informationSpeeding up R code using Rcpp and foreach packages.
Speeding up R code using Rcpp and foreach packages. Pedro Albuquerque Universidade de Brasília February, 8th 2017 Speeding up R code using Rcpp and foreach packages. 1 1 Speeding up R. 2 foreach package.
More informationDATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R
DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 25 Overview 1 Background to the book 2 A motivating example from my own
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationChapter 4 Convex Optimization Problems
Chapter 4 Convex Optimization Problems Shupeng Gui Computer Science, UR October 16, 2015 hupeng Gui (Computer Science, UR) Convex Optimization Problems October 16, 2015 1 / 58 Outline 1 Optimization problems
More informationCHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves
More informationMATH : EXAM 3 INFO/LOGISTICS/ADVICE
MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:
More informationDiscrete Mathematics Course Review 3
21-228 Discrete Mathematics Course Review 3 This document contains a list of the important definitions and theorems that have been covered thus far in the course. It is not a complete listing of what has
More informationBIOINF 585: Machine Learning for Systems Biology & Clinical Informatics
BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics Lecture 12: Ensemble Learning I Jie Wang Department of Computational Medicine & Bioinformatics University of Michigan 1 Outline Bias
More informationMULTI-DIMENSIONAL MONTE CARLO INTEGRATION
CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals
More informationUVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia
UVA CS 4501: Machine Learning Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Where are we? è Five major secfons
More informationChapter 6. THE NORMAL DISTRIBUTION
Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells
More informationBlending of Probability and Convenience Samples:
Blending of Probability and Convenience Samples: Applications to a Survey of Military Caregivers Michael Robbins RAND Corporation Collaborators: Bonnie Ghosh-Dastidar, Rajeev Ramchand September 25, 2017
More informationSection 6.2: Generating Discrete Random Variates
Section 6.2: Generating Discrete Random Variates Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete
More informationToday s outline: pp
Chapter 3 sections We will SKIP a number of sections Random variables and discrete distributions Continuous distributions The cumulative distribution function Bivariate distributions Marginal distributions
More informationOn Choosing An Optimally Trimmed Mean
Syracuse University SURFACE Electrical Engineering and Computer Science Technical Reports College of Engineering and Computer Science 5-1990 On Choosing An Optimally Trimmed Mean Kishan Mehrotra Syracuse
More informationApplication of Characteristic Function Method in Target Detection
Application of Characteristic Function Method in Target Detection Mohammad H Marhaban and Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey Surrey, GU2 7XH, UK eep5mm@ee.surrey.ac.uk
More informationMarkov chain Monte Carlo methods
Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis
More informationLAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT
NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling
More informationConfidence Intervals. Dennis Sun Data 301
Dennis Sun Data 301 Statistical Inference probability Population / Box Sample / Data statistics The goal of statistics is to infer the unknown population from the sample. We ve already seen one mode of
More informationWhat We ll Do... Random
What We ll Do... Random- number generation Random Number Generation Generating random variates Nonstationary Poisson processes Variance reduction Sequential sampling Designing and executing simulation
More informationConfidence sharing: an economic strategy for efficient information flows in animal groups 1
1 / 35 Confidence sharing: an economic strategy for efficient information flows in animal groups 1 Amos Korman 2 CNRS and University Paris Diderot 1 Appears in PLoS Computational Biology, Oct. 2014 2 Joint
More informationExtended Hopfield Network for Sequence Learning: Application to Gesture Recognition
Extended Hopfield Network for Sequence Learning: Application to Gesture Recognition André Maurer, Micha Hersch and Aude G. Billard Ecole Polytechnique Fédérale de Lausanne (EPFL) Swiss Federal Institute
More informationPandasBiogeme: a short introduction
PandasBiogeme: a short introduction Michel Bierlaire December 19, 2018 Report TRANSP-OR 181219 Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique
More informationSo..to be able to make comparisons possible, we need to compare them with their respective distributions.
Unit 3 ~ Modeling Distributions of Data 1 ***Section 2.1*** Measures of Relative Standing and Density Curves (ex) Suppose that a professional soccer team has the money to sign one additional player and
More informationClustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford
Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically
More informationCS CS 5623 Simulation Techniques
CS 4633 - CS 5623 Simulation Techniques How to model data using matlab Instructor Dr. Turgay Korkmaz This tutorial along with matlab s statistical toolbox manual will be useful for HW 5. Data collection
More informationGenerating random samples from user-defined distributions
The Stata Journal (2011) 11, Number 2, pp. 299 304 Generating random samples from user-defined distributions Katarína Lukácsy Central European University Budapest, Hungary lukacsy katarina@phd.ceu.hu Abstract.
More informationDigital Image Processing Laboratory: MAP Image Restoration
Purdue University: Digital Image Processing Laboratories 1 Digital Image Processing Laboratory: MAP Image Restoration October, 015 1 Introduction This laboratory explores the use of maximum a posteriori
More informationProblem Set 3. MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009
Problem Set 3 MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009 1. Show directly that P 1 (s) P 1 (t) for all t s. Proof. Given G, let H s be a subgraph of G on s vertices such that
More informationIntegrated Algebra 2 and Trigonometry. Quarter 1
Quarter 1 I: Functions: Composition I.1 (A.42) Composition of linear functions f(g(x)). f(x) + g(x). I.2 (A.42) Composition of linear and quadratic functions II: Functions: Quadratic II.1 Parabola The
More informationScale effects in physical hydraulic engineering models
Discussion of Scale effects in physical hydraulic engineering models by Valentin Heller, Journal of Hydraulic Research Vol. 49, No. (2011), pp. 29-06 Discussers Michael Pfister (IAHR Member), Laboratory
More informationWeek 4: Simple Linear Regression II
Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties
More informationA noninformative Bayesian approach to small area estimation
A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported
More informationAdvanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm)
Advanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm) Erick L. Oberstar 2001 Adaptive Linear Prediction Filter Using the RLS Algorithm A complete analysis/discussion
More informationThis is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.
Chapter 5: JOINT PROBABILITY DISTRIBUTIONS Part 1: Sections 5-1.1 to 5-1.4 For both discrete and continuous random variables we will discuss the following... Joint Distributions (for two or more r.v. s)
More informationScalable Bayes Clustering for Outlier Detection Under Informative Sampling
Scalable Bayes Clustering for Outlier Detection Under Informative Sampling Based on JMLR paper of T. D. Savitsky Terrance D. Savitsky Office of Survey Methods Research FCSM - 2018 March 7-9, 2018 1 / 21
More informationChapter 6: Simulation Using Spread-Sheets (Excel)
Chapter 6: Simulation Using Spread-Sheets (Excel) Refer to Reading Assignments 1 Simulation Using Spread-Sheets (Excel) OBJECTIVES To be able to Generate random numbers within a spreadsheet environment.
More information5 Bootstrapping. 4.7 Extensions. 4.8 References. 5.1 Bootstrapping for random samples (the i.i.d. case) ST697F: Topics in Regression.
ST697F: Topics in Regression. Spring 2007 c 21 4.7 Extensions The approach above can be readily extended to the case where we are interested in inverse prediction or regulation about one predictor given
More information