Optimization and Simulation

Size: px
Start display at page:

Download "Optimization and Simulation"

Transcription

1 Optimization and Simulation Statistical analysis and bootstrapping Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 1 / 17

2 Introduction The outputs of the simulator are random variables. Running the simulator provides one realization of these r.v. We have no access to the pdf or CDF of these r.v. Well... this is actually why we rely on simulation. How to derive statistics about a r.v. when only instances are known? How to measure the quality of this statistic? M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 2 / 17

3 Sample mean and variance Consider X 1,..., X n i.i.d. r.v. E[X i ] = µ, Var(X i ) = σ 2. The sample mean X = 1 n is an unbiased estimate of the population mean µ, as E[ X] = µ. The sample variance S 2 = 1 n 1 n i=1 X i n (X i X) 2 is an unbiased estimator of the population variance σ 2, as E[S 2 ] = σ 2. (see proof: Ross, chapter 7) i=1 M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 3 / 17

4 Sample mean and variance Recursive computation 1 Initialize X 0 = 0, S 2 1 = 0. 2 Update the mean X k+1 = X k + X k+1 X k k +1 3 Update the variance ( Sk+1 2 = 1 1 ) Sk 2 k +(k +1)( X k+1 X k ) 2. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 4 / 17

5 Mean Square Error Consider X 1,..., X n i.i.d. r.v. with CDF F. Consider a parameter θ(f) of the distribution (mean, quantile, mode, etc.) Consider θ(x 1,...,X n ) an estimator of θ(f). The Mean Square Error of the estimator is defined as [ ) ] 2 MSE(F) = E F ( θ(x1,...,x n ) θ(f), where E F emphasizes that the expectation is taken under the assumption that the r.v. all have distribution F. If F is unknown, it is not immediate to find an estimator of MSE. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 5 / 17

6 How many draws must be used? Let X a r.v. with mean θ and variance σ 2. We want to estimate the mean θ of the simulated distribution. The estimator used is the sample mean: X. The mean square error is E[( X θ) 2 ] = σ2 n The sample mean X is normally distributed with mean θ and variance σ 2 /n. So we can stop generating data when σ/ n is small. σ is approximated by the sample variance S. Law of large numbers: at least 100 draws (say) should be used. See Ross p. 121 for details. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 6 / 17

7 Mean Square Error Other indicators than the mean are desired. Theoretical results about the MSE cannot always be derived. Solution: rely on simulation. Method: bootstrapping. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 7 / 17

8 Empirical distribution function Consider X 1,..., X n i.i.d. r.v. with CDF F. Consider a realization x 1,...,x n of these r.v. The empirical distribution function is defined as F e (x) = 1 n n I{x i x}, i=1 where I{x i x} = { 1 if xi x, 0 otherwise. CDF of a r.v. that can take any x i with equal probability. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 8 / 17

9 Empirical CDF F e (x), n = 10 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 9 / 17

10 Empirical CDF F e (x), n = 100 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 10 / 17

11 Empirical CDF F e (x), n = 1000 F(x) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 11 / 17

12 Mean Square Error We use the empirical distribution function F e We can approximate [ ) ] 2 MSE(F) = E F ( θ(x1,...,x n ) θ(f), by [ ) ] 2 MSE(F e ) = E Fe ( θ(x1,...,x n ) θ(f e ), θ(f e ) can be computed directly from the data (mean, variance, etc.) M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 12 / 17

13 Mean Square Error We want to compute [ ) ] 2 MSE(F e ) = E Fe ( θ(x1,...,x n ) θ(f e ), F e is the CDF of a r.v. that can take any x i with equal probability. Therefore, MSE(F e ) = 1 n n n i 1 =1 n i n=1 Clearly impossible to compute when n is large. Solution: simulation. [ ( θ(xi1,...,x in) θ(f e )) 2 ], M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 13 / 17

14 Bootstrapping For r = 1,...,R Draw x r 1,...,xr n from F e, that is draw from the data: 1 Let s be a draw from U[0,1] 2 Set j = floor(ns). 3 Return x j. Compute M r = ( θ(x r 1,...,x r n) θ(f e )) 2, Estimate of MSE(F e ) and, therefore, of MSE(F): Typical value for R: R R r=1 M r M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 14 / 17

15 Bootstrap: simple example Data: 0.636, , 0.183, -1.67, Mean= MSE= E[( X θ) 2 ] = S 2 /n= r ˆθ θ(f e) MSE e M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 15 / 17

16 Summary The number of draws is determined by the required precision. In some cases, the precision is derived from theoretical results. If not, rely on bootstrapping. Idea: use simulation to estimate the Mean Square Error. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 16 / 17

17 Appendix: MSE for the mean Consider X 1,..., X n i.i.d. r.v. Denote θ = E[X i ] and σ 2 = Var(X i ). Consider X = n i=1 X i/n. E[ X] = n i=1 E[X i]/n = θ. MSE: E[( X θ) 2 ] = Var X ( n ) = Var X i /n i=1 = n Var(X i )/n 2 i=1 = σ 2 /n. M. Bierlaire (TRANSP-OR ENAC EPFL) Optimization and Simulation 17 / 17

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine

More information

Behavior of the sample mean. varx i = σ 2

Behavior of the sample mean. varx i = σ 2 Behavior of the sample mean We observe n independent and identically distributed (iid) draws from a random variable X. Denote the observed values by X 1, X 2,..., X n. Assume the X i come from a population

More information

Modelling competition in demand-based optimization models

Modelling competition in demand-based optimization models Modelling competition in demand-based optimization models Stefano Bortolomiol Virginie Lurkin Michel Bierlaire Transport and Mobility Laboratory (TRANSP-OR) École Polytechnique Fédérale de Lausanne Workshop

More information

Solving singularity issues in the estimation of econometric models

Solving singularity issues in the estimation of econometric models Solving singularity issues in the estimation of econometric models Michel Bierlaire Michaël Thémans Transport and Mobility Laboratory École Polytechnique Fédérale de Lausanne Solving singularity issues

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27 STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 27 BOOTSTRAP CONFIDENCE INTERVALS Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 27 Distribution

More information

Exploiting mobility data from Nokia smartphones

Exploiting mobility data from Nokia smartphones Exploiting mobility data from Nokia smartphones Michel Bierlaire, Jeffrey Newman Transport and Mobility Laboratory Ecole Polytechnique Fédérale de Lausanne Nokia @ EPFL Nokia Research Centers research.nokia.com/locations

More information

Three challenges in route choice modeling

Three challenges in route choice modeling Three challenges in route choice modeling Michel Bierlaire and Emma Frejinger transp-or.epfl.ch Transport and Mobility Laboratory, EPFL Three challenges in route choice modeling p.1/61 Route choice modeling

More information

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods

ISyE 6416: Computational Statistics Spring Lecture 13: Monte Carlo Methods ISyE 6416: Computational Statistics Spring 2017 Lecture 13: Monte Carlo Methods Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Determine area

More information

Probability Model for 2 RV s

Probability Model for 2 RV s Probability Model for 2 RV s The joint probability mass function of X and Y is P X,Y (x, y) = P [X = x, Y= y] Joint PMF is a rule that for any x and y, gives the probability that X = x and Y= y. 3 Example:

More information

Performance Evaluation

Performance Evaluation Performance Evaluation Dan Lizotte 7-9-5 Evaluating Performance..5..5..5..5 Which do ou prefer and wh? Evaluating Performance..5..5 Which do ou prefer and wh?..5..5 Evaluating Performance..5..5..5..5 Performance

More information

Use of Extreme Value Statistics in Modeling Biometric Systems

Use of Extreme Value Statistics in Modeling Biometric Systems Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision

More information

RANDOM SAMPLING OF ALTERNATIVES IN A ROUTE CHOICE CONTEXT

RANDOM SAMPLING OF ALTERNATIVES IN A ROUTE CHOICE CONTEXT RANDOM SAMPLING OF ALTERNATIVES IN A ROUTE CHOICE CONTEXT Emma Frejinger Transport and Mobility Laboratory (TRANSP-OR), EPFL Abstract In this paper we present a new point of view on choice set generation

More information

Statistics 406 Exam November 17, 2005

Statistics 406 Exam November 17, 2005 Statistics 406 Exam November 17, 2005 1. For each of the following, what do you expect the value of A to be after executing the program? Briefly state your reasoning for each part. (a) X

More information

GAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao

GAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao GAMES Webinar: Rendering Tutorial 2 Monte Carlo Methods Shuang Zhao Assistant Professor Computer Science Department University of California, Irvine GAMES Webinar Shuang Zhao 1 Outline 1. Monte Carlo integration

More information

Lab 5 Monte Carlo integration

Lab 5 Monte Carlo integration Lab 5 Monte Carlo integration Edvin Listo Zec 9065-976 edvinli@student.chalmers.se October 0, 014 Co-worker: Jessica Fredby Introduction In this computer assignment we will discuss a technique for solving

More information

4.5 The smoothed bootstrap

4.5 The smoothed bootstrap 4.5. THE SMOOTHED BOOTSTRAP 47 F X i X Figure 4.1: Smoothing the empirical distribution function. 4.5 The smoothed bootstrap In the simple nonparametric bootstrap we have assumed that the empirical distribution

More information

STAT 725 Notes Monte Carlo Integration

STAT 725 Notes Monte Carlo Integration STAT 725 Notes Monte Carlo Integration Two major classes of numerical problems arise in statistical inference: optimization and integration. We have already spent some time discussing different optimization

More information

Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals.

Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals. Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals. In this Computer Class we are going to use Statgraphics

More information

PARAMETRIC MODEL SELECTION TECHNIQUES

PARAMETRIC MODEL SELECTION TECHNIQUES PARAMETRIC MODEL SELECTION TECHNIQUES GARY L. BECK, STEVE FROM, PH.D. Abstract. Many parametric statistical models are available for modelling lifetime data. Given a data set of lifetimes, which may or

More information

THE RECURSIVE HESSIAN SKETCH FOR ADAPTIVE FILTERING. Robin Scheibler and Martin Vetterli

THE RECURSIVE HESSIAN SKETCH FOR ADAPTIVE FILTERING. Robin Scheibler and Martin Vetterli THE RECURSIVE HESSIAN SKETCH FOR ADAPTIVE FILTERING Robin Scheibler and Martin Vetterli School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne,

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

Training in Mapping Changes on an Archaeological Site

Training in Mapping Changes on an Archaeological Site Training in Mapping Changes on an Archaeological Site Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Pierre-Yves Gilliéron Bertrand Merminod Jérôme Zufferey EPFL Presentation 04.2015

More information

Statistical Matching using Fractional Imputation

Statistical Matching using Fractional Imputation Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. Lecture 12 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 23, 2007 1 2 3 4 5 1 2 Introduce the bootstrap 3 the bootstrap algorithm 4 Example

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 28 Problem Set : Probability Review Last updated: March 6, 28 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables, p x denotes the

More information

Will Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions

Will Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions Will Monroe July 1, 017 with materials by Mehran Sahami and Chris Piech Joint Distributions Review: Normal random variable An normal (= Gaussian) random variable is a good approximation to many other distributions.

More information

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors (Section 5.4) What? Consequences of homoskedasticity Implication for computing standard errors What do these two terms

More information

Resampling Methods for Dependent Data

Resampling Methods for Dependent Data S.N. Lahiri Resampling Methods for Dependent Data With 25 Illustrations Springer Contents 1 Scope of Resampling Methods for Dependent Data 1 1.1 The Bootstrap Principle 1 1.2 Examples 7 1.3 Concluding

More information

Optimal Denoising of Natural Images and their Multiscale Geometry and Density

Optimal Denoising of Natural Images and their Multiscale Geometry and Density Optimal Denoising of Natural Images and their Multiscale Geometry and Density Department of Computer Science and Applied Mathematics Weizmann Institute of Science, Israel. Joint work with Anat Levin (WIS),

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

Summer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis

Summer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis Summer School in Statistics for Astronomers & Physicists June 15-17, 2005 Session on Computational Algorithms for Astrostatistics Cluster Analysis Max Buot Department of Statistics Carnegie-Mellon University

More information

Biogeme & Binary Logit Model Estimation

Biogeme & Binary Logit Model Estimation Biogeme & Binary Logit Model Estimation Anna Fernández Antolín Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering École Polytechnique Fédérale de Lausanne April

More information

Nonparametric Error Estimation Methods for Evaluating and Validating Artificial Neural Network Prediction Models

Nonparametric Error Estimation Methods for Evaluating and Validating Artificial Neural Network Prediction Models Nonparametric Error Estimation Methods for Evaluating and Validating Artificial Neural Network Prediction Models Janet M. Twomey and Alice E. Smith Department of Industrial Engineering University of Pittsburgh

More information

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 24 Overview 1 Background to the book 2 Crack growth example 3 Contents

More information

Logit Mixture Models

Logit Mixture Models Intelligent Transportation System Laboratory Department of Mathematics Master Thesis Sampling of Alternatives for Logit Mixture Models by Inès Azaiez Under the supervision of Professors Michel Bierlaire

More information

Snell s Law. Introduction

Snell s Law. Introduction Snell s Law Introduction According to Snell s Law [1] when light is incident on an interface separating two media, as depicted in Figure 1, the angles of incidence and refraction, θ 1 and θ 2, are related,

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Data Analysis & Probability

Data Analysis & Probability Unit 5 Probability Distributions Name: Date: Hour: Section 7.2: The Standard Normal Distribution (Area under the curve) Notes By the end of this lesson, you will be able to Find the area under the standard

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Estimation Data: X 1,..., X n iid P where P is a distribution with density f(x). Aim: Estimation of density f(x) Parametric density estimation: Fit parametric model {f(x θ) θ Θ} to data parameter

More information

Locating Mobile Nodes with EASE: Learning Efficient Routes from Encounter Histories Alone

Locating Mobile Nodes with EASE: Learning Efficient Routes from Encounter Histories Alone 1 Locating Mobile Nodes with EASE: Learning Efficient Routes from Encounter Histories Alone Matthias Grossglauser, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract Routing in large-scale mobile

More information

STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS

STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability

More information

Laplace Transform of a Lognormal Random Variable

Laplace Transform of a Lognormal Random Variable Approximations of the Laplace Transform of a Lognormal Random Variable Joint work with Søren Asmussen & Jens Ledet Jensen The University of Queensland School of Mathematics and Physics August 1, 2011 Conference

More information

Test Oracles and Randomness

Test Oracles and Randomness Test Oracles and Randomness UNIVERSITÄ T ULM DOCENDO CURANDO SCIENDO Ralph Guderlei and Johannes Mayer rjg@mathematik.uni-ulm.de, jmayer@mathematik.uni-ulm.de University of Ulm, Department of Stochastics

More information

(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1.

(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1. EXAMINING THE PERFORMANCE OF A CONTROL CHART FOR THE SHIFTED EXPONENTIAL DISTRIBUTION USING PENALIZED MAXIMUM LIKELIHOOD ESTIMATORS: A SIMULATION STUDY USING SAS Austin Brown, M.S., University of Northern

More information

Introduction. Advanced Econometrics - HEC Lausanne. Christophe Hurlin. University of Orléans. October 2013

Introduction. Advanced Econometrics - HEC Lausanne. Christophe Hurlin. University of Orléans. October 2013 Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans October 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne October 2013 1 / 27 Instructor Contact

More information

ScienceDirect. Analytical formulation of the trip travel time distribution

ScienceDirect. Analytical formulation of the trip travel time distribution Available online at www.sciencedirect.com ScienceDirect Transportation Research Procedia ( ) 7 7th Meeting of the EURO Working Group on Transportation, EWGT, - July, Sevilla, Spain Analytical formulation

More information

MONTEREY, CALIFORNIA

MONTEREY, CALIFORNIA NPS-MAE-04-005 MONTEREY, CALIFORNIA DETERMINING THE NUMBER OF ITERATIONS FOR MONTE CARLO SIMULATIONS OF WEAPON EFFECTIVENESS by Morris R. Driels Young S. Shin April 2004 Approved for public release; distribution

More information

R Programming: Worksheet 6

R Programming: Worksheet 6 R Programming: Worksheet 6 Today we ll study a few useful functions we haven t come across yet: all(), any(), `%in%`, match(), pmax(), pmin(), unique() We ll also apply our knowledge to the bootstrap.

More information

Comparison of Means: The Analysis of Variance: ANOVA

Comparison of Means: The Analysis of Variance: ANOVA Comparison of Means: The Analysis of Variance: ANOVA The Analysis of Variance (ANOVA) is one of the most widely used basic statistical techniques in experimental design and data analysis. In contrast to

More information

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing

More information

Probability and Statistics for Final Year Engineering Students

Probability and Statistics for Final Year Engineering Students Probability and Statistics for Final Year Engineering Students By Yoni Nazarathy, Last Updated: April 11, 2011. Lecture 1: Introduction and Basic Terms Welcome to the course, time table, assessment, etc..

More information

Applied Statistics and Econometrics Lecture 6

Applied Statistics and Econometrics Lecture 6 Applied Statistics and Econometrics Lecture 6 Giuseppe Ragusa Luiss University gragusa@luiss.it http://gragusa.org/ March 6, 2017 Luiss University Empirical application. Data Italian Labour Force Survey,

More information

Exam 2 is Tue Nov 21. Bring a pencil and a calculator. Discuss similarity to exam1. HW3 is due Tue Dec 5.

Exam 2 is Tue Nov 21. Bring a pencil and a calculator. Discuss similarity to exam1. HW3 is due Tue Dec 5. Stat 100a: Introduction to Probability. Outline for the day 1. Bivariate and marginal density. 2. CLT. 3. CIs. 4. Sample size calculations. 5. Review for exam 2. Exam 2 is Tue Nov 21. Bring a pencil and

More information

Monte Carlo Simula/on and Copula Func/on. by Gerardo Ferrara

Monte Carlo Simula/on and Copula Func/on. by Gerardo Ferrara Monte Carlo Simula/on and Copula Func/on by Gerardo Ferrara Introduc)on A Monte Carlo method is a computational algorithm that relies on repeated random sampling to compute its results. In a nutshell,

More information

INFO Wednesday, September 14, 2016

INFO Wednesday, September 14, 2016 INFO 1301 Wednesday, September 14, 2016 Topics to be covered Last class we spoke about the middle (central tendency) of a data set. Today we talk about variability in a data set. Variability of Data Variance

More information

* Hyun Suk Park. Korea Institute of Civil Engineering and Building, 283 Goyangdae-Ro Goyang-Si, Korea. Corresponding Author: Hyun Suk Park

* Hyun Suk Park. Korea Institute of Civil Engineering and Building, 283 Goyangdae-Ro Goyang-Si, Korea. Corresponding Author: Hyun Suk Park International Journal Of Engineering Research And Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 13, Issue 11 (November 2017), PP.47-59 Determination of The optimal Aggregation

More information

Bootstrap confidence intervals Class 24, Jeremy Orloff and Jonathan Bloom

Bootstrap confidence intervals Class 24, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Bootstrap confidence intervals Class 24, 18.05 Jeremy Orloff and Jonathan Bloom 1. Be able to construct and sample from the empirical distribution of data. 2. Be able to explain the bootstrap

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Predictive Distributed Visual Analysis for Video in Wireless Sensor Networks

Predictive Distributed Visual Analysis for Video in Wireless Sensor Networks .9/TMC.25.246539, IEEE Transactions on Mobile Computing Predictive Distributed Visual Analysis for Video in Wireless Sensor Networks Emil Eriksson, György Dán, Viktoria Fodor School of Electrical Engineering,

More information

Speeding up R code using Rcpp and foreach packages.

Speeding up R code using Rcpp and foreach packages. Speeding up R code using Rcpp and foreach packages. Pedro Albuquerque Universidade de Brasília February, 8th 2017 Speeding up R code using Rcpp and foreach packages. 1 1 Speeding up R. 2 foreach package.

More information

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 25 Overview 1 Background to the book 2 A motivating example from my own

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Chapter 4 Convex Optimization Problems

Chapter 4 Convex Optimization Problems Chapter 4 Convex Optimization Problems Shupeng Gui Computer Science, UR October 16, 2015 hupeng Gui (Computer Science, UR) Convex Optimization Problems October 16, 2015 1 / 58 Outline 1 Optimization problems

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

MATH : EXAM 3 INFO/LOGISTICS/ADVICE

MATH : EXAM 3 INFO/LOGISTICS/ADVICE MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:

More information

Discrete Mathematics Course Review 3

Discrete Mathematics Course Review 3 21-228 Discrete Mathematics Course Review 3 This document contains a list of the important definitions and theorems that have been covered thus far in the course. It is not a complete listing of what has

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics Lecture 12: Ensemble Learning I Jie Wang Department of Computational Medicine & Bioinformatics University of Michigan 1 Outline Bias

More information

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals

More information

UVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

UVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia UVA CS 4501: Machine Learning Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Where are we? è Five major secfons

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Blending of Probability and Convenience Samples:

Blending of Probability and Convenience Samples: Blending of Probability and Convenience Samples: Applications to a Survey of Military Caregivers Michael Robbins RAND Corporation Collaborators: Bonnie Ghosh-Dastidar, Rajeev Ramchand September 25, 2017

More information

Section 6.2: Generating Discrete Random Variates

Section 6.2: Generating Discrete Random Variates Section 6.2: Generating Discrete Random Variates Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete

More information

Today s outline: pp

Today s outline: pp Chapter 3 sections We will SKIP a number of sections Random variables and discrete distributions Continuous distributions The cumulative distribution function Bivariate distributions Marginal distributions

More information

On Choosing An Optimally Trimmed Mean

On Choosing An Optimally Trimmed Mean Syracuse University SURFACE Electrical Engineering and Computer Science Technical Reports College of Engineering and Computer Science 5-1990 On Choosing An Optimally Trimmed Mean Kishan Mehrotra Syracuse

More information

Application of Characteristic Function Method in Target Detection

Application of Characteristic Function Method in Target Detection Application of Characteristic Function Method in Target Detection Mohammad H Marhaban and Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey Surrey, GU2 7XH, UK eep5mm@ee.surrey.ac.uk

More information

Markov chain Monte Carlo methods

Markov chain Monte Carlo methods Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

Confidence Intervals. Dennis Sun Data 301

Confidence Intervals. Dennis Sun Data 301 Dennis Sun Data 301 Statistical Inference probability Population / Box Sample / Data statistics The goal of statistics is to infer the unknown population from the sample. We ve already seen one mode of

More information

What We ll Do... Random

What We ll Do... Random What We ll Do... Random- number generation Random Number Generation Generating random variates Nonstationary Poisson processes Variance reduction Sequential sampling Designing and executing simulation

More information

Confidence sharing: an economic strategy for efficient information flows in animal groups 1

Confidence sharing: an economic strategy for efficient information flows in animal groups 1 1 / 35 Confidence sharing: an economic strategy for efficient information flows in animal groups 1 Amos Korman 2 CNRS and University Paris Diderot 1 Appears in PLoS Computational Biology, Oct. 2014 2 Joint

More information

Extended Hopfield Network for Sequence Learning: Application to Gesture Recognition

Extended Hopfield Network for Sequence Learning: Application to Gesture Recognition Extended Hopfield Network for Sequence Learning: Application to Gesture Recognition André Maurer, Micha Hersch and Aude G. Billard Ecole Polytechnique Fédérale de Lausanne (EPFL) Swiss Federal Institute

More information

PandasBiogeme: a short introduction

PandasBiogeme: a short introduction PandasBiogeme: a short introduction Michel Bierlaire December 19, 2018 Report TRANSP-OR 181219 Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique

More information

So..to be able to make comparisons possible, we need to compare them with their respective distributions.

So..to be able to make comparisons possible, we need to compare them with their respective distributions. Unit 3 ~ Modeling Distributions of Data 1 ***Section 2.1*** Measures of Relative Standing and Density Curves (ex) Suppose that a professional soccer team has the money to sign one additional player and

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

CS CS 5623 Simulation Techniques

CS CS 5623 Simulation Techniques CS 4633 - CS 5623 Simulation Techniques How to model data using matlab Instructor Dr. Turgay Korkmaz This tutorial along with matlab s statistical toolbox manual will be useful for HW 5. Data collection

More information

Generating random samples from user-defined distributions

Generating random samples from user-defined distributions The Stata Journal (2011) 11, Number 2, pp. 299 304 Generating random samples from user-defined distributions Katarína Lukácsy Central European University Budapest, Hungary lukacsy katarina@phd.ceu.hu Abstract.

More information

Digital Image Processing Laboratory: MAP Image Restoration

Digital Image Processing Laboratory: MAP Image Restoration Purdue University: Digital Image Processing Laboratories 1 Digital Image Processing Laboratory: MAP Image Restoration October, 015 1 Introduction This laboratory explores the use of maximum a posteriori

More information

Problem Set 3. MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009

Problem Set 3. MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009 Problem Set 3 MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009 1. Show directly that P 1 (s) P 1 (t) for all t s. Proof. Given G, let H s be a subgraph of G on s vertices such that

More information

Integrated Algebra 2 and Trigonometry. Quarter 1

Integrated Algebra 2 and Trigonometry. Quarter 1 Quarter 1 I: Functions: Composition I.1 (A.42) Composition of linear functions f(g(x)). f(x) + g(x). I.2 (A.42) Composition of linear and quadratic functions II: Functions: Quadratic II.1 Parabola The

More information

Scale effects in physical hydraulic engineering models

Scale effects in physical hydraulic engineering models Discussion of Scale effects in physical hydraulic engineering models by Valentin Heller, Journal of Hydraulic Research Vol. 49, No. (2011), pp. 29-06 Discussers Michael Pfister (IAHR Member), Laboratory

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

Advanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm)

Advanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm) Advanced Digital Signal Processing Adaptive Linear Prediction Filter (Using The RLS Algorithm) Erick L. Oberstar 2001 Adaptive Linear Prediction Filter Using the RLS Algorithm A complete analysis/discussion

More information

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures.

This is a good time to refresh your memory on double-integration. We will be using this skill in the upcoming lectures. Chapter 5: JOINT PROBABILITY DISTRIBUTIONS Part 1: Sections 5-1.1 to 5-1.4 For both discrete and continuous random variables we will discuss the following... Joint Distributions (for two or more r.v. s)

More information

Scalable Bayes Clustering for Outlier Detection Under Informative Sampling

Scalable Bayes Clustering for Outlier Detection Under Informative Sampling Scalable Bayes Clustering for Outlier Detection Under Informative Sampling Based on JMLR paper of T. D. Savitsky Terrance D. Savitsky Office of Survey Methods Research FCSM - 2018 March 7-9, 2018 1 / 21

More information

Chapter 6: Simulation Using Spread-Sheets (Excel)

Chapter 6: Simulation Using Spread-Sheets (Excel) Chapter 6: Simulation Using Spread-Sheets (Excel) Refer to Reading Assignments 1 Simulation Using Spread-Sheets (Excel) OBJECTIVES To be able to Generate random numbers within a spreadsheet environment.

More information

5 Bootstrapping. 4.7 Extensions. 4.8 References. 5.1 Bootstrapping for random samples (the i.i.d. case) ST697F: Topics in Regression.

5 Bootstrapping. 4.7 Extensions. 4.8 References. 5.1 Bootstrapping for random samples (the i.i.d. case) ST697F: Topics in Regression. ST697F: Topics in Regression. Spring 2007 c 21 4.7 Extensions The approach above can be readily extended to the case where we are interested in inverse prediction or regulation about one predictor given

More information