An Introduction to the Bootstrap

Size: px
Start display at page:

Download "An Introduction to the Bootstrap"

Transcription

1 An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics, University of Zbronto CHAPMAN & HALLICRC Boca Raton London New York Washington, D.C.

2 Contents 1 Introduction An overview of this book Information for instructors Some of the notation used in the book 9 2 The accuracy of a sample mean Problems Random samples and probabilities Introduction Random samples Probability theory Problems 28 4 The empirical distribution function and the plug-in principle Introduction The empirical distribution function The plug-in principle Problems 37 5 Standard errors and estimated standard errors Introduction The standard error of a mean Estimating the standard error of.the mean Problems 43

3 Viii CONTENTS 6 The bootstrap estimate of standard error Introduction The bootstrap estimate of standard error Example: the correlation coefficient The number of bootstrap replications B The parametric bootstrap Bibliographic notes Problems 57 7 Bootstrap standard errors : some examples Introduction Example 1 : test score data Example 2: curve fitting An example of bootstrap failure Bibliographic notes Problems 82 8 More complicated data structures Introduction One-sample problems The two-sample problem More general data structures Example : lutenizing hormone The moving blocks bootstrap Bibliographic notes Problems Regression models Introduction The linear regression model Example ; the hormone data Application of the bootstrap Bootstrapping pairs vs bootstrapping residuals Example : the cell survival data Least median of squares Bibliographic notes Problems Estimates of bias 12, Introduction 124

4 1.0.2 The bootstrap estimate of bias Example: the patch data An improved estimate of bias The jackknife estimate of bias Bias correction Bibliographic notes Problems The jackknife Introduction Definition of the jackknife Example : test score data Pseudo-values Relationship between the jackknife and bootstrap Failure of the jackknife The delete-d jackknife Bibliographic notes Problems Confidence intervals based on bootstrap "tables" Introduction Some background on confidence intervals Relation between confidence intervals and hypothesis tests Student's t interval The bootstrap-t interval Transformations and the bootstrap-t Bibliographic notes Problems Confidence intervals based on bootstrap percentiles Introduction Standard normal intervals The percentile interval Is the percentile interval backwards? Coverage performance The transformation-respecting property The range-preserving property Discussion 176

5 CONTENTS 13.9 Bibliographic notes Problems Better bootstrap confidence intervals Introduction Example : the spatial test data The BCd method The ABC method Example: the tooth data Bibliographic notes Problems Permutation tests Introduction The two-sample problem Other test statistics Relationship of hypothesis tests to confidence intervals and the bootstrap Bibliographic notes Problems Hypothesis testing with the bootstrap Introduction The two-sample problem Relationship between the permutation test and the bootstrap The one-sample problem Testing multimodality of a population Discussion Bibliographic notes Problems Cross-validation and other estimates of prediction error Introduction Example: hormone data Cross-validation Cr and other estimates of prediction error Example : classification trees Bootstrap estimates o prediction error 247

6 Overview Some details The.632 bootstrap estimator Discussion Bibliographic notes Problems Adaptive estimation and calibration Introduction Example : smoothing parameter selection for curve fitting Example : calibration of a confidence point Some general considerations Bibliographic notes Problems Assessing the error in bootstrap estimates Introduction Standard error estimation Percentile estimation The jackknife-after-bootstrap Derivations Bibliographic notes Problems A geometrical representation for the bootstrap and jackknife Introduction Bootstrap sampling The jackknife as an approximation to the bootstrap Other jackknife approximations Estimates of bias An example Bibliographic notes Problems An overview of nonparametric and parametric inference Introduction Distributions, densities and likelihood functions 296

7 xii CONTENTS 21,3 FVnctional statistics and influence functions Parametric maximum likelihood inference The parametric bootstrap Relation of parametric maximum likelihood, bootstrap and jackknife approaches ,6.1 Example : influence components for the mean The empirical cdf as a maximum likelihood estimate The sandwich estimator Example: Mouse data The delta method ,9.1 Example: delta method for the mean Example : delta method for the correlation coefficient Relationship between the delta method and in finitesimal jackknife Exponential fandlies Bibliographic notes ,13 Problems Further topics in bootstrap confidence intervals Introduction Correctness and accuracy Confidence points based on approximate pivots The BC,, interval The underlying basis for the BC,, interval The ABC approximation Least favorable families The ABCq method and transformations Discussion Bibliographic notes Problems Efficient bootstrap computations Introduction Post-sampling adjustments Application to bootstrap bias estimation Application to bootstrap variance estimation Pre- and post-sampling adjustments Importance sampling for tail probabilities Application to bootstrap tail probabilities 352

8 CONTENTS xiii 23.8 Bibliographic notes Problems Approximate likelihoods Introduction Empirical likelihood Approximate pivot methods Bootstrap partial likelihood Implied likelihood Discussion Bibliographic notes Problems Bootstrap bioequivalence Introduction A bioequivalence problem Bootstrap confidence intervals Bootstrap power calculations A more careful power calculation Fieller's intervals Bibliographic notes Problems Discussion and further topics Discussion Some questions about the bootstrap References on further topics 396 Appendix : software for bootstrap computations 398 Introduction 398 Some available software 399 S language functions 399 References 413 Author index 426 Subject index 430

COPYRIGHTED MATERIAL CONTENTS

COPYRIGHTED MATERIAL CONTENTS PREFACE ACKNOWLEDGMENTS LIST OF TABLES xi xv xvii 1 INTRODUCTION 1 1.1 Historical Background 1 1.2 Definition and Relationship to the Delta Method and Other Resampling Methods 3 1.2.1 Jackknife 6 1.2.2

More information

Modelling and Quantitative Methods in Fisheries

Modelling and Quantitative Methods in Fisheries SUB Hamburg A/553843 Modelling and Quantitative Methods in Fisheries Second Edition Malcolm Haddon ( r oc) CRC Press \ y* J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of

More information

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support

Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Topics Validation of biomedical models Data-splitting Resampling Cross-validation

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Resampling Methods for Dependent Data

Resampling Methods for Dependent Data S.N. Lahiri Resampling Methods for Dependent Data With 25 Illustrations Springer Contents 1 Scope of Resampling Methods for Dependent Data 1 1.1 The Bootstrap Principle 1 1.2 Examples 7 1.3 Concluding

More information

Analysis of Incomplete Multivariate Data

Analysis of Incomplete Multivariate Data Analysis of Incomplete Multivariate Data J. L. Schafer Department of Statistics The Pennsylvania State University USA CHAPMAN & HALL/CRC A CR.C Press Company Boca Raton London New York Washington, D.C.

More information

Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models

Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models Nikolaos Mittas, Lefteris Angelis Department of Informatics,

More information

Stochastic Simulation: Algorithms and Analysis

Stochastic Simulation: Algorithms and Analysis Soren Asmussen Peter W. Glynn Stochastic Simulation: Algorithms and Analysis et Springer Contents Preface Notation v xii I What This Book Is About 1 1 An Illustrative Example: The Single-Server Queue 1

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. Lecture 12 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 23, 2007 1 2 3 4 5 1 2 Introduce the bootstrap 3 the bootstrap algorithm 4 Example

More information

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016 Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Technical Support Minitab Version Student Free technical support for eligible products

Technical Support Minitab Version Student Free technical support for eligible products Technical Support Free technical support for eligible products All registered users (including students) All registered users (including students) Registered instructors Not eligible Worksheet Size Number

More information

Bootstrapping Methods

Bootstrapping Methods Bootstrapping Methods example of a Monte Carlo method these are one Monte Carlo statistical method some Bayesian statistical methods are Monte Carlo we can also simulate models using Monte Carlo methods

More information

CREATING THE DISTRIBUTION ANALYSIS

CREATING THE DISTRIBUTION ANALYSIS Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184

More information

The Bootstrap. Philip M. Dixon Iowa State University,

The Bootstrap. Philip M. Dixon Iowa State University, Statistics Preprints Statistics 12-2001 The Bootstrap Philip M. Dixon Iowa State University, pdixon@iastate.edu Follow this and additional works at: http://lib.dr.iastate.edu/stat_las_preprints Part of

More information

Acknowledgments. Acronyms

Acknowledgments. Acronyms Acknowledgments Preface Acronyms xi xiii xv 1 Basic Tools 1 1.1 Goals of inference 1 1.1.1 Population or process? 1 1.1.2 Probability samples 2 1.1.3 Sampling weights 3 1.1.4 Design effects. 5 1.2 An introduction

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use

More information

Product Catalog. AcaStat. Software

Product Catalog. AcaStat. Software Product Catalog AcaStat Software AcaStat AcaStat is an inexpensive and easy-to-use data analysis tool. Easily create data files or import data from spreadsheets or delimited text files. Run crosstabulations,

More information

Lecture 7: Linear Regression (continued)

Lecture 7: Linear Regression (continued) Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions

More information

Assignments Fill out this form to do the assignments or see your scores.

Assignments Fill out this form to do the assignments or see your scores. Assignments Assignment schedule General instructions for online assignments Troubleshooting technical problems Fill out this form to do the assignments or see your scores. Login Course: Statistics W21,

More information

Introduction to Queueing Theory for Computer Scientists

Introduction to Queueing Theory for Computer Scientists Introduction to Queueing Theory for Computer Scientists Raj Jain Washington University in Saint Louis Jain@eecs.berkeley.edu or Jain@wustl.edu A Mini-Course offered at UC Berkeley, Sept-Oct 2012 These

More information

Excel 2010 with XLSTAT

Excel 2010 with XLSTAT Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

4.5 The smoothed bootstrap

4.5 The smoothed bootstrap 4.5. THE SMOOTHED BOOTSTRAP 47 F X i X Figure 4.1: Smoothing the empirical distribution function. 4.5 The smoothed bootstrap In the simple nonparametric bootstrap we have assumed that the empirical distribution

More information

STATS PAD USER MANUAL

STATS PAD USER MANUAL STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11

More information

STATISTICS (STAT) 200 Level Courses Registration Restrictions: STAT 250: Required Prerequisites: not Schedule Type: Mason Core: STAT 346:

STATISTICS (STAT) 200 Level Courses Registration Restrictions: STAT 250: Required Prerequisites: not Schedule Type: Mason Core: STAT 346: Statistics (STAT) 1 STATISTICS (STAT) 200 Level Courses STAT 250: Introductory Statistics I. 3 credits. Elementary introduction to statistics. Topics include descriptive statistics, probability, and estimation

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

STATISTICS (STAT) 200 Level Courses. 300 Level Courses. Statistics (STAT) 1

STATISTICS (STAT) 200 Level Courses. 300 Level Courses. Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) 200 Level Courses STAT 250: Introductory Statistics I. 3 credits. Elementary introduction to statistics. Topics include descriptive statistics, probability, and estimation

More information

Quantitative - One Population

Quantitative - One Population Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Mark Johnson Macquarie University Sydney, Australia February 27, 2017 1 / 38 Outline Introduction Hypothesis tests and confidence intervals Classical hypothesis tests

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

Instrumental variables, bootstrapping, and generalized linear models

Instrumental variables, bootstrapping, and generalized linear models The Stata Journal (2003) 3, Number 4, pp. 351 360 Instrumental variables, bootstrapping, and generalized linear models James W. Hardin Arnold School of Public Health University of South Carolina Columbia,

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices

Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices Int J Adv Manuf Technol (2003) 21:249 256 Ownership and Copyright 2003 Springer-Verlag London Limited Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices J.-P. Chen 1

More information

Use of Extreme Value Statistics in Modeling Biometric Systems

Use of Extreme Value Statistics in Modeling Biometric Systems Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision

More information

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error

More information

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Analysis of Panel Data Third Edition Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Contents Preface to the ThirdEdition Preface to the Second Edition Preface to the First Edition

More information

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see

More information

Nonparametric Estimation of Distribution Function using Bezier Curve

Nonparametric Estimation of Distribution Function using Bezier Curve Communications for Statistical Applications and Methods 2014, Vol. 21, No. 1, 105 114 DOI: http://dx.doi.org/10.5351/csam.2014.21.1.105 ISSN 2287-7843 Nonparametric Estimation of Distribution Function

More information

A Beginner's Guide to. Randall E. Schumacker. The University of Alabama. Richard G. Lomax. The Ohio State University. Routledge

A Beginner's Guide to. Randall E. Schumacker. The University of Alabama. Richard G. Lomax. The Ohio State University. Routledge A Beginner's Guide to Randall E. Schumacker The University of Alabama Richard G. Lomax The Ohio State University Routledge Taylor & Francis Group New York London About the Authors Preface xv xvii 1 Introduction

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures: Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes

More information

"BOOTSTRAP MATLAB TOOLBOX" Version 2.0 (May 1998) Software Reference Manual Abdelhak M. Zoubir and D. Robert Iskander Communications and Information Processing Group Cooperative Research Centre for Satellite

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

Statistical Methods for the Analysis of Repeated Measurements

Statistical Methods for the Analysis of Repeated Measurements Charles S. Davis Statistical Methods for the Analysis of Repeated Measurements With 20 Illustrations #j Springer Contents Preface List of Tables List of Figures v xv xxiii 1 Introduction 1 1.1 Repeated

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

PATTERN CLASSIFICATION AND SCENE ANALYSIS

PATTERN CLASSIFICATION AND SCENE ANALYSIS PATTERN CLASSIFICATION AND SCENE ANALYSIS RICHARD O. DUDA PETER E. HART Stanford Research Institute, Menlo Park, California A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS New York Chichester Brisbane

More information

More Summer Program t-shirts

More Summer Program t-shirts ICPSR Blalock Lectures, 2003 Bootstrap Resampling Robert Stine Lecture 2 Exploring the Bootstrap Questions from Lecture 1 Review of ideas, notes from Lecture 1 - sample-to-sample variation - resampling

More information

MINITAB Release Comparison Chart Release 14, Release 13, and Student Versions

MINITAB Release Comparison Chart Release 14, Release 13, and Student Versions Technical Support Free technical support Worksheet Size All registered users, including students Registered instructors Number of worksheets Limited only by system resources 5 5 Number of cells per worksheet

More information

Further Simulation Results on Resampling Confidence Intervals for Empirical Variograms

Further Simulation Results on Resampling Confidence Intervals for Empirical Variograms University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2010 Further Simulation Results on Resampling Confidence

More information

Machine Learning. Topic 4: Linear Regression Models

Machine Learning. Topic 4: Linear Regression Models Machine Learning Topic 4: Linear Regression Models (contains ideas and a few images from wikipedia and books by Alpaydin, Duda/Hart/ Stork, and Bishop. Updated Fall 205) Regression Learning Task There

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,

More information

Predicting Expenditure Per Person for Cities

Predicting Expenditure Per Person for Cities Predicting Expenditure Per Person for Cities Group Gamma Daniel Eck, Lian Hortensius, Yuting Sun Bo Yang, Jingnan Zhang, Qian Zhao Background Client A government organization of State Planning Commission

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

STATISTICS (STAT) Statistics (STAT) 1

STATISTICS (STAT) Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).

More information

FUNCTIONS, ALGEBRA, & DATA ANALYSIS CURRICULUM GUIDE Overview and Scope & Sequence

FUNCTIONS, ALGEBRA, & DATA ANALYSIS CURRICULUM GUIDE Overview and Scope & Sequence FUNCTIONS, ALGEBRA, & DATA ANALYSIS CURRICULUM GUIDE Overview and Scope & Sequence Loudoun County Public Schools 2017-2018 (Additional curriculum information and resources for teachers can be accessed

More information

Chapter 2: Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data Chapter 2: Modeling Distributions of Data Section 2.2 The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 2 Modeling Distributions of Data 2.1 Describing Location in a Distribution

More information

MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. 1. MAT 456 Applied Regression Analysis

MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. 1. MAT 456 Applied Regression Analysis MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. The Part II comprehensive examination is a three-hour closed-book exam that is offered on the second

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Unified Methods for Censored Longitudinal Data and Causality

Unified Methods for Censored Longitudinal Data and Causality Mark J. van der Laan James M. Robins Unified Methods for Censored Longitudinal Data and Causality Springer Preface v Notation 1 1 Introduction 8 1.1 Motivation, Bibliographic History, and an Overview of

More information

Chapters 5-6: Statistical Inference Methods

Chapters 5-6: Statistical Inference Methods Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past

More information

2014 Stat-Ease, Inc. All Rights Reserved.

2014 Stat-Ease, Inc. All Rights Reserved. What s New in Design-Expert version 9 Factorial split plots (Two-Level, Multilevel, Optimal) Definitive Screening and Single Factor designs Journal Feature Design layout Graph Columns Design Evaluation

More information

Statistics 406 Exam November 17, 2005

Statistics 406 Exam November 17, 2005 Statistics 406 Exam November 17, 2005 1. For each of the following, what do you expect the value of A to be after executing the program? Briefly state your reasoning for each part. (a) X

More information

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent

More information

davidr Cornell University

davidr Cornell University 1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent

More information

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

Bootstrapping Method for  14 June 2016 R. Russell Rhinehart. Bootstrapping Bootstrapping Method for www.r3eda.com 14 June 2016 R. Russell Rhinehart Bootstrapping This is extracted from the book, Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation,

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

Computer Comparisons in the Presence of Performance Variation

Computer Comparisons in the Presence of Performance Variation Front. Comput. Sci. DOI Firstname LASTNAME: please insert running head here RESEARCH ARTICLE Computer Comparisons in the Presence of Performance Variation Samuel Irving 1, Bin Li 1, Shaoming Chen 1, Lu

More information

BIG DATA SCIENTIST Certification. Big Data Scientist

BIG DATA SCIENTIST Certification. Big Data Scientist BIG DATA SCIENTIST Certification Big Data Scientist Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big Data. To obtain a certification,

More information

Contents III. 1 Introduction 1

Contents III. 1 Introduction 1 III Contents 1 Introduction 1 2 The Parametric Distributional Clustering Model 5 2.1 The Data Acquisition Process.................... 5 2.2 The Generative Model........................ 8 2.3 The Likelihood

More information

Minitab 18 Feature List

Minitab 18 Feature List Minitab 18 Feature List * New or Improved Assistant Measurement systems analysis * Capability analysis Graphical analysis Hypothesis tests Regression DOE Control charts * Graphics Scatterplots, matrix

More information

6. More Loops, Control Structures, and Bootstrapping

6. More Loops, Control Structures, and Bootstrapping 6. More Loops, Control Structures, and Bootstrapping Ken Rice Timothy Thornotn University of Washington Seattle, July 2013 In this session We will introduce additional looping procedures as well as control

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 3: Distributions Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture Examine data in graphical form Graphs for looking at univariate distributions

More information

Image Analysis, Classification and Change Detection in Remote Sensing

Image Analysis, Classification and Change Detection in Remote Sensing Image Analysis, Classification and Change Detection in Remote Sensing WITH ALGORITHMS FOR ENVI/IDL Morton J. Canty Taylor &. Francis Taylor & Francis Group Boca Raton London New York CRC is an imprint

More information

DM4U_B P 1 W EEK 1 T UNIT

DM4U_B P 1 W EEK 1 T UNIT MDM4U_B Per 1 WEEK 1 Tuesday Feb 3 2015 UNIT 1: Organizing Data for Analysis 1) THERE ARE DIFFERENT TYPES OF DATA THAT CAN BE SURVEYED. 2) DATA CAN BE EFFECTIVELY DISPLAYED IN APPROPRIATE TABLES AND GRAPHS.

More information

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM) School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression

More information

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING SECOND EDITION IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING ith Algorithms for ENVI/IDL Morton J. Canty с*' Q\ CRC Press Taylor &. Francis Group Boca Raton London New York CRC

More information

Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x.

Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x. Exam 4 1-5. Normal Population. The scatter plot show below is a random sample from a 2D normal population. The bell curves and dark lines refer to the population. The sample Least Squares Line (shorter)

More information

Section 4 Matching Estimator

Section 4 Matching Estimator Section 4 Matching Estimator Matching Estimators Key Idea: The matching method compares the outcomes of program participants with those of matched nonparticipants, where matches are chosen on the basis

More information

Missing Data: What Are You Missing?

Missing Data: What Are You Missing? Missing Data: What Are You Missing? Craig D. Newgard, MD, MPH Jason S. Haukoos, MD, MS Roger J. Lewis, MD, PhD Society for Academic Emergency Medicine Annual Meeting San Francisco, CA May 006 INTRODUCTION

More information

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Algebra, Functions & Data Analysis

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Algebra, Functions & Data Analysis WESTMORELAND COUNTY PUBLIC SCHOOLS 2013 2014 Integrated Instructional Pacing Guide and Checklist Algebra, Functions & Data Analysis FIRST QUARTER and SECOND QUARTER (s) ESS Vocabulary A.4 A.5 Equations

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

Random Number Generation and Monte Carlo Methods

Random Number Generation and Monte Carlo Methods James E. Gentle Random Number Generation and Monte Carlo Methods With 30 Illustrations Springer Contents Preface vii 1 Simulating Random Numbers from a Uniform Distribution 1 1.1 Linear Congruential Generators

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

9.8 Rockin the Residuals

9.8 Rockin the Residuals 42 SECONDARY MATH 1 // MODULE 9 9.8 Rockin the Residuals A Solidify Understanding Task The correlation coefficient is not the only tool that statisticians use to analyze whether or not a line is a good

More information

Technical Report of ISO/IEC Test Program of the M-DISC Archival DVD Media June, 2013

Technical Report of ISO/IEC Test Program of the M-DISC Archival DVD Media June, 2013 Technical Report of ISO/IEC 10995 Test Program of the M-DISC Archival DVD Media June, 2013 With the introduction of the M-DISC family of inorganic optical media, Traxdata set the standard for permanent

More information

EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA. Shang-Lin Yang. B.S., National Taiwan University, 1996

EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA. Shang-Lin Yang. B.S., National Taiwan University, 1996 EVALUATION OF THE NORMAL APPROXIMATION FOR THE PAIRED TWO SAMPLE PROBLEM WITH MISSING DATA By Shang-Lin Yang B.S., National Taiwan University, 1996 M.S., University of Pittsburgh, 2005 Submitted to the

More information

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015 GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Package FWDselect. December 19, 2015

Package FWDselect. December 19, 2015 Title Selecting Variables in Regression Models Version 2.1.0 Date 2015-12-18 Author Marta Sestelo [aut, cre], Nora M. Villanueva [aut], Javier Roca-Pardinas [aut] Maintainer Marta Sestelo

More information

Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones

Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones What is machine learning? Data interpretation describing relationship between predictors and responses

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

Generalized least squares (GLS) estimates of the level-2 coefficients,

Generalized least squares (GLS) estimates of the level-2 coefficients, Contents 1 Conceptual and Statistical Background for Two-Level Models...7 1.1 The general two-level model... 7 1.1.1 Level-1 model... 8 1.1.2 Level-2 model... 8 1.2 Parameter estimation... 9 1.3 Empirical

More information