Stat 401 B Lecture 26

Size: px
Start display at page:

Download "Stat 401 B Lecture 26"

Transcription

1 Stat B Lecture 6 Forward Selection The Forward selection rocedure looks to add variables to the model. Once added, those variables stay in the model even if they become insignificant at a later ste. Backward Selection The Backward selection rocedure looks to remove variables from the model. Once removed, those variables cannot reenter the model even if they would add significantly at a later ste. Mixed Selection A combination of the Forward and Backward selection rocedures. Starts out like Forward selection but looks to see if an added variable can be removed at a later ste.

2 Stat B Lecture 6 Mixed Set u Resonse: MDBH Stewise Regression Control Prob to Enter Prob to Leave Direction: Mixed SSE DFE MSE Adj AIC Lock Parameter Estimate ndf SS "F Ratio" "Prob>F" X X Ste History Ste Parameter Action "Sig Prob" Seq SS Stewise Regression Control Direction Mixed Prob to Enter controls what variables are added. Prob to Leave controls what variables are removed. Prob to Enter = Prob to Leave 5 The current estimates are exactly the same as with the Forward selection rocedure. Clicking on Ste will initiate the Mixed rocedure that starts like the Forward rocedure. 6

3 Stat B Lecture 6 Resonse: MDBH Stewise Regression Control Prob to Enter Prob to Leave Direction: Mixed SSE.59 DFE MSE Lock Parameter X X Ste History.76 Estimate ndf Adj.69 SS "F Ratio" AIC -.6 "Prob>F" Ste Parameter Action "Sig Prob" Seq SS Ste X is added to the model Predicted MDBH = *X R =.76 RMSE = MSE =.6958 =. 7 8 Ste By clicking on Ste you will invoke the Backward art of the Mixed rocedure. Because X is statistically significant and is the only variable in the model, clicking on Ste will not do anything. 9

4 Stat B Lecture 6 Ste Of the remaining variables not in the model X will add the largest sum of squares if added to the model. SS =. F Ratio = 8.9 Prob>F =. JMP Mixed Ste Because X will add the largest sum of squares and that addition is statistically significant, by clicking on Ste, JMP will add X to the model with X. Resonse: MDBH Stewise Regression Control Prob to Enter Prob to Leave Direction: Mixed SSE.5859 DFE MSE 7.59 Lock Parameter X X Ste History Ste Parameter X.86 Estimate Action ndf "Sig Prob".. Adj.779 SS Seq SS "F Ratio" AIC "Prob>F"

5 Stat B Lecture 6 Ste X is added to the model Predicted MDBH =. +.*X +.95*X R =.86 RMSE = MSE =.59 =. 7 Ste By clicking on Ste you will invoke the Backward art of the Mixed rocedure. Because X and X are statistically significant, clicking on Ste will not do anything. Ste Of the remaining variables not in the model X will add the largest sum of squares if added to the model. SS =.67 F Ratio = 7.78 Prob>F =. 5

6 Stat B Lecture 6 JMP Mixed Ste Because X will add the largest sum of squares and that addition is statistically significant, by clicking on Ste, JMP will add X to the model with X and X. 6 Resonse: MDBH Stewise Regression Control Prob to Enter Prob to Leave Direction: Mixed SSE.799 DFE MSE Lock Parameter X X Ste History Ste Parameter X X.867 Estimate Action ndf "Sig Prob"... Adj.8 SS Seq SS "F Ratio" AIC "Prob>F" Ste X is added to the model Predicted MDBH = *X.69*X +.67*X R =.867 RMSE = MSE =.8699 =. 96 8

7 Stat B Lecture 6 Ste By clicking on Ste you will invoke the Backward art of the Mixed rocedure. Note that variable X is no longer statistically significant and so it will be removed from the model when you click on Ste. 9 Resonse: MDBH Stewise Regression Control Prob to Enter Prob to Leave Direction: Mixed SSE.989 DFE MSE Lock Parameter X X Ste History Ste Parameter X X.8658 Estimate Action Removed ndf "Sig Prob" Adj.85 SS Seq SS "F Ratio" AIC "Prob>F" Ste X is removed from the model Predicted MDBH = *X.898*X R =.8658 RMSE = MSE =.8997 =. 86

8 Stat B Lecture 6 Ste Because X and X add significantly to the model they cannot be removed. Because X will not add significantly to the model it cannot be added. The Mixed rocedure stos. Resonse MDBH Summary of Fit Adj Root Mean Square Error Mean of Resonse Observations (or Sum Wgts) Analysis of Variance Source Model Error C. Total DF 7 9 Sum of Squares Parameter Estimates Term X X Effect Tests Source X X Estimate Narm DF Std Error e-5 Mean Square Sum of Squares t Ratio F Ratio 5.8 Prob > F <.* Prob> t <.* <.* <.* F Ratio Prob > F <.* <.* Finding the best model For this examle, the Forward selection rocedure did not find the best model. The Backward and Mixed selection rocedures came u with the best model.

9 Stat B Lecture 6 Finding the best model None of the automatic selection rocedures are guaranteed to find the best model. The only way to be sure, is to look at all ossible models. 5 For k exlanatory variables there k are ossible models. There are k -variable models. k There are -variable models. There are k -variable models. 6 When confronted with all ossible models, we often rely on summary statistics to describe features of the models. R adjr RMSE 7

10 Stat B Lecture 6 Another summary statistic used to assess the fit of a model is Mallows C. SSE C = MSE = k + Full ( n ) 8 The smaller C is the better the fit of the model. The full model will have C =. 9 JMP Fit Model Personality Stewise Red triangle ull down All Possible Models Right click on table Columns Check C

11 Stat B Lecture 6 Resonse: MDBH Model Number X,X,.867 X,X.8658 X,.86 X, X.5977 X RMSE RMSE.5... X,X X,X,. 5 6 = Number of Terms Lists all 7 models. -variable (full) model first. -variable models listed in order of the R value. -variable models listed in order of the R value. Model with X, X, X Highest R value. Model with X, X Lowest RMSE and lowest C. Which is best? Can t tell until you look at significance of the variables in the model.

Instruction on JMP IN of Chapter 19

Instruction on JMP IN of Chapter 19 Instruction on JMP IN of Chapter 19 Example 19.2 (1). Download the dataset xm19-02.jmp from the website for this course and open it. (2). Go to the Analyze menu and select Fit Model. Click on "REVENUE"

More information

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value.

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value. Calibration OVERVIEW... 2 INTRODUCTION... 2 CALIBRATION... 3 ANOTHER REASON FOR CALIBRATION... 4 CHECKING THE CALIBRATION OF A REGRESSION... 5 CALIBRATION IN SIMPLE REGRESSION (DISPLAY.JMP)... 5 TESTING

More information

IE 361 Exam 1 October 2005 Prof. Vardeman Give Give Does Explain What Answer explain

IE 361 Exam 1 October 2005 Prof. Vardeman Give Give Does Explain What Answer explain October 5, 2005 IE 361 Exam 1 Prof. Vardeman 1. IE 361 students Wilhelm, Chow, Kim and Villareal worked with a company checking conformance of several critical dimensions of a machined part to engineering

More information

Bivariate (Simple) Regression Analysis

Bivariate (Simple) Regression Analysis Revised July 2018 Bivariate (Simple) Regression Analysis This set of notes shows how to use Stata to estimate a simple (two-variable) regression equation. It assumes that you have set Stata up on your

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

Information Criteria Methods in SAS for Multiple Linear Regression Models

Information Criteria Methods in SAS for Multiple Linear Regression Models Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information

More information

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions) THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533 MIDTERM EXAMINATION: October 14, 2005 Instructor: Val LeMay Time: 50 minutes 40 Marks FRST 430 50 Marks FRST 533 (extra questions) This examination

More information

E-Campus Inferential Statistics - Part 2

E-Campus Inferential Statistics - Part 2 E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence

More information

DESIGN OF EXPERIMENTS and ROBUST DESIGN

DESIGN OF EXPERIMENTS and ROBUST DESIGN DESIGN OF EXPERIMENTS and ROBUST DESIGN Problems in design and production environments often require experiments to find a solution. Design of experiments are a collection of statistical methods that,

More information

This electronic supporting information S4 contains the main steps for fitting a response surface model using Minitab 17 (Minitab Inc.).

This electronic supporting information S4 contains the main steps for fitting a response surface model using Minitab 17 (Minitab Inc.). This electronic supporting information S4 contains the main steps for fitting a response surface model using Minitab 17 (Minitab Inc.). This process was used in Predicting instrumental mass fractionation

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100. Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records

More information

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors Week 10: Autocorrelated errors This week, I have done one possible analysis and provided lots of output for you to consider. Case study: predicting body fat Body fat is an important health measure, but

More information

An Econometric Study: The Cost of Mobile Broadband

An Econometric Study: The Cost of Mobile Broadband An Econometric Study: The Cost of Mobile Broadband Zhiwei Peng, Yongdon Shin, Adrian Raducanu IATOM13 ENAC January 16, 2014 Zhiwei Peng, Yongdon Shin, Adrian Raducanu (UCLA) The Cost of Mobile Broadband

More information

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 23 CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 3.1 DESIGN OF EXPERIMENTS Design of experiments is a systematic approach for investigation of a system or process. A series

More information

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

CH5: CORR & SIMPLE LINEAR REFRESSION ======================================= STAT 430 SAS Examples SAS5 ===================== ssh xyz@glue.umd.edu, tap sas913 (old sas82), sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

More information

Variable selection is intended to select the best subset of predictors. But why bother?

Variable selection is intended to select the best subset of predictors. But why bother? Chapter 10 Variable Selection Variable selection is intended to select the best subset of predictors. But why bother? 1. We want to explain the data in the simplest way redundant predictors should be removed.

More information

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening Variables Entered/Removed b Variables Entered GPA in other high school, test, Math test, GPA, High school math GPA a Variables Removed

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. 1. (42 points) A random sample of n = 30 NBA basketball

More information

Repeated Measures Part 4: Blood Flow data

Repeated Measures Part 4: Blood Flow data Repeated Measures Part 4: Blood Flow data /* bloodflow.sas */ options linesize=79 pagesize=100 noovp formdlim='_'; title 'Two within-subjecs factors: Blood flow data (NWK p. 1181)'; proc format; value

More information

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM options pagesize=53 linesize=76 pageno=1 nodate; proc format; value $stcktyp "1"="Growth" "2"="Combined" "3"="Income"; data invstmnt; input stcktyp $ perform; label stkctyp="type of Stock" perform="overall

More information

PubHlth 640 Intermediate Biostatistics Unit 2 - Regression and Correlation. Simple Linear Regression Software: Stata v 10.1

PubHlth 640 Intermediate Biostatistics Unit 2 - Regression and Correlation. Simple Linear Regression Software: Stata v 10.1 PubHlth 640 Intermediate Biostatistics Unit 2 - Regression and Correlation Simple Linear Regression Software: Stata v 10.1 Emergency Calls to the New York Auto Club Source: Chatterjee, S; Handcock MS and

More information

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90% ------------------ log: \Term 2\Lecture_2s\regression1a.log log type: text opened on: 22 Feb 2008, 03:29:09. cmdlog using " \Term 2\Lecture_2s\regression1a.do" (cmdlog \Term 2\Lecture_2s\regression1a.do

More information

Week 5: Multiple Linear Regression II

Week 5: Multiple Linear Regression II Week 5: Multiple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Adjusted R

More information

1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression

1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 2 Feb. 6, 2015 1 Downloading files and accessing SAS. We will be using the billion.dat dataset again today, as well as the OECD dataset

More information

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy 2017 ITRON EFG Meeting Abdul Razack Specialist, Load Forecasting NV Energy Topics 1. Concepts 2. Model (Variable) Selection Methods 3. Cross- Validation 4. Cross-Validation: Time Series 5. Example 1 6.

More information

Week 11: Interpretation plus

Week 11: Interpretation plus Week 11: Interpretation plus Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline A bit of a patchwork

More information

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures Example: Age and plasma level for 25 healthy children in a study are reported. Of interest is how plasma level depends on age. (Text Table

More information

Compare Linear Regression Lines for the HP-67

Compare Linear Regression Lines for the HP-67 Compare Linear Regression Lines for the HP-67 by Namir Shammas This article presents an HP-67 program that calculates the linear regression statistics for two data sets and then compares their slopes and

More information

5.5 Regression Estimation

5.5 Regression Estimation 5.5 Regression Estimation Assume a SRS of n pairs (x, y ),..., (x n, y n ) is selected from a population of N pairs of (x, y) data. The goal of regression estimation is to take advantage of a linear relationship

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

Panel Data 4: Fixed Effects vs Random Effects Models

Panel Data 4: Fixed Effects vs Random Effects Models Panel Data 4: Fixed Effects vs Random Effects Models Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised April 4, 2017 These notes borrow very heavily, sometimes verbatim,

More information

Cell means coding and effect coding

Cell means coding and effect coding Cell means coding and effect coding /* mathregr_3.sas */ %include 'readmath.sas'; title2 ''; /* The data step continues */ if ethnic ne 6; /* Otherwise, throw the case out */ /* Indicator dummy variables

More information

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education)

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education) DUMMY VARIABLES AND INTERACTIONS Let's start with an example in which we are interested in discrimination in income. We have a dataset that includes information for about 16 people on their income, their

More information

Centering and Interactions: The Training Data

Centering and Interactions: The Training Data Centering and Interactions: The Training Data A random sample of 150 technical support workers were first given a test of their technical skill and knowledge, and then randomly assigned to one of three

More information

MODEL DEVELOPMENT: VARIABLE SELECTION

MODEL DEVELOPMENT: VARIABLE SELECTION 7 MODEL DEVELOPMENT: VARIABLE SELECTION The discussion of least squares regression thus far has presumed that the model was known with respect to which variables were to be included and the form these

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

Assignment No: 2. Assessment as per Schedule. Specifications Readability Assignments

Assignment No: 2. Assessment as per Schedule. Specifications Readability Assignments Specifications Readability Assignments Assessment as per Schedule Oral Total 6 4 4 2 4 20 Date of Performance:... Expected Date of Completion:... Actual Date of Completion:... ----------------------------------------------------------------------------------------------------------------

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

Practical Design of Experiments: Considerations for Iterative Developmental Testing

Practical Design of Experiments: Considerations for Iterative Developmental Testing Practical Design of Experiments: Considerations for Iterative Developmental Testing Best Practice Authored by: Michael Harman 29 January 2018 The goal of the STAT COE is to assist in developing rigorous,

More information

Introduction to Hierarchical Linear Model. Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017

Introduction to Hierarchical Linear Model. Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017 Introduction to Hierarchical Linear Model Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017 1 Outline What is Hierarchical Linear Model? Why do nested data create analytic problems? Graphic presentation

More information

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1 Stat 5303 (Oehlert): Unbalanced Factorial Examples 1 > section

More information

Analysis&Optimization of Design Parameters of Mechanisms Using Ga

Analysis&Optimization of Design Parameters of Mechanisms Using Ga International Journal of Computational Engineering Research Vol, 3 Issue, 7 Analysis&Optimization of Design Parameters of Mechanisms Using Ga B.Venu 1, Dr.M.nagaphani sastry 2 1 Student, M.Tech (CAD/CAM),

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model

More information

Multiple Regression White paper

Multiple Regression White paper +44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms

More information

CHAPTER 6. The Normal Probability Distribution

CHAPTER 6. The Normal Probability Distribution The Normal Probability Distribution CHAPTER 6 The normal probability distribution is the most widely used distribution in statistics as many statistical procedures are built around it. The central limit

More information

schooling.log 7/5/2006

schooling.log 7/5/2006 ----------------------------------- log: C:\dnb\schooling.log log type: text opened on: 5 Jul 2006, 09:03:57. /* schooling.log */ > use schooling;. gen age2=age76^2;. /* OLS (inconsistent) */ > reg lwage76

More information

Watkins Mill High School. Algebra 2. Math Challenge

Watkins Mill High School. Algebra 2. Math Challenge Watkins Mill High School Algebra 2 Math Challenge "This packet will help you prepare for Algebra 2 next fall. It will be collected the first week of school. It will count as a grade in the first marking

More information

Discussion Notes 3 Stepwise Regression and Model Selection

Discussion Notes 3 Stepwise Regression and Model Selection Discussion Notes 3 Stepwise Regression and Model Selection Stepwise Regression There are many different commands for doing stepwise regression. Here we introduce the command step. There are many arguments

More information

Week 10: Heteroskedasticity II

Week 10: Heteroskedasticity II Week 10: Heteroskedasticity II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Dealing with heteroskedasticy

More information

Hypermarket Retail Analysis Customer Buying Behavior. Reachout Analytics Client Sample Report

Hypermarket Retail Analysis Customer Buying Behavior. Reachout Analytics Client Sample Report Hypermarket Retail Analysis Customer Buying Behavior Report Tools Used: R Python WEKA Techniques Applied: Comparesion Tests Association Tests Requirement 1: All the Store Brand significance to Gender Towards

More information

SPSS INSTRUCTION CHAPTER 9

SPSS INSTRUCTION CHAPTER 9 SPSS INSTRUCTION CHAPTER 9 Chapter 9 does no more than introduce the repeated-measures ANOVA, the MANOVA, and the ANCOVA, and discriminant analysis. But, you can likely envision how complicated it can

More information

2017 Mathematics Paper 1 (Non-calculator) Finalised Marking Instructions

2017 Mathematics Paper 1 (Non-calculator) Finalised Marking Instructions National Qualifications 017 017 Mathematics Paper 1 (Non-calculator) N5 Finalised Marking Instructions Scottish Qualifications Authority 017 The information in this publication may be reproduced to support

More information

DATA DEFINITION PHASE

DATA DEFINITION PHASE Twoway Analysis of Variance Unlike previous problems in the manual, the present problem involves two independent variables (gender of juror and type of crime committed by defendant). There are two levels

More information

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem. STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,

More information

Lab 07: Multiple Linear Regression: Variable Selection

Lab 07: Multiple Linear Regression: Variable Selection Lab 07: Multiple Linear Regression: Variable Selection OBJECTIVES 1.Use PROC REG to fit multiple regression models. 2.Learn how to find the best reduced model. 3.Variable diagnostics and influential statistics

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

The Coefficient of Determination

The Coefficient of Determination The Coefficient of Determination Lecture 46 Section 13.9 Robb T. Koether Hampden-Sydney College Wed, Apr 17, 2012 Robb T. Koether (Hampden-Sydney College) The Coefficient of Determination Wed, Apr 17,

More information

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Abstract In this project, we study 374 public high schools in New York City. The project seeks to use regression techniques

More information

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization

[POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization [POLS 8500] Stochastic Gradient Descent, Linear Model Selection and Regularization L. Jason Anastasopoulos ljanastas@uga.edu February 2, 2017 Gradient descent Let s begin with our simple problem of estimating

More information

For our example, we will look at the following factors and factor levels.

For our example, we will look at the following factors and factor levels. In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball

More information

Summarizing Organization Performance Metrics Tania Skinner Intel Corporation

Summarizing Organization Performance Metrics Tania Skinner Intel Corporation Summarizing Organization Performance Metrics Tania Skinner Intel Corporation Tania.skinner@intel.com Intel Corporation 5200 NE Elam Young Parkway Hillsboro, OR 97124 MS: EG3-319 Objectives Teach a novel

More information

Week 9: Modeling II. Marcelo Coca Perraillon. Health Services Research Methods I HSMP University of Colorado Anschutz Medical Campus

Week 9: Modeling II. Marcelo Coca Perraillon. Health Services Research Methods I HSMP University of Colorado Anschutz Medical Campus Week 9: Modeling II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Taking the log Retransformation

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 22: Model Selection In model selection, the idea is to find the smallest set of variables which provides an adequate description of the data. We will consider

More information

Stat 5303 (Oehlert): Response Surfaces 1

Stat 5303 (Oehlert): Response Surfaces 1 Stat 5303 (Oehlert): Response Surfaces 1 > data

More information

ZunZun.com. User-Selectable Polynomial. Sat Jan 14 09:49: local server time

ZunZun.com. User-Selectable Polynomial. Sat Jan 14 09:49: local server time ZunZun.com User-Selectable Polynomial y = a + bx 1 + cx 2 + dx 3 + fx 4 + gx 5 Sat Jan 14 09:49:08 2012 local server time Coefficients y = a + bx 1 + cx 2 + dx 3 + fx 4 + gx 5 Fitting target of sum of

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Historical Data RSM Tutorial Part 1 The Basics

Historical Data RSM Tutorial Part 1 The Basics DX10-05-3-HistRSM Rev. 1/27/16 Historical Data RSM Tutorial Part 1 The Basics Introduction In this tutorial you will see how the regression tool in Design-Expert software, intended for response surface

More information

Conditional and Unconditional Regression with No Measurement Error

Conditional and Unconditional Regression with No Measurement Error Conditional and with No Measurement Error /* reg2ways.sas */ %include 'readsenic.sas'; title2 ''; proc reg; title3 'Conditional Regression'; model infrisk = stay census; proc calis cov; /* Analyze the

More information

Introduction to STATA 6.0 ECONOMICS 626

Introduction to STATA 6.0 ECONOMICS 626 Introduction to STATA 6.0 ECONOMICS 626 Bill Evans Fall 2001 This handout gives a very brief introduction to STATA 6.0 on the Economics Department Network. In a few short years, STATA has become one of

More information

Topic:- DU_J18_MA_STATS_Topic01

Topic:- DU_J18_MA_STATS_Topic01 DU MA MSc Statistics Topic:- DU_J18_MA_STATS_Topic01 1) In analysis of variance problem involving 3 treatments with 10 observations each, SSE= 399.6. Then the MSE is equal to: [Question ID = 2313] 1. 14.8

More information

Compare Linear Regression Lines for the HP-41C

Compare Linear Regression Lines for the HP-41C Compare Linear Regression Lines for the HP-41C by Namir Shammas This article presents an HP-41C program that calculates the linear regression statistics for two data sets and then compares their slopes

More information

piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 September 8, 2017

piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 September 8, 2017 piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 1 Heriot-Watt University, Edinburgh, UK Center for Energy Economics Research and Policy (CEERP) 2 The Hebrew University

More information

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value. AP Statistics - Problem Drill 05: Measures of Variation No. 1 of 10 1. The range is calculated as. (A) The minimum data value minus the maximum data value. (B) The maximum data value minus the minimum

More information

Set up of the data is similar to the Randomized Block Design situation. A. Chang 1. 1) Setting up the data sheet

Set up of the data is similar to the Randomized Block Design situation. A. Chang 1. 1) Setting up the data sheet Repeated Measure Analysis (Univariate Mixed Effect Model Approach) (Treatment as the Fixed Effect and the Subject as the Random Effect) (This univariate approach can be used for randomized block design

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

Math 10- Chapter 2 Review

Math 10- Chapter 2 Review Math 10- Chapter 2 Review [By Christy Chan, Irene Xu, and Henry Luan] Knowledge required for understanding this chapter: 1. Simple calculation skills: addition, subtraction, multiplication, and division

More information

Analyzing traffic source impact on returning visitors ratio in information provider website

Analyzing traffic source impact on returning visitors ratio in information provider website IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Analyzing traffic source impact on returning visitors ratio in information provider website To cite this article: A Prasetio et

More information

LECTURE 11: LINEAR MODEL SELECTION PT. 2. October 18, 2017 SDS 293: Machine Learning

LECTURE 11: LINEAR MODEL SELECTION PT. 2. October 18, 2017 SDS 293: Machine Learning LECTURE 11: LINEAR MODEL SELECTION PT. 2 October 18, 2017 SDS 293: Machine Learning Announcements 1/2 CS Internship Lunch Presentations Come hear where Computer Science majors interned in Summer 2017!

More information

Model Diagnostic tests

Model Diagnostic tests Model Diagnostic tests 1. Multicollinearity a) Pairwise correlation test Quick/Group stats/ correlations b) VIF Step 1. Open the EViews workfile named Fish8.wk1. (FROM DATA FILES- TSIME) Step 2. Select

More information

Stata Session 2. Tarjei Havnes. University of Oslo. Statistics Norway. ECON 4136, UiO, 2012

Stata Session 2. Tarjei Havnes. University of Oslo. Statistics Norway. ECON 4136, UiO, 2012 Stata Session 2 Tarjei Havnes 1 ESOP and Department of Economics University of Oslo 2 Research department Statistics Norway ECON 4136, UiO, 2012 Tarjei Havnes (University of Oslo) Stata Session 2 ECON

More information

Sharp EL-9900 Graphing Calculator

Sharp EL-9900 Graphing Calculator Sharp EL-9900 Graphing Calculator Basic Keyboard Activities General Mathematics Algebra Programming Advanced Keyboard Activities Algebra Calculus Statistics Trigonometry Programming Sharp EL-9900 Graphing

More information

Prob and Stats, Sep 4

Prob and Stats, Sep 4 Prob and Stats, Sep 4 Variations on the Frequency Histogram Book Sections: N/A Essential Questions: What are the methods for displaying data, and how can I build them? What are variations of the frequency

More information

One Factor Experiments

One Factor Experiments One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal

More information

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1 Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the

More information

Simulating Multivariate Normal Data

Simulating Multivariate Normal Data Simulating Multivariate Normal Data You have a population correlation matrix and wish to simulate a set of data randomly sampled from a population with that structure. I shall present here code and examples

More information

Nina Zumel and John Mount Win-Vector LLC

Nina Zumel and John Mount Win-Vector LLC SUPERVISED LEARNING IN R: REGRESSION Evaluating a model graphically Nina Zumel and John Mount Win-Vector LLC "line of perfect prediction" Systematic errors DataCamp Plotting Ground Truth vs. Predictions

More information

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB Keppel, G. Design and Analysis: Chapter 17: The Mixed Two-Factor Within-Subjects Design: The Overall Analysis and the Analysis of Main Effects and Simple Effects Keppel describes an Ax(BxS) design, which

More information

STAT:5201 Applied Statistic II

STAT:5201 Applied Statistic II STAT:5201 Applied Statistic II Two-Factor Experiment (one fixed blocking factor, one fixed factor of interest) Randomized complete block design (RCBD) Primary Factor: Day length (short or long) Blocking

More information

14.2 The Regression Equation

14.2 The Regression Equation 14.2 The Regression Equation Tom Lewis Fall Term 2009 Tom Lewis () 14.2 The Regression Equation Fall Term 2009 1 / 12 Outline 1 Exact and inexact linear relationships 2 Fitting lines to data 3 Formulas

More information

Advanced Data Analysis 1 Stat 427/527

Advanced Data Analysis 1 Stat 427/527 Advanced Data Analysis 1 Stat 427/527 Chapter 00-2 R building blocks Erik B. Erhardt Department of Mathematics and Statistics MSC01 1115 1 University of New Mexico Albuquerque, New Mexico, 87131-0001 Office:

More information

Tips on JMP ing into Mixture Experimentation

Tips on JMP ing into Mixture Experimentation Tips on JMP ing into Mixture Experimentation Daniell. Obermiller, The Dow Chemical Company, Midland, MI Abstract Mixture experimentation has unique challenges due to the fact that the proportion of the

More information

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11} Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is stored in Lists on the calculator. Locate and press the STAT button on the calculator. Choose EDIT. The calculator

More information

Graphics calculator instructions

Graphics calculator instructions Graphics calculator instructions Contents: A B C D E F G Basic calculations Basic functions Secondary function and alpha keys Memory Lists Statistical graphs Working with functions 10 GRAPHICS CALCULATOR

More information

Predicting Web Service Levels During VM Live Migrations

Predicting Web Service Levels During VM Live Migrations Predicting Web Service Levels During VM Live Migrations 5th International DMTF Academic Alliance Workshop on Systems and Virtualization Management: Standards and the Cloud Helmut Hlavacs, Thomas Treutner

More information

Reproducible Research: Weaving with Stata

Reproducible Research: Weaving with Stata StataCorp LP Italian Stata Users Group Meeting October, 2008 Outline I Introduction 1 Introduction Goals Reproducible Research and Weaving 2 3 What We ve Seen Goals Reproducible Research and Weaving Goals

More information

Factorial ANOVA. Skipping... Page 1 of 18

Factorial ANOVA. Skipping... Page 1 of 18 Factorial ANOVA The potato data: Batches of potatoes randomly assigned to to be stored at either cool or warm temperature, infected with one of three bacterial types. Then wait a set period. The dependent

More information

INTRODUCTION TO PANEL DATA ANALYSIS

INTRODUCTION TO PANEL DATA ANALYSIS INTRODUCTION TO PANEL DATA ANALYSIS USING EVIEWS FARIDAH NAJUNA MISMAN, PhD FINANCE DEPARTMENT FACULTY OF BUSINESS & MANAGEMENT UiTM JOHOR PANEL DATA WORKSHOP-23&24 MAY 2017 1 OUTLINE 1. Introduction 2.

More information

Graphics calculator instructions

Graphics calculator instructions Graphics calculator instructions Contents: A Basic calculations B Basic functions C Secondary function and alpha keys D Memory E Lists F Statistical graphs G Working with functions H Two variable analysis

More information