Instruction on JMP IN of Chapter 19

Similar documents
Stat 401 B Lecture 26

Bivariate (Simple) Regression Analysis

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value.

IE 361 Exam 1 October 2005 Prof. Vardeman Give Give Does Explain What Answer explain

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education)

1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression

Cell means coding and effect coding

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

DESIGN OF EXPERIMENTS and ROBUST DESIGN

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Hypermarket Retail Analysis Customer Buying Behavior. Reachout Analytics Client Sample Report

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

Laboratory for Two-Way ANOVA: Interactions

22s:152 Applied Linear Regression

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

5.5 Regression Estimation

Introduction to Stata: An In-class Tutorial

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

schooling.log 7/5/2006

Stat 5303 (Oehlert): Response Surfaces 1

Nina Zumel and John Mount Win-Vector LLC

Centering and Interactions: The Training Data

Assignment No: 2. Assessment as per Schedule. Specifications Readability Assignments

An Example of Using inter5.exe to Obtain the Graph of an Interaction

2017 ITRON EFG Meeting. Abdul Razack. Specialist, Load Forecasting NV Energy

Discussion Notes 3 Stepwise Regression and Model Selection

Week 11: Interpretation plus

NCSS Statistical Software. Design Generator

Week 4: Simple Linear Regression III

An introduction to SPSS

Panel Data 4: Fixed Effects vs Random Effects Models

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

Analysis of Two-Level Designs

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

Regression. Notes. Page 1 25-JAN :21:57. Output Created Comments

Within-Cases: Multivariate approach part one

E-Campus Inferential Statistics - Part 2

STAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression

Box-Cox Transformation for Simple Linear Regression

Week 10: Heteroskedasticity II

Multiple Regression White paper

Set up of the data is similar to the Randomized Block Design situation. A. Chang 1. 1) Setting up the data sheet

PubHlth 640 Intermediate Biostatistics Unit 2 - Regression and Correlation. Simple Linear Regression Software: Stata v 10.1

Variable selection is intended to select the best subset of predictors. But why bother?

Intermediate SAS: Statistics

Salary 9 mo : 9 month salary for faculty member for 2004

Factorial ANOVA with SAS

1. What specialist uses information obtained from bones to help police solve crimes?

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian. Panel Data Analysis: Fixed Effects Models

Independent Variables

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)

MODEL DEVELOPMENT: VARIABLE SELECTION

Introduction to STATA 6.0 ECONOMICS 626

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 2 Working with data in Excel and exporting to JMP Introduction

Stat 5100 Handout #19 SAS: Influential Observations and Outliers

Creating New Variables in JMP Datasets Using Formulas Exercises

DATA DEFINITION PHASE

Lab 2: OLS regression

Subset Selection in Multiple Regression

This electronic supporting information S4 contains the main steps for fitting a response surface model using Minitab 17 (Minitab Inc.).

Conditional and Unconditional Regression with No Measurement Error

Machine Learning - Clustering. CS102 Fall 2017

STAT:5201 Applied Statistic II

Introduction to Excel Workshop

Two-Stage Least Squares

An Econometric Study: The Cost of Mobile Broadband

Week 9: Modeling II. Marcelo Coca Perraillon. Health Services Research Methods I HSMP University of Colorado Anschutz Medical Campus

Math 263 Excel Assignment 3

Week 4: Simple Linear Regression II

Getting Correct Results from PROC REG

Regression on the trees data with R

Lecture 13: Model selection and regularization

rm(list=ls(all=true)) # number of factors. # number of replicates.

STAT 705 Introduction to generalized additive models

/23/2004 TA : Jiyoon Kim. Recitation Note 1

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1

Factorial ANOVA. Skipping... Page 1 of 18

Information Criteria Methods in SAS for Multiple Linear Regression Models

Use of in-built functions and writing expressions

22s:152 Applied Linear Regression

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

INTRODUCTION TO SPSS OUTLINE 6/17/2013. Assoc. Prof. Dr. Md. Mujibur Rahman Room No. BN Phone:

Assignment 6 - Model Building

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

THE LINEAR PROBABILITY MODEL: USING LEAST SQUARES TO ESTIMATE A REGRESSION EQUATION WITH A DICHOTOMOUS DEPENDENT VARIABLE

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

Repeated Measures Part 4: Blood Flow data

piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 September 8, 2017

Model Diagnostic tests

22s:152 Applied Linear Regression DeCook Fall 2011 Lab 3 Monday October 3

Predicting Web Service Levels During VM Live Migrations

Linear regression Number of obs = 6,866 F(16, 326) = Prob > F = R-squared = Root MSE =

Introduction to Hierarchical Linear Model. Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

- 1 - Fig. A5.1 Missing value analysis dialog box

Statistics Lab #7 ANOVA Part 2 & ANCOVA

CHAPTER 7 ASDA ANALYSIS EXAMPLES REPLICATION-SPSS/PASW V18 COMPLEX SAMPLES

Transcription:

Instruction on JMP IN of Chapter 19 Example 19.2 (1). Download the dataset xm19-02.jmp from the website for this course and open it. (2). Go to the Analyze menu and select Fit Model. Click on "REVENUE" and then click on the Y button. Then double click on "INCOME", AGE, INC sq, AGE sq, and INC X AGE variables. Then click Run Model. (3). Sometimes, you need to create one new column that is the square of the other column. You can get the idea of how to do that later in this instruction. Following is the output: Response REVENUE Actual by Predicted Plot 1300 REVENUE Actual 1200 1100 1000 900 800 800 900 1000 1100 1200 1300 REVENUE Predicted P<.0001 RSq=0.91 RMSE=44.695 Rsquare 0.906535 RSquare Adj 0.881939 Root Mean Square Error 44.69533 Mean of Response 1085.56 Observations (or Sum Wgts) 25 Model 5 368140.38 73628.1 36.8569 Error 19 37955.78 1997.7 Prob > F C. Total 24 406096.16 <.0001 Intercept -1133.981 320.0193-3.54 0.0022 INCOME 173.20317 28.20399 6.14 <.0001 AGE 23.549963 32.23447 0.73 0.4739 INC sq -3.726129 0.542156-6.87 <.0001

AGE sq -3.868707 1.179054-3.28 0.0039 INC X AGE 1.9672682 0.944082 2.08 0.0509 Effect Tests Source Nparm DF Sum of Squares F Ratio Prob > F INCOME 1 1 75338.118 37.7129 <.0001 AGE 1 1 1066.261 0.5338 0.4739 INC sq 1 1 94360.833 47.2354 <.0001 AGE sq 1 1 21507.422 10.7662 0.0039 INC X AGE 1 1 8674.258 4.3422 0.0509 Scaled Estimates Continuous factors centered by mean, scaled by range/2 Term Scaled Estimate Plot Estimate Std Error t Ratio Prob> t Intercept 1085.56 8.939067 121.44 <.0001 INCOME 1558.8285 253.836 6.14 <.0001 AGE 135.41229 185.3482 0.73 0.4739 INC sq -1649.93 240.0666-6.87 <.0001 AGE sq -407.0847 124.066-3.28 0.0039 INC X AGE 254.14154 121.9612 2.08 0.0509 Prediction Profiler 3269 REVENUE 1085.56-1445 15.6 24.2 INCOME 33.6 3.4 8.392 AGE 14.9 243.36 608.425 INC sq 1128.96 11.56 78.2144 AGE sq 222.01 53.04 203.354 311.41 INC X AGE To run the new model with the transformed data: (1). Right click the mouse and select Add Multiple Columns, fill in 5 in the box after How many columns to add. Click OK. (2).Double left click on Column1 in the data set to change the column name as income. (Fill in income in the box after Column Name.) (3). Use the same method to change column2 to age, column3 to incomesq, column4 to agesq, column5 to incomexage. (4). Right click income and select Formula. Then click INCOME and - and input 24.2. Then click OK. (5). Use the same way to get age. (Right click age and select Formula. Then click AGE and - and input 8.392. Then click OK.) (6). Right click incomesq and select Formula. Then click income and x y. Then click OK.

(7). Right click agesq and select Formula. Then click age and x y. Then click OK. (8). Right click incomexage and select Formula. Then click income and X and age. Then click OK. (9). Go to the Analyze menu and select Fit Model. Click on "REVENUE" and then click on the Y button. Then double click on "incole", age, incomesq, agesq, and incomexage variables. Then click Run Model. The following is the output: Response REVENUE Actual by Predicted Plot 1300 REVENUE Actual 1200 1100 1000 900 800 800 900 1000 1100 1200 1300 REVENUE Predicted P<.0001 RSq=0.91 RMSE=44.695 RSquare 0.906535 RSquare Adj 0.881939 Root Mean Square Error 44.69533 Mean of Response 1085.56 Observations (or Sum Wgts) 25 Model 5 368140.38 73628.1 36.8569 Error 19 37955.78 1997.7 Prob > F C. Total 24 406096.16 <.0001 Intercept 1472.5221 88.47497 16.64 <.0001 income 9.3678491 2.743887 3.41 0.0029 age 71.157854 22.97214 3.10 0.0059 incomesq -3.726129 0.542156-6.87 <.0001 agesq -3.868707 1.179054-3.28 0.0039 incomexage 1.9672682 0.944082 2.08 0.0509 Effect Tests Source Nparm DF Sum of Squares F Ratio Prob > F income 1 1 23284.764 11.6559 0.0029 age 1 1 19167.581 9.5950 0.0059 incomesq 1 1 94360.833 47.2354 <.0001 agesq 1 1 21507.422 10.7662 0.0039 incomexage 1 1 8674.258 4.3422 0.0509

Scaled Estimates Continuous factors centered by mean, scaled by range/2 Term Scaled Estimate Plot Estimate Std Error t Ratio Prob> t Intercept 1085.56 +++++++++++++++++++++++ 8.939067 121.44 <.0001 +++++++++++++ income 84.310642 ++++++ 24.69498 3.41 0.0029 age 409.15766 +++++++++++++++++++++++ 132.0898 3.10 0.0059 +++++++++++++ incomesq -164.6017 -------------- 23.94974-6.87 <.0001 agesq -407.0847 ---------------------------------- 124.066-3.28 0.0039 incomexage 66.194641 ++++ 31.76646 2.08 0.0509 Prediction Profiler 1862 REVENUE 1085.56 173.91-8.6 9.9e-16 income 9.4-4.992 5e-16 age 6.508 0.01 22.7848 incomesq 88.36 11.56 78.2144 agesq 222.01-24.3648 0.2672 42.9312 incomexage Example 17.1 with color variable (1). Download the dataset xm17-01a.jmp from the website and open it. (2). Click fit model, choose Price as Y while choose Odometer and Color as Construct Model Effects, then click OK, we get the following result: (To make this instruction shorter, I just include part of the output from JMP) Response Price RSquare 0.655219 RSquare Adj 0.64811 Root Mean Square Error 151.2364 Mean of Response 5411.41 Observations (or Sum Wgts) 100 Model 2 4216263.2 2108132 92.1691 Error 97 2218627.0 22872 Prob > F C. Total 99 6434890.2 <.0001 Intercept 6580.1826 92.95884 70.79 <.0001 Odometer -0.031278 0.002306-13.56 <.0001 Color -21.67052 18.11408-1.20 0.2345 (1). Create one column, change the name to I1 while creating another column and change the name to I2. (2). Put the cross on the I1 and right click, choose Formula ; (3). Click Conditional -> If, click Comparison -> a==b, in the first square, click Color, in the second square, input 1 ;

(4). Choose then clause and input 1, choose else clause, input 0, and then click OK ; (5). Put the cross on the I2 and right click, choose Formula ; (6). Click Conditional -> If, click Comparison -> a==b, in the first square, click Color, in the second square, input 2 ; (7). Choose then clause and input 1, choose else clause and input 0, and then click OK ; (8). Click Fit Model, choose Price as Y while choose Odometer, I1 and I2 as Construct Model Effects, then click OK ; we get the following result: Note: The formula for I1 has the following appearance. Response Price RSquare 0.69803 RSquare Adj 0.688594 Root Mean Square Error 142.271 Mean of Response 5411.41 Observations (or Sum Wgts) 100 Model 3 4491749.2 1497250 73.9709 Error 96 1943140.9 20241 Prob > F C. Total 99 6434890.2 <.0001 Intercept 6350.3231 92.16653 68.90 <.0001 Odometer -0.02777 0.002369-11.72 <.0001 I1 45.240979 34.08443 1.33 0.1876 I2 147.73801 38.18499 3.87 0.0002 Example 19.3 (1). Download the dataset xm19-03.jmp from the website for this course and open it. (2). Go to the Analyze menu and select Fit Model. Click on "Win_pct" and then click on the Y button. Then double click on "Rns_scrd", Team_BA,, SO, etc. variables. Then click Run Model. Following is the output:

Response Win_Pct Actual by Predicted Plot 0.60 Win_Pct Actual 0.55 0.50 0.45 0.40 0.35.35.40.45.50.55.60 Win_Pct Predicted P=0.3058 RSq=0.99 RMSE=0.0252 RSquare 0.986665 RSquare Adj 0.826649 Root Mean Square Error 0.025243 Mean of Response 0.500071 Observations (or Sum Wgts) 14 Model 12 0.04714773 0.003929 6.1660 Error 1 0.00063720 0.000637 Prob > F C. Total 13 0.04778493 0.3058 Intercept -0.154661 0.904846-0.17 0.8922 Rns_Scrd 0.0005584 0.000585 0.95 0.5148 Team_BA 0.863167 3.10199 0.28 0.8272 Team_Hmr 0.0002489 0.000588 0.42 0.7450 Team_SB 0.0003479 0.000582 0.60 0.6571 Team_Wlk 0.0001627 0.000284 0.57 0.6689 Team_SO 0.0000541 0.000137 0.40 0.7604 Rns_Alw -0.001895 0.002003-0.95 0.5177 Erns_Alw 0.001145 0.001789 0.64 0.6376 Hits_Alw 0.0001834 0.000328 0.56 0.6756 Team_Ers -0.000117 0.000362-0.32 0.8011 Wlk_Alw 0.0002292 0.000214 1.07 0.4778 SO 0.0001692 0.001195 0.14 0.9105 Effect Tests Source Nparm DF Sum of Squares F Ratio Prob > F Rns_Scrd 1 1 0.00058047 0.9110 0.5148 Team_BA 1 1 0.00004934 0.0774 0.8272 Team_Hmr 1 1 0.00011424 0.1793 0.7450 Team_SB 1 1 0.00022750 0.3570 0.6571 Team_Wlk 1 1 0.00020894 0.3279 0.6689 Team_SO 1 1 0.00009948 0.1561 0.7604 Rns_Alw 1 1 0.00057007 0.8947 0.5177 Erns_Alw 1 1 0.00026089 0.4094 0.6376 Hits_Alw 1 1 0.00019901 0.3123 0.6756 Team_Ers 1 1 0.00006651 0.1044 0.8011 Wlk_Alw 1 1 0.00073255 1.1496 0.4778 SO 1 1 0.00001277 0.0200 0.9105 Scaled Estimates Continuous factors centered by mean, scaled by range/2 Term Scaled Estimate Plot Estimate Std Error t Ratio Prob> t Intercept 0.5000714 +++++++++++++++++++++++ 0.006746 74.12 0.0086

Term Scaled Estimate Plot Estimate Std Error t Ratio Prob> t +++++++++++++ Rns_Scrd 0.0706413 ++++++++++++ 0.074013 0.95 0.5148 Team_BA 0.0142423 ++ 0.051183 0.28 0.8272 Team_Hmr 0.0161808 ++ 0.038215 0.42 0.7450 Team_SB 0.0175675 ++ 0.029401 0.60 0.6571 Team_Wlk 0.0204152 ++++ 0.035652 0.57 0.6689 Team_SO 0.0118522 ++ 0.029997 0.40 0.7604 Rns_Alw -0.181877 ------------------------------------ 0.192287-0.95 0.5177 Erns_Alw 0.0996162 ++++++++++++++++++ 0.155681 0.64 0.6376 Hits_Alw 0.0246623 ++++ 0.04413 0.56 0.6756 Team_Ers -0.012913 -- 0.039971-0.32 0.8011 Wlk_Alw 0.0324346 ++++++ 0.03025 1.07 0.4778 SO 0.0049054 0.034653 0.14 0.9105 To run Stepwise regression, follow the following steps: (1). Download the dataset xm19-03.jmp from the website for this course and open it. (2). Go to the Analyze menu and select Fit Model. Click on "Win_pct" and then click on the Y button. Then double click on "Rns_scrd", Team_BA,, SO, etc. variables. Then, select Stepwise from the Fitting Personality popup menu in the top-right corner of the dialog, and click Run Model. (3). Keep on clicking Step until no variable will be entered. You will get the following outputs: Stepwise Fit Response: Win_Pct Stepwise Regression Control Prob to Enter 0.250 Prob to Leave 0.100 Direction: Current Estimates SSE DFE MSE RSquare RSquare Adj Cp AIC 0.0030252 11 0.000275 0.9367 0.9252-3.2524-112.158 Lock Entered Parameter Estimate ndf SS "F Ratio" "Prob>F" Intercept 0.46240196 1 0 0.000 1.0000 Rns_Scrd 0.0007405 1 0.032686 118.852 0.0000 Team_BA. 1 0.000313 1.155 0.3077 Team_Hmr. 1 0.000633 2.644 0.1350 Team_SB. 1 0.00057 2.320 0.1587 Team_Wlk. 1 0.000042 0.140 0.7157 Team_SO. 1 0.000524 2.096 0.1783 Rns_Alw -0.0006887 1 0.022716 82.598 0.0000 Erns_Alw. 1 0.000368 1.387 0.2662 Hits_Alw. 1 0.000539 2.168 0.1717 Team_Ers. 1 0.00046 1.793 0.2101 Wlk_Alw. 1 0.000355 1.331 0.2754 SO. 1 0.000006 0.019 0.8929 Step History Step Parameter Action "Sig Prob" Seq SS RSquare Cp p 1 Rns_Scrd Entered 0.0076 0.022044 0.4613 30.397 2 2 Rns_Alw Entered 0.0000 0.022716 0.9367-3.252 3

Example 19.4 (1). Download the dataset xm19-04.jmp from the website for this course and open it. (2). Go to the Analyze menu and select Fit Model. Click on "Salary" and then click on the Y button. Then double click on "Education", Experience, and Gender variables. Then click Run Model. Following is the output: Response Salary Whole Model Actual by Predicted Plot 200000 Salary Actual 150000 100000 50000 0 0 50000 100000 150000 200000 Salary Predicted P<.0001 RSq=0.69 RMSE=16274 RSquare 0.693155 RSquare Adj 0.683567 Root Mean Square Error 16273.96 Mean of Response 89607.7 Observations (or Sum Wgts) 100 Model 3 5.74341e10 1.9145e10 72.2873 Error 96 2.54248e10 264841613 Prob > F C. Total 99 8.28589e10 <.0001 Lack Of Fit Lack Of Fit 80 2.41567e10 301959257 3.8100 Pure Error 16 1268054367 79253398 Prob > F Total Error 96 2.54248e10 0.0021 Max RSq 0.9847 Intercept -5835.104 16082.8-0.36 0.7175 Education 2118.8982 1018.486 2.08 0.0401 Experience 4099.3383 317.1936 12.92 <.0001 Gender 1850.9849 3703.07 0.50 0.6183 Effect Tests Source Nparm DF Sum of Squares F Ratio Prob > F Education 1 1 1146295197 4.3282 0.0401 Experience 1 1 4.42349e10 167.0239 <.0001 Gender 1 1 66171062.5 0.2499 0.6183

Residual by Predicted Plot 50000 40000 Salary Residual 30000 20000 10000 0-10000 -20000-30000 -40000 0 50000 100000 150000 200000 Salary Predicted Education Leverage Plot 200000 Salary Leverage Residuals 150000 100000 50000 0 11 12 13 14 15 16 17 18 19 20 Education Leverage, P=0.0401 Experience Leverage Plot 200000 Salary Leverage Residuals 150000 100000 50000 0 0 5 10 15 20 25 30 35 40 Experience Leverage, P<.0001

Gender Leverage Plot 200000 Salary Leverage Residuals 150000 100000 50000 0-0.5.0.5 1.0 1.5 Gender Leverage, P=0.6183