Psychology 282 Lecture #21 Outline Categorical IVs in MLR: Effects Coding and Contrast Coding
|
|
- Griffin Cory Bradley
- 5 years ago
- Views:
Transcription
1 Psychology 282 Lecture #21 Outline Categorical IVs in MLR: Effects Coding and Contrast Coding In the previous lecture we learned how to incorporate a categorical research factor into a MLR model by using dummy variables. Given a categorical factor with g levels we construct (g-1) dummy variables as defined by the following coding table: Category C 1 C 2 C 3. C g (g-1) g The coding table is used to assign values of the dummy variables to each individual. The dummy variables are then used as IVs in a regression model, which produces a value of R 2 as well as a regression equation. The value of R 2 indicates the proportion of variance in Y accounted for by the categorical research factor, as represented by the dummy variables.
2 The partial statistics associated with dummy variable C j are interpreted with reference to a comparison of category j to category g. Thus, category g plays a special role by serving as the category to which all others are compared via the partial statistics. When there is no basis for assigning a particular category to play this role, we may wish to use a different coding method. Unweighted Effects Coding Effects coded variables look very much like dummy variables with one change. Individuals in group g are assigned values of -1 on all effects coded variables. Thus, the coding table would have the following general form: 2 Category C 1 C 2 C 3. C g (g-1) g
3 3 In our example the coding table would look like this: Category C 1 C 2 C 3 1 (Drug A) (Drug B) (Placebo) (Control) Using this coding table we could then assign values of the effects coded IVs to each individual and produce a data matrix of the form Participant Treatment C 1 C 2 C 3 Y 1 A A B B Placebo Placebo Control n We could then use the effects coded variables as IVs in a MLR analysis with Y as the DV. The analysis would produce a value of R 2 along with a regression equation of the form Yˆ L = B0 + B1C1 + B2C2 + + B g 1C g 1
4 4 The value of R 2 and all inferential information about R 2 (significance test, confidence interval, correction for shrinkage, etc.) would be identical to results obtained from the MLR analysis with dummy variables. The regression coefficients and other partial statistics associated with the effects coded variables would be different than corresponding information associated with dummy variables. It can be shown that the intercept and coefficients in the regression equation would have the following interpretation: The intercept B 0 would be equal to the mean of the g group means on the Y variable. That is: B 0 = Y&& = Y + Y + Y & g + L+ Y g This value is called the unweighted mean of the group means, meaning that the group means are not weighted by sample size. If group sample sizes are equal, then this value is equivalent to the grand mean of Y across all n observations. More on this later.
5 The regression coefficients for effects coded IVs also have a simple interpretation. For the first coded variable, it can be shown that B = Y Y && & 1 1 That is, the regression coefficient for C 1 will equal the difference between the mean for category 1 and the mean of all group means. In our example B 1 would equal the difference between the mean value of Y for individuals in the Drug A condition and the mean of all four group means. Such coefficients can be thought of as representing the effect of membership in a given category. For example, a large positive value of B 1 indicates a strong positive effect of being in the Drug A condition. Each regression coefficient for effects coded IVs has a similar interpretation. The coefficient for C 2 would have the value B = Y Y && & 2 2 and would reflect the effect of being in the Drug B condition. In general for effects coded IV C j, B j = Yj Y &&& reflects the effect of membership in category j. For all such effects, category j is compared to the unweighted mean of all categories. 5
6 For each B j we can also conduct significance tests and obtain confidence intervals. Such information is interpreted with reference to a comparison of the mean for category j to the unweighted mean of all group means. Similarly, we can obtain a value of sr 2 j associated with each effects coded variable. Such a value would be interpreted as the proportion of variance in Y accounted for by the effect of membership in group j; or more specifically, the proportion of variance in Y accounted for by the difference between the mean for group j and the unweighted mean of all group means. In general, under this type of coding, all partial statistics are interpreted with reference to comparison of a given group to the unweighted mean of all group means. Note the difference between this interpretation and that for partial statistics associated with dummy variables, which are interpreted with reference to comparison of a given group to group g. Note that the use of unweighted effects coding implies that each category counts equally. Differences in sample sizes for different groups are not considered relevant. This would normally be the case in experimental designs. 6
7 7 Weighted Effects Coding In some situations differences in sample sizes among groups may be indicative of those groups representing different proportions of the full population. For example if the research factor is ethnicity and we take a large sample from the full population we will find different sample sizes for different ethnic groups. Those sample sizes reflect the fact that different ethnic groups make up different proportions of the full population. If we wish for these differences to be represented in our coded variables and in our regression analyses, then effects codes must be adjusted by using the differential sample sizes. See details for these adjustments in Cohen, Cohen, West, & Aiken (2004). The resulting coded variables can then be used as IVs in a regression analysis, producing a value of R 2 and a regression equation. The value of R 2 and associated inferential information will be identical to that obtained using dummy coding or unweighted effects coding.
8 The coefficients in the regression equation will be different and will be interpreted in terms of weighted means instead of unweighted means. The intercept B 0 will correspond to the weighted mean of all g group means. A regression coefficient B j will be interpreted as a deviation of a group mean from the weighted mean of all g group means. The choice of weighted vs. unweighted effects coding depends primarily on whether differences in sample sizes for different categories of the research factor are reflective of those categories representing different proportions of the full population. The choice of effects coding vs. dummy coding depends at least in part on whether there exists an appropriate choice for a comparison group under dummy coding. 8
9 9 Contrast Coding A third type of coding can be used when there exist prior hypotheses about particular differences between categories. In our example, for instance, one specific issue of interest might be evaluation of the difference in effectiveness between Drug A and Drug B, ignoring the other two categories. Another might be evaluation of the difference in effectiveness between use of a real drug (Drug A and Drug B) vs. no real drug (Placebo and Control). Such prior hypotheses are called contrasts, and we can design coded variables to represent and provide for the testing of contrasts of interest. Given g categories we would define (g-1) contrast coded variables. The general procedure for defining a contrast coded variable is as follows: Given g categories, a contrast can be seen as defining three subsets of the g categories: Subset U, containing u categories. Subset V, containing v categories. Subset W, containing w categories. The contrast is designed to compare the groups in subset U to those in subset V, ignoring those in subset W.
10 For example, if we wish to compare Drug A to Drug B, ignoring Placebo and Control conditions, then: Subset U is the Drug A condition, containing u=1 category. Subset V is the Drug B condition, containing v=1 category. Subset W contains the Placebo and Control conditions, thus w=2. Contrast codes (defining a column in the coding table) are then defined as follows: For categories in subset U, codes are set at v/(u+v). For categories in subset V, codes are set at +u/(u+v). For categories in subset W, codes are set at 0. To illustrate, let us define codes for contrast variable C 1 to represent a comparison of Drug A vs. Drug B. The value of C 1 for Drug A condition would be -1/2. The value of C 1 for Drug B condition would be +1/2. The value of C 1 for Placebo and Control conditions would be 0. 10
11 Thus the first column of the coding table would have the following form: 11 Category C 1 C 2 C 3 1 (Drug A) -1/2 2 (Drug B) +1/2 3 (Placebo) 0 4 (Control) 0 The contrast codes actually define a linear combination of group means: C 1 = 1 Y1 + 1 Y2 + (0) Y3 + (0) Y Since we need (g-1) coded IVs to carry the information in the categorical research factor, we can (must) define two more contrasts in our example. Let C 2 be defined to represent a comparison of Placebo vs. Control, ignoring Drugs A and B. Let C 3 be defined to represent a comparison of Drugs A and B to Placebo and Control. The full coding table would then take this form: Category C 1 C 2 C 3 1 (Drug A) -1/2 0 +1/2 2 (Drug B) +1/2 0 +1/2 3 (Placebo) 0-1/2-1/2 4 (Control) 0 +1/2-1/2
12 Note that the contrasts should be defined as independent, or orthogonal. Independence is achieved by defining the contrasts so that the sum of products of codes for a given pair of contrasts is zero. In our example, if we sum the products of the codes in any pair of columns, we get a value of zero. Once contrast codes are defined we can then use the coding table to assign values of the coded variables to each individual. In our example the resulting data matrix would look like this: 12 Participant Treatment C 1 C 2 C 3 Y 1 A -1/2 0 1/2 9 2 A -1/2 0 1/ B 1/2 0 1/2 8 4 B 1/2 0 1/2 7 5 Placebo 0-1/2-1/2 5 6 Placebo 0-1/2-1/2 8 7 Control 0 1/2-1/2 7 n Just as we did using other coding methods, we could then proceed with an MLR analysis regressing Y on the three coded IVs.
13 13 Results would include a value of R 2 and associated inferential information, which would exactly match corresponding results obtained under other coding methods. Results would also include a regression equation of the form Yˆ B + B C + B C + L + B g C = g 1 Our focus in these results is on the regression coefficients and associated inferential information and partial statistics. (It can be shown that the intercept will be equal to the unweighted mean of the g group means.) In our example, B 1 would equal the value of the contrast defined by C 1 ; specifically, the difference between the unweighted means for the Drug A and Drug B conditions. The significance test for B 1 would be interpreted as a test of the significance of this contrast. The value of sr 2 1 would be interpreted as the proportion of variance in Y accounted for by this contrast.
14 14 In a similar fashion the partial statistics associated with each contrast coded IV could be interpreted. General Comments on Coding Methods In the case of a single categorical research factor, regardless of which coding method is used, results of an MLR analysis will be equivalent to results of a one-way ANOVA. When different coding methods are used, the value of R 2 and associated inferential information will not change. The values of regression coefficients and other partial statistics will change, as will their interpretation. In general when using coded variables we should always make use of unstandardized regression coefficients rather than standardized coefficients. Standardization of coded variables makes interpretation more difficult.
15 15 The choice of coding method can be based on the following principles: Dummy coding: Use when there is one group that logically can serve as a reference group to which all others will be compared through the various partial statistics. Effects coding: Use when there is no obvious choice for a reference group and no specific contrasts of interest. Use unweighted effects coding when differences in group sample sizes are irrelevant. Use weighted effects coding when differences in group sample sizes reflect differences in proportional representation in the population. Contrast coding: Use when prior hypotheses lend themselves to the specification of (g-1) independent contrasts.
STATISTICS FOR PSYCHOLOGISTS
STATISTICS FOR PSYCHOLOGISTS SECTION: JAMOVI CHAPTER: USING THE SOFTWARE Section Abstract: This section provides step-by-step instructions on how to obtain basic statistical output using JAMOVI, both visually
More informationStatistics Lab #7 ANOVA Part 2 & ANCOVA
Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")
More informationIntroduction to Statistical Analyses in SAS
Introduction to Statistical Analyses in SAS Programming Workshop Presented by the Applied Statistics Lab Sarah Janse April 5, 2017 1 Introduction Today we will go over some basic statistical analyses in
More informationSTATS PAD USER MANUAL
STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11
More informationCDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening
CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening Variables Entered/Removed b Variables Entered GPA in other high school, test, Math test, GPA, High school math GPA a Variables Removed
More informationMinitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.
Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 9. Linear regression with latent variables 9.1 Principal component regression (PCR) 9.2 Partial least-squares regression (PLS) [ mostly
More informationDesign of Experiments
Seite 1 von 1 Design of Experiments Module Overview In this module, you learn how to create design matrices, screen factors, and perform regression analysis and Monte Carlo simulation using Mathcad. Objectives
More informationCorrectly Compute Complex Samples Statistics
SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample
More informationSpatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data
Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated
More informationResources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.
Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department
More informationDesign and Analysis of Experiments Prof. Jhareswar Maiti Department of Industrial and Systems Engineering Indian Institute of Technology, Kharagpur
Design and Analysis of Experiments Prof. Jhareswar Maiti Department of Industrial and Systems Engineering Indian Institute of Technology, Kharagpur Lecture 59 Fractional Factorial Design using MINITAB
More informationLab #9: ANOVA and TUKEY tests
Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for
More informationCHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA
Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent
More informationOne Factor Experiments
One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal
More informationData Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data
Data Statistics Population Census Sample Correlation... Voluntary Response Sample Statistical & Practical Significance Quantitative Data Qualitative Data Discrete Data Continuous Data Fewer vs Less Ratio
More informationRecall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:
Topic 11. Unbalanced Designs [ST&D section 9.6, page 219; chapter 18] 11.1 Definition of missing data Accidents often result in loss of data. Crops are destroyed in some plots, plants and animals die,
More informationDescriptive Statistics Descriptive statistics & pictorial representations of experimental data.
Psychology 312: Lecture 7 Descriptive Statistics Slide #1 Descriptive Statistics Descriptive statistics & pictorial representations of experimental data. In this lecture we will discuss descriptive statistics.
More informationAn Example of Using inter5.exe to Obtain the Graph of an Interaction
An Example of Using inter5.exe to Obtain the Graph of an Interaction This example covers the general use of inter5.exe to produce data from values inserted into a regression equation which can then be
More informationE-Campus Inferential Statistics - Part 2
E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence
More informationStudy Guide. Module 1. Key Terms
Study Guide Module 1 Key Terms general linear model dummy variable multiple regression model ANOVA model ANCOVA model confounding variable squared multiple correlation adjusted squared multiple correlation
More informationLecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression
Lecture Simple Regression, An Overview, and Simple Linear Regression Learning Objectives In this set of lectures we will develop a framework for simple linear, logistic, and Cox Proportional Hazards Regression
More informationRegression. Dr. G. Bharadwaja Kumar VIT Chennai
Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called
More informationLinear Methods for Regression and Shrinkage Methods
Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors
More information610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison
610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison R is very touchy about unbalanced designs, partly because
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationRSM Split-Plot Designs & Diagnostics Solve Real-World Problems
RSM Split-Plot Designs & Diagnostics Solve Real-World Problems Shari Kraber Pat Whitcomb Martin Bezener Stat-Ease, Inc. Stat-Ease, Inc. Stat-Ease, Inc. 221 E. Hennepin Ave. 221 E. Hennepin Ave. 221 E.
More informationMultiple Regression White paper
+44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms
More informationMinitab 17 commands Prepared by Jeffrey S. Simonoff
Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save
More informationMultiple Linear Regression
Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors
More informationFor our example, we will look at the following factors and factor levels.
In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball
More informationProduct Catalog. AcaStat. Software
Product Catalog AcaStat Software AcaStat AcaStat is an inexpensive and easy-to-use data analysis tool. Easily create data files or import data from spreadsheets or delimited text files. Run crosstabulations,
More informationSPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL
SPSS QM II SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform some statistical analyses in SPSS. Details around a certain function/analysis method not covered
More informationSTAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.
STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,
More informationSTAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression
STAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression analysis. Analysis of Variance: one way classification,
More informationChapter 7: Linear regression
Chapter 7: Linear regression Objective (1) Learn how to model association bet. 2 variables using a straight line (called "linear regression"). (2) Learn to assess the quality of regression models. (3)
More informationFathom Dynamic Data TM Version 2 Specifications
Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other
More informationCorrectly Compute Complex Samples Statistics
PASW Complex Samples 17.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample
More informationIf the active datasheet is empty when the StatWizard appears, a dialog box is displayed to assist in entering data.
StatWizard Summary The StatWizard is designed to serve several functions: 1. It assists new users in entering data to be analyzed. 2. It provides a search facility to help locate desired statistical procedures.
More informationD-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview
Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,
More informationLinear Model Selection and Regularization. especially usefull in high dimensions p>>100.
Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records
More informationSTATISTICS (STAT) Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).
More informationMultiple Linear Regression: Global tests and Multiple Testing
Multiple Linear Regression: Global tests and Multiple Testing Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike
More informationES-2 Lecture: Fitting models to data
ES-2 Lecture: Fitting models to data Outline Motivation: why fit models to data? Special case (exact solution): # unknowns in model =# datapoints Typical case (approximate solution): # unknowns in model
More informationPSY 9556B (Feb 5) Latent Growth Modeling
PSY 9556B (Feb 5) Latent Growth Modeling Fixed and random word confusion Simplest LGM knowing how to calculate dfs How many time points needed? Power, sample size Nonlinear growth quadratic Nonlinear growth
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2015 MODULE 4 : Modelling experimental data Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal
More informationConfidence Intervals. Dennis Sun Data 301
Dennis Sun Data 301 Statistical Inference probability Population / Box Sample / Data statistics The goal of statistics is to infer the unknown population from the sample. We ve already seen one mode of
More informationSPSS INSTRUCTION CHAPTER 9
SPSS INSTRUCTION CHAPTER 9 Chapter 9 does no more than introduce the repeated-measures ANOVA, the MANOVA, and the ANCOVA, and discriminant analysis. But, you can likely envision how complicated it can
More informationStatCalc User Manual. Version 9 for Mac and Windows. Copyright 2018, AcaStat Software. All rights Reserved.
StatCalc User Manual Version 9 for Mac and Windows Copyright 2018, AcaStat Software. All rights Reserved. http://www.acastat.com Table of Contents Introduction... 4 Getting Help... 4 Uninstalling StatCalc...
More informationAnalysis of Complex Survey Data with SAS
ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods
More informationUnit Maps: Grade 8 Math
Real Number Relationships 8.3 Number and operations. The student represents and use real numbers in a variety of forms. Representation of Real Numbers 8.3A extend previous knowledge of sets and subsets
More informationDual-Frame Weights (Landline and Cell) for the 2009 Minnesota Health Access Survey
Dual-Frame Weights (Landline and Cell) for the 2009 Minnesota Health Access Survey Kanru Xia 1, Steven Pedlow 1, Michael Davern 1 1 NORC/University of Chicago, 55 E. Monroe Suite 2000, Chicago, IL 60603
More informationAn Introduction to Growth Curve Analysis using Structural Equation Modeling
An Introduction to Growth Curve Analysis using Structural Equation Modeling James Jaccard New York University 1 Overview Will introduce the basics of growth curve analysis (GCA) and the fundamental questions
More informationSection 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business
Section 3.2: Multiple Linear Regression II Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Multiple Linear Regression: Inference and Understanding We can answer new questions
More informationLab 5 - Risk Analysis, Robustness, and Power
Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationStatistical Good Practice Guidelines. 1. Introduction. Contents. SSC home Using Excel for Statistics - Tips and Warnings
Statistical Good Practice Guidelines SSC home Using Excel for Statistics - Tips and Warnings On-line version 2 - March 2001 This is one in a series of guides for research and support staff involved in
More informationThe problem we have now is called variable selection or perhaps model selection. There are several objectives.
STAT-UB.0103 NOTES for Wednesday 01.APR.04 One of the clues on the library data comes through the VIF values. These VIFs tell you to what extent a predictor is linearly dependent on other predictors. We
More informationTwo-Level Designs. Chapter 881. Introduction. Experimental Design. Experimental Design Definitions. Alias. Blocking
Chapter 881 Introduction This program generates a 2 k factorial design for up to seven factors. It allows the design to be blocked and replicated. The design rows may be output in standard or random order.
More informationDescriptives. Graph. [DataSet1] C:\Documents and Settings\BuroK\Desktop\Prestige.sav
GET FILE='C:\Documents and Settings\BuroK\Desktop\Prestige.sav'. DESCRIPTIVES VARIABLES=prestige education income women /STATISTICS=MEAN STDDEV MIN MAX. Descriptives Input Missing Value Handling Resources
More informationCS229 Lecture notes. Raphael John Lamarre Townshend
CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based
More informationExercise: Graphing and Least Squares Fitting in Quattro Pro
Chapter 5 Exercise: Graphing and Least Squares Fitting in Quattro Pro 5.1 Purpose The purpose of this experiment is to become familiar with using Quattro Pro to produce graphs and analyze graphical data.
More informationData-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp )
Applied Longitudinal Data Analysis Page 1 Data-Analysis Exercise Fitting and Extending the Discrete-Time Survival Analysis Model (ALDA, Chapters 11 & 12, pp. 357-467) Purpose of the Exercise This data-analytic
More information9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10
St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................
More informationPolymath 6. Overview
Polymath 6 Overview Main Polymath Menu LEQ: Linear Equations Solver. Enter (in matrix form) and solve a new system of simultaneous linear equations. NLE: Nonlinear Equations Solver. Enter and solve a new
More informationSAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/
STAT:5201 Applied Statistic II (Factorial with 3 factors as 2 3 design) Three-way ANOVA (Factorial with three factors) with replication Factor A: angle (low=0/high=1) Factor B: geometry (shape A=0/shape
More informationThe Power and Sample Size Application
Chapter 72 The Power and Sample Size Application Contents Overview: PSS Application.................................. 6148 SAS Power and Sample Size............................... 6148 Getting Started:
More informationNuts and Bolts Research Methods Symposium
Organizing Your Data Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013 Topics to Discuss: Types of Variables Constructing a Variable Code Book Developing Excel Spreadsheets
More informationRonald H. Heck 1 EDEP 606 (F2015): Multivariate Methods rev. November 16, 2015 The University of Hawai i at Mānoa
Ronald H. Heck 1 In this handout, we will address a number of issues regarding missing data. It is often the case that the weakest point of a study is the quality of the data that can be brought to bear
More informationMicroscopic Traffic Simulation
Microscopic Traffic Simulation Lecture Notes in Transportation Systems Engineering Prof. Tom V. Mathew Contents Overview 2 Traffic Simulation Models 2 2. Need for simulation.................................
More informationUnit Maps: Grade 8 Math
Real Number Relationships 8.3 Number and operations. The student represents and use real numbers in a variety of forms. Representation of Real Numbers 8.3A extend previous knowledge of sets and subsets
More informationPSS718 - Data Mining
Lecture 5 - Hacettepe University October 23, 2016 Data Issues Improving the performance of a model To improve the performance of a model, we mostly improve the data Source additional data Clean up the
More informationTHIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010
THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE
More informationSimulation: Solving Dynamic Models ABE 5646 Week 12, Spring 2009
Simulation: Solving Dynamic Models ABE 5646 Week 12, Spring 2009 Week Description Reading Material 12 Mar 23- Mar 27 Uncertainty and Sensitivity Analysis Two forms of crop models Random sampling for stochastic
More informationSTAT STATISTICAL METHODS. Statistics: The science of using data to make decisions and draw conclusions
STAT 515 --- STATISTICAL METHODS Statistics: The science of using data to make decisions and draw conclusions Two branches: Descriptive Statistics: The collection and presentation (through graphical and
More informationRegression. Page 1. Notes. Output Created Comments Data. 26-Mar :31:18. Input. C:\Documents and Settings\BuroK\Desktop\Data Sets\Prestige.
GET FILE='C:\Documents and Settings\BuroK\Desktop\DataSets\Prestige.sav'. GET FILE='E:\MacEwan\Teaching\Stat252\Data\SPSS_data\MENTALID.sav'. DATASET ACTIVATE DataSet1. DATASET CLOSE DataSet2. GET FILE='E:\MacEwan\Teaching\Stat252\Data\SPSS_data\survey_part.sav'.
More informationQuantitative - One Population
Quantitative - One Population The Quantitative One Population VISA procedures allow the user to perform descriptive and inferential procedures for problems involving one population with quantitative (interval)
More informationExcel 2010 with XLSTAT
Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with
More informationMinitab detailed
Minitab 18.1 - detailed ------------------------------------- ADDITIVE contact sales: 06172-5905-30 or minitab@additive-net.de ADDITIVE contact Technik/ Support/ Installation: 06172-5905-20 or support@additive-net.de
More informationDesignDirector Version 1.0(E)
Statistical Design Support System DesignDirector Version 1.0(E) User s Guide NHK Spring Co.,Ltd. Copyright NHK Spring Co.,Ltd. 1999 All Rights Reserved. Copyright DesignDirector is registered trademarks
More informationChapter 17: INTERNATIONAL DATA PRODUCTS
Chapter 17: INTERNATIONAL DATA PRODUCTS After the data processing and data analysis, a series of data products were delivered to the OECD. These included public use data files and codebooks, compendia
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use
More informationUNIT 1: NUMBER LINES, INTERVALS, AND SETS
ALGEBRA II CURRICULUM OUTLINE 2011-2012 OVERVIEW: 1. Numbers, Lines, Intervals and Sets 2. Algebraic Manipulation: Rational Expressions and Exponents 3. Radicals and Radical Equations 4. Function Basics
More informationIndependent Variables
1 Stepwise Multiple Regression Olivia Cohen Com 631, Spring 2017 Data: Film & TV Usage 2015 I. MODEL Independent Variables Demographics Item: Age Item: Income Dummied Item: Gender (Female) Digital Media
More informationLast time... Coryn Bailer-Jones. check and if appropriate remove outliers, errors etc. linear regression
Machine learning, pattern recognition and statistical data modelling Lecture 3. Linear Methods (part 1) Coryn Bailer-Jones Last time... curse of dimensionality local methods quickly become nonlocal as
More informationSAS Structural Equation Modeling 1.3 for JMP
SAS Structural Equation Modeling 1.3 for JMP SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2012. SAS Structural Equation Modeling 1.3 for JMP. Cary,
More informationExample 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1
Panel data set Consists of n entities or subjects (e.g., firms and states), each of which includes T observations measured at 1 through t time period. total number of observations : nt Panel data have
More informationExample Using Missing Data 1
Ronald H. Heck and Lynn N. Tabata 1 Example Using Missing Data 1 Creating the Missing Data Variable (Miss) Here is a data set (achieve subset MANOVAmiss.sav) with the actual missing data on the outcomes.
More informationCurve fitting. Lab. Formulation. Truncation Error Round-off. Measurement. Good data. Not as good data. Least squares polynomials.
Formulating models We can use information from data to formulate mathematical models These models rely on assumptions about the data or data not collected Different assumptions will lead to different models.
More informationLearner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display
CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &
More informationFrequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS
ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More information- 1 - Fig. A5.1 Missing value analysis dialog box
WEB APPENDIX Sarstedt, M. & Mooi, E. (2019). A concise guide to market research. The process, data, and methods using SPSS (3 rd ed.). Heidelberg: Springer. Missing Value Analysis and Multiple Imputation
More informationScreening Design Selection
Screening Design Selection Summary... 1 Data Input... 2 Analysis Summary... 5 Power Curve... 7 Calculations... 7 Summary The STATGRAPHICS experimental design section can create a wide variety of designs
More informationData Analysis Guidelines
Data Analysis Guidelines DESCRIPTIVE STATISTICS Standard Deviation Standard deviation is a calculated value that describes the variation (or spread) of values in a data set. It is calculated using a formula
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More informationMean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Creation & Description of a Data Set * 4 Levels of Measurement * Nominal, ordinal, interval, ratio * Variable Types
More informationAnalysis of Variance in R
nalysis of Variance in R Dale arr R Training: University of Glasgow Dale arr (R Training: University of Glasgow) nalysis of Variance in R 1 / 19 When is NOV applicable? When you wish to assess the independent/joint
More informationAnalysis of Two-Level Designs
Chapter 213 Analysis of Two-Level Designs Introduction Several analysis programs are provided for the analysis of designed experiments. The GLM-ANOVA and the Multiple Regression programs are often used.
More information