ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

Similar documents
STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Week 4: Simple Linear Regression III

Week 5: Multiple Linear Regression II

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

5.5 Regression Estimation

IQR = number. summary: largest. = 2. Upper half: Q3 =

Week 4: Simple Linear Regression II

Centering and Interactions: The Training Data

Regression Analysis and Linear Regression Models

Getting Correct Results from PROC REG

Bivariate (Simple) Regression Analysis

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

Regression on the trees data with R

Introduction to Statistical Analyses in SAS

Model Selection and Inference

Factorial ANOVA with SAS

The Coefficient of Determination

Comparison of Means: The Analysis of Variance: ANOVA

Conditional and Unconditional Regression with No Measurement Error

Simulating Multivariate Normal Data

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Section 2.2: Covariance, Correlation, and Least Squares

Compare Linear Regression Lines for the HP-67

SAS/STAT 13.1 User s Guide. The NESTED Procedure

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Fathom Dynamic Data TM Version 2 Specifications

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Factorial ANOVA. Skipping... Page 1 of 18

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.1: Intro to Simple Linear Regression & Least Squares

The NESTED Procedure (Chapter)

14.2 The Regression Equation

A straight line is the graph of a linear equation. These equations come in several forms, for example: change in x = y 1 y 0

Measures of Dispersion

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

STAT:5201 Applied Statistic II

Cell means coding and effect coding

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Subset Selection in Multiple Regression

Written by Donna Hiestand-Tupper CCBC - Essex TI 83 TUTORIAL. Version 3.0 to accompany Elementary Statistics by Mario Triola, 9 th edition

Robust Linear Regression (Passing- Bablok Median-Slope)

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Repeated Measures Part 4: Blood Flow data

Two-Stage Least Squares

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

One Factor Experiments

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Multivariate Capability Analysis

Simulation and resampling analysis in R

Compare Linear Regression Lines for the HP-41C

For our example, we will look at the following factors and factor levels.

Answers Investigation 4

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

A C E. Answers Investigation 4. Applications. b. Possible answers:

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

Solution to Bonus Questions

3. Data Analysis and Statistics

MS&E 226: Small Data

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.

Linear Methods for Regression and Shrinkage Methods

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

Section E. Measuring the Strength of A Linear Association

36-402/608 HW #1 Solutions 1/21/2010

G r a d e 1 0 I n t r o d u c t i o n t o A p p l i e d a n d P r e - C a l c u l u s M a t h e m a t i c s ( 2 0 S )

Name 13-6A. 1. Which is the volume of the solid? 2. What is the volume of the solid? A 72 cm 3 B 432 cm 3 C 672 cm 3 D 864 cm 3

Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression

Splines and penalized regression

SENIOR HIGH MATH LEAGUE April 24, 2001 SECTION I: ONE POINT EACH

Lecture 7: Linear Regression (continued)

CREATING THE ANALYSIS

R-Square Coeff Var Root MSE y Mean

1. Determine the population mean of x denoted m x. Ans. 10 from bottom bell curve.

Part I. Fill in the blank. 2 points each. No calculators. No partial credit

Chapter 6: DESCRIPTIVE STATISTICS

Math B Regents Exam 0606 Page 1

1. Solve the following system of equations below. What does the solution represent? 5x + 2y = 10 3x + 5y = 2

Exam 2 Review. 2. What the difference is between an equation and an expression?

FALL SEMESTER EXAM Directions: You must show work for all the problems. Unit 1. Angle. Angle Addition Postulate. Angle Bisector. Length of a segment

ALGEBRA 2 W/ TRIGONOMETRY MIDTERM REVIEW

Analysis of variance - ANOVA

Applied Statistics and Econometrics Lecture 6

Algebra II Quadratic Functions and Equations - Extrema Unit 05b

Confidence Intervals: Estimators

Practice Exam III - Answers

Quantitative - One Population

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

A Pedi for the Lady... And Other Area/Volume Activities

I. Function Characteristics

MS in Applied Statistics: Study Guide for the Data Science concentration Comprehensive Examination. 1. MAT 456 Applied Regression Analysis

The Normal Distribution & z-scores

PSY 9556B (Feb 5) Latent Growth Modeling

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression

Stat 5100 Handout #19 SAS: Influential Observations and Outliers

Part I. Problems in this section are mostly short answer and multiple choice. Little partial credit will be given. 5 points each.

Generalized Least Squares (GLS) and Estimated Generalized Least Squares (EGLS)

Transcription:

ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

1. (42 points) A random sample of n = 30 NBA basketball players is drawn and heights and weights are recorded. The sample means, variances and corrected sums of squares for x and y are given in the table below: Variable Mean St. Dev. Corr. sum of squares x : height 79 in 4.2 in 507 in 2 y : weight 224.6 lbs. 32 lbs 29623 lbs 2 The sample covariance is s xy = 110.3 (lb-inches). In answering the questions below, make the usual assumption that errors about the mean are iid normal. (a) Report the sample correlation coefficient for height and weight. (b) Give the least squares regression line for a regression of weight on height. (c) Report the MSE from the regression analysis.

(d) Use the model to estimate the mean weight among players who are x = 80 inches tall. (e) Using the fact that (1, 80)(X X) 1 (1, 80) MSE = 3.46, report a 95% confidence interval for this mean. (f) The width of a 95% confidence interval for the mean among players of height x = 84 is (bigger/smaller/equal to) the width of the interval in part (e). (Circle one.) (g) A player has a weight clause in his contract that says that if he is over the 97.5 th percentile of the league weight among players with similar height, he is subject to a fine. Danny Fortson (Seattle Supersonics) weighs y = 260lbs and is x = 80 inches tall. The team wants to impose the fine on him arguing that he is outside the 95% confidence interval computed in part (e). Do you agree? Support your argument briefly, with appropriate calculations if necessary.

2. (18 points) A simple linear regression model for a random sample of n measurements assumes that given x 1,..., x n, the mean of the response y is a linear function of x: Y i = β 0 + β 1 x i + E i, i = 1,..., n. (a) Briefly list all of the additional assumptions about E i that we make that are necessary for small-sample inference about the regression coefficients. (b) Let ŷ i denote the fitted value on the least squares regression line. Label the sums of squares below as corrected total (TOT), or being due to Regression(R) or Error(E). (Fill in the blanks.) n i=1 (y i ŷ i ) 2 SS[ ] n i=1 (y i ȳ) 2 SS[ ] n i=1 (ŷ i ȳ) 2 SS[ ] Also, express one sum of squares as the sum of the other two.

3. (40 points) Three measurements are made on each of n = 31 trees randomly sampled from a population of interest: y = volume (in cubic ft), x 1 = girth (in inches), x 2 = height (in feet). Let X denote the design matrix for a multiple linear regression model of y on x 1 and x 2 : Y i = β 0 + β 1 x i1 + β 2 x i2 + E i The partial output at the end of the exam was generated using the SAS code below. Use it to answer the given questions. proc reg; model volume=girth height/xpx i ss1 ss2; run; (a) Fill in the blank space in the output labelled AAA. (b) Fill in the blank space in the output labelled BBB. (c) Fill in the blank space in the output labelled CCC. (d) Specify the dimension of the design matrix X. (e) Give the matrix product (X X) 1 X Y (f) Give the extra sum of squares for girth after controlling for height, R(β 1 β 0, β 2 ). (g) Give the regression sum of squares for a simple linear regression of volume on height only.

(h) When the product girth height is added to the model, the least squares regression equation becomes ŷ = 69.4 5.86x 1 1.3x 2 + 0.135x 1 x 2 and the unexplained error is quantified by SS[E] = 198 on 27 df. Formulate a test comparing the additive model with the interactive model. Specify a null hypothesis H 0 and report an F -ratio, along with associated degrees of freedom. Using α = 0.05, is there evidence to select a model on the basis of this test? (i) Use the complex model to estimate the mean volume among trees that are x 1 = 15 inches around and x 2 = 80 feet tall. Estimate the standard deviation of volumes among such a population of trees. (j) Going back to the additive model, it may be shown from the output that the correlation between the slopes is corr( ˆβ 1, ˆβ 2 ) = 0.52. True or false: if another response, such as crown width, z, were measured on each of these trees and z were regressed on x 1 and x 2, then the correlation between the slopes ˆβ 1 and ˆβ 2 would still be 0.52. Explain briefly.

The SAS System 1 The REG Procedure Model Crossproducts X X X Y Y Y Variable Intercept girth height volume Intercept AAA 410.7 2356 935.3 girth 410.7 5736.55 31524.7 13887.86 height 2356 31524.7 180274 72962.6 volume 935.3 13887.86 72962.6 36324.99 X X Inverse, Parameter Estimates, and SSE Variable Intercept girth height volume Intercept 4.9519429276 0.028680223-0.069732257-57.98765892 girth BBB CCC -0.001185265 4.708160503 height -0.069732257-0.001185265 0.0011241461 0.3392512342 volume -57.98765892 4.708160503 0.3392512342 421.92135922 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 7684.16251 3842.08126 254.97 <.0001 Error 28 421.92136 15.06862 Corrected Total 30 8106.08387 Root MSE 3.88183 R-Square 0.9480 Dependent Mean 30.17097 Adj R-Sq 0.9442 Parameter Standard Variable DF Estimate Error t Value Pr > t Type I SS Intercept 1-57.98766 8.63823-6.71 <.0001 28219 girth 1 4.70816 0.26426 17.82 <.0001 7581.78133 height 1 0.33925 0.13015 2.61 0.0145 102.38118 Variable DF Type II SS Intercept 1 679.04025 girth 1 4782.97364 height 1 102.38118