THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Similar documents
THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

Getting Correct Results from PROC REG

IQR = number. summary: largest. = 2. Upper half: Q3 =

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks,

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

Centering and Interactions: The Training Data

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

Regression Analysis and Linear Regression Models

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Measures of Dispersion

STAT:5201 Applied Statistic II

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

5.5 Regression Estimation

Bivariate (Simple) Regression Analysis

Robust Linear Regression (Passing- Bablok Median-Slope)

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

CREATING THE ANALYSIS

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x.

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Stat 5100 Handout #19 SAS: Influential Observations and Outliers

Linear Regression in two variables (2-3 students per group)

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/

Week 4: Simple Linear Regression III

An introduction to SPSS

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

MAT 110 WORKSHOP. Updated Fall 2018

Fathom Dynamic Data TM Version 2 Specifications

Chapter 6. THE NORMAL DISTRIBUTION

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1

Sections 4.3 and 4.4

Linear Regression. Problem: There are many observations with the same x-value but different y-values... Can t predict one y-value from x. d j.

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

Name Per. a. Make a scatterplot to the right. d. Find the equation of the line if you used the data points from week 1 and 3.

STA 570 Spring Lecture 5 Tuesday, Feb 1

ST Lab 1 - The basics of SAS

Chapter 6. THE NORMAL DISTRIBUTION

Model Selection and Inference

Loki s Practice Sets for PUBP555: Math Camp Spring 2014

Brief Guide on Using SPSS 10.0

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

Math B Regents Exam 0607 Page b, P.I. A.G.4 Which equation is best represented by the accompanying graph?

Multiple Linear Regression

Lab #9: ANOVA and TUKEY tests

Chapters 5-6: Statistical Inference Methods

Cell means coding and effect coding

STATS PAD USER MANUAL

2) In the formula for the Confidence Interval for the Mean, if the Confidence Coefficient, z(α/2) = 1.65, what is the Confidence Level?

Two-Stage Least Squares

Table Of Contents. Table Of Contents

1. How would you describe the relationship between the x- and y-values in the scatter plot?

How individual data points are positioned within a data set.

Chapter 5: The standard deviation as a ruler and the normal model p131

Conditional and Unconditional Regression with No Measurement Error

MATH 115: Review for Chapter 1

Generalized Least Squares (GLS) and Estimated Generalized Least Squares (EGLS)

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Sec 6.3. Bluman, Chapter 6 1

G r a d e 1 0 I n t r o d u c t i o n t o A p p l i e d a n d P r e - C a l c u l u s M a t h e m a t i c s ( 2 0 S )

Analysis of variance - ANOVA

Simulating Multivariate Normal Data

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

Baruch College STA Senem Acet Coskun

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow.

For our example, we will look at the following factors and factor levels.

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Chapter 6: DESCRIPTIVE STATISTICS

Distributions of Continuous Data

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Introductory Guide to SAS:

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section E. Measuring the Strength of A Linear Association

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Quantitative - One Population

One Factor Experiments

Applied Regression Modeling: A Business Approach

Lecture 3 Questions that we should be able to answer by the end of this lecture:

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part I. 4 th Nine Weeks,

MINITAB 17 BASICS REFERENCE GUIDE

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Multiple Regression White paper

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Review of JDemetra+ revisions plug-in

8. MINITAB COMMANDS WEEK-BY-WEEK

Introduction to Statistical Analyses in SAS

Week 5: Multiple Linear Regression II

Math B Regents Exam 0606 Page 1

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

ALGEBRA 2 W/ TRIGONOMETRY MIDTERM REVIEW

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

SLStats.notebook. January 12, Statistics:

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Multivariate Analysis Multivariate Calibration part 2

Transcription:

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE ON THE BUBBLE SHEET. YOU MUST BUBBLE-IN YOUR NAME & YOUR STUDENT IDENTIFICATION NUMBER. EACH QUESTION HAS ONLY ONE CORRECT CHOICE (decimals may need rounding). USE "NUMBER " PENCIL ONLY - DO NOT USE INK - FILL BUBBLE COMPLETELY. NO NOTES OR REMARKS ARE ACCEPTED - DO NOT TEAR OR FOLD THE BUBBLE SHEET. A GRADE OF ZERO WILL BE ASSIGNED FOR THE ENTIRE EXAM IF THE BUBBLE SHEET IS NOT FILLED OUT ACCORDING TO THE ABOVE INSTRUCTIONS. QUESTIONS are worth point each.. Using the standard normal distribution tables, what is the area under the standard normal curve corresponding to. < Z <.5? A).5764 B).385 C).85 D).8849 Use the following to answer questions -5: A friendly STOR455 instructor wants to find out if there is a difference between the final exam scores and the midterm exam scores. He takes a random sample of five students and finds the following scores (assume normal distribution). Student 3 4 5 Midterm 5 4 7 5 Final 4 8 9 9. The target population of numbers is A) All midterm scores in the class B) All final scores in the class C) All students in the class D) Bivariate midterm and final scores of all the students in the class E) None of the above

3. What is a 9% confidence interval for μ, the mean difference between final and midterm scores? A) (-3.95,.75) B) (-6.6, 5.6) C) (-4.95, 3.75) D) (-4.6, 3.4) E) Not within ±.5 of any of the above. 4. Based on the confidence interval previously calculated, we wish to test H: μ = versus Ha: μ at the 5% significance level. Determine which of the following statements is true: A) We cannot make a decision since the confidence level we used to calculate the confidence interval is 9%, and we would need a 95% confidence interval. B) We accept H, since the value falls in the 9% confidence interval and would therefore also fall in the wider 95% confidence interval. C) We reject H, since the value falls in the 9% confidence interval. D) We cannot make a decision since the confidence interval is too wide. E) None of the above 5. It is suspected that the material on the final is harder than on the midterm. Therefore you would like to prove that the final exam scores are lower than the midterm scores. The appropriate hypothesis for μ, the mean difference between second and first exam scores is: A) H μ= vs. Ha μ B) H μ= vs. Ha μ< C) H μ= vs. Ha μ> D) None of the above 6. In statistical tests of hypotheses, we say the data are significant at level α if A) α =. B) the p-value is less than or equal to α C) α is small D) the p-value is larger than α E) α has nothing to do with statistical significance. Use the following to answer questions 7-9: Do heavier cars use more gasoline? To answer this question, a researcher randomly selected 5 cars registered in North Carolina. He collected data about the weight (in hundreds of pounds) and the mileage (mpg) for each car. From a scatterplot made with the data, a linear model seems appropriate. 7. The study population of items in this study are A) all cars in NC B) mileage C) weight D) both B, C 8. The response variable in this study is A) weight B) mileage C) both D) neither E) not enough info

9. The equation of the least-squares regression line is y ˆ = 4.4.5 x Which of the following descriptions of the value of the slope is the correct description? A) The mileage is expected to increase by.5 when the weight of a car increases by pound. B) The mileage is expected to decrease by.5 when the weight of a car increases by pounds. C) We cannot interpret the slope because we cannot have a negative weight of a car. D) None of the above. Four different residual plots are shown below. Which plots indicate that the linear regression model is not appropriate? 4 4 Residual - Residual - -4 A) B) -4 4 4 Residual Residual - - -4 C) D) E) Both A and D -4. John s parents recorded his height at various ages between 36 and 66 months. Below is a record of the results: Age (months) 36 48 54 6 66 Height (inches) 34 38 4 43 45 Which of the following is the equation of the least-squares regression line of John s height on age? A) Height =.(Age) B) Height = Age/ C) Height = 6..(Age) D) Height =.46 +.37.(Age) E) None of the above

. What do we hope to capture within a confidence interval? A) The parameter estimate. B) The unknown confidence level. C) The unknown parameter value. D) The sample size. E) None of the above. A = Use the following to answer questions 3 4 3 3 3 B = B 5 3 C 3 3 5 @ A C = 5 3. Which of the following matrices can be multiplied together? A) C.D B) A.D C) B.A D) A.B E) None of the above 3 @ 5 3 A 3 D = 5 3 4. Compute D.C 4 6 A) @ 35 A B) 7 6 9 E) None of the above B is correct 8 4 C) 9 8 7 E) None of the above C is correct 5. Find the inverse of A) B) C) D) D) B @ 8 4 9 8 7 7 4 3 8 6 C A In questions 6 indicate what (if anything) will the SAS code produce. (Assume that a data set blah containing variables x and y exists in SAS memory.) 6. proc reg data=blah noprint; model y=x; plot y*x; run; A) nothing, the code is incorrect B) normal quantile (rankit) plot C) plot of standardized residuals vs predicted values D) plot of y vs x with the regression line overlaid

7. proc reg data=blah noprint; model y=x; plot student.*nqq.; run; A) nothing, the code is incorrect B) normal quantile (rankit) plot for standardized residuals C) plot of standardized residuals vs predicted values D) plot of y vs x with the regression line overlaid 8. proc reg data=blah noprint; model y=x; output out=blah student=sr p=yhat; run; A) nothing, the code is incorrect B) normal quantile (rankit) plot C) plot of standardized residuals vs predicted values D) plot of y vs x with the regression line overlaid (the code creates a data set blah with variables sr and yhat added to the old data set) 9. proc reg data=blah model y=x; plot r.*p; run; A) nothing, the code is incorrect B) normal quantile (rankit) plot C) plot of standardized residuals vs predicted values D) plot of y vs x with the regression line overlaid Use the following SAS output to answer questions - : The UNIVARIATE Procedure Variable: MPG Basic Confidence Limits Assuming Normality Parameter Estimate 95% Confidence Limits Mean 5.3 4.8447 5.7593 Std Deviation.6397.4397.675 Variance.4866.9335.36

. The 95% confidence interval for σ is A) 5.3 B) [.9,.36] C) [4.84, 5.76] D) [.44,.7] E) None of the above is close. The value of the sample mean is A) 5.3 B) [.9,.36] C) [4.84, 5.76] D) [.44,.7] E) None of the above is close. This output was produced using A) proc means B) proc reg C) proc print D) proc gplot (proc univariate) 3 Which of the following functions is linear in unknown parameters (symbols β)? p A) + x 4 B) + x C) x + x D) +sin( x) A is correct Use the following SAS output to answer questions 4-8: Model: MODEL Dependent Variable: Y Number of Observations Read 84 Number of Observations Used 84 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 934694 934694 6.83 <. Error 8 4557365 555 Corrected Total 83 5487368 Root MSE 356.995 R-Square.73 Dependent Mean 7.38 Adj R-Sq.6 Coeff Var 33.3493 The REG Procedure Model: MODEL Dependent Variable: Y Parameter Estimates E) none Parameter Standard Variable DF Estimate Error t Value Pr > t 95% Confidence Limits Intercept 58 377.6469 6.6 <. 3997 738 X -7.5759 4.57433-4. <. -53.7977-87.876

4. The estimate of the standard deviation is A) 356.9 B) 58 C) 7.5759 D) 377.64 is close 5. The estimate of the intercept A) 356.9 B) 58 C) -7.5759 D) 377.64 is close 6. The estimate of the slope A) 356.9 B) 58 C) -7.5759 D) 377.64 is close 7. Is the slope significantly different from? A) yes B) no C) not enough info D) none of the above 8. Based on the normal quantile plot below is the assumption of normal distribution of the errors appropriate? A) yes because the plot follows roughly a line B) yes, since Rsq is big enough C) no, because Rsq<.5 D) no because there are outliers

9. What is wrong with the following regression plots? A) The variance is increasing B) The regression line does not fit C) There is an outlier D) nothing, the plot indicates a good fit E) None of the above 3. When computing a CI for standard deviation which table will I use? A) Normal Table B) T table C) Chi square table D)None of the above