Exam 4. In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x.

Similar documents
1. Determine the population mean of x denoted m x. Ans. 10 from bottom bell curve.

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Relations and Functions 2.1

IQR = number. summary: largest. = 2. Upper half: Q3 =

Multiple Regression White paper

26, 2016 TODAY'S AGENDA QUIZ

Descriptive Statistics, Standard Deviation and Standard Error

Linear Regression. Problem: There are many observations with the same x-value but different y-values... Can t predict one y-value from x. d j.

Analog input to digital output correlation using piecewise regression on a Multi Chip Module

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

FUNCTIONS AND MODELS

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Section 2.3: Simple Linear Regression: Predictions and Inference

Algebra 2 Chapter Relations and Functions

An Introduction to Growth Curve Analysis using Structural Equation Modeling

Regression Analysis and Linear Regression Models

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Chapter 2: Modeling Distributions of Data

Section E. Measuring the Strength of A Linear Association

Curve Fitting with Linear Models

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

2.2 Graphs Of Functions. Copyright Cengage Learning. All rights reserved.

SLStats.notebook. January 12, Statistics:

LAB 2: DATA FILTERING AND NOISE REDUCTION

Chapter 2: The Normal Distribution

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

appstats6.notebook September 27, 2016

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

3. Data Analysis and Statistics

BIOL Gradation of a histogram (a) into the normal curve (b)

Functions. Copyright Cengage Learning. All rights reserved.

Error Analysis, Statistics and Graphing

A straight line is the graph of a linear equation. These equations come in several forms, for example: change in x = y 1 y 0

Chapter 2: Linear Equations and Functions

Building Better Parametric Cost Models

Year 10 General Mathematics Unit 2

Chapter 1 Linear Equations

Confidence Intervals: Estimators

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

9.1. How do the nerve cells in your brain communicate with each other? Signals have. Like a Glove. Least Squares Regression

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Bivariate (Simple) Regression Analysis

2.1: Frequency Distributions and Their Graphs

Equating. Lecture #10 ICPSR Item Response Theory Workshop

Chapter 7: Linear regression

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

E-Campus Inferential Statistics - Part 2

I-96 Case Study: Jointed Concrete Pavement Curling and Warp Presented in the Context of Pavement Asset Management

PSY 9556B (Feb 5) Latent Growth Modeling

Chapter 6. THE NORMAL DISTRIBUTION

Confidence Intervals. Dennis Sun Data 301

Table Of Contents. Table Of Contents

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

LAB 2: DATA FILTERING AND NOISE REDUCTION

Bivariate Linear Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Algebra II Notes Unit Two: Linear Equations and Functions

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

An Introduction to the Bootstrap

Section 2.2: Covariance, Correlation, and Least Squares

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

Section 1.1: Functions and Models

Chapter 6. THE NORMAL DISTRIBUTION

IT 403 Practice Problems (1-2) Answers

MATH 115: Review for Chapter 1

Stat 5100 Handout #11.a SAS: Variations on Ordinary Least Squares

Lab Activity #2- Statistics and Graphing

Practical OmicsFusion

Visualizing Data: Freq. Tables, Histograms

One Factor Experiments

Lecture 1: Statistical Reasoning 2. Lecture 1. Simple Regression, An Overview, and Simple Linear Regression

FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION

Pre-Calculus Mr. Davis

Normal Curves and Sampling Distributions

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Statistics 1 - Basic Commands. Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}

3. parallel: (b) and (c); perpendicular (a) and (b), (a) and (c)

MDM 4UI: Unit 8 Day 2: Regression and Correlation

Chapter 5: The standard deviation as a ruler and the normal model p131

Section 2.1: Intro to Simple Linear Regression & Least Squares

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015

Correlation and Regression

Gelman-Hill Chapter 3

Name Per. a. Make a scatterplot to the right. d. Find the equation of the line if you used the data points from week 1 and 3.

Distributions of random variables

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

Genotype x Environmental Analysis with R for Windows

CHAPTER 2 DESCRIPTIVE STATISTICS

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Your Name: Section: 2. To develop an understanding of the standard deviation as a measure of spread.

MODULE - 4. e-pg Pathshala

Solutions. Algebra II Journal. Module 2: Regression. Exploring Other Function Models

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Transcription:

Exam 4 1-5. Normal Population. The scatter plot show below is a random sample from a 2D normal population. The bell curves and dark lines refer to the population. The sample Least Squares Line (shorter) is in red. 40 30 20 10 0 0 5 10 15 20 In the above, label each of the following with the problem number. 1. The population Least Squares line. 2. The population distribution of x. 3. The population sd of y (and indicate by a thick line segment). 4. The sd of all population y whose x-score is 19.

2 Exam4.nb 5. The population SD line. 6. A 68% y-interval for population points with x = 19. 7. The sample Least Squares Line. 8. Sketch in place the distribution of population y with x = 7. 9. The 95% prediction interval for population y with x = 7. 10. Use the sample Least Squares Line to read-off the regression-based estimator of m y if it is known that m x = 10. 11-20. Non-normal Population. A population need not be normal in order for Least Squares to be useful. The plot below shows a decidedly non-normal population of N = 7 points together with its population Least Squares Line. 12 10 8 6 4 2 2 3 4 5 6 7 8 9

Exam4.nb 3 x y x 2 y 2 xy 2 4 4 16 8 4 5 16 25 20 3 2 9 4 6 7 9 49 81 63 8 8 64 64 64 3 1 9 1 3 9 12 81 144 108 _ 5.14286 5.85714 33.1429 47.8571 38.8571 From the table determine the following. You should check the results using your calculator's built-in statistical routines. 11. m x 12. m y 13. s x 14. s y 15. r 16. Use these to re-plot the Least Squares Line in the above. 17. s 3 x - 2 (properties of population sd in relation to location or scale changes). 18. r 3 x - 2, 9 y + 4 (properties of population correlation in relation to location or scale changes). 19. The plot of the population is not at all normal, and the strength of regression is not great. Nor would the plot of any sample from this population look normal. However, the plot of 100 pairs (x, y ), each from a with-replacement sample of n = 30, is nearly normal. The importance of this is that various issues surrounding the joint variation of (x, y ) are subject to normal theory even though the population is not normal.

4 Exam4.nb 12 10 8 6 4 2 2 3 4 5 6 7 8 9 The sample least squares line for 100 pairs (x, y ) is shown in red (see enlargement). Moreover, almost any sample of n = 30 from the population can give us a good idea of the population Least Squares Line. Below, I've overlaid the above plot with one such sample of 30 (there are only 7 points in the population so the sample of 30 falls entirely on these 7 but with unequal numbers because of random sampling). Owing to the unequal representation of the 7 population points in the sample of 30 the sample Least Squares Line does not fall so perfectly on the population Least Squares Line as has the sample L.S. for 100 pairs (x, y ).

Exam4.nb 5 12 10 8 6 4 2 2 3 4 5 6 7 8 9 Identify (1) the sample L.S. line from n = 30 pairs (x, y) in plot and enlargement above. Identify (2) the sample L.S. line from n = 100 pairs (x, y ) in plot and enlargement above.

6 Exam4.nb 20. Which are correct? A random sample of large n from a population is likely to produce a least squares line close to the population least squares line. The variation of sample values (x, y) from a population is universally going to exhibit normal looking plots of the points (x, y) provided n is large. A random sample of large n of random sample pairs (x, y ) is likely to show plot consistent with normal variation even if the underlying population is not normal. 21-22. Sampling Distribution slope b 1 of sample regression line. This is keyed to #14 and #16 of page 748. s x = Hx - x L 2 ë Hn - 1L = n n-1 x 2 - Hx L 2 s e = Iy - ỳm 2 n í Hn - 2L = 1 - r 2 y 2 - Hy L 2 n-2 s e SE(b 1 ) = estimate of SD of b 1 n-1 s x = 1-r 2 s y n-2 s x This estimate of the standard deviation of b 1 is useful for CI and tests about the population slope b 1. If the population is 2D normal then we have CI for b 1 : b 1 ± t df = n-2 SE(b 1 ) (exact)

Exam4.nb 7 test statistic b 1-b 1 SEHb 1 L has exactly t df = n-2 distribution For large n, even if the population is not normal CI for b 1 : b 1 ± z SE(b 1 ) (approximate) test statistic b 1-b 1 SEHb 1 L As applied to #14 and #16 pg. 748 has ~ Z distribution. x y x 2 y 2 xy 1 13 990 1 195 720 100 13 990 1 13 495 1 182 115 025 13 495 3 12 999 9 168 974 001 38 997 4 9 16 90 250 000 38 000 4 10 495 16 110 145 025 41 980 5 8995 25 80 910 025 44 975 5 9494 25 90 136 036 47 470 6 6999 36 48 986 001 41 994 7 6950 49 48 302 48 650 7 7850 49 61 622 54 950 8 6999 64 48 986 001 55 992 8 5995 64 35 940 025 47 960 10 4950 100 24 502 49 10 4495 100 20 205 025 44 950 13 2850 169 8 122 37 050 _ 6.13333 8403.73 48.2667 8.09945 µ 10 7 41 330.2

8 Exam4.nb 14 000 12 000 10 000 8000 6000 4000 2000 2 4 6 8 10 12 x = age (yr), y = advertised price (dollars) n = 15 pairs (x, y) means (6.1333, 8403.73) s x = 3.3778 s y = 3333.56 r = -0.971767 t 13 = 2.16 for 95% b 1 = -959.039 SEHb 1 L = 64.5816 (applicable df = 15-2 = 13) 95%CI = -959.039 + {-1, 1} (2.16) (64.5816) = {-1098.54, -819.543} Plot of residuals vs. x = year: 0

b 1 = -959.039 SEHb 1 L = 64.5816 (applicable df = 15-2 = 13) 95%CI = -959.039 + {-1, 1} (2.16) (64.5816) = {-1098.54, -819.543} Exam4.nb 9 Plot of residuals vs. x = year: 0 - - -1-2000 2 4 6 8 10 12 Plot of residuals vs. predicted values as favored by the textbook: 1 0

10 Exam4.nb Plot of residuals vs. predicted values as favored by the textbook: 1 0 - - -1 2000 4000 6000 8000 10 000 12 000 Normal Probability Plot of residuals. orderstat 1

Exam4.nb 11 Normal Probability Plot of residuals. orderstat 1-1.5-1.0-0.5 0.5 1.0 1.5 zpercentile - - -1