Linear Regression in two variables (2-3 students per group)

Similar documents
Name Per. a. Make a scatterplot to the right. d. Find the equation of the line if you used the data points from week 1 and 3.

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Why Use Graphs? Test Grade. Time Sleeping (Hrs) Time Sleeping (Hrs) Test Grade

1. How would you describe the relationship between the x- and y-values in the scatter plot?

Using Excel for Graphical Analysis of Data

Fall 2016 CS130 - Regression Analysis 1 7. REGRESSION. Fall 2016

SLStats.notebook. January 12, Statistics:

Microsoft Excel Using Excel in the Science Classroom

Year 10 General Mathematics Unit 2

Microsoft Excel 2007 Creating a XY Scatter Chart

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

= 3 + (5*4) + (1/2)*(4/2)^2.

BIG IDEAS. A.REI.D.10: Interpret Graphs as Sets of Solutions. Lesson Plan

Using Excel for Graphical Analysis of Data

Note that ALL of these points are Intercepts(along an axis), something you should see often in later work.

Pre-Lab Excel Problem

Plotting Graphs. Error Bars

Chapter 3: Rate Laws Excel Tutorial on Fitting logarithmic data

Reading: Graphing Techniques Revised 1/12/11 GRAPHING TECHNIQUES

Graphing on Excel. Open Excel (2013). The first screen you will see looks like this (it varies slightly, depending on the version):

Linear Functions. College Algebra

1 Introduction to Using Excel Spreadsheets

Linear Topics Notes and Homework DUE ON EXAM DAY. Name: Class period:

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

0 Graphical Analysis Use of Excel

CS130 Regression. Winter Winter 2014 CS130 - Regression Analysis 1

Dealing with Data in Excel 2013/2016

1. What specialist uses information obtained from bones to help police solve crimes?

Excel Functions & Tables

Chemistry 1A Graphing Tutorial CSUS Department of Chemistry

252 APPENDIX D EXPERIMENT 1 Introduction to Computer Tools and Uncertainties

Make a new column in your table, zbeers, which will be the z-score of Beers To do that:

Reference and Style Guide for Microsoft Excel

Mathematics 9 Exploration Lab Scatter Plots and Lines of Best Fit. a line used to fit into data in order to make a prediction about the data.

Section : Modelling Data with a Line of Best Fit and a Curve of Best Fit

[ Dear student you may consult with other students for help, but under no circumstances take someone work and claim it to be yours]

Chapter 8: Regression. Self-test answers

WEEK 4 REVIEW. Graphing Systems of Linear Inequalities (3.1)

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

Learning Packet THIS BOX FOR INSTRUCTOR GRADING USE ONLY. Mini-Lesson is complete and information presented is as found on media links (0 5 pts)

Revision Topic 11: Straight Line Graphs

/4 Directions: Graph the functions, then answer the following question.

Here is the data collected.

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks,

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

8.NS.1 8.NS.2. 8.EE.7.a 8.EE.4 8.EE.5 8.EE.6

Linear and Quadratic Least Squares

Chapter 5 Accumulating Change: Limits of Sums and the Definite Integral

Microsoft Word for Report-Writing (2016 Version)

[1] CURVE FITTING WITH EXCEL

Math 1: Solutions to Written Homework 1 Due Friday, October 3, 2008

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Lesson 76. Linear Regression, Scatterplots. Review: Shormann Algebra 2, Lessons 12, 24; Shormann Algebra 1, Lesson 94

Homework 1 Excel Basics

Measuring the Stack Height of Nested Styrofoam Cups

Excel Spreadsheets and Graphs

Tips and Guidance for Analyzing Data. Executive Summary

Section 1.1 The Distance and Midpoint Formulas; Graphing Utilities; Introduction to Graphing Equations

Integrated Math 1 Module 7 Honors Connecting Algebra and Geometry Ready, Set, Go! Homework Solutions

Activity Graphical Analysis with Excel and Logger Pro

Where we are. HOW do we apply the concepts of quadratics to better understand higher order polynomials?

UNIT 4 DESCRIPTIVE STATISTICS Lesson 2: Working with Two Categorical and Quantitative Variables Instruction

Algebra 2 Chapter Relations and Functions

Department of Chemical Engineering ChE-101: Approaches to Chemical Engineering Problem Solving Excel Tutorial VIII

Using Excel This is only a brief overview that highlights some of the useful points in a spreadsheet program.

Graphical Analysis TEACHER NOTES SCIENCE NSPIRED. Science Objectives. Vocabulary. About the Lesson. TI-Nspire Navigator. Activity Materials

Data Management Project Using Software to Carry Out Data Analysis Tasks

PR3 & PR4 CBR Activities Using EasyData for CBL/CBR Apps

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

Graphing with Microsoft Excel

Excel Primer CH141 Fall, 2017

Welcome to class! Put your Create Your Own Survey into the inbox. Sign into Edgenuity. Begin to work on the NC-Math I material.

Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example

GRAPHS AND STATISTICS Residuals Common Core Standard

Introduction to Excel Workshop

Direct and Partial Variation. Lesson 12

Work Session 16a: Massaging data graphically

An introduction to plotting data

Algebra II Chapter 5

Intermediate Microsoft Excel (Demonstrated using Windows XP) Using Spreadsheets in the Classroom

Math 121 Project 4: Graphs

Scientific Graphing in Excel 2013

Lab1: Use of Word and Excel

Roger Ranger and Leo Lion

Select the Points You ll Use. Tech Assignment: Find a Quadratic Function for College Costs

Algebra 1 Semester 2 Final Review

Algebra 1 Notes Quarter

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Brenda Lynch TI Summit Algebra 1 October 20, 2012

ENV Laboratory 2: Graphing

Building Concepts: Moving from Proportional Relationships to Linear Equations

Appendix C. Vernier Tutorial

LINEAR TOPICS Notes and Homework: DUE ON EXAM

Wits Maths Connect Secondary Project Card-sorting Activities

W7 DATA ANALYSIS 2. Your graph should look something like that in Figure W7-2. It shows the expected bell shape of the Gaussian distribution.

Chemistry Excel. Microsoft 2007

FLC Ch 3. Ex 1 Plot the points Ex 2 Give the coordinates of each point shown. Sec 3.2: Solutions and Graphs of Linear Equations

MATH SPEAK - TO BE UNDERSTOOD AND MEMORIZED DETERMINING THE INTERSECTIONS USING THE GRAPHING CALCULATOR

Excel. Spreadsheet functions

A scatter plot Used to display values for typically for a set of

Transcription:

Linear Regression in two variables (2-3 students per group) 1. Choose a 2 or 3 person group. Turn in a list of group members to your instructor, in written form or through email, no later than February, 2015 2. Choose an applicable two-variable project for which you can collect at least 25 data sets. Choose a data set relevant to your declared major. Data should be collected carefully. Identify where you were when you collected the data. Topics should be appropriate in content (stay away from drug use and sexual activity). Identify your independent and dependent variables. Determine which variable is independent and which variable is dependent. 3. Collect data and organize it in a table. Directions for creating a table in Microsoft Word: Select the INSERT tab Table 2 columns and appropriate number of rows. 4. Create a graph from your data. Creating a scatter plot: In Microsoft Word select the INSERT tab X Y (Scatter). Enter data in the X and Y columns based on your independent and dependent variable designations. Close the excel data sheet and your graph should appear. Double click on the graph and chose from the upper left corner. From the pull down menu, highlight the following: Axes Axis Titles Chart Title Type chart and axis titles in the corresponding text boxes. Check your scatter plot for linearity. If the plot is not reasonably linear you must choose a new two-variable project. 5. Print out a copy of your graph and manually draw 3 different lines of best fit on your graph based on visual estimation. Make sure each line goes through 2 or more points so that you can determine the comparative slope of each line. A good qualitative estimation should be reasonably close to the actual line of best fit. 6. Interpret the slope for each line in the context of your data. Does your slope interpretation make sense in context?. Interpret your y-intercept in context. 8. Find a linear equation which models the line you feel fits the data best. 9. Define a reasonable domain for your data. 10. Write a contextual word problem: a. Write a contextual word problem to predict your dependent variable using a value within your domain but not included in your data. b. Write a contextual word problem to predict your dependent variable using a value outside your domain. Correctly solve both problems for the associated predictions. Discuss the reasonableness of your answer based in the context of the problem. The process of making predictions based on independent values outside of the data/domain is known as extrapolation. 11. Create a small poster displaying your table, graph and linear regression equation.

Sample Project Two Variable Linear Regression: TUTORING HOURS AND EXAM GRADES 2. Choose 2 variable relationship: Relationship between tutoring sessions and exam grades Determine Variables and dependency relationship Exam grade is affected (dependent) by tutoring sessions Independent variable X: Tutoring Session (T) Dependent variable Y: (G) 3. Collect data and organize in a table (create table in Microsoft Word or in Excel) Number of Tutoring Sessions 18 84 6 62 16 92 14 82 0 50 4 6 0 56 10 9 12 84 18 88 4. Create a graph from your data and check for linearity. Insert Chart below was produced from data. Is the data linear? The data appears to be reasonably linear. Axes Axis Titles Chart Title vs Hours Spent in Tutoring 100 90 80 0 60 50 40 30 20 10 0 0 2 4 6 8 10 12 14 16 18 20 Tutoring Hours

Below are some examples of non-linear scatter plots. Below are some examples of linear scatter plots. 5. Save your scatterplot for later use, and print out your graph and manually draw 3 different lines of best fit on your graph based on visual estimation. Create a key for each line by color or by type. vs Hours Spent in Tutoring 100 90 80 0 60 50 40 30 20 10 0 0 2 4 6 8 10 12 14 16 18 20 Tutoring Hours Line A: Line B: Line C: 6. SLOPE IN CONTEXT: (increase/decrease) Line A: 9 (approximately 1) Increasing the number of tutoring sessions by 8 corresponds to an exam 8 score increase of 9%. Exam scores will increase approximately 1% for every increase of one additional tutoring session. Line B: 26 = 13 (approximately 2 ) Increasing the number of tutoring sessions by (or 14) corresponds to 14 1 an exam score increase of 13% (or 26%). Exam scores will increase approximately 2% for every increase of one additional tutoring session.

Line C: 12 = 2 Increasing the number of tutoring sessions by 6 corresponds to an exam score increase of 6 1 12%. Exam scores will increase approximately 2% for every increase of one additional tutoring session. Check to make sure your interpretation of the slope is reasonable.. Estimate the y-interprets for your 3 lines and interpret each in context. There are three y-intercepts. The y-intercept represents the exam scores received when no tutoring has taken place. For example line C has a y-intercept of 50 which interpreted in context means an exam score of 50% would be predicted if a student does not attend any (0) tutoring sessions. 8. Find a linear equation which models the line you feel fits the data best. (calculations required) Line B was chosen and line B is modeled by the linear equation G = 13 T + 56 where G represents the exam grade and T represents the number of tutoring sessions. 9. Define a reasonable domain: A reasonable domain would be represented by a realistic/possible minimum/maximum for the independent and dependent variable. The domain (x axis) is often tied to realistic values in the range (y-axis). The domain in this example represents the number of tutoring sessions and the range represents possible exam scores. The domain (and range) has an obvious minimum of zero but no obvious maximum. The maximum exam score would be 100%. The maximum number of tutoring sessions corresponding to an exam score of 100% (use your equation). Domain: 0 T < 24. Check your graph for reasonableness. 10. Write a contextual word problem: If Sam went to tutoring 20 times, what grade would he expect on his exam based on the linear regression equation G = 13 T + 56? Answer: 93.14% If Sam went to tutoring every day for 4 weeks before an exam (28 times) what grade would he expect on his exam based on the linear regression equation G = 13 T + 56? Answer: 108% Important note Point out that association does not imply Causation Least Squares regression Assignment: Students will need to find a least squares regression line by randomly selecting 6 of their data points. This will keep the actual regression process from becoming cumbersome. For instance, from the example above, choose 6 points randomly. I used a graphing calculator from 1-10. They would randomly choose from 1-25. Number of Tutoring Sessions 18 84 6 62 16 92 14 82 0 50 4 6 0 56 10 9 12 84 18 88 Math PRBrandInt (1,25,6) (1-25, 6 numbers chosen ). Students would then follow the directions for finding the least squares regression line. The following example calculates the least squares regression line using all 10 points in the data set instead of just the 5 in red. Least Squares refers to the sum of all of the squared vertical distances between each point and the best fit line.

Finding a linear regression line and corresponding equation which best fits the data is done using the Least Squares Regression Method. This method is used to find the actual or true linear regression line/equation, and can be used to predict how y changes as x changes. The term Least Squares regression method calculates the best fit line by finding the least or smallest vertical distance between the line and all of the points. Explanation of terms and symbols: y = a + bx is the least squares regression equation, where a is the y-intercept and b is the slope. N is the number of data (x, y) points. In mathematics the Greek symbol Σ means sum so Σx means sum all the x values and Σxy means sum all the products of x times y. (Σx 2 ) is found by first squaring all x values and them summing all the square values, versus (Σx) 2 which involves summing all x values first and then squaring the total. The least squares regression formula: This formula is somewhat tedious but not computationally challenging (it just looks that way). a = (Σy)(Σx2 ) (Σx)(Σxy) N(Σx 2 ) (Σx) 2 b = N(Σxy) (Σx)(Σy) N(Σx 2 ) (Σx) 2 y = a + bx Using the data from the sample project, the different sums are as follows: (Use excel or the calculator table feature) Number of Tutoring Sessions x y 18 84 324 1512 6 62 36 32 16 92 256 142 14 82 196 1148 0 50 0 0 4 6 16 304 0 56 0 0 10 9 100 90 12 84 144 1008 18 88 324 1584 Σx = 98 Σy = 53 Σx 2 = 1396 Σxy = 8190 At the bottom of each column is the sum of each column. The missing calculations are N, (Σx)(Σy) and (Σx) 2 N = 10 (Σx)(Σy) = (98) (53) = 394 (Σx) 2 = (98) 2 = 9604 x 2 x y Plugging these values into the two formulas we get: a = (Σy)(Σx2 ) (Σx)(Σxy) = (53)(1396) (98)(8190) N(Σx 2 ) (Σx) 2 10(1396) 9604 = 105118 802620 13960 9604 = 248568 4356 = 5.06 (y-int) b = N(Σxy) (Σx)(Σy) N(Σx 2 ) (Σx) = 10(8190) (98)(53) 2 10(1396) 9604 = 81900 394 = 8106 13960 9604 4356 = 1.86 (slope) The linear regression equation is y = 5.06 + 1.86x (which can also be written as G = 1.86 T + 5.06) Notice our sample linear regression equation for line B, found by visual estimation, was very close. The previous sample equation was G = 13 T + 56 and if we write the slope as a decimal we get, G = 1.85 T + 56. Students should draw their calculated least squares regression line on a new copy of their original scatter plot (see step 5). Finally choose Insert to see the actual least squares regression line. Axes Axis Titles Chart Title Trend Line