THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

Similar documents
EXST3201 Mousefeed01 Page 1

STAT:5201 Applied Statistic II

STAT 503 Fall Introduction to SAS

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Land Cover Stratified Accuracy Assessment For Digital Elevation Model derived from Airborne LIDAR Dade County, Florida

Baruch College STA Senem Acet Coskun

STAT:5400 Computing in Statistics

* Sample SAS program * Data set is from Dean and Voss (1999) Design and Analysis of * Experiments. Problem 3, page 129.

Laboratory Topics 1 & 2

PLS205 Lab 1 January 9, Laboratory Topics 1 & 2

Centering and Interactions: The Training Data

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

CH5: CORR & SIMPLE LINEAR REFRESSION =======================================

STA 570 Spring Lecture 5 Tuesday, Feb 1

Factorial ANOVA with SAS

Bivariate (Simple) Regression Analysis

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

Generalized Least Squares (GLS) and Estimated Generalized Least Squares (EGLS)

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Repeated Measures Part 4: Blood Flow data

Stat 5100 Handout #19 SAS: Influential Observations and Outliers

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

PubHlth 640 Intermediate Biostatistics Unit 2 - Regression and Correlation. Simple Linear Regression Software: Stata v 10.1

Regression on the trees data with R

Sales Price of Laptops Based on Their Specifications. Hyunwoo Cho Jay Jung Gun Hee Lee Chan Hong Park Seoyul Um Mario Wijaya Team #10

Factorial ANOVA. Skipping... Page 1 of 18

Cell means coding and effect coding

SPSS. (Statistical Packages for the Social Sciences)

Getting Correct Results from PROC REG

SLStats.notebook. January 12, Statistics:

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Subset Selection in Multiple Regression

CREATING THE ANALYSIS

Analysis of variance - ANOVA

Introduction to Statistical Analyses in SAS

Multiple Regression White paper

Regression Analysis and Linear Regression Models

SAS/STAT 14.2 User s Guide. The SIMNORMAL Procedure

Salary 9 mo : 9 month salary for faculty member for 2004

Normal Plot of the Effects (response is Mean free height, Alpha = 0.05)

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

IQR = number. summary: largest. = 2. Upper half: Q3 =

Table Of Contents. Table Of Contents

5.5 Regression Estimation

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Within-Cases: Multivariate approach part one

Week 4: Simple Linear Regression III

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

15 Wyner Statistics Fall 2013

The SIMNORMAL Procedure (Chapter)

Algebra 1, 4th 4.5 weeks

TABEL DISTRIBUSI DAN HUBUNGAN LENGKUNG RAHANG DAN INDEKS FASIAL N MIN MAX MEAN SD

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

1 Downloading files and accessing SAS. 2 Sorting, scatterplots, correlation and regression

Fathom Dynamic Data TM Version 2 Specifications

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

36-402/608 HW #1 Solutions 1/21/2010

050 0 N 03 BECABCDDDBDBCDBDBCDADDBACACBCCBAACEDEDBACBECCDDCEA

Statistics Lab #7 ANOVA Part 2 & ANCOVA

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

ITSx: Policy Analysis Using Interrupted Time Series

Robust Linear Regression (Passing- Bablok Median-Slope)

An introduction to SPSS

Week 3/4 [06+ Sept.] Class Activities. File: week sep07.doc Directory: \\Muserver2\USERS\B\\baileraj\Classes\sta402\handouts

CREATING THE DISTRIBUTION ANALYSIS

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education)

Week 5: Multiple Linear Regression II

Lab Session 1. Introduction to Eviews

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

5b. Descriptive Statistics - Part II

Conditional and Unconditional Regression with No Measurement Error

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Introduction to Hierarchical Linear Model. Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017

Stat 401 B Lecture 26

Eksamen ERN4110, 6/ VEDLEGG SPSS utskrifter til oppgavene (Av plasshensyn kan utskriftene være noe redigert)

Exploratory Data Analysis

Compare Linear Regression Lines for the HP-67

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian. Panel Data Analysis: Fixed Effects Models

Section 2.3: Simple Linear Regression: Predictions and Inference

Gelman-Hill Chapter 3

Cluster Randomization Create Cluster Means Dataset

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.

Applied Regression Modeling: A Business Approach

Measures of Dispersion

Two-Stage Least Squares

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Instruction on JMP IN of Chapter 19

FREQUENCIES VARIABLES=CAT_MSDS /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN MEDIAN MODE /ORDER=ANALYSIS.

An Econometric Study: The Cost of Mobile Broadband

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Paper ODS, YES! Odious, NO! An Introduction to the SAS Output Delivery System

Transcription:

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533 MIDTERM EXAMINATION: October 14, 2005 Instructor: Val LeMay Time: 50 minutes 40 Marks FRST 430 50 Marks FRST 533 (extra questions) This examination consists of 11 pages (2 questions). part-questions for FRST 533 students only. There are two extra (15) 1. As part of your short-term research job in South Africa, you collect data on biomass (kg) of Protea shrubs, and measure the shrub heights. Since biomass requires destructive sampling of the plants, you would like to develop an equation to estimate biomass from shrub heights. You have a very limited budget and collect data for only 10 plants. Also, you do not have money to buy a statistical package, and there is limited power in the area where you are staying. You manage to get the data into EXCEL, do some preliminary calculations and a graph, and then your laptop has no more power. Using the EXCEL values, calculate: a. the estimated slope and intercept b. the sum of squares Y, sum of squares regression, and sum of squares error c. the r 2 value d. the standard error of the estimate (SEE) e. Do you think you might need transformations of the variables, or is this a line? Is the first assumption of simple linear regression met? f. FRST 533 only: Is this a good equation in your opinion? How might you improve this equation? Give three possible ways. 1

plant Biomass (y) Height (x) biomass-mean (biomass-mean) sq. height-mean (height-mean) sq. (biomass-mean) X (height-mean) 1 20 1 3.4 11.56 0.06 0.0036 0.204 2 25 1.5 8.4 70.56 0.56 0.3136 4.704 3 13 0.9-3.6 12.96-0.04 0.0016 0.144 4 10 0.8-6.6 43.56-0.14 0.0196 0.924 5 22 1.2 5.4 29.16 0.26 0.0676 1.404 6 21 1.3 4.4 19.36 0.36 0.1296 1.584 7 15 0.4-1.6 2.56-0.54 0.2916 0.864 8 8 0.6-8.6 73.96-0.34 0.1156 2.924 9 23 1.4 6.4 40.96 0.46 0.2116 2.944 10 9 0.3-7.6 57.76-0.64 0.4096 4.864 mean 16.60 0.94 Sums: 0.00 362.40 0.00 1.56 20.56 biomass versus height biomass in kg 30 25 20 15 10 5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 height in m etres 2

(25) 2. Regression was used to fit the following equation to predict crown ratio for birch trees: where: ALL : CR ˆ b + b ccf + b = 0 1 2 dbhsq b 0 to b 2 are the estimated coefficients; CR is the crown ratio (length of live crown relative to total tree height); Predictor variables are dbhsq (the measured tree diameter at 1.3 m above ground, squared), and ccf (crown competition factor) to represent competition (see SAS outputs starting on page 4). NOTE: Take values from the outputs where-ever possible ALSO indicate what alpha level you used for all tests. (a) Do the residuals meet the assumptions of regression using the residual plot and the normality plot? (b) What are the R 2 and SEE values? (c) Test whether the regression is significant. Show the hypothesis, test statistic, p-value or critical value from a table, and the decision. (d) Test whether each of the variables is significant. Show a general hypothesis for all variables to be tested. Then for each variable, give the test statistic, the p-value (or critical values from a table), and the decision. (e) Is this a good model? Give evidence to support your statement. (f) FRST 533 only: (a)how many equations would you need to do in order to get all possible regressions if there were three variables to use as predictors? (ii) How would you add a class variable, region, with three regions, to the equation? 3

all variables 1 The REG Procedure Model: ALL Dependent Variable: CR CR Number of Observations Read 93 Number of Observations Used 93 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 1.04026 0.52013 25.22 <.0001 Error 90 1.85636 0.02063 Corrected Total 92 2.89661 Root MSE 0.14362 R-Square 0.3591 Dependent Mean 0.47097 Adj R-Sq 0.3449 Coeff Var 30.49428 Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > t Intercept Intercept 1 0.78709 0.08281 9.50 <.0001 CCF CCF 1-0.00217 0.00044287-4.89 <.0001 dbhsq 1 0.00039252 0.00009161 4.28 <.0001 4

all variables 2 Plot of resid1*pred1. Symbol used is '*'. 0.3 ˆ * * * 0.2 ˆ * ** * * * * * * * ** * * * * * 0.1 ˆ * * * * * * * * ** R * * e * * s 0.0 ˆ * * * i ** * * d ** * u * * a * * l -0.1 ˆ * ** * * *** * * * * ** -0.2 ˆ * * * * * ** * -0.3 ˆ * -0.4 ˆ Šˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒ 0.3 0.4 0.5 0.6 0.7 0.8 NOTE: 22 obs hidden. Predicted Value of CR 5

all variables 3 The UNIVARIATE Procedure Variable: resid1 (Residual) Moments N 93 Sum Weights 93 Mean 0 Sum Observations 0 Std Deviation 0.14204856 Variance 0.02017779 Skewness -0.2184176 Kurtosis -0.8541527 Uncorrected SS 1.85635697 Corrected SS 1.85635697 Coeff Variation. Std Error Mean 0.01472975 Basic Statistical Measures Location Variability Mean 0.000000 Std Deviation 0.14205 Median 0.005888 Variance 0.02018 Mode. Range 0.60953 Interquartile Range 0.24190 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 0 Pr > t 1.0000 Sign M 1.5 Pr >= M 0.8358 Signed Rank S 36.5 Pr >= S 0.8897 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.974452 Pr < W 0.0648 Kolmogorov-Smirnov D 0.085903 Pr > D 0.0894 Cramer-von Mises W-Sq 0.121594 Pr > W-Sq 0.0585 Anderson-Darling A-Sq 0.748436 Pr > A-Sq 0.0494 6

all variables 4 Quantiles (Definition 5) Quantile Estimate 100% Max 0.27880851 99% 0.27880851 95% 0.20440556 90% 0.18174849 75% Q3 0.12044012 50% Median 0.00588825 25% Q1-0.12145754 10% -0.19199049 5% -0.23280473 1% -0.33072324 0% Min -0.33072324 Extreme Observations ------Lowest------ ------Highest----- Value Obs Value Obs -0.330723 83 0.204406 54-0.266861 13 0.205416 31-0.262322 2 0.210140 36-0.259450 17 0.258583 25-0.232805 8 0.278809 7 Stem Leaf # Boxplot 2 68 2 2 0000011 7 1 55556678 8 1 1112233334 10 +-----+ 0 56667777788899 14 0 0111244 7 *--+--* -0 444432222220 12-0 999655 6-1 44433322211 11 +-----+ -1 9988755 7-2 33321 5-2 766 3-3 3 1 ----+----+----+----+ Multiply Stem.Leaf by 10**-1 7

all variables 5 The UNIVARIATE Procedure Variable: resid1 (Residual) Normal Probability Plot 0.275+ +++* * **+** **** ****** ****++ **++ -0.025+ **** ++** ***** +**** +**** *+** -0.325+*+++ +----+----+----+----+----+----+----+----+----+----+ -2-1 0 +1 +2 8