Performance Evaluation

Size: px
Start display at page:

Download "Performance Evaluation"

Transcription

1 Performance Evaluation Dan Lizotte Evaluating Performance Which do ou prefer and wh?

2 Evaluating Performance Which do ou prefer and wh? Evaluating Performance Performance of a Fied Hpothesis (HTF , JWHT., 5)

3 Define the loss (error) of the hpothesis on an eample (, ) as L(h(), ) Suppose (X, Y ) is a vector-valued random variable. Then what is L(h(X), Y ) Performance of a Fied Hpothesis Given a model h, (which could have come from anwhere), its generalization error is: E[L(h(X), Y )] Given a set of data points ( i, i ) that are realizations of (X, Y ), we can compute the empirical error l h,n = n What is l h,n? n L(h( i ), i ) i= Generalization error of hpotheses from last da 5 log(generalization Error) Degree of Polnomial

4 Estimates of generalization error using points 5 log(generalization Error) Degree of Polnomial Sample Mean Given a dataset (collection of realizations),,..., n of X, the sample mean is: n = n Given a dataset, n is a fied number. We use X n to denote the random variable corresponding to the sample mean computed from a randoml drawn dataset of size n. i i Datasets and sample means Datasets of size n = 5, sample means plotted in red. 4

5 Statistics, Parameters, and Estimation A statistic is an summar of a dataset. (E.g. function applied to a dataset. n, sample median.) A statistic is the result of a A parameter is an summar of the distribution of a random variable. (E.g. µ X, median.) A parameter is the result of a function applied to a distribution. Estimation uses a statistic (e.g. n ) to estimate a parameter (e.g. µ X ) of the distribution of a random variable. Estimate: value obtained from a specific dataset Estimator: function (e.g. sum, divide b n) used to compute the estimate Estimand: parameter of interest Sampling Distributions (AoS, p.6, q.9) Given an estimate, how good is it? 5

6 The distribution of an estimator is called its sampling distribution. Bias (AoS, p.9) The epected difference between estimator and parameter. For eample, If, estimator is unbiased. E[ X n µ X ] Sometimes, n > µ X, sometimes n < µ X, but the long run average of these differences will be zero. Variance The epected squared difference between estimator and its mean Positive for all interesting estimators. For an unbiased estimator E[( X n E[ X n ]) ] E[( X n µ X ) ] Sometimes, n > µ X, sometimes n < µ X, but the squared differences are all positive and do not cancel out. 6

7 Normal (Gaussian) Distribution (AoS, p.8) f X () = e ( µ X ) σ X σ X π Normal distribution is defined b two parameters: µ X, σ X. The normal distribution is special (among other reasons) because man estimators have approimatel normal sampling distributions or have sampling distributions that are closel related to the normal. For an estimator like X n, if we know µ Xn and σ Xn, then we can sa a lot about how good it is. Central Limit Theorem (AoS, p.77) Informall: The sampling distribution of Xn is approimatel normal if n is big enough. More formall, for X with finite variance: where F Xn ( ) e ( µ X ) σ n σ n π σ n = σ n is called the standard error and σ is the variance of X. Who cares? Eruptions dataset has n = 7 observations. Our estimate of the mean of eruption times is 7 = What is the probabilit of observing an 7 that is within seconds of the true mean? Who cares? B the C.L.T., Pr(.7 X 7 µ X.7) =.7 =.7 e ( µ X ) σ n πσn =.986 Note! I estimated σ X here. (Look up t-test for details.) 7

8 .7 =.7 e ( µ X ) σ n =.986 πσn 6 4 Densit..... z Confidence Intervals (AoS, p.9) Tpicall, we specif confidence given b α Use the sampling distribution to get an interval that traps the parameter (estimand) with probabilit α. 95% C.I. for eruption mean is (.5,.6) 8

9 6 95% Confidence Region 4 Densit z 9

10 What a Confidence Interval Means

11 Effect of n on width Performance Evaluation - Test Sets Training error underestimates generalization error. It is a biased estimator. If ou reall want a good estimate of generalization error, ou need to hold out a separate test set of data not used for training. Possibl of size n = (.96) σ L d where σl is the variance of the loss (which has to be guessed or estimated from training) and d is half-width of a 95% confidence interval. Could report test the error, but then deplo whatever ou train on the whole data. (Probabl won t be worse.)

12 Eample - linear model Training Data ## [] "Estimated variance of errors: " ## [] "Sample required for CI width of. (+-.): 65" Eample - linear model Training Data Testing Data ## TestMSE VarOfErrors StdOfSquaredErrors n StandardError CI_left ## ## CI_right

13 ##.885 Choosing Performance Measures for Regression: Mean Errors MSE = n n i= RMSE = n MAE = n (ŷ i i ) n (ŷ i i ) i= n i= ŷ i i I find MAE easier to interpret. (How far am I from the correct value, on average?) RMSE is at least in the same units as the. Choosing Performance Measures for Regression: Mean Relative Error MRE = n n i= ŷ i i i Scales error according to magnitude of true. E.g., if MRE=., then regression is wrong b % of the value of, on average. If this is appropriate for our problem then linear regression, which assumes additive error, ma not be appropriate. Options include using a different model or regression on log rather than on. Etra slides - The Bootstrap The Bootstrap (AoS, p.) CLT gives theoretical approimate sampling distribution of Xn. We could also estimate the sampling distribution of X n b drawing man datasets of size n, computing X n on each, constructing histogram. This is impossible. But we can use the data we have as a surrogate. The Bootstrap Call our dataset D. Draw B new datasets b sampling observations with replacement from D. (B is often at least ) Compute X (b) n for each of the datasets. Use the histogram/empirical distribution of these pretend X to determine confidence limits.

14 Bootstrap eample librar(boot) bootstraps <- boot(faithful$eruptions,function(d,i){mean(d[i])},r=5) bootdata = data.frame(bars=bootstraps$t); limits = quantile(bootdata$bars,c(.5,.975)) ggplot(bootdata, aes(=bars)) + labs(="prop.") + geom_histogram(aes( =..densit..)) + geom_errorbarh(aes(min=limits[[]], ma=limits[[]], =c()),height=.5,colour="red",size=) 6 4 Prop bars 4

15 Realit Check.8.6 Prop eruptions 5

Bias-Variance Decomposition Error Estimators Cross-Validation

Bias-Variance Decomposition Error Estimators Cross-Validation Bias-Variance Decomposition Error Estimators Cross-Validation Bias-Variance tradeoff Intuition Model too simple does not fit the data well a biased solution. Model too comple small changes to the data,

More information

Bias-Variance Decomposition Error Estimators

Bias-Variance Decomposition Error Estimators Bias-Variance Decomposition Error Estimators Cross-Validation Bias-Variance tradeoff Intuition Model too simple does not fit the data well a biased solution. Model too comple small changes to the data,

More information

Non-linear models. Basis expansion. Overfitting. Regularization.

Non-linear models. Basis expansion. Overfitting. Regularization. Non-linear models. Basis epansion. Overfitting. Regularization. Petr Pošík Czech Technical Universit in Prague Facult of Electrical Engineering Dept. of Cbernetics Non-linear models Basis epansion.....................................................................................................

More information

Overfitting, Model Selection, Cross Validation, Bias-Variance

Overfitting, Model Selection, Cross Validation, Bias-Variance Statistical Machine Learning Notes 2 Overfitting, Model Selection, Cross Validation, Bias-Variance Instructor: Justin Domke Motivation Suppose we have some data TRAIN = {(, ), ( 2, 2 ),..., ( N, N )} that

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Andrew W. Moore/Anna Goldenberg School of Computer Science Carnegie Mellon Universit Copright 2001, Andrew W. Moore Apr 1st, 2004 Want to learn

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

More information

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency Math 14 Introductory Statistics Summer 008 6-9-08 Class Notes Sections 3, 33 3: 1-1 odd 33: 7-13, 35-39 Measures of Central Tendency odd Notation: Let N be the size of the population, n the size of the

More information

LECTURE 6: CROSS VALIDATION

LECTURE 6: CROSS VALIDATION LECTURE 6: CROSS VALIDATION CSCI 4352 Machine Learning Dongchul Kim, Ph.D. Department of Computer Science A Regression Problem Given a data set, how can we evaluate our (linear) model? Cross Validation

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Final eam December 3, 24 Your name and MIT ID: J. D. (Optional) The grade ou would give to ourself + a brief justification. A... wh not? Problem 5 4.5 4 3.5 3 2.5 2.5 + () + (2)

More information

Chapter 5snow year.notebook March 15, 2018

Chapter 5snow year.notebook March 15, 2018 Chapter 5: Statistical Reasoning Section 5.1 Exploring Data Measures of central tendency (Mean, Median and Mode) attempt to describe a set of data by identifying the central position within a set of data

More information

Math 14 Lecture Notes Ch. 6.1

Math 14 Lecture Notes Ch. 6.1 6.1 Normal Distribution What is normal? a 10-year old boy that is 4' tall? 5' tall? 6' tall? a 25-year old woman with a shoe size of 5? 7? 9? an adult alligator that weighs 200 pounds? 500 pounds? 800

More information

TIPS4RM: MHF4U: Unit 1 Polynomial Functions

TIPS4RM: MHF4U: Unit 1 Polynomial Functions TIPSRM: MHFU: Unit Polnomial Functions 008 .5.: Polnomial Concept Attainment Activit Compare and contrast the eamples and non-eamples of polnomial functions below. Through reasoning, identif attributes

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Final eam December 3, 24 Your name and MIT ID: J. D. (Optional) The grade ou would give to ourself + a brief justification. A... wh not? Cite as: Tommi Jaakkola, course materials

More information

Transformations of Functions. 1. Shifting, reflecting, and stretching graphs Symmetry of functions and equations

Transformations of Functions. 1. Shifting, reflecting, and stretching graphs Symmetry of functions and equations Chapter Transformations of Functions TOPICS.5.. Shifting, reflecting, and stretching graphs Smmetr of functions and equations TOPIC Horizontal Shifting/ Translation Horizontal Shifting/ Translation Shifting,

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

3.2 Polynomial Functions of Higher Degree

3.2 Polynomial Functions of Higher Degree 71_00.qp 1/7/06 1: PM Page 6 Section. Polnomial Functions of Higher Degree 6. Polnomial Functions of Higher Degree What ou should learn Graphs of Polnomial Functions You should be able to sketch accurate

More information

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler Lecture 7 Machine Learning: Cross Validation Outline Performance evaluation methods test/train sets cross-validation k-fold Leave-one-out 1 A

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

Photo by Carl Warner

Photo by Carl Warner Photo b Carl Warner Photo b Carl Warner Photo b Carl Warner Fitting and Alignment Szeliski 6. Computer Vision CS 43, Brown James Has Acknowledgment: Man slides from Derek Hoiem and Grauman&Leibe 2008 AAAI

More information

Optimization and Simulation

Optimization and Simulation Optimization and Simulation Statistical analysis and bootstrapping Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale

More information

Chapters 5-6: Statistical Inference Methods

Chapters 5-6: Statistical Inference Methods Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past

More information

Exponential Functions. Christopher Thomas

Exponential Functions. Christopher Thomas Mathematics Learning Centre Eponential Functions Christopher Thomas c 1998 Universit of Sdne Mathematics Learning Centre, Universit of Sdne 1 1 Eponential Functions 1.1 The functions =2 and =2 From our

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Derivatives 3: The Derivative as a Function

Derivatives 3: The Derivative as a Function Derivatives : The Derivative as a Function 77 Derivatives : The Derivative as a Function Model : Graph of a Function 9 8 7 6 5 g() - - - 5 6 7 8 9 0 5 6 7 8 9 0 5 - - -5-6 -7 Construct Your Understanding

More information

Modeling and Simulation Exam

Modeling and Simulation Exam Modeling and Simulation am Facult of Computers & Information Department: Computer Science Grade: Fourth Course code: CSC Total Mark: 75 Date: Time: hours Answer the following questions: - a Define the

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

3.6 Graphing Piecewise-Defined Functions and Shifting and Reflecting Graphs of Functions

3.6 Graphing Piecewise-Defined Functions and Shifting and Reflecting Graphs of Functions 76 CHAPTER Graphs and Functions Find the equation of each line. Write the equation in the form = a, = b, or = m + b. For Eercises through 7, write the equation in the form f = m + b.. Through (, 6) and

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining

More information

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model Topic 16 - Other Remedies Ridge Regression Robust Regression Regression Trees Outline - Fall 2013 Piecewise Linear Model Bootstrapping Topic 16 2 Ridge Regression Modification of least squares that addresses

More information

Investigation Free Fall

Investigation Free Fall Investigation Free Fall Name Period Date You will need: a motion sensor, a small pillow or other soft object What function models the height of an object falling due to the force of gravit? Use a motion

More information

APPLICATIONS OF INTEGRATION

APPLICATIONS OF INTEGRATION 6 APPLICATIONS OF INTEGRATION The volume of a sphere is the limit of sums of volumes of approimating clinders. In this chapter we eplore some of the applications of the definite integral b using it to

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Student Page. Algebra/ Day #4 90 Minute Class Functions, Patterns and X-Y Tables

Student Page. Algebra/ Day #4 90 Minute Class Functions, Patterns and X-Y Tables Student Page Algebra/ Da #4 90 Minute Class Functions, Patterns and X-Y Tables Definition: A relation is an set of ordered pairs Ex: # {(,), (-7,6), (-,4)} # { (0,8), (-, ), (0,6)} Definition: A function

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. 1/44 Cross-validation and the Bootstrap In the section we discuss two resampling

More information

Essential Question How many turning points can the graph of a polynomial function have?

Essential Question How many turning points can the graph of a polynomial function have? .8 Analzing Graphs of Polnomial Functions Essential Question How man turning points can the graph of a polnomial function have? A turning point of the graph of a polnomial function is a point on the graph

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

SECTION 3-4 Rational Functions

SECTION 3-4 Rational Functions 20 3 Polnomial and Rational Functions 0. Shipping. A shipping bo is reinforced with steel bands in all three directions (see the figure). A total of 20. feet of steel tape is to be used, with 6 inches

More information

Using a Table of Values to Sketch the Graph of a Polynomial Function

Using a Table of Values to Sketch the Graph of a Polynomial Function A point where the graph changes from decreasing to increasing is called a local minimum point. The -value of this point is less than those of neighbouring points. An inspection of the graphs of polnomial

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Lecture 14. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 14. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. geometric Lecture 14 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007 geometric 1 2 3 4 geometric 5 6 7 geometric 1 Review about logs

More information

angle The figure formed by two lines with a common endpoint called a vertex. angle bisector The line that divides an angle into two equal parts.

angle The figure formed by two lines with a common endpoint called a vertex. angle bisector The line that divides an angle into two equal parts. A angle The figure formed b two lines with a common endpoint called a verte. verte angle angle bisector The line that divides an angle into two equal parts. circle A set of points that are all the same

More information

Math 26: Fall (part 1) The Unit Circle: Cosine and Sine (Evaluating Cosine and Sine, and The Pythagorean Identity)

Math 26: Fall (part 1) The Unit Circle: Cosine and Sine (Evaluating Cosine and Sine, and The Pythagorean Identity) Math : Fall 0 0. (part ) The Unit Circle: Cosine and Sine (Evaluating Cosine and Sine, and The Pthagorean Identit) Cosine and Sine Angle θ standard position, P denotes point where the terminal side of

More information

2.3 Polynomial Functions of Higher Degree with Modeling

2.3 Polynomial Functions of Higher Degree with Modeling SECTION 2.3 Polnomial Functions of Higher Degree with Modeling 185 2.3 Polnomial Functions of Higher Degree with Modeling What ou ll learn about Graphs of Polnomial Functions End Behavior of Polnomial

More information

Chapter Goals: Evaluate limits. Evaluate one-sided limits. Understand the concepts of continuity and differentiability and their relationship.

Chapter Goals: Evaluate limits. Evaluate one-sided limits. Understand the concepts of continuity and differentiability and their relationship. MA123, Chapter 3: The idea of its (pp. 47-67) Date: Chapter Goals: Evaluate its. Evaluate one-sided its. Understand the concepts of continuit and differentiabilit and their relationship. Assignments: Assignment

More information

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 5. Graph sketching

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 5. Graph sketching Roberto s Notes on Differential Calculus Chapter 8: Graphical analsis Section 5 Graph sketching What ou need to know alread: How to compute and interpret limits How to perform first and second derivative

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade What is SIFT? It is a technique for detecting salient stable feature points in an image. For ever such point it also provides a set of features

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

Multi-stable Perception. Necker Cube

Multi-stable Perception. Necker Cube Multi-stable Perception Necker Cube Spinning dancer illusion, Nobuuki Kaahara Fitting and Alignment Computer Vision Szeliski 6.1 James Has Acknowledgment: Man slides from Derek Hoiem, Lana Lazebnik, and

More information

The Sine and Cosine Functions

The Sine and Cosine Functions Lesson -5 Lesson -5 The Sine and Cosine Functions Vocabular BIG IDEA The values of cos and sin determine functions with equations = sin and = cos whose domain is the set of all real numbers. From the eact

More information

Discussion: Clustering Random Curves Under Spatial Dependence

Discussion: Clustering Random Curves Under Spatial Dependence Discussion: Clustering Random Curves Under Spatial Dependence Gareth M. James, Wenguang Sun and Xinghao Qiao Abstract We discuss the advantages and disadvantages of a functional approach to clustering

More information

Topic 2 Transformations of Functions

Topic 2 Transformations of Functions Week Topic Transformations of Functions Week Topic Transformations of Functions This topic can be a little trick, especiall when one problem has several transformations. We re going to work through each

More information

Section 2.3: Simple Linear Regression: Predictions and Inference

Section 2.3: Simple Linear Regression: Predictions and Inference Section 2.3: Simple Linear Regression: Predictions and Inference Jared S. Murray The University of Texas at Austin McCombs School of Business Suggested reading: OpenIntro Statistics, Chapter 7.4 1 Simple

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

MATH : EXAM 3 INFO/LOGISTICS/ADVICE

MATH : EXAM 3 INFO/LOGISTICS/ADVICE MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Fall 09, Homework 5

Fall 09, Homework 5 5-38 Fall 09, Homework 5 Due: Wednesday, November 8th, beginning of the class You can work in a group of up to two people. This group does not need to be the same group as for the other homeworks. You

More information

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. 1. (42 points) A random sample of n = 30 NBA basketball

More information

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value. AP Statistics - Problem Drill 05: Measures of Variation No. 1 of 10 1. The range is calculated as. (A) The minimum data value minus the maximum data value. (B) The maximum data value minus the minimum

More information

4.2 Properties of Rational Functions. 188 CHAPTER 4 Polynomial and Rational Functions. Are You Prepared? Answers

4.2 Properties of Rational Functions. 188 CHAPTER 4 Polynomial and Rational Functions. Are You Prepared? Answers 88 CHAPTER 4 Polnomial and Rational Functions 5. Obtain a graph of the function for the values of a, b, and c in the following table. Conjecture a relation between the degree of a polnomial and the number

More information

Developed in Consultation with Tennessee Educators

Developed in Consultation with Tennessee Educators Developed in Consultation with Tennessee Educators Table of Contents Letter to the Student........................................ Test-Taking Checklist........................................ Tennessee

More information

5.2 Graphing Polynomial Functions

5.2 Graphing Polynomial Functions Locker LESSON 5. Graphing Polnomial Functions Common Core Math Standards The student is epected to: F.IF.7c Graph polnomial functions, identifing zeros when suitable factorizations are available, and showing

More information

Multicollinearity and Validation CIVL 7012/8012

Multicollinearity and Validation CIVL 7012/8012 Multicollinearity and Validation CIVL 7012/8012 2 In Today s Class Recap Multicollinearity Model Validation MULTICOLLINEARITY 1. Perfect Multicollinearity 2. Consequences of Perfect Multicollinearity 3.

More information

Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression

Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression Catharina Olsen and Gianluca Bontempi March 12, 2013 1 1 Repetition 1.1 Estimation using the mean square error Assume to have

More information

0 COORDINATE GEOMETRY

0 COORDINATE GEOMETRY 0 COORDINATE GEOMETRY Coordinate Geometr 0-1 Equations of Lines 0- Parallel and Perpendicular Lines 0- Intersecting Lines 0- Midpoints, Distance Formula, Segment Lengths 0- Equations of Circles 0-6 Problem

More information

LESSON 3.1 INTRODUCTION TO GRAPHING

LESSON 3.1 INTRODUCTION TO GRAPHING LESSON 3.1 INTRODUCTION TO GRAPHING LESSON 3.1 INTRODUCTION TO GRAPHING 137 OVERVIEW Here s what ou ll learn in this lesson: Plotting Points a. The -plane b. The -ais and -ais c. The origin d. Ordered

More information

Model Complexity and Generalization

Model Complexity and Generalization HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Generalization Learning Curves Underfit Generalization

More information

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years. 3: Summary Statistics Notation Consider these 10 ages (in years): 1 4 5 11 30 50 8 7 4 5 The symbol n represents the sample size (n = 10). The capital letter X denotes the variable. x i represents the

More information

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo

More information

Machine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm

Machine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm Machine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 17 Review 1 / 17 Decision Tree: Making a

More information

Unit 2 Functions Analyzing Graphs of Functions (Unit 2.2)

Unit 2 Functions Analyzing Graphs of Functions (Unit 2.2) Unit 2 Functions Analzing Graphs of Functions (Unit 2.2) William (Bill) Finch Mathematics Department Denton High School Introduction Domain/Range Vert Line Zeros Incr/Decr Min/Ma Avg Rate Change Odd/Even

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Pre-Algebra Notes Unit 8: Graphs and Functions

Pre-Algebra Notes Unit 8: Graphs and Functions Pre-Algebra Notes Unit 8: Graphs and Functions The Coordinate Plane A coordinate plane is formed b the intersection of a horizontal number line called the -ais and a vertical number line called the -ais.

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors (Section 5.4) What? Consequences of homoskedasticity Implication for computing standard errors What do these two terms

More information

Pair-Wise Multiple Comparisons (Simulation)

Pair-Wise Multiple Comparisons (Simulation) Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,

More information

Section 1.4 Limits involving infinity

Section 1.4 Limits involving infinity Section. Limits involving infinit (/3/08) Overview: In later chapters we will need notation and terminolog to describe the behavior of functions in cases where the variable or the value of the function

More information

ACTIVITY: Representing Data by a Linear Equation

ACTIVITY: Representing Data by a Linear Equation 9.2 Lines of Fit How can ou use data to predict an event? ACTIVITY: Representing Data b a Linear Equation Work with a partner. You have been working on a science project for 8 months. Each month, ou measured

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem. STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,

More information

Chapter 1 Notes, Calculus I with Precalculus 3e Larson/Edwards

Chapter 1 Notes, Calculus I with Precalculus 3e Larson/Edwards Contents 1.1 Functions.............................................. 2 1.2 Analzing Graphs of Functions.................................. 5 1.3 Shifting and Reflecting Graphs..................................

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

1.2 Visualizing and Graphing Data

1.2 Visualizing and Graphing Data 6360_ch01pp001-075.qd 10/16/08 4:8 PM Page 1 1 CHAPTER 1 Introduction to Functions and Graphs 9. Volume of a Cone The volume V of a cone is given b V = 1 3 pr h, where r is its radius and h is its height.

More information

Online Homework Hints and Help Extra Practice

Online Homework Hints and Help Extra Practice Evaluate: Homework and Practice Use a graphing calculator to graph the polnomial function. Then use the graph to determine the function s domain, range, and end behavior. (Use interval notation for the

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 12. August 23, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. Lecture 12 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 23, 2007 1 2 3 4 5 1 2 Introduce the bootstrap 3 the bootstrap algorithm 4 Example

More information

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable Learning Objectives Continuous Random Variables & The Normal Probability Distribution 1. Understand characteristics about continuous random variables and probability distributions 2. Understand the uniform

More information