LECTURE 6: CROSS VALIDATION

Size: px

Start display at page:

Download "LECTURE 6: CROSS VALIDATION"

Dwight Hamilton
6 years ago
Views:

1 LECTURE 6: CROSS VALIDATION CSCI 4352 Machine Learning Dongchul Kim, Ph.D. Department of Computer Science

2 A Regression Problem Given a data set, how can we evaluate our (linear) model?

3 Cross Validation Cross validation is a model evaluation method that is better than residuals. The problem with residual evaluations is that the do not give an indication of how well the learner will do when it is asked to make new predictions for data it has not alread seen. One wa to overcome this problem is to.. not use the entire data set when training a learner. Some of the data is removed before training begins. Then when training is done, the data that was removed can be used to test the performance of the learned model on ``new'' data. This is the basic idea for a whole class of model evaluation methods called cross validation.

4 Which is best? Wh not choose the method with the best fit to the data?

5 What do we reall want? Wh not choose the method with the best fit to the data? How well are ou going to predict future data drawn from the same distribution?

6 The test set method 1. Randoml choose 30% of the data to be in a test set 2. The remainder is a training set

7 The test set method 1. Randoml choose 30% of the data to be in a test set 2. The remainder is a training set 3. Perform our regression on the training set (Linear regression eample)

8 The test set method 1. Randoml choose 30% of the data to be in a test set 2. The remainder is a training set 3. Perform our regression on the training set 4. Estimate our future performance with the test set (Linear regression eample) Mean Squared Error = 2.4

9 The test set method 1. Randoml choose 30% of the data to be in a test set 2. The remainder is a training set 3. Perform our regression on the training set 4. Estimate our future performance with the test set (Quadratic regression eample) Mean Squared Error = 0.9

10 The test set method 1. Randoml choose 30% of the data to be in a test set 2. The remainder is a training set 3. Perform our regression on the training set 4. Estimate our future performance with the test set (Join the dots eample) Mean Squared Error = 2.2

11 The test set method Good news: Ver ver simple Can then simpl choose the method with the best test-set score Bad news: Wastes data: we get an estimate of the best method to appl to 30% less data If we don t have much data, our test-set might just be luck or unluck We sa the test-set estimator of performance has high variance

12 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record

13 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset

14 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints

15 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints 4. Note our error ( k, k )

16 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints 4. Note our error ( k, k ) When ou ve done all points, report the mean error.

17 LOOCV (Leave-one-out Cross Validation) For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints 4. Note our error ( k, k ) When ou ve done all points, report the mean error. MSE LOOCV = 2.12

18 LOOCV for Quadratic Regression For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints 4. Note our error ( k, k ) When ou ve done all points, report the mean error. MSE LOOCV =0. 962

19 LOOCV for Join The Dots For k=1 to R 1. Let ( k, k ) be the k th record 2. Temporaril remove ( k, k ) from the dataset 3. Train on the remaining R-1 datapoints 4. Note our error ( k, k ) When ou ve done all points, report the mean error. MSE LOOCV =3. 33

20 Which kind of Cross Validation? Test-set Leaveone-out Downside Variance: unreliable estimate of future performance Epensive. Upside Cheap Doesn t waste data

21 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple)

22 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points.

23 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points. For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.

24 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points. For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points. For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points.

25 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points. Linear Regression MSE 3FOLD =2.05 For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points. For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points. Then report the mean error

26 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points. Quadratic Regression MSE 3FOLD =1.11 For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points. For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points. Then report the mean error

27 k-fold Cross Validation Randoml break the dataset into k partitions (in our eample we ll have k=3 partitions colored Blue Green and Purple) For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points. Joint-the-dots MSE 3FOLD =2.93 For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points. For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points. Then report the mean error

28 Which kind of Cross Validation? Test-set Leaveone-out 10-fold 3-fold R-fold Downside Variance: unreliable estimate of future performance Epensive. Wastes 10% of the data. 10 times more epensive than test set Wastier than 10-fold. Epensivier than test set Identical to Leave-one-out Upside Cheap Doesn t waste data Onl wastes 10%. Onl 10 times more epensive instead of R times. Slightl better than testset

29 CV is useful Preventing overfitting Comparing different learning algorithm Feature selection (see later)

30 Reference Dr. Andrew Moore s homepage

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler Lecture 7 Machine Learning: Cross Validation Outline Performance evaluation methods test/train sets cross-validation k-fold Leave-one-out 1 A