Bias-Variance Trade-off (cont d) + Image Representations

Similar documents
Texture Representation + Image Pyramids

Local features and image matching. Prof. Xin Yang HUST

Local features: detection and description. Local invariant features

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Recognition Tools: Support Vector Machines

Wikipedia - Mysid

Local Image Features

2D Image Processing Feature Descriptors

Local invariant features

Edge and corner detection

Local Image Features

Computer Vision for HCI. Topics of This Lecture

Why is computer vision difficult?

CS 1674: Intro to Computer Vision. Midterm Review. Prof. Adriana Kovashka University of Pittsburgh October 10, 2016

Local Image Features

CS5670: Computer Vision

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Lecture 10 Detectors and descriptors

Local features: detection and description May 12 th, 2015

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

Texture and Other Uses of Filters

Bias-Variance Trade-off + Other Models and Problems

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

Edges and Binary Images

Motion illusion, rotating snakes

Deformable Part Models

CS 558: Computer Vision 4 th Set of Notes

Feature Matching and Robust Fitting

Image gradients and edges April 11 th, 2017

Object Category Detection. Slides mostly from Derek Hoiem

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Image gradients and edges April 10 th, 2018

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Local Features: Detection, Description & Matching

CAP 5415 Computer Vision Fall 2012

Image Features: Detection, Description, and Matching and their Applications

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Automatic Image Alignment (feature-based)

Image Features. Work on project 1. All is Vanity, by C. Allan Gilbert,

Lecture 6: Finding Features (part 1/2)

Edge and Texture. CS 554 Computer Vision Pinar Duygulu Bilkent University

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Lecture: RANSAC and feature detectors

Local Features and Bag of Words Models

Image gradients and edges

Patch-based Object Recognition. Basic Idea

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Category vs. instance recognition

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Templates, Image Pyramids, and Filter Banks

Linear Algebra Review

Prof. Feng Liu. Spring /26/2017

Patch Descriptors. CSE 455 Linda Shapiro

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

Image Processing. Image Features

Applications of Image Filters

Patch Descriptors. EE/CSE 576 Linda Shapiro

Local Feature Detectors

Texture April 14 th, 2015

Detection III: Analyzing and Debugging Detection Methods

Outline 7/2/201011/6/

2D Image Processing INFORMATIK. Kaiserlautern University. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz

The SIFT (Scale Invariant Feature

Segmentation and Grouping

Object Category Detection: Sliding Windows

Lecture 9: Hough Transform and Thresholding base Segmentation

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

School of Computing University of Utah

CS231A Section 6: Problem Set 3

Lecture 4: Spatial Domain Transformations

Peripheral drift illusion

Texture April 17 th, 2018

CS 2750: Machine Learning. Clustering. Prof. Adriana Kovashka University of Pittsburgh January 17, 2017

Filters and Pyramids. CSC320: Introduction to Visual Computing Michael Guerzhoy. Many slides from Steve Marschner, Alexei Efros

CS 2770: Computer Vision. Edges and Segments. Prof. Adriana Kovashka University of Pittsburgh February 21, 2017

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Object Recognition with Invariant Features

Feature descriptors and matching

Part-based and local feature models for generic object recognition

Evaluation and comparison of interest points/regions

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

CS 378: Autonomous Intelligent Robotics. Instructor: Jivko Sinapov

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

CS4670: Computer Vision

AK Computer Vision Feature Point Detectors and Descriptors

Supervised Learning: Nearest Neighbors

Window based detectors

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Feature Based Registration - Image Alignment

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Local Image preprocessing (cont d)

Announcements. Texture. Review. Today: Texture 9/14/2015. Reminder: A1 due this Friday. Tues, Sept 15. Kristen Grauman UT Austin

Ulas Bagci

Motion Estimation and Optical Flow Tracking

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Transcription:

CS 275: Machine Learning Bias-Variance Trade-off (cont d) + Image Representations Prof. Adriana Kovashka University of Pittsburgh January 2, 26

Announcement Homework now due Feb.

Generalization Training set (labels known) Test set (labels unknown) How well does a learned model generalize from the data it was trained on to a new test set? Slide credit: L. Lazebnik

Generalization Components of expected loss Noise in our observations: unavoidable Bias: how much the average model over all training sets differs from the true model Error due to inaccurate assumptions/simplifications made by the model Variance: how much models estimated from different training sets differ from each other Underfitting: model is too simple to represent all the relevant class characteristics High bias and low variance High training error and high test error Overfitting: model is too complex and fits irrelevant characteristics (noise) in the data Low bias and high variance Low training error and high test error Adapted from L. Lazebnik

Bias-Variance Trade-off Think about squinting Red dots = training data (all that we see before we ship off our model!) Green curve = true underlying model Models with too few parameters are inaccurate because of a large bias (not enough flexibility). Models with too many parameters are inaccurate because of a large variance (too much sensitivity to the sample). Blue curve = our predicted model/fit Purple dots = possible test points Adapted from D. Hoiem

Polynomial Curve Fitting Slide credit: Chris Bishop

Sum-of-Squares Error Function Slide credit: Chris Bishop

th Order Polynomial Slide credit: Chris Bishop

st Order Polynomial Slide credit: Chris Bishop

3 rd Order Polynomial Slide credit: Chris Bishop

9 th Order Polynomial Slide credit: Chris Bishop

Over-fitting Root-Mean-Square (RMS) Error: Slide credit: Chris Bishop

Data Set Size: 9 th Order Polynomial Slide credit: Chris Bishop

Data Set Size: 9 th Order Polynomial Slide credit: Chris Bishop

How to reduce over-fitting? Get more training data Slide credit: D. Hoiem

Regularization Penalize large coefficient values (Remember: We want to minimize this expression.) Adapted from Chris Bishop

Regularization: Slide credit: Chris Bishop

Regularization: Slide credit: Chris Bishop

Polynomial Coefficients Slide credit: Chris Bishop

Polynomial Coefficients No regularization Huge regularization Adapted from Chris Bishop

Regularization: vs. Slide credit: Chris Bishop

Bias-variance Figure from Chris Bishop

How to reduce over-fitting? Get more training data Regularize the parameters Slide credit: D. Hoiem

Error Bias-variance tradeoff Underfitting Overfitting Test error High Bias Low Variance Complexity Training error Low Bias High Variance Slide credit: D. Hoiem

Test Error Bias-variance tradeoff Few training examples Many training examples High Bias Low Variance Complexity Low Bias High Variance Slide credit: D. Hoiem

Error Effect of training size Fixed prediction model Generalization Error Testing Training Number of Training Examples Adapted from D. Hoiem

How to reduce over-fitting? Get more training data Regularize the parameters Use fewer features Underfitting Overfitting Choose a simpler classifier Test error Use validation set to find when overfitting occurs Adapted from D. Hoiem

Remember Three kinds of error Inherent: unavoidable Bias: due to over-simplifications Variance: due to inability to perfectly estimate parameters from limited data Try simple classifiers first Use increasingly powerful classifiers with more training data (bias-variance trade-off) Adapted from D. Hoiem

Image Representations Keypoint-based image description Extraction / detection of keypoints Description (via gradient histograms) Texture-based Filter bank representations Filtering

An image is a set of pixels Adapted from S. Narasimhan What we see What a computer sees Source: S. Narasimhan

Problems with pixel representation Not invariant to small changes Translation Illumination etc. Some parts of an image are more important than others

Human eye movements Yarbus eye tracking D. Hoiem

Choosing distinctive interest points If you wanted to meet a friend would you say a) Let s meet on campus. b) Let s meet on Green street. c) Let s meet at Green and Wright. Corner detection Or if you were in a secluded area: a) Let s meet in the Plains of Akbar. b) Let s meet on the side of Mt. Doom. c) Let s meet on top of Mt. Doom. Blob (valley/peak) detection D. Hoiem

Interest points Suppose you have to click on some point, go away and come back after I deform the image, and click on the same points again. Which points would you choose? deformed original D. Hoiem

Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity flat region: no change in all directions A. Efros, D. Frolova, D. Simakov edge : no change along the edge direction corner : significant change in all directions

K. Grauman Example of Harris application

Local features: desired properties Repeatability The same feature can be found in several images despite geometric and photometric transformations Distinctiveness Each feature has a distinctive description Compactness and efficiency Many fewer features than image pixels Locality A feature occupies a relatively small area of the image; robust to clutter and occlusion Adapted from K. Grauman

Overview of Keypoint Description. Find a set of distinctive keypoints A A 2 A 3 2. Define a region around each keypoint f A f B 3. Compute a local descriptor from the normalized region Adapted from K. Grauman, B. Leibe

Gradients

SIFT Descriptor Histogram of oriented gradients Captures important texture information Robust to small translations / affine deformations [Lowe, ICCV 999] K. Grauman, B. Leibe

HOG Descriptor Computes histograms of gradients per region of the image and concatenates them N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 25 Image credit: N. Snavely

What is this? http://web.mit.edu/vondrick/ihog/

What is this? http://web.mit.edu/vondrick/ihog/

What is this? http://web.mit.edu/vondrick/ihog/

Image Representations Keypoint-based image description Extraction / detection of keypoints Description (via gradient histograms) Texture-based Filter bank representations Filtering [read the extra slides if interested]

Texture Marks and patterns, e.g. ones caused by grooves Can include regular or more random patterns

Texture representation Textures are made up of repeated local patterns, so: Find the patterns Use filters that look like patterns (spots, bars, raw patches ) Consider magnitude of response Describe their statistics within each image E.g. histogram of pattern occurrences Results in a d-dimensional feature vector, where d is the number of patterns/filters Adapted from Kristen Grauman

Filter banks orientations scales Edges Bars Spots What filters to put in the bank? Typically we want a combination of scales and orientations, different types of patterns. Matlab code available for these examples: http://www.robots.ox.ac.uk/~vgg/research/texclass/filters.html

Image from http://www.texasexplorer.com/austincap2.jpg Kristen Grauman

Kristen Grauman Showing magnitude of responses

Kristen Grauman

[r, r2,, r38] Patch description: A feature vector formed from the list of responses at each pixel. Adapted from Kristen Grauman

You try: Can you match the texture Filters to the response? A B 2 C 3 Mean responses Answer: B, 2 C, 3 A Derek Hoiem

How do we compute these reponses? The remaining slides are optional (i.e. view them if you re interested)

Next time Unsupervised learning: clustering

Image filtering Compute a function of the local neighborhood at each pixel in the image Function specified by a filter or mask saying how to combine values from neighbors. Uses of filtering: De-noise an image Expect pixels to be like their neighbors Expect noise processes to be independent from pixel to pixel Extract information (texture, edges, etc) Adapted from Derek Hoiem

Moving Average In 2D 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Source: S. Seitz

Moving Average In 2D 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Source: S. Seitz

Moving Average In 2D 2 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Source: S. Seitz

Moving Average In 2D 2 3 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Source: S. Seitz

Moving Average In 2D 2 3 3 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Source: S. Seitz

Moving Average In 2D 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 2 3 3 3 2 2 4 6 6 6 4 2 3 6 9 9 9 6 3 3 5 8 8 9 6 3 3 5 8 8 9 6 3 2 3 5 5 6 4 2 2 3 3 3 3 2 Source: S. Seitz

Correlation filtering Say the averaging window size is 2k+ x 2k+: Attribute uniform weight to each pixel Loop over all pixels in neighborhood around image pixel F[i,j] Now generalize to allow different weights depending on neighboring pixel s relative position: Non-uniform weights Filtering an image = replace each pixel with linear combination of neighbors.

Correlation filtering F = image 5 2 5 4 4 u = -, v = - 2x.6 + 5 2 3 2 4 5 5 4 4 5 5 2 2 3 5 2 (i, j) 2 2 2 H = filter.6.2.6.2.25.2.6.2.6 (, )

Correlation filtering F = image 5 2 5 4 4 5 2 3 2 4 5 5 4 4 (i, j) u = -, v = - 2x.6 + v = 3x.2 + 5 5 2 2 3 5 2 2 2 2 H = filter.6.2.6.2.25.2 (, ).6.2.6

Correlation filtering F = image 5 2 5 4 4 5 2 3 2 4 5 5 4 4 (i, j) u = -, v = - 2x.6 + v = 3x.2 + v = + 2x.6 + 5 5 2 2 3 5 2 2 2 2 H = filter.6.2.6.2.25.2 (, ).6.2.6

Correlation filtering F = image 5 2 5 4 4 5 2 3 2 4 5 5 4 4 (i, j) u = -, v = - 2x.6 + v = 3x.2 + v = + 2x.6 + u =, v = - 5x.2 5 5 2 2 3 5 2 2 2 2 H = filter.6.2.6.2.25.2 (, ).6.2.6

Practice with linear filters? Original Source: D. Lowe

Practice with linear filters Original Filtered (no change) Source: D. Lowe

Practice with linear filters? Original Source: D. Lowe

Practice with linear filters Original Shifted left by pixel with correlation Source: D. Lowe

Practice with linear filters? Original Source: D. Lowe

Practice with linear filters Original Blur Source: D. Lowe

Practice with linear filters 2 -? Original Source: D. Lowe

Practice with linear filters 2 - Original Sharpening filter: accentuates differences with local average Source: D. Lowe

Filtering examples: sharpening

Gaussian filter What if we want nearest neighboring pixels to have the most influence on the output? 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 2 2 4 2 2 This kernel is an approximation of a 2d Gaussian function: Source: S. Seitz