Optimization and least squares. Prof. Noah Snavely CS1114

Similar documents
Prof. Noah Snavely CS Administrivia. A4 due on Friday (please sign up for demo slots)

CS1114: Study Guide 2

CS664 Lecture #16: Image registration, robust statistics, motion

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

CS6670: Computer Vision

Hyperparameter optimization. CS6787 Lecture 6 Fall 2017

CS1114 Assignment 5, Part 1

MATH 51: MATLAB HOMEWORK 3

Simple Model Selection Cross Validation Regularization Neural Networks

Finding the Maximum or Minimum of a Quadratic Function. f(x) = x 2 + 4x + 2.

Humanoid Robotics. Least Squares. Maren Bennewitz

CPSC 340: Machine Learning and Data Mining. More Regularization Fall 2017

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

Understanding Gridfit

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

4.7 Approximate Integration

LARGE MARGIN CLASSIFIERS

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester.

15-780: Problem Set #2

ES-2 Lecture: Fitting models to data

Chapter 1. Math review. 1.1 Some sets

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016

APCS-AB: Java. Recursion in Java December 12, week14 1

CS 188: Artificial Intelligence

Grades 7 & 8, Math Circles 31 October/1/2 November, Graph Theory

1 Introduction to Using Excel Spreadsheets

CS4670: Computer Vision

x y

Linked lists. Prof. Noah Snavely CS1114

Administrivia. Next Monday is Thanksgiving holiday. Tuesday and Wednesday the lab will be open for make-up labs. Lecture as usual on Thursday.

Mathematics. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

Note that ALL of these points are Intercepts(along an axis), something you should see often in later work.

CSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

CHAPTER 5 SYSTEMS OF EQUATIONS. x y

Interpolation, extrapolation, etc.

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Parametric Curves. University of Texas at Austin CS384G - Computer Graphics

Image Resizing & Seamcarve. CS16: Introduction to Algorithms & Data Structures

[1] CURVE FITTING WITH EXCEL

Why Use Graphs? Test Grade. Time Sleeping (Hrs) Time Sleeping (Hrs) Test Grade

Linear Regression & Gradient Descent

Why is computer vision difficult?

CS5670: Computer Vision

Matrix Inverse 2 ( 2) 1 = 2 1 2

CS 97SI: INTRODUCTION TO PROGRAMMING CONTESTS. Jaehyun Park

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10

REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA

CS4450. Computer Networks: Architecture and Protocols. Lecture 11 Rou+ng: Deep Dive. Spring 2018 Rachit Agarwal

Parametric Curves. University of Texas at Austin CS384G - Computer Graphics Fall 2010 Don Fussell

EE795: Computer Vision and Intelligent Systems

1. Answer: x or x. Explanation Set up the two equations, then solve each equation. x. Check

Numerical Methods in Physics Lecture 2 Interpolation

Linear and Quadratic Least Squares

Adaptive Filters Algorithms (Part 2)

Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

Lecture 11: Randomized Least-squares Approximation in Practice. 11 Randomized Least-squares Approximation in Practice

DISCRETE DIFFERENTIAL GEOMETRY: AN APPLIED INTRODUCTION Keenan Crane CMU /858B Fall 2017

Foundations of Computing

Statistics: Interpreting Data and Making Predictions. Visual Displays of Data 1/31

I m going to be introducing you to ergonomics More specifically ergonomics in terms of designing touch interfaces for mobile devices I m going to be

5 Machine Learning Abstractions and Numerical Optimization

Iterative Methods for Solving Linear Problems

CS5670: Computer Vision

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/3/15

WebcamPaperPen: A Low-Cost Graphics Tablet

Generating Functions

CS 248 Assignment 2 Polygon Scan Converter. CS248 Presented by Abe Davis Stanford University October 17, 2008

Intro To Excel Spreadsheet for use in Introductory Sciences

Solving Minesweeper Using CSP

Concept of Curve Fitting Difference with Interpolation

LOGISTIC REGRESSION FOR MULTIPLE CLASSES

Camera calibration. Robotic vision. Ville Kyrki

Machine Learning for Signal Processing Lecture 4: Optimization

Homography estimation

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

Inferring the Source of Encrypted HTTP Connections. Michael Lin CSE 544

I can solve simultaneous equations algebraically and graphically. I can solve inequalities algebraically and graphically.

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers

1.1 - Functions, Domain, and Range

Instructions on Adding Zeros to the Comtrade Data

3 Perceptron Learning; Maximum Margin Classifiers

Lecture 4: Petersen Graph 2/2; also, Friendship is Magic!

What is Multigrid? They have been extended to solve a wide variety of other problems, linear and nonlinear.

CS 248 Assignment 2 Polygon Scan Conversion. CS248 Presented by Michael Green Stanford University October 20, 2004

CS 237: Probability in Computing

y 1 ) 2 Mathematically, we write {(x, y)/! y = 1 } is the graph of a parabola with 4c x2 focus F(0, C) and directrix with equation y = c.

Lecture 3. Brute Force

Introduction to SNNS

Introduction to Computer Graphics

Intro to Curves Week 1, Lecture 2

VCEasy VISUAL FURTHER MATHS. Overview

1

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Some material taken from: Yuri Boykov, Western Ontario

Transcription:

Optimization and least squares Prof. Noah Snavely CS1114 http://cs1114.cs.cornell.edu

Administrivia A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot) Part 2 will be due in two weeks (4/17) Prelim 2 on Tuesday 4/7 (in class) Covers everything since Prelim 1, including: Polygons, convex hull, interpolation, image transformation, feature-based object recognition, solving for affine transformations Review session on Monday, 7pm, Upson 315 2

Image transformations What happens when the image moves outside the image boundary? Previously, we tried to fix this by changing the transformation: 3

Image transformations Another approach: We can compute where the image moves to in particular the minimum row and the minimum column as well as the width and height of the output image (e.g., max_col min_col + 1) Then we can shift the image so it fits inside the output image This could be done with another transformation, but could also just add/subtract min_col and min_row 4

Shifting the image matrix entry (1,1) (30, -10) (-15, -50) h need to shift (row, col) by adding (min_row, min_col), before applying T -1 w 5

Convex hulls If I have 100 points, and 10 are on the convex hull If I add 1 more point the max # of points on the convex hull is 11 the min # of points on the convex hull is 3 new point 6

Object recognition 1. Detect features in two images 2. Match features between the two images 3. Select three matches at random 4. Solve for the affine transformation T 5. Count the number of inlier matches to T 6. If T is has the highest number of inliers so far, save it 7. Repeat 3-6 for N rounds, return the best T 7

Object recognition 8

When this won t work If the percentage of inlier matches is small, then this may not work In theory, < 50% inliers could break it But all the outliers would have to fit a single transform Often works with fewer inliers 9

A slightly trickier problem What if we want to fit T to more than three points? For instance, all of the inliers we find? Say we found 100 inliers Now we have 200 equations, but still only 6 unknowns Overdetermined system of equations This brings us back to linear regression 10

Linear regression, > 2 points (y i, x i ) y = mx + b The line won t necessarily pass through any data point 11

Some new definitions No line is perfect we can only find the best line out of all the imperfect ones We ll define a function Cost(m,b) that measures how far a line is from the data, then find the best line I.e., the (m,b) that minimizes Cost(m,b) Such a function Cost(m,b) is called an objective function You learned about these in section yesterday 12

Line goodness What makes a line good versus bad? This is actually a very subtle question 13

Residual errors The difference between what the model predicts and what we observe is called a residual error (i.e., a left-over) Consider the data point (x,y) The model m,b predicts (x,mx+b) The residual is y (mx + b) For 1D regressions, residuals can be easily visualized Vertical distance to the line 14

Least squares fitting This is a reasonable cost function, but we usually use something slightly different 15

Least squares fitting We prefer to make this a squared distance Called least squares 16

Why least squares? There are lots of reasonable objective functions Why do we want to use least squares? This is a very deep question We will soon point out two things that are special about least squares The full story probably needs to wait for graduate-level courses, or at least next semester 17

Gradient descent You learned about this in section Basic strategy: 1. Start with some guess for the minimum 2. Find the direction of steepest descent (gradient) 3. Take a step in that direction (making sure that you get lower, if not, adjust the step size) 4. Repeat until taking a step doesn t get you much lower 18

Gradient descent, 1D quadratic There is some black magic in setting the step size 19

Some error functions are easy A (positive) quadratic is a convex function The set of points above the curve forms a (infinite) convex set The previous slide shows this in 1D But it s true in any dimension A sum of convex functions is convex Thus, the sum of squared error is convex Convex functions are nice They have a single global minimum Rolling downhill from anywhere gets you there 20

Consequences Our gradient descent method will always converge to the right answer By slowly rolling downhill It might take a long time, hard to predict exactly how long (see CS3220 and beyond) 21

What else about LS? Least squares has an even more amazing property than convexity Consider the linear regression problem There is a magic formula for the optimal choice of (m,b) You don t need to roll downhill, you can simply compute the right answer 22

Closed-form solution! This is a huge part of why everyone uses least squares Other functions are convex, but have no closed-form solution, e.g. 23

Closed form LS formula The derivation requires linear algebra Most books use calculus also, but it s not required (see the Links section on the course web page) There s a closed form for any linear leastsquares problem 24

Linear least squares Any formula where the residual is linear in the variables Examples: 1. simple linear regression: [y (mx + b)] 2 2. finding an affine transformation [x (ax + by + c)] 2 + [y (dx + ey + f)] 2 Non-example: [x abc x] 2 (variables: a, b, c) 25

Linear least squares Surprisingly, fitting the coefficients of a quadratic is still linear least squares The residual is still linear in the coefficients β 1, β 2, β 3 Wikipedia, Least squares fitting 26

Optimization Least squares is an example of an optimization problem Optimization: define a cost function and a set of possible solutions, find the one with the minimum cost Optimization is a huge field 27

Optimization strategies From worse to pretty good: 1. Guess an answer until you get it right 2. Guess an answer, improve it until you get it right 3. Magically compute the right answer For most problems 2 is the best we can do Some problems are easy enough that we can do 3 We ve seen several examples of this 28

Sorting as optimization Set of allowed answers: permutations of the input sequence Cost(permutation) = number of out-oforder pairs Algorithm 1: Snailsort Algorithm 2: Bubble sort Algorithm 3:??? 29

Another example: structure from motion p 1 p 4 p 3 p 2 minimize f (R,T,P) p 5 p 6 p 7 Camera 1 Camera 3 R 1,t 1 Camera 2 R 3,t 3 R 2,t 2

Optimization is everywhere How can you give someone the smallest number of coins back as change? How do you schedule your classes so that there are the fewest conflicts? How do you manage your time to maximize the grade / work tradeoff? 32

Next time: Prelim 2 33