Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Similar documents
COS Lecture 10 Autonomous Robot Navigation

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception

Lecture 14: Computer Vision

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Robotics Programming Laboratory

Computer and Machine Vision

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors

Peripheral drift illusion

CS4733 Class Notes, Computer Vision

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Computational Foundations of Cognitive Science

EE795: Computer Vision and Intelligent Systems

Miniature faking. In close-up photo, the depth of field is limited.

Edge and Texture. CS 554 Computer Vision Pinar Duygulu Bilkent University

Practice Exam Sample Solutions

Digital Image Fundamentals

Lecture 6: Edge Detection

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

Range Sensors (time of flight) (1)

Feature descriptors and matching

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Feature Tracking and Optical Flow

Why is computer vision difficult?

Digital Image Processing

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Epipolar geometry contd.

CS 664 Segmentation. Daniel Huttenlocher

Final Review CMSC 733 Fall 2014

Capturing, Modeling, Rendering 3D Structures

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Three-Dimensional Computer Vision

Image Segmentation. Segmentation is the process of partitioning an image into regions

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies"

Physics-based Vision: an Introduction

10/5/09 1. d = 2. Range Sensors (time of flight) (2) Ultrasonic Sensor (time of flight, sound) (1) Ultrasonic Sensor (time of flight, sound) (2) 4.1.

3D graphics, raster and colors CS312 Fall 2010

Distributed Algorithms. Image and Video Processing

EE 264: Image Processing and Reconstruction. Image Motion Estimation I. EE 264: Image Processing and Reconstruction. Outline

Digital Image Processing COSC 6380/4393

What is Computer Vision?

Feature Tracking and Optical Flow

Multi-stable Perception. Necker Cube

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Visual Pathways to the Brain

Feature Extraction and Image Processing, 2 nd Edition. Contents. Preface

CS 4495 Computer Vision. Linear Filtering 2: Templates, Edges. Aaron Bobick. School of Interactive Computing. Templates/Edges

Feature Detectors and Descriptors: Corners, Lines, etc.

Lecture 16: Computer Vision

Edges and Binary Images

Statistical image models

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

EE795: Computer Vision and Intelligent Systems

Data Term. Michael Bleyer LVA Stereo Vision

CS4670: Computer Vision

Lecture 7: Most Common Edge Detectors

Colour Reading: Chapter 6. Black body radiators

(Sample) Final Exam with brief answers

Photometric Stereo.

What is it? How does it work? How do we use it?

The Display pipeline. The fast forward version. The Display Pipeline The order may vary somewhat. The Graphics Pipeline. To draw images.

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Introduction to 3D Imaging: Perceiving 3D from 2D Images

Image Processing, Analysis and Machine Vision

Visualizing Flow Fields by Perceptual Motion

Scene Grammars, Factor Graphs, and Belief Propagation

COMP 102: Computers and Computing

Robert Collins CSE486, Penn State Lecture 08: Introduction to Stereo

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages

Image Formation. Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts University of New Mexico

Contents I IMAGE FORMATION 1

Realtime 3D Computer Graphics Virtual Reality

Edge Detection (with a sidelight introduction to linear, associative operators). Images

EE795: Computer Vision and Intelligent Systems

Stereo: Disparity and Matching

Scene Grammars, Factor Graphs, and Belief Propagation

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

(Refer Slide Time: 0:32)

Image Analysis. Edge Detection

Lecture 16: Computer Vision

CS334: Digital Imaging and Multimedia Edges and Contours. Ahmed Elgammal Dept. of Computer Science Rutgers University

Comparison between Motion Analysis and Stereo

Local Features Tutorial: Nov. 8, 04

PERCEPTION. 2D1380 Artificial Intelligence. 2D1380 Artificial Intelligence. 2D1380 Artificial Intelligence. 2D1380 Artificial Intelligence

All forms of EM waves travel at the speed of light in a vacuum = 3.00 x 10 8 m/s This speed is constant in air as well

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Information Visualization. Overview. What is Information Visualization? SMD157 Human-Computer Interaction Fall 2003

Other approaches to obtaining 3D structure

What is Computer Vision? Introduction. We all make mistakes. Why is this hard? What was happening. What do you see? Intro Computer Vision

EN1610 Image Understanding Lab # 3: Edges

Ulrik Söderström 16 Feb Image Processing. Segmentation

Stereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17

Perception, Part 2 Gleitman et al. (2011), Chapter 5

Transcription:

Last update: May 4, 200 Vision CMSC 42: Chapter 24 CMSC 42: Chapter 24

Outline Perception generally Image formation Early vision 2D D Object recognition CMSC 42: Chapter 24 2

Perception generally Stimulus (percept) S, World W S = g(w ) E.g., g = graphics. Can we do vision as inverse graphics? W = g (S) CMSC 42: Chapter 24

Perception generally Stimulus (percept) S, World W S = g(w ) E.g., g = graphics. Can we do vision as inverse graphics? W = g (S) Problem: massive ambiguity! CMSC 42: Chapter 24 4

Perception generally Stimulus (percept) S, World W S = g(w ) E.g., g = graphics. Can we do vision as inverse graphics? W = g (S) Problem: massive ambiguity! CMSC 42: Chapter 24 5

Perception generally Stimulus (percept) S, World W S = g(w ) E.g., g = graphics. Can we do vision as inverse graphics? W = g (S) Problem: massive ambiguity! CMSC 42: Chapter 24 6

Example CMSC 42: Chapter 24 7

Example CMSC 42: Chapter 24 8

Better approaches Bayesian inference of world configurations: P (W S) = α P (S W ) }{{} graphics Better still: no need to recover exact scene! Just extract information needed for navigation manipulation recognition/identification P (W ) }{{} prior knowledge CMSC 42: Chapter 24 9

HIGH LEVEL VISION LOW LEVEL VISION edges edge detection Vision subsystems s.f. contour scenes objects s.f. stereo disparity matching filters behaviors features tracking data association object recognition depth map optical flow image sequence s.f. motion s.f. shading regions segmentation Vision requires combining multiple cues CMSC 42: Chapter 24 0

Image formation image plane Y X P pinhole Z P f P is a point in the scene, with coordinates (X, Y, Z) P is its image on the image plane, with coordinates (x, y, z) x = fx Z where f is a scaling factor., y = fy Z CMSC 42: Chapter 24

Images CMSC 42: Chapter 24 2

Images, continued 95 209 22 25 249 25 254 255 250 24 247 248 20 26 249 254 255 254 225 226 22 204 26 2 64 72 80 92 24 25 255 255 255 255 25 90 67 64 7 70 79 89 208 244 254 255 25 24 62 67 66 69 69 70 76 85 96 22 249 254 5 57 60 62 69 70 68 69 7 76 85 28 26 5 4 47 56 57 60 66 67 7 68 70 0 07 8 25 45 5 56 58 59 6 64 095 095 097 0 5 24 2 42 7 22 24 6 09 09 09 09 095 099 05 8 25 5 4 9 09 09 09 09 09 09 095 097 0 09 9 2 095 09 09 09 09 09 09 09 09 09 09 9 I(x, y, t) is the intensity at (x, y) at time t typical digital camera 5 to 0 million pixels human eyes 240 million pixels; about 0.25 terabits/sec CMSC 42: Chapter 24

Color vision Intensity varies with frequency infinite-dimensional signal intensity sensitivity frequency frequency Human eye has three types of color-sensitive cells; each integrates the signal -element vector intensity CMSC 42: Chapter 24 4

Primary colors Did anyone ever tell you that the three primary colors are red, yellow, and blue? CMSC 42: Chapter 24 5

Primary colors Did anyone ever tell you that the three primary colors are red, yellow, and blue? Not correct. Did anyone ever tell you that red, green, and blue are the primary colors of light, and red, yellow, and blue are the primary pigments? CMSC 42: Chapter 24 6

Primary colors Did anyone ever tell you that the three primary colors are red, yellow, and blue? Not correct. Did anyone ever tell you that red, green, and blue are the primary colors of light, and red, yellow, and blue are the primary pigments? Only partially correct. What do we mean by primary colors? CMSC 42: Chapter 24 7

Primary colors Did anyone ever tell you that the three primary colors are red, yellow, and blue? They weren t correct. Did anyone ever tell you that red, green, and blue are the primary colors of light, and red, yellow, and blue are the primary pigments? Only partially correct. What do we mean by primary colors? Ideally, a small set of colors from which we can generate every color the eye can see Some colors can t be generated by any combination of red, yellow, and blue pigment. CMSC 42: Chapter 24 8

Primary colors Use color frequencies that stimulate the three color receptors Primary colors of light (additive) frequencies where the three color receptors are most sensitive red, green, blue Primary pigments (subtractive) white minus red = cyan white minus green = magenta white minus blue = yellow Four color (or CMYK) printing uses cyan, magenta, yellow, and black. CMSC 42: Chapter 24 9

Edge detection Edges in image discontinuities in scene: ) depth 2) surface orientation ) reflectance (surface markings) 4) illumination (shadows, etc.) Edges correspond to abrupt changes in brightness or color CMSC 42: Chapter 24 20

Edge detection Edges correspond to abrupt changes in brightness or color E.g., a single row or column of an image: Ideally, we could detect abrupt changes by computing a derivative: In practice, this has some problems... CMSC 42: Chapter 24 2

Noise Problem: the image is noisy: The derivative looks like this: First need to smooth out the noise CMSC 42: Chapter 24 22

Smoothing out the noise Convolution g = h f: for each point f[i, j], compute a weighted average g[i, j] of the points near f[i, j] g[i, j] = Σ k u= kσ k v= kh[u, v]f[i u, j v] Use convolution to smooth the image, then differentiate: CMSC 42: Chapter 24 2

Edge detection Find all pixels at which the derivative is above some threshold: Label above-threshold pixels with edge orientation Infer clean line segments by combining edge pixels with same orientation CMSC 42: Chapter 24 24

Inferring shape Can use cues from prior knowledge to help infer shape: Shape from... Assumes motion rigid bodies, continuous motion stereo solid, contiguous, non-repeating bodies texture uniform texture shading uniform reflectance contour minimum curvature CMSC 42: Chapter 24 25

Inferring shape from motion Is this a -d shape? If so, what shape is it? CMSC 42: Chapter 24 26

Rotate the shape slightly Inferring shape from motion CMSC 42: Chapter 24 27

Inferring shape from motion Match the corresponding pixels in the two images Use this to infer -dimensional locations of the pixels CMSC 42: Chapter 24 28

Inferring shape using stereo vision Can do something similar using stereo vision Perceived object P 0 P Left image Right image CMSC 42: Chapter 24 29

Inferring shape from texture Idea: assume actual texture is uniform, compute surface shape that would produce this distortion Similar idea works for shading assume uniform reflectance, etc. but interreflections give nonlocal computation of perceived intensity hollows seem shallower than they really are CMSC 42: Chapter 24 0

Inferring shape from edges Sometimes we can infer shape from properties of the edges and vertices: Let s only consider solid polyhedral objects with trihedral vertices Only certain types of labels are possible, e.g., CMSC 42: Chapter 24

All possible combinations of vertex/edge labels 5 5 7 5 5 7 Given a line drawing, use constraint-satisfaction to assign labels to edges CMSC 42: Chapter 24 2

Vertex/edge labeling example 5 5 5 7 Use a constraint-satisfaction algorithm to assign labels to edges variables = edges, constraints = possible node configurations CMSC 42: Chapter 24

Vertex/edge labeling example 5 5 5 7 Start by putting clockwise arrows on the boundary CMSC 42: Chapter 24 4

Vertex/edge labeling example 5 = 5 5 7 CMSC 42: Chapter 24 5

Vertex/edge labeling example 5 = 5 5 7 CMSC 42: Chapter 24 6

Vertex/edge labeling example 5 5 5 7 = CMSC 42: Chapter 24 7

Vertex/edge labeling example 5 T 5 5 7 = CMSC 42: Chapter 24 8

Vertex/edge labeling example = 5 T 5 5 7 CMSC 42: Chapter 24 9

Vertex/edge labeling example = 5 T 5 5 7 CMSC 42: Chapter 24 40

Vertex/edge labeling example 5 T 5 5 7 Need to backtrack CMSC 42: Chapter 24 4

Vertex/edge labeling example 5 T 5 5 7 CMSC 42: Chapter 24 42

Vertex/edge labeling example = 5 T 5 5 7 CMSC 42: Chapter 24 4

Vertex/edge labeling example = 5 T 5 5 7 CMSC 42: Chapter 24 44

Vertex/edge labeling example 5 T 5 5 7 Need to backtrack again CMSC 42: Chapter 24 45

Vertex/edge labeling example 5 T 5 5 7 CMSC 42: Chapter 24 46

Vertex/edge labeling example = 5 T 5 5 7 CMSC 42: Chapter 24 47

Vertex/edge labeling example 5 T = 5 5 7 Found a consistent labeling CMSC 42: Chapter 24 48

Simple idea: extract -D shapes from image match against shape library Object recognition Problems: extracting curved surfaces from image representing shape of extracted object representing shape and variability of library object classes improper segmentation, occlusion unknown illumination, shadows, markings, noise, complexity, etc. Approaches: index into library by measuring invariant properties of objects alignment of image feature with projected library object feature match image against multiple stored views (aspects) of library object machine learning methods based on image statistics CMSC 42: Chapter 24 49

Handwritten digit recognition -nearest-neighbor = 2.4% error 400 00 0 unit MLP (multi-layer perceptron) =.6% error LeNet: 768 92 0 0 unit MLP = 0.9% error Current best (kernel machines, vision algorithms) 0.6% error A kernel machine is a statistical learning algorithm The kernel is a particular kind of vector function (see Section 20.6) CMSC 42: Chapter 24 50

Shape-context matching Do two shapes represent the same object? Represent each shape as a set of sampled edge points p,..., p n For each point p i, compute its shape context - a function that represents p i s relation to the other points in the shape Shape P Shape Q Polar coordinates (log r, θ) CMSC 42: Chapter 24 5

Shape-context matching contd. A point s shape context is a histogram: a grid of discrete values of log r and θ the grid divides space into bins, one for each pair (log r, θ) log r log r log r θ θ θ h i (k) = the number of points in the k th bin for point p i CMSC 42: Chapter 24 52

Shape-context matching contd. For p i in shape P and q j in shape Q, let K C ij = [h i (k) h j (k)] 2 /[h i (k) + h j (k)] 2 k= C ij is the χ 2 distance between the histograms of p i and q j. The smaller it is, the closer the more alike p i and q j are. Find a one-to-one correspondence between the points in shape P and the points in shape Q, in a way that minimizes the total cost weighted bipartite matching problem, can be solved in time O(n ) Simple nearest-neighbor learning gives 0.6% error rate on NIST digit data CMSC 42: Chapter 24 5

Summary Vision is hard noise, ambiguity, complexity Prior knowledge is essential to constrain the problem Need to combine multiple cues: motion, contour, shading, texture, stereo Library object representation: shape vs. aspects Image/object matching: features, lines, regions, etc. CMSC 42: Chapter 24 54