COMP 102: Computers and Computing

Similar documents
Lecture 14: Computer Vision

Computational Foundations of Cognitive Science

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Project 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Stereo Vision A simple system. Dr. Gerhard Roth Winter 2012

Multiple View Geometry

Multiple View Geometry

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Face Recognition At-a-Distance Based on Sparse-Stereo Reconstruction

COMP 102: Computers and Computing

Dense 3D Reconstruction. Christiano Gava

Transactions on Information and Communications Technologies vol 16, 1996 WIT Press, ISSN

CPSC 425: Computer Vision

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

CS5670: Computer Vision

A Simple Vision System

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

3D Modeling of Objects Using Laser Scanning

Computer Vision Lecture 17

Computer Vision Lecture 17

Miniature faking. In close-up photo, the depth of field is limited.

12/3/2009. What is Computer Vision? Applications. Application: Assisted driving Pedestrian and car detection. Application: Improving online search

Dense 3D Reconstruction. Christiano Gava

Combinatorial optimization and its applications in image Processing. Filip Malmberg

Occlusion Detection of Real Objects using Contour Based Stereo Matching

Lecture 17: Recursive Ray Tracing. Where is the way where light dwelleth? Job 38:19

COS Lecture 10 Autonomous Robot Navigation

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

Thanks to Chris Bregler. COS 429: Computer Vision

Depth Camera for Mobile Devices

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them.

Computational Medical Imaging Analysis Chapter 4: Image Visualization

Digital Image Processing Lectures 1 & 2

Geometric Reconstruction Dense reconstruction of scene geometry

CSCI 5980: Assignment #3 Homography

Multi-view stereo. Many slides adapted from S. Seitz

Recap from Previous Lecture

A virtual tour of free viewpoint rendering

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

IDE-3D: Predicting Indoor Depth Utilizing Geometric and Monocular Cues

Computer and Machine Vision

Announcements. Stereo Vision Wrapup & Intro Recognition

Tecnologie per la ricostruzione di modelli 3D da immagini. Marco Callieri ISTI-CNR, Pisa, Italy

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Segmentation and Tracking of Partial Planar Templates

Practice Exam Sample Solutions

Computer and Machine Vision

Graphics and Games. Penny Rheingans University of Maryland Baltimore County

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Chapter 7. Conclusions and Future Work

An introduction to 3D image reconstruction and understanding concepts and ideas

Mach band effect. The Mach band effect increases the visual unpleasant representation of curved surface using flat shading.

What is Computer Vision?

Visibility. Tom Funkhouser COS 526, Fall Slides mostly by Frédo Durand

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Depth from Stereo. Dominic Cheng February 7, 2018

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors

Visual Learning and Recognition of 3D Objects from Appearance

Image Based Reconstruction II

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

What have we leaned so far?

Ceilbot vision and mapping system

arxiv: v1 [cs.cv] 28 Sep 2018

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

3D from Images - Assisted Modeling, Photogrammetry. Marco Callieri ISTI-CNR, Pisa, Italy

Visible Surface Ray-Tracing of Stereoscopic Images

Waleed Pervaiz CSE 352

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

Stereo Vision. MAN-522 Computer Vision

A Survey of Light Source Detection Methods

Advance Shadow Edge Detection and Removal (ASEDR)

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

TIED FACTOR ANALYSIS FOR FACE RECOGNITION ACROSS LARGE POSE DIFFERENCES

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

surface: reflectance transparency, opacity, translucency orientation illumination: location intensity wavelength point-source, diffuse source

Last week. Machiraju/Zhang/Möller

Computer Graphics. Lecture 9 Environment mapping, Mirroring

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza

Computer Vision I - Appearance-based Matching and Projective Geometry

Ulrik Söderström 17 Jan Image Processing. Introduction

CS 4758: Automated Semantic Mapping of Environment

CSE 4392/5369. Dr. Gian Luca Mariottini, Ph.D.

Shadows. COMP 575/770 Spring 2013

Stereo and Epipolar geometry

Data-driven Depth Inference from a Single Still Image

Interactive segmentation, Combinatorial optimization. Filip Malmberg

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Light scattering is an important natural phenomenon, which arises when the light interacts with the particles distributed in the media.

Stereo and structured light

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Chaplin, Modern Times, 1936

Multi-View Stereo for Community Photo Collections Michael Goesele, et al, ICCV Venus de Milo

Ray Tracing. CSCI 420 Computer Graphics Lecture 15. Ray Casting Shadow Rays Reflection and Transmission [Ch ]

Lecture'9'&'10:'' Stereo'Vision'

Making Machines See. Roberto Cipolla Department of Engineering. Research team

Transcription:

COMP 102: Computers and Computing Lecture 23: Computer Vision Instructor: Kaleem Siddiqi (siddiqi@cim.mcgill.ca) Class web page: www.cim.mcgill.ca/~siddiqi/102.html

What is computer vision? Broadly speaking, it has to do with making a computer see. The complexity of this task is enormous! It is about understanding visual input and making inferences about the visual world. What you see is NOT necessarily what is there! COMP-102 2

Visual illusions COMP-102 3

Visual illusions An impossible object COMP-102 4

Visual illusions Mach bands COMP-102 5

What do you see? COMP-102 6

Is it clearer now? COMP-102 7

Bregman B s COMP-102 8

Why study computer vision? Scene reconstruction Object recognition and identification Object modeling Robot navigation Medical imaging Image-guided neurosurgery COMP-102 9

3 Problems we will look at Segmentation using deformable models (moving curves and surfaces) Face Recognition Stereo Vision COMP-102 10

Problem 1: Segmentation Segmentation has to do with processing visual images to extract the outlines of objects of interest. This is a difficult problem for a computer, because it is a challenge to come up with a precise definition of an outline that generalizes to many classes of objects. In an image changes in contrast often signal object boundaries, but they can also arise due to other cues (shadow boundaries, texture, surface discontinuities, etc.). COMP-102 11

A simpler scenario Assume that an object is on a homogeneous background Initialize a curve (a list of discrete points) and define a process by which these points move in the image, until they reach a place of high contrast, where they slow down. In the limit, this type of shrink wrap process can wrap around the outline of an object. This scenario can be modified to handle objects with a particular distinctive texture (e.g. a tissue class). In this case the stopping criterion takes into account the statistics of the enclosed region. COMP-102 12

An example COMP-102 13

Segmentation The previous process can also be carried out in 3D, where the evolving structure is no a surface. It deforms in 3D so as to wrap around the boundaries of 3D structures of interest, such as anatomical structures or tumours in an MRI scan. Initialize a curve (a list of discrete points) and define a process by which these points move in the image, until they reach a place of high contrast, where they slow down. In the limit, this type of shrink wrap process can wrap around the outline of an object. This scenario can be modified to handle objects with a particular distinctive texture (e.g. a tissue class). In this case the stopping criterion takes into account the statistics of the enclosed region. COMP-102 14

Problem 2: Face Recognition Imagine that one has a very large database of mugshots of individuals which one wishes to be able to recognize. Given a query image of a face, the goal is to identify the person by comparing the query against the images in the database and finding the closest match. A difficulty with this strategy is that a face can have may different appearances, dependng on pose, facial expression,etc. A second difficulty is that it is simply too computationally expensive to compare the query, pixel by pixel, with all the images in a large database. It just takes too long! COMP-102 15

A simpler scenario Assume that faces are seen from a particular viewpoint (e.g. frontal views) and that for each person one wishes to recognize there are many candidate pictures a system can use for training. Given such a large database, imagine that there is a special set of derived images, called principal components, PC1, PC2, PC3, with the following property: a candidate image is the average of all the images in the database, plus a weighted sum of the first few principal components. The representation is then just a vector of weights. COMP-102 16

An example Some pictures from a face database containing 40 pictures: 10 individuals, 4 pictures per individual. COMP-102 17

Face Recognition Average of the 40 pictures in the database COMP-102 18

Face Recognition First principal component COMP-102 19

Face Recognition Second principal component COMP-102 20

Face Recognition Third principal component COMP-102 21

Eigenspace representation Query image (not in database) Approximation using the first 10 principal components COMP-102 22

How does a face get recognized? The query image gets represented by a weighted combination of the first few principal components. Those weights are then compared against the weights that represent each image in the database. This is much cheaper because we are basicall comparing a small vector with another small vector (rather than two possibly mega-pixel sized images). The identity assigned to the query image is then that of the closest image found in the database. COMP-102 23

Problem 3: Stereo How does one perceived depth (3D) from a pair of stereo images, each of which is 2D? The human visual system, and that of most primates, exploits the notion of disparity. Given a fixed point in the 3D world, the location at which it is projected onto the left and right images in a stereo pair differs slightly. This difference is called disparity. The amount of disparity is inversely proportional to 3D depth. As a simple demonstration, hold your thumb out in front of your eyes and open one eye at a time (shutting the other eye when the first is open). The thumb will appear to jump from left to right. The amount of this jump decreases as you increase the distance of your thumb from your eye. COMP-102 24

Correspondence and Reconstruction Given a fixed camera system, with a left image and a right image, stereo breaks down into two problems. The first is the correspondence problem, which is the harder problem, and it has to do with associating points in one image with their corresponding points in the other. The notion of correspondence here is that the same physical point in the 3D world gave rise to these two projected points, one in each image. Often this problem is not well defined, e.g., imagine looking at a regular texture or at a homogeneous region, where a point in one image can correspond to many points in the other. Assuming though that there are distinctive features, once a correspondence has been found, the 3D point it came from can be constructed by just tracing a ray back from each camera center, through the left (or corresponding right) image point and then noting where these two rays intersect. This is called the reconstruction problem. COMP-102 25

Correspondence and Reconstruction Modern day stereo algorithms for rigid scenes are quite powerful. Given a fixed camera system, they can generate dense realistic 3D depth maps from left-right image pairs, such as the following One image (from a stereo pair) ground truth surface reconstructed surface COMP-102 26