Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Similar documents
Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Depth. Chapter Stereo Imaging

Computer Vision Lecture 17

Computer Vision Lecture 17

Dense 3D Reconstruction. Christiano Gava

L2 Data Acquisition. Mechanical measurement (CMM) Structured light Range images Shape from shading Other methods

Dense 3D Reconstruction. Christiano Gava

Range Sensors (time of flight) (1)

Stereo Vision. MAN-522 Computer Vision

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

Basic distinctions. Definitions. Epstein (1965) familiar size experiment. Distance, depth, and 3D shape cues. Distance, depth, and 3D shape cues

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

10/5/09 1. d = 2. Range Sensors (time of flight) (2) Ultrasonic Sensor (time of flight, sound) (1) Ultrasonic Sensor (time of flight, sound) (2) 4.1.

Project 4 Results. Representation. Data. Learning. Zachary, Hung-I, Paul, Emanuel. SIFT and HoG are popular and successful.

Multi-View Stereo for Community Photo Collections Michael Goesele, et al, ICCV Venus de Milo

Photoshop PSD Export. Basic Tab. Click here to expand Table of Contents... Basic Tab Additional Shading Tab Material Tab Motion Tab Geometry Tab

EE795: Computer Vision and Intelligent Systems

Rectification and Disparity

Multiple View Geometry

coding of various parts showing different features, the possibility of rotation or of hiding covering parts of the object's surface to gain an insight

Correspondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Multiple View Geometry

Stereo imaging ideal geometry

Practice Exam Sample Solutions

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

3D Modeling of Objects Using Laser Scanning

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Basilio Bona DAUIN Politecnico di Torino

A Novel Stereo Camera System by a Biprism

Robert Collins CSE486, Penn State Lecture 08: Introduction to Stereo

Perception, Part 2 Gleitman et al. (2011), Chapter 5

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Why study Computer Vision?

12/3/2009. What is Computer Vision? Applications. Application: Assisted driving Pedestrian and car detection. Application: Improving online search

Introduction to 3D Imaging: Perceiving 3D from 2D Images

3D Scanning. Qixing Huang Feb. 9 th Slide Credit: Yasutaka Furukawa

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Stereo. Shadows: Occlusions: 3D (Depth) from 2D. Depth Cues. Viewing Stereo Stereograms Autostereograms Depth from Stereo

AUTOMATED 4 AXIS ADAYfIVE SCANNING WITH THE DIGIBOTICS LASER DIGITIZER

General Principles of 3D Image Analysis

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Miniature faking. In close-up photo, the depth of field is limited.

Stereo Vision A simple system. Dr. Gerhard Roth Winter 2012

Think-Pair-Share. What visual or physiological cues help us to perceive 3D shape and depth?

Computer Vision. 3D acquisition

Mapping textures on 3D geometric model using reflectance image

Inline Computational Imaging: Single Sensor Technology for Simultaneous 2D/3D High Definition Inline Inspection

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Lecture 14: Basic Multi-View Geometry

Announcements. Stereo

3D Sensing. 3D Shape from X. Perspective Geometry. Camera Model. Camera Calibration. General Stereo Triangulation.

High Accuracy Depth Measurement using Multi-view Stereo

lecture 10 - depth from blur, binocular stereo

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

ELEC Dr Reji Mathew Electrical Engineering UNSW

CEng Computational Vision

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

Lecture 9 & 10: Stereo Vision

Final Review CMSC 733 Fall 2014

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

Lecture 14: Computer Vision

3D DEFORMATION MEASUREMENT USING STEREO- CORRELATION APPLIED TO EXPERIMENTAL MECHANICS

Occlusion Detection of Real Objects using Contour Based Stereo Matching

Stereo: Disparity and Matching

EE795: Computer Vision and Intelligent Systems

Stereo vision. Many slides adapted from Steve Seitz

Epipolar Geometry and Stereo Vision

Interpolation using scanline algorithm

Graphics Hardware and Display Devices

Cameras and Stereo CSE 455. Linda Shapiro

Introduction to Computer Vision. Introduction CMPSCI 591A/691A CMPSCI 570/670. Image Formation

Announcements. Stereo

Model-Based Stereo. Chapter Motivation. The modeling system described in Chapter 5 allows the user to create a basic model of a

Why study Computer Vision?

CSE 4392/5369. Dr. Gian Luca Mariottini, Ph.D.

CSE 252B: Computer Vision II

Structured light 3D reconstruction

Stereo and structured light

Multi-View Geometry (Ch7 New book. Ch 10/11 old book)

Stereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17

Binocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Spacetime Stereo: A Unifying Framework for Depth from Triangulation. Abstract

Final Exam Study Guide

Classification and Detection in Images. D.A. Forsyth

IDE-3D: Predicting Indoor Depth Utilizing Geometric and Monocular Cues

Basic Algorithms for Digital Image Analysis: a course

3D Computer Vision. Structured Light I. Prof. Didier Stricker. Kaiserlautern University.

Large-Scale 3D Point Cloud Processing Tutorial 2013

Binocular cues to depth PSY 310 Greg Francis. Lecture 21. Depth perception

Transcription:

Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze features that are consistent within class and differ between them as much as possible. Train with many exemplars from each class. Recognition of scene types Find and analyze common features, objects, or layouts within scene classes. Use large variety of scene photos. Example: AlexNet 1 2 Another Example: Inception Another Example: Inception 3 4 Depth Depth Estimating the distance of a point from the observer is crucial for scene and object recognition. There are many monocular cues such as shading, texture gradient, and perspective. However, just like biological systems, computer vision systems can benefit a lot from stereo vision, i.e. using two visual inputs from slightly different angles. Computer vision systems also have the option to use completely different approaches such as radar, laser range finding, and structured lighting. Let us first talk about stereo vision. 5 6 1

In the simplest case, the two cameras used for stereo vision are identical and separated only along the x- axis by a baseline distance b. Then the image planes are coplanar. Depth information is obtained through the fact that the same feature point in the scene appears at slightly different positions in the two image planes. This displacement between the two images is called the disparity. The plane spanned by the two camera centers and the feature point is called the epipolar plane. Its intersection with the image plane is the epipolar line. 7 Geometry of binocular stereo vision 8 Ideally, every feature in one image will lie in the same vertical position in the other image. However, due to distortion in the camera images and imperfect geometry, there usually is also some vertical disparity. While many algorithms for binocular stereo do not account for such vertical disparity, it increases the robustness of our system if we allow at least a few pixels of deviation from the epipolar line. Before we discuss the computation of depth from stereo imaging, let us look at two definitions: A conjugate pair is two points in different images that are the projections of the same point in the scene. Disparity is the distance between points of a conjugate pair when the two images are superimposed. Let us say that the scene point P is observed at points p l and p r in the left and right image planes, respectively. For convenience, let us assume that the origin of the coordinate system coincides with the left lens center. 9 10 Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and p r RC r, we get: Combining gives us: Due to the limited resolution of images, increasing the baseline distance b gives us a more precise estimate of depth z. However, the greater b, the more different are the two viewing angles, and the more difficult it can become to determine the correspondence between the two images. This brings us to the main problem in stereo vision: How can we find the conjugate pairs in our stereo images? This problem is called stereo matching. 11 12 2

In stereo matching, we have to solve a problem that is still under investigation by many researchers, called the correspondence problem. It can be phrased like this: For each point in the left image, find the corresponding point in the right image. The idea underlying all stereo matching algorithms is that these two points should be similar to each other. So we need a measure for similarity. Moreover, we need to find matchable features. 13 14 15 16 17 18 3

19 20 21 22 Generating Interesting Points Using interpolation to determine the depth of points within large homogeneous areas may cause large errors. To overcome this problem, we can generate additional interesting points that can be matched between the two images. The idea is to use structured light, i.e., project a pattern of light onto the visual scene. This creates additional variance in the brightness of pixels and increases the number of interesting points. 23 24 4

Generating Interesting Points Shape from Shading Besides binocular disparity, there are many different ways of depth estimation based on monocular information. For example, if we know the reflective properties of the surfaces in our scene and the position of the light source, we can use shape from shading techniques: Basically, since the amount of light reflected by a surface depends on its angle towards the light source, we can estimate the orientation of surfaces based on their intensity. More sophisticated methods also use the contours of shadows cast by objects to estimate the shape and orientation of those objects. 25 26 Photometric Stereo To improve the coarse estimates of orientation derived from shape from shading methods, we can use photometric stereo. This technique uses three light sources that are located at different, known positions. Three images are taken, one for each light source, with the other two light sources being turned off. This way we determine three different intensities for each surface in the scene. These three values put narrow constraints on the possible orientation of a surface and allow a rather precise estimation. 27 Shape from Texture As we discussed before, the texture gradient gives us valuable monocular depth information. At any point in the image showing texture, the texture gradient is a two-dimensional vector pointing towards the steepest increase in the size of texture elements. The texture gradient across a surface allows a good estimate of the spatial orientation of that surface. Of course, it is important for this technique that the image has high resolution and a precise method of texture size (granularity) measurement is used. 28 Shape from Motion The shape from motion technique s similar to binocular stereo, but it uses only one camera. This camera is moved while it takes images from the visual scene. This way two images with a specific baseline distance can be obtained, and depth can be computed just like for binocular stereo. We can even use more than two images in this computation to get more robust measurements of depth. The disadvantages of shape from motion techniques are the technical overhead for moving the camera and the reduced temporal resolution. 29 Range Imaging If we can determine the depth of every pixel in an image, we can make this information available in the form of a range image. A range image has exactly the same size and number of pixels as the original image. However, each pixel does not specify color or intensity, but the depth of that pixel in the original image, encoded as grayscale. Usually, the brighter a pixel in a range image is, the closer the corresponding pixel is to the observer. By providing both the original and the range image, we basically define a 3D image. 30 5

Sample Range Image of a Mug 31 Range Imaging Through Triangulation How can we obtain precise depth information for every pixel in the image of a scene? One precise but slow method uses a laser that can rotate around its vertical axis and can also assume different vertical positions. This laser systematically and sequentially illuminates points in the image. A scene camera determines the position of every single point in its picture. The trick is that this camera looks at the scene from a different direction than does the laser pointer. Therefore, the depth of every point can be easily and precisely determined through triangulation: 32 Range Imaging Through Triangulation 33 6