CITS 4402 Computer Vision

Similar documents
Robotics Programming Laboratory

Processing 3D Surface Data

3D Perception. CS 4495 Computer Vision K. Hawkins. CS 4495 Computer Vision. 3D Perception. Kelsey Hawkins Robotics

Processing 3D Surface Data

Processing 3D Surface Data

Chapter 3 Image Registration. Chapter 3 Image Registration

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Feature Tracking and Optical Flow

Local features: detection and description. Local invariant features

Local Features: Detection, Description & Matching

Stereo Vision. MAN-522 Computer Vision

Structured light 3D reconstruction

3D Photography: Active Ranging, Structured Light, ICP

3D Keypoints Detection for Objects Recognition

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Scale Invariant Feature Transform

On the Repeatability and Quality of Keypoints for Local Feature-based 3D Object Retrieval from Cluttered Scenes

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

CS 4495 Computer Vision Motion and Optic Flow

Matching and Recognition in 3D. Based on slides by Tom Funkhouser and Misha Kazhdan

Motion Estimation and Optical Flow Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Local features and image matching. Prof. Xin Yang HUST

3D Photography: Stereo

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Data Representation in Visualisation

EE795: Computer Vision and Intelligent Systems

The Novel Approach for 3D Face Recognition Using Simple Preprocessing Method

Scale Invariant Feature Transform

Computer Vision for HCI. Topics of This Lecture

3D Polygon Rendering. Many applications use rendering of 3D polygons with direct illumination

Lecture 16: Computer Vision

3D Scanning. Qixing Huang Feb. 9 th Slide Credit: Yasutaka Furukawa

L2 Data Acquisition. Mechanical measurement (CMM) Structured light Range images Shape from shading Other methods

A Keypoint Descriptor Inspired by Retinal Computation

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Feature Tracking and Optical Flow

3D Computer Vision. Structured Light I. Prof. Didier Stricker. Kaiserlautern University.

Feature descriptors and matching

CPSC / Texture Mapping

Depth Sensors Kinect V2 A. Fornaser

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Coarse-to-fine image registration

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Lecture 16: Computer Vision

Lecture 8 Object Descriptors

Scott Smith Advanced Image Processing March 15, Speeded-Up Robust Features SURF

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

Outline 7/2/201011/6/

Technical Report No D Object Recognition using Local Shape Descriptors. Mustafa Mohamad

Computer Vision. 3D acquisition

2D Image Processing Feature Descriptors

Capturing, Modeling, Rendering 3D Structures

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Image features. Image Features

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999)

The SIFT (Scale Invariant Feature

Tracking system. Danica Kragic. Object Recognition & Model Based Tracking

DETECTION AND ROBUST ESTIMATION OF CYLINDER FEATURES IN POINT CLOUDS INTRODUCTION

03 - Reconstruction. Acknowledgements: Olga Sorkine-Hornung. CSCI-GA Geometric Modeling - Spring 17 - Daniele Panozzo

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

3D object recognition used by team robotto

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

CS 475 / CS 675 Computer Graphics. Lecture 11 : Texture

CITS 4402 Computer Vision

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Feature Based Registration - Image Alignment

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Digital Image Processing

CS4670: Computer Vision

Correspondence. CS 468 Geometry Processing Algorithms. Maks Ovsjanikov

SIFT - scale-invariant feature transform Konrad Schindler

The main problem of photogrammetry

Multiple View Geometry

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Image processing and features

HISTOGRAMS OF ORIENTATIO N GRADIENTS

Automatic 3D Face Detection, Normalization and Recognition

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Intensity Augmented ICP for Registration of Laser Scanner Point Clouds

3D Modeling of Objects Using Laser Scanning

Differential Geometry: Circle Patterns (Part 1) [Discrete Conformal Mappinngs via Circle Patterns. Kharevych, Springborn and Schröder]

Peripheral drift illusion

Texture. Texture Mapping. Texture Mapping. CS 475 / CS 675 Computer Graphics. Lecture 11 : Texture

3D Object Representations. COS 526, Fall 2016 Princeton University

Algorithm research of 3D point cloud registration based on iterative closest point 1

AUTOMATIC 3D HUMAN ACTION RECOGNITION Ajmal Mian Associate Professor Computer Science & Software Engineering

Geometric Modeling Summer Semester 2012 Point-Based Modeling

Key properties of local features

Edge and local feature detection - 2. Importance of edge detection in computer vision

Object Recognition with Invariant Features

Local features: detection and description May 12 th, 2015

CSL 859: Advanced Computer Graphics. Dept of Computer Sc. & Engg. IIT Delhi

Feature Detectors and Descriptors: Corners, Lines, etc.

Application questions. Theoretical questions

Transcription:

CITS 4402 Computer Vision Prof Ajmal Mian Lecture 12 3D Shape Analysis & Matching

Overview of this lecture Revision of 3D shape acquisition techniques Representation of 3D data Applying 2D image techniques on 3D data ICP Algorithm Keypoint detection in 3D data 3D feature extraction and matching The University of Western Australia 2

Laser stripe scanner A laser stripe is moved over the object The stripe is orthogonal to the epipolar lines and helps resolve correspondence Can use one camera and one laser stripe projector www.laserdesign.com 3D model Real object The University of Western Australia 3

Minolta Vivid 3D scanner uses laser stripe Available in CSSE Fine scan resolution is 640x480 with 1mm resolution Fast scan is 320x240 but scans in less than a second The University of Western Australia 4

Structured light Multiple stripes or other 2D patterns are simultaneously projected Stripes are coded through Time coding Colour coding Other spatial patters Random pattern Kinect 1 projects spatial infra-red random pattern The University of Western Australia 5

Sample 3D data from Kinect 1 Comparing with other scanners The University of Western Australia 6

Full 3D reconstruction with Kinect 1 Rotate object Get 30 fps Integrate Kinect SDK is available OpenCV + Matlab The University of Western Australia 7

Laser time of flight scanner Fire laser and measure its time delay (phase shift) after reflection from an object Measure distance in that directions Kinect 2 is an example Has better spatial resolution (number of points) Has better depth resolution than Kinect 1 The University of Western Australia 8

Representing 3D data Raw 3D data is basically a collection of points called pointcloud Pointcloud is an 3xN matrix of the X, Y, Z coordinates of the N scene points that were scanned A pointcloud can be visualized using rendering softwares based on OpenGL Pointclouds are converted to mesh by connecting nearest points to form triangles (or polygons) Polygons are filled and shaded to render surfaces The University of Western Australia 9

Pointcloud of UWA Winthrop Hall Scanned with laser time of flight scanner The University of Western Australia 10

Basic representation of 3D data The University of Western Australia 11

Pointcloud example Matlab: plot3.m mesh = readplyfile( c:\chef.ply ); X = mesh.vertices(:,1); Y = mesh.vertices(:,2); Z = mesh.vertices(:,3); plot3(x,y,z,. ); axis equal; rotate3d; www.csse.uwa.edu.au/~ajmal/code.html The University of Western Australia 12

Mesh example mesh = readplyfile( c:\chef.ply ); X = mesh.vertices(:,1); Y = mesh.vertices(:,2); Z = mesh.vertices(:,3); trimesh(mesh.triangles,x,y,z); axis equal; rotate3d; The University of Western Australia 13

Surface mesh = readplyfile( c:\chef.ply ); X = mesh.vertices(:,1); Y = mesh.vertices(:,2); Z = mesh.vertices(:,3); trisurf(mesh.triangles,x,y,z); axis equal; rotate3d; The University of Western Australia 14

Rendering as a shaded surface M = calcmeshnormals(mesh); N = M.vertexNormals; [~,p,~]=cart2sph(n(:,1),n(:,2),n(:,3)); trisurf(mesh.triangles,x,y,z,p); axis equal; shading interp; colormap gray; rotate3d; www.csse.uwa.edu.au/~ajmal/code.html The University of Western Australia 15

How to make full 3D models? Problem: A scanner can only see part of the object s surface A single view scan is also referred to as 2.5D Strictly speaking, a single Kinect frame is 2.5D Solution: Scan the object multiple times from different angles and integrate Before we can integrate scans, we need to register/align them The University of Western Australia 16

Iterative Closest Point (ICP) algorithm Given two pointclouds P 1 and P 2, we want to rotate and translate P 2 so that it aligns with P 1 For each point in P 2, find its nearest point in P 1 (Correspondences) Remove incompatible points based on Distance, colour, normals, curvatures, boundary points Let Y P 2 and X P 1 be the remaining set of corresponding points Minimize RY + t X 2 where R is the rotation matrix and t is the translation vector The University of Western Australia 17

ICP algorithm Computationally demanding Find correspondences X and Y Subtract mean: Y = Y m, and X = X n Perform SVD: USV T = Y X T R = VU T (Ensure that det R = 1) t = n Rm Repeat until RY + t X 2 < ε Need minimum of 3 points to solve for R www.csse.uwa.edu.au/~ajmal/code.html The University of Western Australia 18

ICP algorithm limitations The two pointclouds should be close, otherwise ICP will fail If they are not close, the first set of correspondences must be provided Manually Or by automatic feature matching Can still fail when The surface is rotationally symmetric The surface has repetitive structure The University of Western Australia 19

ICP registration example Corresponding points Three corresponding points marked to find initial coarse registration After ICP registration The University of Western Australia 20

Full 3D model construction Register multiple scans Integrate them to form a single smooth surface (outside the scope of this lecture) The University of Western Australia 21

ICP applications 1. Surface alignment/registration 2. Surface matching The final value of error ϵ is used to decide how well the two surfaces match Advantages Partial surface matching Inbuilt correction of alignment Drawbacks Computationally expensive The University of Western Australia 22

Mesh representation Many scanners output pointclouds as 4 matrices of the same size X matrix containing the x-coordinates at each pixel location Y matrix containing the y-coordinates at each pixel location Z matrix containing the z-coordinates at each pixel location M matrix of binary values with 1 at valid and 0 at invalid pixel Kinect 2 gives XYZ RGB pointcloud Since the pointcloud is arranged on a planar grid, a simple 2D delaunay triangulation of the XY coordinates will give you the mesh Delaunay triangulation connects nearest three points validpixels = find(m(:)); triangles = delaunay(x(validpixels), Y(validPixels)); Vertices = [X(validPixels) Y(validPixels) Z(validPixels)]; Remember to remove elongated triangles The University of Western Australia 23

Example: Pointcloud to mesh conversion The triangles variable will contain sets of 3 points (their index numbers) that make a triangle The University of Western Australia 24

2D-grid based representations of pointclouds Depth image: Depth of every point from a plane The Z matrix can be regarded as the depth image For actual depth image, need to resample on a uniform XY grid We can display/visualize it using image visualization (imshow, imagesc) Need to scale Z values so that they are 0-255 Range image: Range of every point from the camera center The difference is small and they are used interchangeably in the literature Kinect 2 can also give depth images The University of Western Australia 25

Example depth images Bright pixels mean close points The University of Western Australia 26

Principal curvatures Surface normal at a point Maximum curvature κ 1 Minimum curvature κ 2 Principal directions Gaussian curvature κ 1 κ 2 Average curvature κ 1 + κ 2 /2 The University of Western Australia 27

More 2D-grid based representations of pointclouds We can derive many other 2D-grid based representations from the X, Y, Z matrices The normal vector n x, n y, n z T gives 3 images. The normal vector θ, φ, r T gives another 2 images. Minimum and maximum curvatures give another 2 images The University of Western Australia 28

Applying 2D Gaussian on the Z-image 7x7 window and σ = 4 Elongated triangles after delaunay triangulation The University of Western Australia 29

Need to remove spikes and fill holes before smoothing Smoothing with a median filter The University of Western Australia 30

Gradient images from Z-image Range image X gradient Y gradient The University of Western Australia 31

Detecting features in 3D data Train Haar classifier on gradient images Detect features in real-time The University of Western Australia 32

Spherical feature Sum the number of points inside spheres of increasing radii Easily implemented by histogramming only the radii values in spherical coordinates 1-D feature The University of Western Australia 33

Spin images (Johnson & Hebert, PAMI 99) Two steps 1. Use a point and its normal to define an image plane 2. Spin the image plane and sum the points passing through each bin/pixel Easily implemented by histogramming in cylindrical coordinates. Download code from www.csse.uwa.edu.au/~ajmal/code.html The University of Western Australia 34

Tensor feature Define a 3D grid over the mesh Need three non-coplanar points OR two points and their normals Find the area of intersection with each cubelet 3D feature Mian et al, 3D object recognition and segmentation in cluttered scenes, PAMI 2005 The University of Western Australia 35

Feature comparison Spherical feature does not need any reference vector or coordinate basis Most robust Less descriptive Spin image needs a reference normal vector Medium robust Medium descriptive Tensor feature needs a complete 3D reference frame Least robust Most descriptive The University of Western Australia 36

Shape index SI = 2 π arctan κ 2 + κ 1 κ 2 + κ 1 cup rut saddle ridge cap SI cup trough rut saddle rut saddle ridge dome cap -1 +1 J. Koenderink and A. van Doorn, ``The internal representation of solid shape with respect to vision,'' Biological Cybernetics, 1979. The University of Western Australia 37

Shape Index image Can organize SI values into an image And use standard image matching P Szeptycki et al., "A coarse-to-fine curvature analysis-based rotation invariant 3D face landmarking" BTAS 2009. The University of Western Australia 38

Keypoint detection in 3D images Similar to corner detection and LoG/DoG blob detection Need to detect interesting points To avoid feature extraction at every point Save computation All locations are not discriminative Does this give you an idea to find keypoints? The University of Western Australia 39

Keypoint detection in 3D images Can apply image based techniques on Z-image Problems Does not fully exploit the potential of 3D data 3D surfaces are smooth (corners and edges are subtle) Corners and edges are at boundary or at noisy locations (e.g. spikes) With rotation, the pixel values change in depth images move in conventional 2D images The University of Western Australia 40

Keypoint Quality (KPQ) : Feature detection in 3D pointclouds Given a point p, crop out a spherical region with radius r around it Let P be the 3xN matrix of N points inside the sphere Find its covariance matrix C = P m P m T Perform PCA: CV = DV Align the points on their principal axes: P = V(P m) Use the X, Y coordinates of the aligned points to find the ratio δ = If δ > threshold, accept the point as a keypoint max X min X max Y min Y OR λ 1 λ 2 > threshold The University of Western Australia 41

KPQ Intuition KPQ selects keypoints where the surface curvature is different in the two directions Unlike edge detection or Lucas Kanade where λ 1 λ 2 is preferred, in KPQ such points are not preferred as they lie on spherical or flat surfaces A unique coordinate basis cannot be defined at a point where λ 1 λ 2 because the principal directions are not unique or sensitive to noise The University of Western Australia 42

KPQ: Number of keypoints vs threshold The University of Western Australia 43

KPQ feature extraction A smooth surface is fitted to the points in P Depth values w.r.t. the center are vectorized and used as a feature We can align the corresponding texture to extract invariant features The University of Western Australia 44

KPQ: Automatic scale selection What should be the radius r i.e. the scale for keypoint detection and feature extraction? Perform scale space search Use different values and select the scale/radius r at which λ 1 /λ 2 is max Discard point where a max cannot be found The University of Western Australia 45

KPQ face recognition Similar faces will have more and spatially coherent matches Different faces will have fewer and spatially incoherent matches Correct match Incorrect match The University of Western Australia 46

KPQ scale invariant object recognition Match KPQ features to find objects in cluttered scenes at unknown scales Prune geometrically inconsistent matches The University of Western Australia 47

Scene completion and match verification Align the database model with the scene Verify the match ICP error Free space violation Occupied space violation ICP error should be small The aligned model should not block visible surfaces The aligned models should not cross over into other objects The University of Western Australia 48

Summarizing the 3D shape matching pipeline 1. Detect keypoints and their scale 2. Extract features at the keypoints (KPQ feature, spherical, spin image, tensor, SI, curvatures..) 3. Match features (l 2 norm, dot product,.) 4. Prune matches (remove geometrically impossible matches) 5. Verify matches (ICP, free/occupied space violation,..) 6. Output best match The University of Western Australia 49

Can we recognize actions? We know how to match objects in images We know how to match objects in 3D data What about video? Can we recognize an action rather than the object/person performing that action? Space-time matching or spatiotemporal matching Space-time SIFT has been proposed but we will focus on 3D data The University of Western Australia 50

3D spatiotemporal matching Sometimes referred to as 4D in the literature Motivations 3D data is preferred for action recognition because it can capture all 6 degrees of freedom (3 rotations and 3 translations) 3D avoids texture which can cause errors in action recognition Kinect 1 and Kinect 2 are available at a very low cost The University of Western Australia 51

Human action recognition from Kinect skeleton data Kinect gives depth image and skeleton data at 30 frames/second Can be used for action recognition Microsoft research dataset The University of Western Australia 52

Histogram of Depth Gradients Recall Histogram of Oriented Gradients (HOG) HOG features can be extracted from gradient of depth images HOG Z x Z y Z t Skeleton displacement feature The University of Western Australia 53

Generating novels views from 3D models Can generate novel views by rotating the object and re-rendering it Can generate novel illuminations The University of Western Australia 54

Demos See recording of lecture 13 for the following demos SIFT matching Single image based height measurement Microsoft Kinect 2 sensor 3D Shape capture Body skeleton tracking Infra-red capture The University of Western Australia 55

Summary Revision of 3D shape acquisition techniques Representation of 3D data Applying 2D image techniques on 3D data ICP Algorithm Keypoint detection in 3D data 3D feature extraction and matching The University of Western Australia 56