A New Representation for Video Inspection. Fabio Viola

A New Representation for Video Inspection Fabio Viola

Outline Brief introduction to the topic and definition of long term goal. Description of the proposed research project. Identification of a short term goal and work ongoing.

Project and Partners PhD funded by EPSRC Project M3Underground. Computer Vision tools designed for monitoring of aging infrastructures such as tunnels and shafts. Co-supervision by Dr Andrew Fitzgibbon in Microsoft Research.

Data sources STILLS VIDEO

Stills Robust Image Mosaicing: Classic Homographic approach; Registration on Inferred Quadric Surface (ongoing). Content-based image data-sets browsing (ongoing).

Video Easier and quicker way to collect data. But: How can be the information content of video be made more accessible to the user? Is there a way to summarize the content of a whole video in an easy to read format capable of interacting with original sequence?

Input: video sequence No restrictions on: camera motion; scene geometry; motion in the scene causing occlusion, disocclusion, self occlusion phenomena; presence of textured and textureless surfaces.

Images from Rav-Acha,et al, " Minimal Aspect Distorsion (MAD) Mosaicing of Long Scenes"

Video courtey of Microsoft Research Cambridge

Output: representation Mosaic-like still representation of the whole video; Tool allowing to edit the mosaic and coherently propagate automatically the appearance of the edited surfaces through the whole length of the sequence.

Images from Rav-Acha,et al, " Minimal Aspect Distorsion (MAD) Mosaicing of Long Scenes"

Photo from Feldman et al. New View Synthesis With Non-Stationary Mosaicing.

Remarks: It is not about full 3D reconstruction; It is between a 2D and a 3D framework: only pursue the 3D knowledge needed to generate the views of the object whose appearance has been edited throughout the given input sequence.

Why not just mosaicing? Mosaicing frames into a map is equivalent to stills mosaicing, therefore the same restrictions apply on: scene geometry; camera motion. Rather than consider motion a limitation, we want to try to use it to partially capture the geometry.

Is this representation flat? Not all surfaces can be developed on the plane without introducing distortion or cutting them: the constraint is on the curvature of the surface, namely it has to be a 0 Gaussian Curvature.

Example: the sphere.

Other possible benefits Fast Access to Video; Rectification of the view of the inspected area (to investigate); Super-resolution (to investigate).

Video courtesy of Dr Fitzgibbon and Microsoft Research. Flow computed using Black and Anandan Algorithm.

Short term problem: Robust Dense Motion estimation over long sequences; Extraction of dense long reliable tracks.

Summary Brief introduction to the topic and definition of long term goal. Description of the proposed research project. Identification of a short term goal and work ongoing.

Computer Vision for Infrastructure Assessment: Sparse 3D reconstruction Krisada Chaiyasarn

Outline Background Use for tunnel inspection Method outline Progress Plan for the future

Background QuickTime and a H.264 decompressor are needed to see this picture. The video clip is based on Photo tourism: Exploring photo collections in 3D(Snavely N.,Seitz SM.,and Szeliski R. 2006)

Usefulness Output for the next step Sparse visualisation of the scene Better method of image browsing New image can be registered Enable better comparison

Methods outline Feature extraction Feature Matching Image Matching Recover 3D metric structure and camera parameters Based on the work by Brown M. and Lowe DG. (2005), Unsupervised 3D Object Recognition and Reconstruction in Unordered Datasets

Feature Extraction Detect Features using Scale Invariant Feature Transform(SIFT), Lowe, IJCV 2004 Other features possible, e.g. Harris Corners

Feature Matching Use Nearest Neighbour search by KD-Tree structure Remove the incorrect match by Space Outliner Rejection e match < 0.8e outlier

2 - View Geometry Matched points satisfy the equation x T Fx = 0 F is a fundamental matrix F can be estimated by Random Sampling Consensus algorithm (other possible methods, e.g. 8-point normalized algorithms etc.)

Image Matching Outliers removed, only geometrically consistent match retained (inliers)

Recover of 3D Structure QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime and a TIFF (LZW) decompressor are needed to see this picture. QuickTime and a TIFF (LZW) decompressor are needed to see this picture. QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. Triangulation The above figure was taken from 4F12: Computer Vision lecture note, Cambridge University Engineering Department

N - View Geometry QuickTime and a TIFF (LZW) decompressor are needed to see this picture. Bundle Adjustment algorithm solve directly for metric structure and camera parameters, but require suitable initialisation

Algorithm Input: n unordered images I. Extract SIFT features from all n images II. Find k nearest-neighbours for each feature using a k-d tree III. For each image: (i) Select m candidate matching images (with the maximum number of feature matches to this image) (ii) Find geometrically consistent feature matches using RANSAC to solve for the fundamental matrix between pairs of images IV. Find connected components of image matches V. For each connected component: (i) Perform sparse bundle adjustment to solve for the rotation 1, 2, 3, translation t1, t2, t3 and focal length f of all cameras, and pointwise 3D geometry Output: 3D model of coordinates The above algorithm is obtained from Brown M. and Lowe DG. (2005)

Picture taken from Brown M (2005)

Progress Recover projective 3D structure between a pair of images Simple run for the BA algorithm Explore the matlab tool box on structure and motion

Plan The recovered structures used for texture mapping (e.g. Quadric Reconstruction from Dual-Space Geometry, Cross and Zisserman) Crack detection and localisation (e.g. A Tunnel Crack Detection and Classification System based on Image Processing, Liu Z, 2002)

Questions?