3D Visualization through Planar Pattern Based Augmented Reality

Similar documents
Overview. Augmented reality and applications Marker-based augmented reality. Camera model. Binary markers Textured planar markers

A Summary of Projective Geometry

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

Stereo and Epipolar geometry

Image correspondences and structure from motion

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

Automatic Image Alignment (feature-based)

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

Step-by-Step Model Buidling

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

CSCI 5980/8980: Assignment #4. Fundamental Matrix

Pin Hole Cameras & Warp Functions

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018

Homographies and RANSAC

Personal Navigation and Indoor Mapping: Performance Characterization of Kinect Sensor-based Trajectory Recovery

Parameter estimation. Christiano Gava Gabriele Bleser

1 Projective Geometry

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006

Object Recognition with Invariant Features

Feature Based Registration - Image Alignment

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Comparison of Feature Detection and Matching Approaches: SIFT and SURF

Feature-based methods for image matching

Midterm Examination CS 534: Computational Photography

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Outline. Introduction System Overview Camera Calibration Marker Tracking Pose Estimation of Markers Conclusion. Media IC & System Lab Po-Chen Wu 2

UNIVERSITY OF TORONTO Faculty of Applied Science and Engineering. ROB501H1 F: Computer Vision for Robotics. Midterm Examination.

Object Reconstruction

URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES

A REAL-TIME REGISTRATION METHOD OF AUGMENTED REALITY BASED ON SURF AND OPTICAL FLOW

Fast Natural Feature Tracking for Mobile Augmented Reality Applications

Scott Smith Advanced Image Processing March 15, Speeded-Up Robust Features SURF

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Agenda. Rotations. Camera models. Camera calibration. Homographies

Real-time detection of independent motion using stereo

Robotics Programming Laboratory

Vision par ordinateur

calibrated coordinates Linear transformation pixel coordinates

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Outline. ETN-FPI Training School on Plenoptic Sensing

Pin Hole Cameras & Warp Functions

Stitching and Blending

CAP 5415 Computer Vision Fall 2012

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne

3D Reconstruction of a Hopkins Landmark

Feature Detection and Matching

3d Pose Estimation. Algorithms for Augmented Reality. 3D Pose Estimation. Sebastian Grembowietz Sebastian Grembowietz

Structure from Motion. Prof. Marco Marcon

5LSH0 Advanced Topics Video & Analysis

CS 231A Computer Vision (Winter 2018) Problem Set 3

3D Environment Reconstruction

Vision-Based Hand Detection for Registration of Virtual Objects in Augmented Reality

Local Features: Detection, Description & Matching

CS 231A Computer Vision (Winter 2014) Problem Set 3

/10/$ IEEE 4048

Object Detection by Point Feature Matching using Matlab

Implemented by Valsamis Douskos Laboratoty of Photogrammetry, Dept. of Surveying, National Tehnical University of Athens

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Chapter 3 Image Registration. Chapter 3 Image Registration

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation

A Comparison of SIFT, PCA-SIFT and SURF

Automatic Image Alignment

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Model Fitting, RANSAC. Jana Kosecka

MERGING POINT CLOUDS FROM MULTIPLE KINECTS. Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia

A Comparison of SIFT and SURF

Application questions. Theoretical questions

Local features and image matching. Prof. Xin Yang HUST

CS 231A: Computer Vision (Winter 2018) Problem Set 2

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

Implementing the Scale Invariant Feature Transform(SIFT) Method

Local features: detection and description. Local invariant features

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Augmenting Reality, Naturally:

Visualization 2D-to-3D Photo Rendering for 3D Displays

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

Automatic Image Alignment

Structure from motion

ABSTRACT 1. INTRODUCTION

Geometric camera models and calibration

CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry

Obtaining Feature Correspondences

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Fitting. Fitting. Slides S. Lazebnik Harris Corners Pkwy, Charlotte, NC

Multiple-Choice Questionnaire Group C

Mosaics. Today s Readings

Computer Vision CSCI-GA Assignment 1.

Agenda. Rotations. Camera calibration. Homography. Ransac

Local Feature Detectors

Motion Tracking and Event Understanding in Video Sequences

CALIBRATION BETWEEN DEPTH AND COLOR SENSORS FOR COMMODITY DEPTH CAMERAS. Cha Zhang and Zhengyou Zhang

Miniature faking. In close-up photo, the depth of field is limited.

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov

Local features: detection and description May 12 th, 2015

3D Motion from Image Derivatives Using the Least Trimmed Square Regression

Image processing and features

A consumer level 3D object scanning device using Kinect for web-based C2C business

Instance-level recognition

Transcription:

NATIONAL TECHNICAL UNIVERSITY OF ATHENS SCHOOL OF RURAL AND SURVEYING ENGINEERS DEPARTMENT OF TOPOGRAPHY LABORATORY OF PHOTOGRAMMETRY 3D Visualization through Planar Pattern Based Augmented Reality Dr. Charalabos Ioannidis, Professor of Photogrammetry Styliani Verykokou, Surveying Engineer, PhD student XXV INTERNATIONAL FIG CONGRESS ENGAGING THE CHALLENGES, ENHANCING THE RELEVANCE 16-21 JUNE 2014, KUALA LUMPUR, MALAYSIA AUGMENTED REALITY AR (1/4) Enriches reality with computer generated information Enhances people s perception of reality Adds mainly visual information in the real world, which has the dominant role Can potentially enhance all five senses Belongs to the technology of mixed reality (Milgram et al., 1994), according to which data that belongs to both the real and virtual world are presented as coexisting in the same place Real Environment Augmented Reality Augmented Virtuality Virtual Environment Mixed Reality

AUGMENTED REALITY AR (2/4) AR Systems (Azuma, 1997): 1. Combine real and virtual objects in a real environment 2. Allow real-time interaction 3. Register virtual objects in the three-dimensional space 1968: 1 st AR system (Sutherland) 1992: the term augmented reality was coined (Caudell & Mizell) TODAY: applications in various fields Medicine Education Navigation Architecture interior design Commerce - advertising Art culture tourism The Army Entertainment sports Task support AUGMENTED REALITY AR (3/4) AR Applications in the field of surveyor Virtual reconstruction of half-ruined buildings, statues, or archaeological sites real-time integration of their 3D models into the real world Visualization of constructions during their design phase Navigation (combination of AR and location-based services)

AUGMENTED REALITY AR (4/4) Augmented Reality Tracking of the position and the orientation of the camera Pattern based AR Outline AR Location-based AR Surface AR Marker-based AR Markerless AR Registration of the virtual object in the real world visual tracking APPLICATION DEVELOPED Application: Planar pattern based markerless AR application Purpose: Visualization of the 3D anaglyph of a region through the screen of a PC, without the need of a 3D print-out of its DTM Methodology: Initial data: a pattern image a 3D augmentation model in OBJ format the interior orientation of the computer camera Calculation of the exterior orientation of every video frame Right rendering of the 3D model on a computer window whereby the background is the video frame

CASE STUDY Pattern image: a photograph of a printed orthoimage that depicts an area (18km x 12km) around the artificial lake of the river Ladonas in the Peloponnese, Greece Augmentation model: DTM of the region depicted in the orthoimage Basic concept: the depiction of the DTM instead of the printed orthoimage on a computer window at the same position as the orthoimage, with the same orientation and size and with the proper perspective, each time the orthoimage is located in the field of view of the computer camera CAMERA CALIBRATION Fully automated approach by taking pictures of a planar chessboard pattern shown at different orientations, based on Zhang s (2000) and Bouguet s (2013) methods. Calibration of the camera of a laptop computer Initial Data: Chessboard images Number of the internal chessboard corners in the two directions

METHODOLOGY OF CAMERA CALIBRATION Initial processing of each image Check whether the chessboard pattern can be recognized in each image Detection of the internal chessboard corners if that check is positive Computation of the object coordinates of the internal corners of the chessboard Estimation of the initial camera interior orientation Computation of the approximate camera exterior orientation for each image, if the chessboard pattern is detected Final computation of the camera interior parameters and the camera exterior parameters for each image, using the Levenberg- Marquardt optimization algorithm (Levenberg,1944; Marquardt, 1963), for the minimization of the reprojection error. DEFINITION OF THE COORDINATES OF THE PATTERN OBJECT Origin: the center of the orthoimage X and Y coordinates are derived from the normalized width and height of the pattern image within the range of [-1, 1] Z =0 width W = και Η = max(width,height) height max(width,height)

FEATURES EXTRACTION Features extraction in the pattern image and in every video frame : Speeded-Up Robust Features (Bay et al., 2006) Scale invariance Rotation invariance Skew Anisotropic scaling Perspective effects 2 steps 1. Features detection 2. Features description Detection of interest points located in blob-like structures of the image METHODOLOGY OF SURF (1/2) 1. Features Detection Based on the determinant of an approximation of the Hessian Matrix, which is computed for every pixel of the image in all scale levels of each octave into which scale space is divided determinant H approx. ( x,y,s) D (x,y,s) D (x,y,s) xx xy = D xy(x,y,s) D yy(x,y,s) D xx, D xy και D yy : the convolution of box filters with the image at point (x,y) ( ) = ( ) ( ) ( ) 2 approx xx yy xy det H D x,y,s D x,y,s w D x,y,s

METHODOLOGY OF SURF (2/2) 2. Features Description Computation of a 64-dimensional vector for each interest point descriptor Indicates the underlying intensity structure of a square region around the interest point, oriented along its dominant orientation MATCHING (1/2) Matching criterion: Euclidean distance between the descriptors of the feature points Minimization of the outliers cross-check test Two feature points i and j are matched if: the nearest neighbor of the descriptor of point i in the pattern image is the descriptor of point j in the video frame AND the nearest neighbor of the descriptor of point j in the video frame is the descriptor of point i in the pattern image

MATCHING (2/2) Many outliers still remain! Definition of a maximum accepted Euclidean distance The correspondences are rejected if the Euclidean distance between the descriptors of the matched features points is above that threshold A few incorrect matches are not detected Final outliers rejection via RANSAC RANSAC FOR THE REJECTION OF OUTLIERS RANSAC: RANndom SAmple Consensus (Fischler & Bolles, 1981) Generally: Computation of the parameters of a mathematical model using a data set, which may contain many errors, relying on the use of the minimum number of data. In the application: Rejection of the outliers via the computation of the 2D homography between the pattern image and each video frame x x frame h11 h12 h 13 pattern frame = 21 22 23 pattern s y h h h y 1 h h h 1 31 32 33 h33 = 1 RANSAC is applied if at least 5 matches are detected. Otherwise: the orthoimage is not recognized in the frame the scene is not augmented

ITERATIVE PROCEDURE FOLLOWED BY RANSAC A sample of 4 matches is randomly chosen from all matches The homography is estimated using the random sample The number of valid matches (inliers) is calculated for the above solution 3 cases: inliers threshold the model is accepted and the algorithm terminates with success inliers < threshold & number of iterations = maximum number of iterations the algorithm terminates with failure inliers < threshold & number of iterations < maximum number of iterations these steps are repeated HOMOGRAPHY ESTIMATION Initial homography estimation via RANSAC using the best set of four matches It is refined using the set of all inliers via a nonlinear optimization using the Levenberg-Marquardt algorithm If inliers 5: the orthoimage is detected in the video frame Otherwise: the scene is not augmented RESULT

OUTLIERS REJECTION VIA RANSAC Matches before the rejection of the outliers by RANSAC Inliers returned by RANSAC PATTERN RECOGNITION Computation of the pixel coordinates of the four corners of the pattern object in the video frame x ' x frame h11 h12 h 13 pattern y ' = h h h y frame 21 22 23 pattern w ' h h 1 frame 31 32 1 x = x ' w ' frame frame frame y = y ' w ' frame frame frame 1 (0,0) 2 (width,0) Homography Matrix 4 (0,height) 3 (width,height)

CAMERA EXTERIOR ORIENTATION ESTIMATION (1/4) Camera interior orientation Pixel coordinates of the four corners of the pattern object in the video frame Rotation Translation Object Coordinate System Object coordinates of the four corners of the pattern object Camera Coordinate System CAMERA EXTERIOR ORIENTATION ESTIMATION (2/4) Mathematical model: Projection transformation X x c 0 x r 11 r12 r13 t x 0 1 = Y λ y 0 cy y0 r21 r22 r23 t2 Z 1 0 0 1 r31 r32 r33 t3 1 Camera matrix Κ Joint rotationtranslation matrix [R t] Homogeneous coordinates c x =c/width pixel, c y =c/height pixel x 0, y 0 : pixel coordinates of the principal point s: skewness r ij : elements of the rotation matrix t i : elements of the translation vector x, y: pixel coordinates X, Y, Z: ground coordinates

CAMERA EXTERIOR ORIENTATION ESTIMATION (3/4) Linear computation of the approximate elements of the joint rotation-translation matrix using: the homography that relates the planar object coordinates and the pixel coordinates of the corners of the orthoimage after their undistortion the camera interior orientation r h = λ K r h 1 = λ r r r 3 = 1 2 1 1 1 2 K 2 λ = 1 11 12 13 h 1 1 K H h h h 21 22 23 t h 1 = λ K 3 r r r t R t = r r r t r r r t 11 12 13 X 21 22 23 Y 31 32 33 Z r r r 1 2 3 h h h = h h h 31 32 33 h h h 1 2 3 t CAMERA EXTERIOR ORIENTATION ESTIMATION (4/4) Calculation of the Singular Value Decomposition (SVD) of the rotation matrix R and recalculation of R in order to satisfy the orthogonality condition R W = I R T T = U W V = U V Conversion of R into a 3D rotation vector via Rodrigues formula Parallel to the rotation axis Magnitude equal to the magnitude of the rotation Levenberg-Marquardt optimization in order to refine the translation and rotation vectors

RENDERING OF THE AUGMENTED SCENE (1/4) 2 steps: 1. Rendering of the video frame on a computer window, so that it forms its background 2. Rendering of the DTM on that window The coordinates of the vertices of the DTM are defined in a local coordinate system object coordinate system They are transformed into the object coordinate system by being normalized into the range of [-1,1] XCAMERA r11 r12 r13 t X 1 MODEL YCAMERA = r21 r22 r23 t2 YMODEL Z CAMERA r31 r32 r33 t3 ZMODEL 1 0 0 0 1 1 The normalized object coordinates are transformed into the camera system (viewing transformation) RENDERING OF THE AUGMENTED SCENE (2/4) Definition of the viewing volume It determines how the DTM is projected into the final scene It determines which of its parts are clipped, so as not to be drawn in the final scene Perspective Projection Camera coordinates are transformed to clip coordinates 2 c 2 x x 0 0 1 0 X width width CLIP X CAMERA 2 cy 2 y Y 0 CLIP 0 1+ 0 = YCAMERA height height Z CLIP Z CAMERA near + far 2 far near WC LIP 0 0 1 near far far near 0 0 1 0

RENDERING OF THE AUGMENTED SCENE (3/4) Clip coordinates (Χ CLIP, Y CLIP, Z CLIP, W CLIP ) Division with W CLIP Normalized Cartesian coordinates (Χ NDC, Y NDC, Z NDC ) Available window surface y (x 1, y 1 +height) (x 1 +width, y 1 +height) Drawing surface viewport transformation and depth-range transformation (x 1, y 1 ) (x 1 +width, y 1 ) x width width x = XNDC + x1 + 2 2 z = 1 Z + 1 height NDC height y = Y 2 + y + 2 NDC 1 XXV 2 International Federation 2 of Window coordinates in pixels (x, y) and depth coordinates (z) RENDERING OF THE AUGMENTED SCENE (4/4) The orthoimage is draped on the DTM for a realistic representation of the anaglyph Texture Mapping

Development of the Application Programming language: C++ OpenCV library (Open source Computer Vision Library) ΟpenGL API (Open Graphics Library) libraries OpenGL Core Library GLU Freeglut GLEW GLM: An Alias Wavefrnt OBJ file Library The application is intended for computers running Microsoft Windows

Thank you for your attention!