Micro-scale Stereo Photogrammetry of Skin Lesions for Depth and Colour Classification

Similar documents
Model-based segmentation and recognition from range data

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

DD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication

calibrated coordinates Linear transformation pixel coordinates

Structured light 3D reconstruction

Stereo Vision. MAN-522 Computer Vision

Product information. Hi-Tech Electronics Pte Ltd

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

Processing 3D Surface Data

Robotics Programming Laboratory

InfoVis: a semiotic perspective

Processing 3D Surface Data

Rectification and Disparity

Topic 6 Representation and Description

Dense 3D Reconstruction. Christiano Gava

Implemented by Valsamis Douskos Laboratoty of Photogrammetry, Dept. of Surveying, National Tehnical University of Athens

FOOTPRINTS EXTRACTION

Processing 3D Surface Data

Stereo and Epipolar geometry

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

Vision Review: Image Formation. Course web page:


DETECTION AND ROBUST ESTIMATION OF CYLINDER FEATURES IN POINT CLOUDS INTRODUCTION

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

ELEC Dr Reji Mathew Electrical Engineering UNSW

General Principles of 3D Image Analysis

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

Simultaneous surface texture classification and illumination tilt angle prediction

Dense 3D Reconstruction. Christiano Gava

Chapter 9 Object Tracking an Overview

Structure from Motion. Prof. Marco Marcon

Advanced Vision Guided Robotics. David Bruce Engineering Manager FANUC America Corporation

Chapters 1 7: Overview

SIFT - scale-invariant feature transform Konrad Schindler

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

Unsupervised learning in Vision

Rectification and Distortion Correction

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Epipolar geometry contd.

4.5 VISIBLE SURFACE DETECTION METHODES

CS 563 Advanced Topics in Computer Graphics Camera Models. by Kevin Kardian

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Margarita Grinvald. Gesture recognition for Smartphones/Wearables

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

The Detection of Faces in Color Images: EE368 Project Report

RASNIK Image Processing with a Steepest Ascent Algorithm

Visual Representation from Semiology of Graphics by J. Bertin

On-line and Off-line 3D Reconstruction for Crisis Management Applications

Image Formation. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Creating a distortion characterisation dataset for visual band cameras using fiducial markers.

Multiple View Geometry

Motion Estimation. There are three main types (or applications) of motion estimation:

MediaTek Video Face Beautify

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

HISTOGRAMS OF ORIENTATIO N GRADIENTS

BUILDING MODEL RECONSTRUCTION FROM DATA INTEGRATION INTRODUCTION

FLOW VISUALISATION OF POLYMER MELT CONTRACTION FLOWS FOR VALIDATION OF NUMERICAL SIMULATIONS

Miniature faking. In close-up photo, the depth of field is limited.

Using temporal seeding to constrain the disparity search range in stereo matching

Feature Extraction and Image Processing, 2 nd Edition. Contents. Preface

Detecting motion by means of 2D and 3D information

Computer and Machine Vision

Towards direct motion and shape parameter recovery from image sequences. Stephen Benoit. Ph.D. Thesis Presentation September 25, 2003

Automated Extraction of Buildings from Aerial LiDAR Point Cloud and Digital Imaging Datasets for 3D Cadastre - Preliminary Results

MAPI Computer Vision. Multiple View Geometry

3D Pose Estimation of Cactus Leaves using an Active Shape Model

3D Scanning. Lecture courtesy of Szymon Rusinkiewicz Princeton University

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Depth Measurement and 3-D Reconstruction of Multilayered Surfaces by Binocular Stereo Vision with Parallel Axis Symmetry Using Fuzzy

CHAPTER 3 RETINAL OPTIC DISC SEGMENTATION

Plant and Canopy Reconstruction User Documentation. The University of Nottingham

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video

Babu Madhav Institute of Information Technology Years Integrated M.Sc.(IT)(Semester - 7)

CHAPTER 1 Introduction 1. CHAPTER 2 Images, Sampling and Frequency Domain Processing 37

Edge and local feature detection - 2. Importance of edge detection in computer vision

Design Intent of Geometric Models

CS 534: Computer Vision Segmentation and Perceptual Grouping

NAME: Sample Final Exam (based on previous CSE 455 exams by Profs. Seitz and Shapiro)

The main problem of photogrammetry

And. Modal Analysis. Using. VIC-3D-HS, High Speed 3D Digital Image Correlation System. Indian Institute of Technology New Delhi

Motion Tracking and Event Understanding in Video Sequences

Constructing a 3D Object Model from Multiple Visual Features

Contours & Implicit Modelling 4

Detection of Melanoma Skin Cancer using Segmentation and Classification Algorithm

Anno accademico 2006/2007. Davide Migliore

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Calibration of a rotating multi-beam Lidar

Module 4F12: Computer Vision and Robotics Solutions to Examples Paper 2

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Data Association for SLAM

(0, 1, 1) (0, 1, 1) (0, 1, 0) What is light? What is color? Terminology

C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S. Image Operations II

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques

How to Compute the Pose of an Object without a Direct View?

The Anatomical Equivalence Class Formulation and its Application to Shape-based Computational Neuroanatomy

DEVELOPMENT OF ORIENTATION AND DEM/ORTHOIMAGE GENERATION PROGRAM FOR ALOS PRISM

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

then assume that we are given the image of one of these textures captured by a camera at a different (longer) distance and with unknown direction of i

L1 - Introduction. Contents. Introduction of CAD/CAM system Components of CAD/CAM systems Basic concepts of graphics programming

Transcription:

Micro-scale Stereo Photogrammetry of Skin Lesions for Depth and Colour Classification Tim Lukins Institute of Perception, Action and Behaviour 1 Introduction The classification of melanoma has traditionally relied on colour and intensity images as input to various rule/checklist based diagnosis. A variety of computer vision techniques are often applied to enhance and segment such data into more representative features, in order to then automate detection of tumours via statistical and machine learning techniques. Such approaches do however omit to take into account the information that could additionally be provided by depth, and the resulting description of the actual macro surface structure of the area in question. Only one other such system attempts to utilise this modality - the DERMA system of Callieri et al. [?]- which is based on laser scanning of a subject to obtain 3D and aligned colour information. In this investigation we consider another approach for capturing and evaluating via dense stereo photogrammetry - with all the benefits of instantaneous capture and perfect 1:1 alignment of colour information. We seek to test whether the inclusion of depth can indeed help distinguish between various dermatological types, showing that the actual surface structure may also yield a valuable source of features on which to base classification. This involves addressing issues in accurate 3D capture at a very fine scale, the conversion and processing of all channels of information to enhance features, and the analysis of what variations and distributions within the data can be used to differentiate. 2 Methodology 2.1 Data Five datasets were collected using a stereo capture rig constructed of two Canon EOS 300D cameras, calibrated and using the maximum level of magnification supported by the standard EF-S lens (0.28m closest focus distance). The dense stereo data was recovered from the two simultaneous images via stereo photogrammetry matching software. The resulting perspective depth-maps were constructed in the left image co-ordinate frame - resulting in a a one-to-one correspondence between z-depth and pixel colour values. From this complete data, a subject area of pixels was selected for each of the five data sets. At a captured scale of one pixel, this therefore represents a surface area of approximately. The subject areas were chosen to represent a variety of different dermatological types (available to us from those found on normal human skin, in the absence of actual cancer examples) and were as follows: Normal, Freckle, Liverspot, Mole, Scab. These are shown in Figure 1 - indicating the variations in depth, size, and colouration. The first dataset acts as a control - representing as it does an area of normal skin. Each dataset also shows the superimposed outline of the mask defining the specific region of interesting skin - all other sample pixels being designated surrounding skin. This represents the only division of the data, and was performed by tracing the outline of the regions. In the case of the control dataset, this boundary simply splits the data with a line down the middle. The depth information represents the z-axis distance from the sensor. Of immediate note is the fact that the samples were captured from a variety of curved body surfaces - by which the presence of surface features is often obscured by global structure. The first two of the actual samples ( freckle and liverspot ) are also affected by a profusion of hair follicles - which has disrupted the stereo recovery process. These are retained to present the situation when the depth information should therefore be discounted. The last two samples however ( mole and scab ) have managed to preserve a useful amount of depth detail.

Figure 1. Colour/mask and depth. Top to bottom: normal, freckle, liverspot, mole, and scab.

6 ( 87 8 9 ( : I : : K : 6 ( U L % P S 1 S # 2.2 Processing Each of the datasets provides direct access to 7 channels of information. These represent for every pixel the z-depth, red, green, blue, hue, saturation, and value components. We are interested in accentuating various regions (e.g. rough areas), and to correct for global surface structure. To this end we adopt the following techniques described below. Local Variation The raw values of each channels can be processed to derive the mean difference for each pixel from its neighbourhood, in order to reflect local variation as: 1. Let "!$# be the value of the channel at pixel!. 2. Calculate the mean % for a neighbourhood of cardinality N. 3. Calculate the mean difference as: & '!#)(*+,-"!#. "!$# The calculation of the mean can be easily implemented as a square convolution matrix of dimensions / / (*0 with each value apart from the centre cell equalling 243 1. e.g. For 05( : 0.125 0.125 0.125 0.125 0 0.125 0.125 0.125 0.125 This process can be carried out to each of the pixels, for all the 7 channels, using the convolution shown - replicating edges with the nearest border value where necessary. Global Orientation The z-depth channel surface detail can be further revealed by fitting the depth-maps to an underlying surface orientation - assuming a simple plane in this case - and projecting the values onto that surface as: 1. Select the 4 corner point z values of a depth-map to construct least squares fitting matrix: : : ; := >; 1 < 1 C: ; 1<?A@AB F;?-@DBE< 1?A@ABG<?-@AB L4MONQPRNTS ;VU HJI 2. Derive co-efficients of the fitted plane by Singular Value Decomposition of this matrix: W XY #Z(\[^]_ E[ ;ed 3. Calculate projection of new depth data as: f ; f < g < g Ua`bc` LhM PR VU A Applying this process results in the corrected depth channels of the datasets shown in Figure 2. #Ai 2.3 Selection Using the mask provided for each dataset, it is possible to divide the point/pixel data into two sets - those within the boundary which are Interesting, and those out-with the boundary which are Surrounding. This selection can be additionally performed by a specified amount of erosion (using standard morphological operators) performed in either direction, the effects of which are to eliminate any ambiguous Border points - as shown for example in Figure 3. This results in the partitioning of the datasets as shown for example in Table 1. Notice that the various sizes of the subject areas results in a oversampling of the surrounding points for the smaller freckle and liverspot cases. Also, the larger the region, the longer the circumference of it s mask, and consequently the greater proportion of ambiguous border points which are ignored.

Figure 2. Projected depth-maps onto fitted planar surface. 50 50 100 100 150 150 200 200 250 50 100 150 200 250 250 50 100 150 200 250 Figure 3. Mask eroded +/- 20 pixels from boundary for mole dataset. Dataset Surround Interest Border Freckle 54660 2420 5420 Liverspot 46202 5870 10428 Mole 29969 18730 13801 Scab 38465 15980 8055 Table 1. Example division of datasets by erosion of +/- 10 pixels from boundary.

3 Results 3.1 Applying Local Variation For all datasets, every channel was separated and processed as described above for local variation and the resulting correlations between depth and the other channels were plotted as shown in Figure 6 (compared to normal skin variations in Figure 5a). These results would appear to show that localised variations in depth (and indeed the colour channels) do not provide sufficient variations to support any robust classification. There exists no global variations in the shape of the distributions between types, and furthermore no suitable separation between interesting and surrounding points. Principle Components Analysis (using the two largest eigenvectors) of the inclusion of depth data alongside the other 6 channels as a feature vector justifies this lack of useful variation, as shown by the similarity in Figure 4 in which no significant variation is contributed. 0.5 0.5 Interesting Skin Surrounding Skin 0.45 0.4 0.35 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0 0.8 Interesting Skin Surrounding Skin 0.45 0.4 0.05 1 1.2 1.4 1.6 1.8 2 0 0.8 1 1.2 1.4 1.6 1.8 2 Figure 4. PCA projection of mole dataset. 3.2 Applying Global Orientation For all datasets, the z-depth was first projected onto a fitted planar surface to accommodate global orientation, with the resulting correlations between the projected depth and the unmodified colour channels plotted as shown in Figure 7 (compared to normal skin variations in Figure 5b). These results would appear to show considerably better potential for classification, by indicating a wide range of variations between datasets. As anticipated, the disrupted freckle and liverspot cases show no benefit with the addition of depth data to the ability to distinguish between surrounding and interesting skin. However, in the instances of mole and scab (where the stereo recovery process was unhindered) there is a good degree of separation between and within the data. It should be noted that applying global orientation, and then analysing local variation does not improve the distributions of the data. That is, it would appear that local surface structure requires a more complex approach for describing the local roughness of a region. 4 Conclusions In summary: the use of depth information as another modality for enhanced dermatological classification shows promise - but only under the guarantee that the data can be captured accurately and fitted to best preserve surface structure. Capturing the data accurately requires careful control of the environment lighting and of other factors that can affect the stereo recovery process - especially in the presence of hair follicles (i.e. shaving the skin area should be performed first). Fitting the data to best preserve surface structure might perform better with more complex underlying surfaces (e.g. cylinders) and in the means of then projecting the data onto that surface orthogonally. Furthermore, better localised analysis of local roughness would perhaps yield more representative features for a region (e.g. looking at the variation of surface normal incidence, or curvature of small patches).

This investigation has only looked at a very small sample size, and non-exotic subject regions. New camera calibration and specialist lenses could enable ever finer 3D information, and using image segmentation and more advanced classification techniques have the potential to greatly improve and fully automate the process of diagnosis. Figure 5. Correlations for normal dataset, split exactly into 2 arbitrary sets.

Figure 6. Local variation from mean for 3x3 neighbourhood (selection via border erosion +/- 10). Red, green, blue, hue, saturation and value channels plotted against depth for each dataset.

Figure 7. Global oriented depth data plotted against unmodified red, green, blue, hue, saturation and value channels for each dataset (selection via border erosion +/- 20).