Quasi-thematic Features Detection & Tracking. Future Rover Long-Distance Autonomous Navigation

Similar documents
QUASI-THEMATIC FEATURE DETECTION AND TRACKING FOR FUTURE ROVER LONG-DISTANCE AUTONOMOUS NAVIGATION

AUTONOMOUS IMAGE EXTRACTION AND SEGMENTATION OF IMAGE USING UAV S

TESTING SALIENCY BASED TECHNIQUES FOR PLANETARY SURFACE SCENE ANALYSIS

Types of image feature and segmentation

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Evaluation and comparison of interest points/regions

Salient Region Detection using Weighted Feature Maps based on the Human Visual Attention Model

Local Features: Detection, Description & Matching

A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes

Selective Search for Object Recognition

An ICA based Approach for Complex Color Scene Text Binarization

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai

CS 223B Computer Vision Problem Set 3

A Novel Approach to Image Segmentation for Traffic Sign Recognition Jon Jay Hack and Sidd Jagadish

FACULTY OF ENGINEERING AND INFORMATION TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE. Project Plan

Comparison of Local Feature Descriptors

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

DIGITAL IMAGE ANALYSIS. Image Classification: Object-based Classification

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

CS 231A Computer Vision (Fall 2012) Problem Set 3

Automated visual fruit detection for harvest estimation and robotic harvesting

An Introduction to Content Based Image Retrieval

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Small Object Segmentation Based on Visual Saliency in Natural Images

Adaptive Learning of an Accurate Skin-Color Model

A New Algorithm for Shape Detection

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution

Robotics. Lecture 8: Simultaneous Localisation and Mapping (SLAM)

Image Analysis Lecture Segmentation. Idar Dyrdal

Bus Detection and recognition for visually impaired people

A context-based model of attention

Ensemble of Bayesian Filters for Loop Closure Detection

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Dot Text Detection Based on FAST Points

Saliency based Person Re-Identification in Video using Colour Features

Evaluation of regions-of-interest based attention algorithms using a probabilistic measure

Modeling Attention to Salient Proto-objects

Local Features and Kernels for Classifcation of Texture and Object Categories: A Comprehensive Study

CS4670: Computer Vision

Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels

Planetary Rover Absolute Localization by Combining Visual Odometry with Orbital Image Measurements

Sky Segmentation by Fusing Clustering with Neural Networks

A NOVEL SHIP DETECTION METHOD FOR LARGE-SCALE OPTICAL SATELLITE IMAGES BASED ON VISUAL LBP FEATURE AND VISUAL ATTENTION MODEL

Learning a Fast Emulator of a Binary Decision Process

Evaluating Shape Descriptors for Detection of Maya Hieroglyphs

Learning video saliency from human gaze using candidate selection

A NOVEL FEATURE EXTRACTION METHOD BASED ON SEGMENTATION OVER EDGE FIELD FOR MULTIMEDIA INDEXING AND RETRIEVAL

A Model of Dynamic Visual Attention for Object Tracking in Natural Image Sequences

Global Probability of Boundary

Selection of Scale-Invariant Parts for Object Class Recognition

Content Based Image Retrieval

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Combining Appearance and Topology for Wide

Motion illusion, rotating snakes

CAP 5415 Computer Vision Fall 2012

CS 4758: Automated Semantic Mapping of Environment

Flood-survivors detection using IR imagery on an autonomous drone

EUSIPCO

Salient Visual Features to Help Close the Loop in 6D SLAM

Saliency Detection in Aerial Imagery

Salient Region Detection and Segmentation

Medical images, segmentation and analysis

Supervised texture detection in images

Main Subject Detection via Adaptive Feature Selection

Semantic Visual Decomposition Modelling for Improving Object Detection in Complex Scene Images

Processing of binary images

Practical Image and Video Processing Using MATLAB

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Image Segmentation and Registration

Robotics. Lecture 7: Simultaneous Localisation and Mapping (SLAM)

Edge Histogram Descriptor, Geometric Moment and Sobel Edge Detector Combined Features Based Object Recognition and Retrieval System

arxiv: v3 [cs.cv] 3 Oct 2012

Det De e t cting abnormal event n s Jaechul Kim

A Novel Method for Image Retrieving System With The Technique of ROI & SIFT

[2006] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image

Outline

Local Image Features

Clustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures

GAZE TRACKING APPLIED TO IMAGE INDEXING

Object Purpose Based Grasping

Glasses Detection for Face Recognition Using Bayes Rules

Binary Image Processing. Introduction to Computer Vision CSE 152 Lecture 5

Latest development in image feature representation and extraction

Motion Tracking and Event Understanding in Video Sequences

TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL PRIORS

Chapter 9 Object Tracking an Overview

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

Experimentation on the use of Chromaticity Features, Local Binary Pattern and Discrete Cosine Transform in Colour Texture Analysis

Machine Learning for. Artem Lind & Aleskandr Tkachenko

Local Feature Detectors

Automatic License Plate Recognition in Real Time Videos using Visual Surveillance Techniques

Bipartite Graph Partitioning and Content-based Image Clustering

SCALE INVARIANT TEMPLATE MATCHING

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach

Water-Filling: A Novel Way for Image Structural Feature Extraction

GABOR AND WEBER FEATURE EXTRACTION PERFORMANCE BASED ON URBAN ATLAS GROUND TRUTH

Automatic determination of text readability over textured backgrounds for augmented reality systems

CITS 4402 Computer Vision

Transcription:

Quasi-thematic Feature Detection And Tracking For Future Rover Long-Distance Autonomous Navigation Authors: Affan Shaukat, Conrad Spiteri, Yang Gao, Said Al-Milli, and Abhinav Bajpai Surrey Space Centre, University of Surrey May 10, 2013

Contents 1 Introduction Problem Research overview Performance Evaluation 2 Bottom-up visual saliency Binary Maps (Rock Detection) Object Tracking 3 Object Detection (Method of Moments) Object Tracking 4 Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion

Problem Research overview Performance Evaluation Detect & Track Rocks on Planetary Surfaces Autonomous navigation systems for planetary rovers need to detect, track and avoid obstacles (e.g., rocks) Vision-based perceptual inputs deemed very useful for (rover) autonavigation (e.g., MER & MSL missions) More importantly: requirement of robust detection and tracking of objects in visual scenes for (not limited to): Obstacle detection and avoidance (e.g., rocks) Visual odometry Path planning Path following and autonomous navigation SLAM (loop closures etc)

Problems... Problem Research overview Performance Evaluation Use standard supervised learning techniques (e.g., SVMs, Tree learning, Gaussian classifiers etc), however... Training data In reality Image courtesy of: NASA/JPL-Caltech Use over saturated contextual features (e.g., SIFT), however... Complex features detected that describe objects (e.g., rocks) Require computationally expensive tracking algorithms Higher memory requirements Possible solutions? (Use symbolic descriptors for objects) Cognitive techniques, saliency maps to describe objects Shape-based features to describe objects

Problem Research overview Performance Evaluation Rock Detection & Tracking Using Thematic Features 1 Visual Saliency Based Detection and Tracking: Bottom-up visual saliency model for object detection (Itti-Koch- Niebur 98) Histogram shape-based image thresholding for binary saliency maps (Otsu s method) Binary saliency blobs (describing rocks) are used as semantic feature descriptors Use heuristic instance-based search algorithm (k-nn search) for tracking these blobs over subsequent frames 2 : Image segmentation via binerisation using a threshold selection criterion Contours of individual patches (i.e., blobs) extracted using a border following method Hu set of invariant moments computed for each contour Hu moments between two subsequent frames collated to achieve tracking

Problem Research overview Performance Evaluation Quantitative Evaluation Using Ground-truth Data Quantitative analysis using standard detection/tracking evaluation metrics and protocols set out in [1] Datasets used: Lab-based, simulated (PANGU) and real-world (SEEKER) replicating planetary surfaces Datasets are hand-labelled using a planetary rock annotation tool purposely built at the Surrey Space Centre [1] R. Kastsuri et al. Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocols, Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 31, no. 2, pp. 319-336, 2009.

Visual Saliency-based Paradigms Bottom-up visual saliency Binary Maps (Rock Detection) Object Tracking Inspired by the information selection property of biological visual systems Models can either based on computational or cognitive research findings A saliency map shows the conspicuity of each pixel in probabilistic terms A number of characteristics can act as a stimulus towards conspicuity: texture, colour, size, shape, orientation etc.

Itti-Koch-Niebur model Bottom-up visual saliency Binary Maps (Rock Detection) Object Tracking We use the Itti-Koch-Niebur model (Itti-98 to describe rocks in terms of saliency maps [2]) Uses centre-surround differences across multi-scale image features (colour, intensity, and orientation) Input image Linear filtering colours intensity orientation Centre-surround differences and normalisation 3 types of features maps Colour conspicuity Intensity conspicuity Across-scale combination and normalisation 3 types of conspicuity maps Linear combination Combining all three channels Saliency map Orientation conspicuity Combined conspicuity [2] L. Itti et al. A model of saliency-based visual attention for rapid scene analysis, Pattern Analysis and Machine Intelligence, IEEE Transactions on, pp. 1254-1259, 1998.

Otsu s method Bottom-up visual saliency Binary Maps (Rock Detection) Object Tracking Histogram shape-based thresholding to convert saliency map into a binary image [3] Assumption: Saliency map images have a bimodal distribution (i.e., two classes of pixels: salient objects (rocks) and the background) Exhaustive search strategy to compute the optimum threhold that minimises the intra-class variance [3] N. Otsu. A threshold selection method from gray-level histograms, Systems, Man and Cybernetics, IEEE Transactions on, vol. 9, no. 1, pp. 62-66, 1979.

k-nn Search-based Tracking Bottom-up visual saliency Binary Maps (Rock Detection) Object Tracking k-nn search algorithm to track binary maps without the requirement of explicity model training or a priori knowledge of the dataset Collate ROI blobs throughout subsequent frames by applying Euclidean norm (l 2 norm) l r, knn(rl t ) L : L = argmin Rl t Qr t 1 (1) r

Object Detection (Method of Moments) Object Tracking Binarised Image Segmentation via Thresholding Image segmentation using the MAT algorithm [4] Utilises local image statistics of mean and variance within a cluster and two thresholds obtained from intensity distribution histogram The algorithm uses a simple percentile (of the brightness) measurement procedure 0 if dst (r,c) > t 0 dst (r,c) = 0 if dst (r,c) < t 1 (2) 1 otherwise. [4] F. Yan et al. A multistage adaptive thresholding method, Pattern Recognition Letters, 26, pp. 1183-1191, 2005.

Continued... Object Detection (Method of Moments) Object Tracking Binary blobs representing ROIs are selected based on zeroth moment (i.e., area) to eliminate outliers using an a priori defined threhold A border following method (based on [5]) is performed to extract blob contours Hu set of (translation and rotation) invariant moments [6] are computed for these contours This yields a vector of 7 values describing each blob (rock) [5] S. Suzuki. Topological Structural Analysis of Digitized Binary Images by Border Following, Computer Vision, Graphics and Image Processing, 32-46. [6] H. K. Hu. Visual Pattern Recognition by Moment Invariants, PROC. IRE vol. 49, p. 1428.

Object Detection (Method of Moments) Object Tracking Hu Moments Matching to Achieve Tracking Exhaustive search strategy is applied in order to carry out a comparison of Hu moments between two subsequent frames Results in matched pairs that are uniquely labelled throughout all the frames to achieve tracking More formally, I (A, B) = 7 1 i=1 F A j 1 Fj B Where, Fj A and Fj B are defined (for Hu moment hj A and hj B of objects A & B) as follows, (3) F A j = sign(h A j ) log h A j (4) F B j = sign(h B j ) log h B j (5)

Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion Normalised Multiple Object Detection Accuracy For any given frame t, the number of false positives (fp t ), misses (ms t ) and true positives (tp t ) is calculated by measuring the spatial overlap between the ground-truth objects and the detector/tracker outputs. Given, Gi t is the i th ground-truth object and Di t is the i th detected object then the spatial overlap ratio (OR t i ) is calculated as, where, OR t i = G i t Di t G i t D i t (6) Di t true positive : OR t i 0.2 i t, D = D t i false positive : OR t i < 0.2 Di t miss : unmatched. We calculate the Normalised Multiple Object Detection Accuracy (N-MODA) for the entire image sequence of each test data as follows, Nframes ( t=1 cms (ms t ) + c f (fp t ) ) N-MODA = 1 Nframes (8) t=1 N t where, { t, N t N t = G if NG t Nt D NG t if NG t < Nt D for N frames t=1 N t = 0 we force N-MODA = 0. The parameters, c ms and c f are weighting parameters, set to c ms = c f = 1). NG t is the number of ground-truth objects and Nt D is the number of detected objects. (7)

Multiple Object Tracking Accuracy Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion In order to evaluate the performance of the tracking system, we use the Multiple Object Tracking Accuracy (MOTA) as follows, Nframes ( t=1 cms (ms t ) + c f (fp t ) + c s (ID SW t MOTA = 1 )) Nframes (9) t=1 N t where, ID SW is the number of object label mismatches in the current frame t relative to the previous frame t t 1, and c s = log 10 (count for the number of mismatches always start from 1). We compute these evaluation metrics along with important ROC measures (i.e., true positives per image (Tp/img), false positives per image (Fp/img), false negatives per image (Fn/img), miss rate and true positive rate (TPR)).

Test Datasets Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion Lab-based dataset RAL Space (SEEKER) dataset PANGU (simulated) dataset

Evaluation Results Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion Table: Dataset Measure Lab-based PANGU SEEKER Mean N-MODA 0.80 0.84 0.61 0.75 MOTA 0.80 0.83 0.61 0.75 Tp/img 3.83 2.37 2.19 2.80 Fp/img 0.22 0.05 0.26 0.18 Fn/img 0.74 0.41 1.11 0.75 Objs/img 4.79 2.83 3.56 3.73 Miss rate 0.15 0.11 0.29 0.18 TPR 0.85 0.88 0.71 0.81

Continued... Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion Table: Dataset Measure Lab-based PANGU SEEKER Mean N-MODA 0.75 0.63 0.65 0.68 MOTA 0.71 0.61 0.62 0.65 Tp/img 4.36 2.14 3.11 3.20 Fp/img 1.36 1.07 1.54 1.32 Fn/img 0.08 0.18 0.13 0.13 Objs/img 5.80 3.39 4.78 4.66 Miss rate 0.01 0.07 0.03 0.04 TPR 0.99 0.93 0.97 0.96

Conclusion Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion Proposed to use two distinct approaches towards object detection and tracking using semantic feature descriptors The use of sparse semantic features reduces computational load, enables the use of heuristic tracking techniques Performed quantitative evaluations to check for performance using lab-based, simulated and real-world datasets We believe such techniques could potentially form a very effective basis for object detection/tracking for application in future long-distance autonomous rover navigation We anticipate to experiment with more challenging noisy datasets to achieve a solid foundation for the proposed concept

END! Introduction Object Detection Accuracy (N-MODA) Object Tracking Accuracy (MOTA) Datasets Results & Conclusion THANKYOU!!