SIFT, BoW architecture and one-against-all Support vector machine

Similar documents
Fish species recognition from video using SVM classifier

REGIMRobvid: Objects and scenes detection for Robot vision 2013

MICC-UNIFI at ImageCLEF 2013 Scalable Concept Image Annotation

CPPP/UFMS at ImageCLEF 2014: Robot Vision Task

SVM classification of moving objects tracked by Kalman filter and Hungarian method

String distance for automatic image classification

MIL at ImageCLEF 2014: Scalable System for Image Annotation

BossaNova at ImageCLEF 2012 Flickr Photo Annotation Task

Computer Vision. Exercise Session 10 Image Categorization

IPL at CLEF 2013 Medical Retrieval Task

MIL at ImageCLEF 2013: Scalable System for Image Annotation

A Multi Cue Discriminative Approach to Semantic Place Classification

Image matching for individual recognition with SIFT, RANSAC and MCL

ImageCLEF 2011

Fish Species Recognition from Video using SVM Classifier

BUAA AUDR at ImageCLEF 2012 Photo Annotation Task

CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task

Inria s participation at ImageCLEF 2013 Plant Identification Task

Overview of the ImageCLEF 2014 Robot Vision Task

Sabanci-Okan System in LifeCLEF 2015 Plant Identification Competition

AAUITEC at ImageCLEF 2015: Compound Figure Separation

CS229: Action Recognition in Tennis

arxiv: v1 [cs.ro] 29 Jan 2016

Video annotation based on adaptive annular spatial partition scheme

IFSC/USP at ImageCLEF 2012: Plant identification task

Visual localization using global visual features and vanishing points

Organizers: J. Martínez Gómez, I. García-Varea, M. Cazorla and B. Caputo

Action Classification in Soccer Videos with Long Short-Term Memory Recurrent Neural Networks

A Keypoint Descriptor Inspired by Retinal Computation

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Tracking. Hao Guan( 管皓 ) School of Computer Science Fudan University

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Short Survey on Static Hand Gesture Recognition

ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset

Human Detection and Action Recognition. in Video Sequences

Machine Learning and Content-Based Multimedia Retrieval

Similar Fragment Retrieval of Animations by a Bag-of-features Approach

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System

Organizers: J. Martínez Gómez, I. García-Varea, M. Cazorla and V. Morell

COMPARISON OF MODERN DESCRIPTION METHODS FOR THE RECOGNITION OF 32 PLANT SPECIES

Improved Spatial Pyramid Matching for Image Classification

Automatic estimation of the inlier threshold in robust multiple structures fitting.

Learning to Recognize Faces in Realistic Conditions

Multiple-View Object Recognition in Band-Limited Distributed Camera Networks

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Robust PDF Table Locator

Improved Fisher Vector for Large Scale Image Classification XRCE's participation for ILSVRC

Patch-Based Image Classification Using Image Epitomes

Birds in a Forest. Abstract. 2. Methods. 1. Introduction. Sneha Mehta Aditya Pratapa Phillip Summers

Microstructure in materials data and analytics

Summarization of Egocentric Moving Videos for Generating Walking Route Guidance

An Acceleration Scheme to The Local Directional Pattern

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

IFSC/USP at ImageCLEF 2011: Plant identification task

TEXTURE CLASSIFICATION METHODS: A REVIEW

TA Section: Problem Set 4

Aggregated Color Descriptors for Land Use Classification

Improving FREAK Descriptor for Image Classification

Motion analysis for broadcast tennis video considering mutual interaction of players

Superpixel-Based Interest Points for Effective Bags of Visual Words Medical Image Retrieval

Mobile Visual Search with Word-HOG Descriptors

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Linear combinations of simple classifiers for the PASCAL challenge

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Hardware Acceleration of Feature Detection and Description Algorithms on Low Power Embedded Platforms

VOTING METHODS FOR GENERIC OBJECT DETECTION

IPL at CLEF 2016 Medical Task

Bag-of-Temporal-SIFT-Words for Time Series Classification

Detecting Thoracic Diseases from Chest X-Ray Images Binit Topiwala, Mariam Alawadi, Hari Prasad { topbinit, malawadi, hprasad

3D Object Recognition using Multiclass SVM-KNN

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Epithelial rosette detection in microscopic images

Semantic Pooling for Image Categorization using Multiple Kernel Learning

Ranking Error-Correcting Output Codes for Class Retrieval

Large Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao

IPL at ImageCLEF 2017 Concept Detection Task

Fisher vector image representation

VK Multimedia Information Systems

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

The PRA and AmILAB at ImageCLEF 2012 Photo Flickr Annotation Task

Object Classification Problem

Object Detection Using Segmented Images

A Novel Extreme Point Selection Algorithm in SIFT

Exploring Geotagged Images for Land-Use Classification

Robot localization method based on visual features and their geometric relationship

ISSN: (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

Profil Entropic visual Features for Visual Concept Detection in CLEF 2008 campaign

CS 221: Object Recognition and Tracking

Rushes Video Segmentation Using Semantic Features

Learning Representations for Visual Object Class Recognition

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

CHAPTER 4 DETECTION OF DISEASES IN PLANT LEAF USING IMAGE SEGMENTATION

Part-based and local feature models for generic object recognition

Image Comparison on the Base of a Combinatorial Matching Algorithm

Transcription:

SIFT, BoW architecture and one-against-all Support vector machine Mohamed Issolah, Diane Lingrand, Frederic Precioso I3S Lab - UMR 7271 UNS-CNRS 2000, route des Lucioles - Les Algorithmes - bt Euclide B 06900 Sophia Antipolis - France Abstract For this first participation to ImageClef Plant Identification, we build on the reference Bag-of-Word framework (BoW) We extract Points-of-Interest (PoI) using the SIFT detector in every image and describe each local feature with the SIFT descriptor The visual dictionary is built with a K-means algorithm of 100 clusters on the local features Each image is then represented by its histogram onto the dictionary using hard-assignment strategy We classify the images with as many binary one-against-all Support Vector Machines as the number of plant classes per organ types Our aim is to evaluate for the plant identification task a classic baseline of multi-class image categorization Our first results illustrate how difficult this task is and that a framework which has become a standard baseline for classifying general image datasets is not immediately relevant on Plant Identification data 1 Our system Our system for ImageClef Plant Identification task [1] is built on the reference Bag-of-Word framework (BoW) and a set of binary Support Vector Machines We use the XML metadata file provided with the images to extract information as Type of content or Background type Let us now detail our settings 11 Feature extraction and Image description We extract Points-of-Interest (PoI) using both the SIFT detector and the SIFT descriptor in each image We extract about 1000 points in each image, with standard settings of Opencv C++ library: Number of layers per Octave: 3 The minimum threshold to consider a point as PoI: 004 σ of Gaussian: 16 Then we build a visual dictionary using a K-means where K is set to 100 Finally we represent each image by its histogram obtained by the hard assignment of each local feature to BoW clusters The histogram is normalized by the size of the BoW

12 Learning We build one-against-all SVMs for each plant class and for each organ type and we exploit the XML metadata provided by the Challenge during the training phase We thereby identify type of content information to train separately organoriented SVMs The same SVM configuration is used for any organ: C = 100 The kernel type is LINEAR Number of iteration = 10000 In the end, we obtain a trained SVM per class of plant and per organ type among: Entire Stem Fruit Flower Leaf NaturalBackground Leaf SheetAsBackground SVM outputs are organized into vectors, one vector per organ type, as depicted in figure 1 class c1 1 class c2 1 class c6 1 class c1 2 class c2 2 class c6 2 class c1 3 class c2 3 class c6 3 class c1 n class c2 n class c6 n Fig 1 Each vector contains the classification scores obtained with n one-against-all SVMs (one SVM score per plant class) with respect to one of the 6 organs In 2013 dataset, the max number of plant classes is n = 250 13 Confidence computation For one test image, our system performs keypoint extraction and description, then from the XML file associated to the test image, we extract organ and type information We then classify the test image histogram using all the SVMs corresponding to the associated organ For example, if the organ of the image is Leaf and the type is SheetAsBackground, our system executes the set of n SVMs corresponding to Leaf vector The final class corresponds to the class associated with the SVM providing the highest confidence score Figure 2 summarizes all the steps of our system

Fig 2 Global schema of system 2 Experiments Experimentation focus on three steps of our standard framework: 1 Configuration of K-means 2 Configuration of SVM 3 Decreasing amount of data to process For the first configuration, different parameters are to be determined but the main one to choose is K, the number of clusters The higher the number of clusters, the more discriminant the Bag-of-Words histogram representation The number of clusters is fixed to 100 because the K-means configuration is common to all different organs and plant classes, it must preserve the generalization capability of the BoW representation and must require a reasonable computation time The same holds for the SVMs which have common settings for all organs and all classes in our baseline implementation The bigger the parameter C, the lower the error rate C is set to 100

The Clustering is the most demanding step in terms of computational intensity In order to reduce its impact, only 100 points among the 1000 extracted in each image are considered during the clustering The keypoints to be discarded are chosen randomly 21 Resources For implementation we use the OpenCV libraries which offers different type of detection and different methods for learning The implementation language is C++ for efficiency and speed The LibXML libraries are used for XML parsing The program has been launched on a server made of 2 Processors Intel Xeon X5675 at 3,06GHz, 6 Cores and 24GB RAM DDR3-1333MHz 22 Results The goal of our proposition is to evaluate a standard framework for image categorization in multimedia databases, using the classic local feature, SIFT, (both for the detector and the descriptor), a BoW architecture and as many one-againstall SVMs as binary classification required The results are presented in figure 3, 4 and 5 Run name runfilename Entire Flower Fruit Leaf Stem NaturalBackground I3S Run 1 1368034466828 new 100 0017 0023 0041 0038 0025 0026 I3S Run 2 1368165605197 new2 100 0017 0023 0041 0038 0025 0026 Fig 3 Results on ImageClef 2013 with respect to the organs 3 Conclusion As a first participation to ImageClef Plant Identification challenge, we have implemented a standard framework which proved to be powerful for image categorization in multimedia database To do so, we have considered SIFT algorithm for both local feature detection and description, then represented each image with its histogram on the visual dictionary The resulting histograms for each image are classified by one-against-all SVMs, one SVM per plant class and per organ Despite the efficiency of such architecture for image categorization, the results are somewhat disappointing on ImageClef Plant Identification task We are currently working on how to optimize all the parameters of our method to achieve better results

Fig 4 NaturalBackgroundScores References 1 Caputo, B, Muller, H, Thomee, B, Villegas, M, Paredes, R, Zellhofer, D, Goeau, H, Joly, A, Bonnet, P, Gomez, JM, Varea, IG, Cazorla, M: Imageclef 2013: the vision, the data and the open challenges Proceedings CLEF 2013, LNCS (2013)

Fig 5 SheetAsBackgroundScores