University of Freiburg at ImageCLEF06 - Radiograph Annotation Using Local Relational Features

Similar documents
Medical Image Annotation in ImageCLEF 2008

Texture Feature Extraction Using Improved Completed Robust Local Binary Pattern for Batik Image Retrieval

Properties of Patch Based Approaches for the Recognition of Visual Object Classes

An Introduction to Content Based Image Retrieval

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Overview of the CLEF 2009 Medical Image Annotation Track

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Combining global features for content-based retrieval of medical images

Content-based Medical Image Annotation and Retrieval using Perceptual Hashing Algorithm

Adaptive Skin Color Classifier for Face Outline Models

Content-Based Medical Image Retrieval Using Low-Level Visual Features and Modality Identification

TEXTURE CLASSIFICATION METHODS: A REVIEW

Color Local Texture Features Based Face Recognition

Robust PDF Table Locator

Medical Image Annotation in ImageCLEF 2008

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Countermeasure for the Protection of Face Recognition Systems Against Mask Attacks

LIG and LIRIS at TRECVID 2008: High Level Feature Extraction and Collaborative Annotation

Spam Filtering Using Visual Features

Object detection using non-redundant local Binary Patterns

Study of Viola-Jones Real Time Face Detector

Color Content Based Image Classification

Face and Nose Detection in Digital Images using Local Binary Patterns

2. LITERATURE REVIEW

Automatic Detection of Body Parts in X-Ray Images

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

An Acceleration Scheme to The Local Directional Pattern

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Object Detection Using Segmented Images

Implementation of a Face Recognition System for Interactive TV Control System

Texture Analysis of Painted Strokes 1) Martin Lettner, Paul Kammerer, Robert Sablatnig

X-ray Categorization and Spatial Localization of Chest Pathologies

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Content based Image Retrieval Using Multichannel Feature Extraction Techniques

Weighted Multi-scale Local Binary Pattern Histograms for Face Recognition

Similarity of Medical Images Computed from Global Feature Vectors for Content-based Retrieval

MORPH-II: Feature Vector Documentation

FIRE Flexible Image Retrieval Engine: ImageCLEF 2004 Evaluation

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Tensor Decomposition of Dense SIFT Descriptors in Object Recognition

Beyond Bags of Features

An Adaptive Threshold LBP Algorithm for Face Recognition

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Developing Open Source code for Pyramidal Histogram Feature Sets

Multiple-Choice Questionnaire Group C

Digital Image Forgery Detection Based on GLCM and HOG Features

Outline 7/2/201011/6/

Consistent Line Clusters for Building Recognition in CBIR

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Human Detection and Action Recognition. in Video Sequences

Textural Features for Image Database Retrieval

Recognizing Deformable Shapes. Salvador Ruiz Correa Ph.D. UW EE

BUAA AUDR at ImageCLEF 2012 Photo Annotation Task

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

FACE DETECTION AND RECOGNITION OF DRAWN CHARACTERS HERMAN CHAU

Color Image Segmentation

Phase based 3D Texture Features

LBP Based Facial Expression Recognition Using k-nn Classifier

A Real Time Facial Expression Classification System Using Local Binary Patterns

Sketchable Histograms of Oriented Gradients for Object Detection

Motivation. Gray Levels

A New Algorithm for Shape Detection

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

Decorrelated Local Binary Pattern for Robust Face Recognition

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

CS229: Action Recognition in Tennis

A METHOD FOR CONTENT-BASED SEARCHING OF 3D MODEL DATABASES

Face Recognition by Combining Kernel Associative Memory and Gabor Transforms

Complete Local Binary Pattern for Representation of Facial Expression Based on Curvelet Transform

Evaluation and comparison of interest points/regions

Subject-Oriented Image Classification based on Face Detection and Recognition

Recognizing Micro-Expressions & Spontaneous Expressions

2D Image Processing Feature Descriptors

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Discriminative classifiers for image recognition

Scale Invariant Feature Transform

A New Feature Local Binary Patterns (FLBP) Method

Learning to Recognize Faces in Realistic Conditions

Categorization by Learning and Combining Object Parts

Sparse coding for image classification

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

CHAPTER 9. Classification Scheme Using Modified Photometric. Stereo and 2D Spectra Comparison

Self Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation

VK Multimedia Information Systems

Moving Object Tracking in Video Using MATLAB

Radially Defined Local Binary Patterns for Hand Gesture Recognition

First-Person Animal Activity Recognition from Egocentric Videos

Recognizing Deformable Shapes. Salvador Ruiz Correa UW Ph.D. Graduate Researcher at Children s Hospital

Fast Image Matching Using Multi-level Texture Descriptor

Object and Action Detection from a Single Example

Histograms of Oriented Gradients

Texture Image Segmentation using FCM

CS231A Section 6: Problem Set 3

Discriminative Training for Object Recognition Using Image Patches

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

An Object Detection System using Image Reconstruction with PCA

Masked Face Detection based on Micro-Texture and Frequency Analysis

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Transcription:

University of Freiburg at ImageCLEF06 - Radiograph Annotation Using Local Relational Features Lokesh Setia, Alexandra Teynor, Alaa Halawani and Hans Burkhardt Chair of Pattern Recognition and Image Processing (LMB), Albert-Ludwigs-University Freiburg, Germany {setia, teynor, halawani, burkhardt} @informatik.uni-freiburg.de Abstract This paper provides details of the experiments performed by the LMB group at the University of Freiburg, Germany, for the medical automatic annotation task in the ImageCLEF 2006. We use local features calculated around interest points, which have recently recieved excellent results for various image recognition and classification tasks. We propose the use of relational features, which are highly robust to illumination changes, and thus quite suitable for X-Ray images. Results with various feature and classifier settings are reported. A significant improvement in results is seen when the relative positions of the interest points are also taken into account during matching. For the given test set, our best run had a classification error rate of 16.7 %, just 0.5 % higher than the best overall submission, and therewith was ranked second in the medical automatic annotation task at the ImageCLEF 2006. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing; H.3.3 Information Search and Retrieval; I.5 [Pattern Recognition]: I.5.3 Clustering; I.5.4 Applications; General Terms Algorithms, Experimentation Keywords Classification, Radiograph, Image Annotation 1 Introduction Growing image collections of various kinds have led to the need for sophisticated image analysis algorithms to aid in classification and recognition. This is especially true for medical image collections, where the cost for manual classification of images collected through daily clinical routine can be overwhelming. Automatic classification of these images would spare the skilled human labour from this repetitive, error-prone task. As an example, a study carried out by the University Hospital in Aachen [1] revealed over 15% human errors in assignment of a specific tag, for X-ray images taken during normal clinical routine. Unless corrected, these images cannot be

Figure 1: Intra-class variability within the class annotated as x-ray, plain radiography, coronal, upper extremity (arm), hand, musculosceletal system. found again with keyword search alone. Our group (LMB) participated in the ImageCLEF 2006 medical automatic annotation task. In this report, we describe our experiments and submitted runs using a local version of relational features which are very robust to illumination changes, and with different kinds of accumulators to aid in the matching process. 2 Database The database for the task was made available by the IRMA group from the University Hospital, Aachen, Germany. It consists of 10,000 fully classified radiographs taken randomly from medical routine. 1,000 radiographis for which classification labels were not available to the participants had to be classified. The aim is to find out how well current techniques can identify image modality, body orientation, body region, and biological system examined based on the images. The results of the classification step can be used for multilingual image annotations as well as for DICOM header corrections. The images are annotated with the complete IRMA code, a multi-axial code for image annotation. However, for the task, we used only a flat classification scheme with a total of 116 classes. Figure 1 shows that a great deal of intra-class variability exists, mostly in illumination changes, small amounts of position and scale differences, and noise. On the other hand, Figure 2 shows that the inter-class distance can be quite tight, as the four frontal chest categories are at least pairwise very similar to each other. The confusion matrix between these four classes indicates that more than 80 % of the images were correctly classified. Figure 2: One example image each from class number 108, 109, 110 and 111 respectively. Some classes are hard to distinguish by the untrained eye, but surprisingly perform quite well by automatic methods.

3 Features 3.1 Local Features Generation We use features inspired from invariant features based on group integration. Let G be the transformation group for the transformations against which the invariance is desired. For the case of radiograph images this would be the group of translations. For rotation, only partial invariance is desired as most radiographs are scanned upright. If an element g G acting on the images, the transformed image is denoted by gm. An invariant feature must satisfy F (gm) = F (M), g G. Such invariant features can be constructed by integrating f(gm) over the transformation group G [2]. I(M) = 1/ G f(gm)dg G Although the above features are invariant, they are not particularly discriminative, due to the fact that the integration runs over the whole image, which contains large insignificant regions. An intuitive extension is to use local features, which have recieved good results in several image processing tasks. The idea is to first use a so-called interest point detector, to select pixels with high information content in their neighbourhood. The meaning of high information differs from application to application. There are, for example, detectors which select points of high gradient magnitude, or corner points (points with multiple gradient orientations). A desirable characteristic expected from an interest-point detector is that it is robust towards the above mentioned transformations. In this work, we use the wavelet-based salient point detector proposed by Loupias and Sebe [3]. For each salient point s = (x, y), we extract relational feature subvector around it, given as: Then, R k = [R(x, y, r 1, r 2, φ, n)] k, k = 1,..., n where and R k = rel(i(x 2, y 2 ) I(x 1, y 1 )), (x 1, y 1 ) = (x + r 1 cos(k 2π/n), y + r 1 sin(k 2π/n)), (x 2, y 2 ) = (x + r 2 cos(k 2π/n + φ), y + r 2 sin(k 2π/n + φ)) with the rel operator defined as: 1 if x < ɛ ɛ x rel (x) = 2ɛ if ɛ x ɛ 0 if ɛ < x, (1) The relational functions were first introduced for texture analysis by Schael [4], motivated by the local binary patterns [5]. For each (r 1, r 2, φ) combination a local subvector is generated. In this work, we use 3 sets of parameters, (0, 5, 0), (3, 6, π/2) and (2, 3, π), each with n = 12 to capture local information at different scales and orientations. The subvectors are concatenated to give a local feature vector for each salient point. The local feature vectors of all images are taken together and clustered using the k-means clustering algorithm. The number of clusters would be denoted here by N c, and is determined emperically.

3.2 Building Global Feature Vector An equally important step is that of estimating similarity between two image objects, once local features have been calculated for both of them. For the case that one expects to find for a given test image, a training image of basically the same object, correspondence-based matching algorithms are called for, i.e. for each test local feature vector, the best training vector and its location is recalled (e.g. through nearest neighbour). The location information can be put to use by doing a consistency check if the constellation of salient points build a valid model for the given class. However, the large intra-class variability displayed by the radiograph images suggest that a global feature based matching would perform better, as small number of non-correspondencies would not penalize the matching process heavily. We build the following accumulator features from the cluster index image I(s), i.e. an image which contains for each salient point, its assigned cluster index only. 3.2.1 All-Invariant Accumulator The simplest histogram that can be built from a cluster index image is a 1-D accumulator counting the number of times a cluster occurred for a given image. All spatial information regarding the interest point is lost in the process. The feature can be mathematically written as: F(c) = #{s (I(s) = c)}, c = 1,..., N c In our experiments, the crossvalidation results were never better than 60 %. It is clear that critical spatial information is being lost in the process of histogram construction. 3.2.2 Rotation Invariant Accumulator We first incorporate spatial information, by counting pairs of salient points lying within a certain distance range, and possessing particular cluster indices, i.e. a 3-D accumulator defined by: F(c 1, c 2, d) = #{ (s 1, s 2 ) (I(s 1 ) = c 1 ) (I(s 2 ) = c 2 ) (D d < s 1 s 2 2 < D d+1 ) } where the extra dimension d runs from 1 to N d (number of distance bins). The crossvalidation results improve to about 68 %, but it should be noted that this accumulator is rotation invariant (depends only on distance between salient points), while the images are upright. Incorporating unnecessary invariance leads to a loss of discriminative performance. Especially in this task of radiograph classification it can be seen that mirroring the position of interest points leads to a completely another class, for example, a left hand becomes a right hand, and so on. Thus we incorporate orientation information in the next section. 3.2.3 Orientation Variant Accumulator We discussed this method in detail in [6]. The following cooccurrence matrix captures the statistical properties of the joint distribution of cluster membership indices which were derived from local features. F(c 1, c 2, d, a) = #{ (s 1, s 2 ) (I(s 1 ) = c 1 ) (I(s 2 ) = c 2 ) (D d < s 1 s 2 2 < D d+1 ) (A a < (s 1, s 2 ) < A a+1 ) } with the new dimension a running from 1 to N a (number of angle bins). The size of the matrix is N c N c N d N a. For 20 clusters, 10 distance bins and 4 angle bins, this leads to a feature vector of size 16000. The classification is done with the help of a multi-class

10246.png (29) 9618.png (108) d=132.1 18366.png (29) d=136.4 11290.png (29) d=120.2 15421.png (29) d=131.6 10247.png (29) d=133.8 18318.png (108) d=133.9 16951.png (109) d=136.6 10261.png (111) d=138.4 10593.png (10) 10513.png (6) d=83.8 10512.png (32) d=86.5 7827.png (5) d=88.9 15319.png (10) d=90.4 2133.png (52) d=91.7 7052.png (5) d=91.9 14839.png (5) d=92.5 8320.png (32) d=93.8 14195.png (64) d=99.9 10962.png (64) d=102.8 10818.png (64) d=104.0 13505.png (64) d=105.3 11580.png (64) d=105.8 15315.png (64) d=105.9 11800.png (64) d=106.8 10251.png (64) 12779.png (64) d=107.1 10597.png (22) 17095.png (22) d=129.0 10054.png (22) d=139.2 11279.png (22) d=151.9 11055.png (22) d=152.9 10229.png (44) d=154.3 16132.png (21) d=154.9 18524.png (22) d=157.2 14198.png (22) d=157.6 Figure 3: Top 8 nearest neighbour results for different query radiographs. The top left image in each box is the query image, results follow from left to right, then top to bottom. The image caption contains the image name and label in (brackets). Finally the distance according to the L1 norm is displayed underneath.

SVM, running in a one-vs-rest mode, and using a histogram-intersection kernel. The libsvmtl implementation 1 was used for the experiments. Despite the large dimensionality, the speed should be acceptable for most cases. Training the SVM classifier with 9000 examples takes about 2 hours of computation time, the final classification of 1000 examples just under 10 minutes. 3.3 Results and Conclusion We submitted two runs for the ImageCLEF medical annotation task. Both runs used the accumulator defined in Section 3.2.3 and were equally sized, with N c = 20, N d = 10 and N a = 4. The only difference was in the number of salient points, N s, selected per image. The first run used N s = 1000, while the second used N s = 800. For the given test set, the runs achieved an error rate of 16.7 % and 17.9 % respectively, which means that our best run was ranked second overall for the task. There might have been some room left for better parameter tuning, but it was not performed due to lack of resources, and because there are quite a few parameters to tune. Thus, instead of a joint optimization, each parameter was only individually selected through emperical methods. Figure 3 shows the nearest neighbour results (L 1 norm) for four sample query images. It can be seen that mostly the features are very suited for the task. Local features are in general robust against occlusion and partial matching, which is in general advantageous for a recognition task. This is also true for the radiograph annotation task described in this paper, but there are exceptions. For example, the top left box contains a chest radiograph, and the radiographs with complete frontal chest, and partial frontal chest apparently belong to different classes (class number 108 and 29 respectively). It might be beneficial to use region specific extra features, once it is determined that the radiograph is e.g. one that of chest. References [1] M. O. Gueld, M. Kohnen, D. Keysers, H. Schubert, B. B. Wein, J. Bredno, and T. M. Lehmann. Quality of DICOM header information for image categorization. In Proc. SPIE Vol. 4685, p. 280-287, Medical Imaging 2002:, pages 280 287, May 2002. [2] H. Schulz-Mirbach. Invariant features for gray scale images. In G. Sagerer, S. Posch, and F. Kummert, editors, 17. DAGM - Symposium Mustererkennung, pages 1-14, Bielefeld, 1995. Reihe Informatik aktuell, Springer. [3] E. Loupias and N. Sebe. Wavelet-based salient points for image retrieval, 1999. [4] M. Schael. Methoden zur Konstruktion invarianter Merkmale für die Texturanalyse. PhD thesis, Albert-Ludwigs-Universität, Freiburg, June 2005. [5] T. Ojala, M. Pietikäinen, and T. Mäenpää. Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns. In Proc. Sixth European Conference on Computer Vision, pages 404 420, Dublin, Ireland, 2000. [6] L. Setia, A. Teynor, A. Halawani and H. Burkhardt. Image Classification using Cluster- Cooccurrence Matrices of Local Relational Features In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (MIR 2006), Santa Barbara, CA, USA, October 26-27, 2006 1 http://lmb.informatik.uni-freiburg.de/lmbsoft/libsvmtl/