Figure 1. Overview of a semantic-based classification-driven image retrieval framework. image comparison; and (3) Adaptive image retrieval captures us

Similar documents
Semantic-based Biomedical Image Indexing and Retrieval

2 The kind of medical database accessing capability we are developing will have many potential applications in clinical practice and medical education

A Classication Based Similarity Metric for 3D Image Retrieval.

Classification Driven Semantic Based Medical Image Indexing and Retrieval

Content-based 3D Neuroradiologic Image Indexing and Retrieval: Preliminary Results

Learning-based Neuroimage Registration

Prostate Detection Using Principal Component Analysis

Content-based 3D Neuroradiologic Image Retrieval: Yanxi Liu, William E. Rothfus (M.D.), Takeo Kanade Forbes Ave. 320 East North Avenue

MEDICAL IMAGE RETRIEVAL BY COMBINING LOW LEVEL FEATURES AND DICOM FEATURES

Content Based Image Retrieval with Semantic Features using Object Ontology

Object and Action Detection from a Single Example

SIIM 2017 Scientific Session Analytics & Deep Learning Part 2 Friday, June 2 8:00 am 9:30 am

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

TRANSFORM FEATURES FOR TEXTURE CLASSIFICATION AND DISCRIMINATION IN LARGE IMAGE DATABASES

Using Natural Clusters Information to Build Fuzzy Indexing Structure

2 Line-Angle-Ratio Statistics Experiments on various types of images showed us that one of the strongest spatial features of an image is its line segm

Automated fmri Feature Abstraction using Neural Network Clustering Techniques

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

Holistic Correlation of Color Models, Color Features and Distance Metrics on Content-Based Image Retrieval

Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues

Feature Selection for Fluorescence Image Classification

Manifold Learning-based Data Sampling for Model Training

Biology Project 1

Computer-Aided Detection system for Hemorrhage contained region

PATTERN RECOGNITION USING NEURAL NETWORKS

An Introduction to Content Based Image Retrieval

ECG782: Multidimensional Digital Signal Processing

Linear Discriminant Analysis for 3D Face Recognition System

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Many slides adapted from K. Grauman and D. Lowe

On Modeling Variations for Face Authentication

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Dimension Reduction CS534

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

FEATURE EVALUATION FOR EMG-BASED LOAD CLASSIFICATION

Combining Selective Search Segmentation and Random Forest for Image Classification

Table Of Contents: xix Foreword to Second Edition

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Image-Based Face Recognition using Global Features

Human Brain Image Database System (HBIDS) for Epilepsy

Fast Fuzzy Clustering of Infrared Images. 2. brfcm

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Boosting Sex Identification Performance

70 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 1, FEBRUARY ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing

Feature Selection for Image Retrieval and Object Recognition

Gender Classification Technique Based on Facial Features using Neural Network

Boosting Image Database Retrieval

String distance for automatic image classification

Multimodal Information Spaces for Content-based Image Retrieval

Analysis of texture patterns in medical images with an application to breast imaging

Pathology Hinting as the Combination of Automatic Segmentation with a Statistical Shape Model

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

CONTENT BASED IMAGE RETRIEVAL SYSTEM USING IMAGE CLASSIFICATION

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Optimal Keys for Image Database Indexing

Probabilistic Registration of 3-D Medical Images

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM

Lecture on Modeling Tools for Clustering & Regression

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Combination of PCA with SMOTE Resampling to Boost the Prediction Rate in Lung Cancer Dataset

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

Hybrid Approach for MRI Human Head Scans Classification using HTT based SFTA Texture Feature Extraction Technique

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Forward Feature Selection Using Residual Mutual Information

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha

Supervised Learning for Image Segmentation

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

COMBINING NEURAL NETWORKS FOR SKIN DETECTION

What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today?

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Content-Based Image Retrieval By Relevance. Feedback? Nanjing University of Science and Technology,

Color Image Segmentation

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA

Generalized Principal Component Analysis CVPR 2007

Perceptron-Based Oblique Tree (P-BOT)

Ensemble of Bayesian Filters for Loop Closure Detection

Classification of Face Images for Gender, Age, Facial Expression, and Identity 1

Spatio-Temporal Registration of Biomedical Images by Computational Methods

Linear combinations of simple classifiers for the PASCAL challenge

Using Machine Learning to Optimize Storage Systems

Improving Recognition through Object Sub-categorization

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity

The exam is closed book, closed notes except your one-page cheat sheet.

BITS F464: MACHINE LEARNING


Ensemble registration: Combining groupwise registration and segmentation

Linear Model Selection and Regularization. especially usefull in high dimensions p>>100.

Face Detection and Recognition in an Image Sequence using Eigenedginess

Human Motion Detection and Tracking for Video Surveillance

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

A Review on Identifying the Main Content From Web Pages

Image enhancement for face recognition using color segmentation and Edge detection algorithm

Cyber attack detection using decision tree approach

Occluded Facial Expression Tracking

Pathology Hinting as the Combination of Automatic Segmentation with a Statistical Shape Model

Transcription:

Semantic-based Biomedical Image Indexing and Retrieval Y. Liu a, N. A. Lazar b, and W. E. Rothfus c a The Robotics Institute Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA b Statistics Department Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA c University of Pittsburgh Medical Center, Pittsburgh, PA, USA ABSTRACT This paper summarizes our work on volumetric pathological neuroimage retrieval under the framework of classification-driven feature selection. The main effort concerns image feature space reduction for the purposes of reducing computational cost during image retrieval time as well as improving image indexing feature discriminating power. Keywords: midsagittal plane, brain asymmetry, image classification, Medical images form an essential and inseparable component of diagnosis, intervention and patient follow-ups. It is therefore natural to use these images as a front-end index to retrieve medically relevant cases from digital patient databases. Common practice in the image retrieval and pattern recognition community is to map each image into a set of numerical and/or symbolic attributes called image indexing features. Thus each image corresponds to a point in a multidimensional image feature space. Existing content-based" image retrieval (CBIR) systems, e.g. [1, 6], depend on general visual features such as color and texture to classify diverse, two-dimensional (2D) images. These general visual cues alone, however, often fail to be effective discriminators for image sets taken within a single domain, where images have subtle, domain-specific differences. Furthermore, these global statistical color and texture measures do not necessarily reflect or have proven correspondence to the meaning of an image, i.e. the image semantics. 1. Our Approach We have explored various issues in medical image retrieval [3, 5, 2, 4] using a neuroimage database composed of clinical volumetric CT image sets of hemorrhage (blood), bland infarct (stroke) and normal brains. The image semantics in this context is expressed as the image pathology. For example, normal, right basal ganglion acute bleed or frontal lobe acute infarct. Figure 1 shows a framework of our approach. The three major components in this scheme are (1) Feature extraction maps each volumetric image into a multi-dimensional image feature space; (2) Feature selection determines the relative scale of the feature space and the best metric for

Figure 1. Overview of a semantic-based classification-driven image retrieval framework. image comparison; and (3) Adaptive image retrieval captures user intention by choosing the most suitable image similarity. For each given image database B and a set of potential image features F B extracted from each image in the database, we define two basic parameters: 1. the fundamental dimension of the database is the lowest feature subspace dimension such that the discriminative power of the subspace is no less than the original feature space; 2. the semantic class separability within the database is a quantitative measure for how-well image classes are separated. Our research goal is to construct a systematic framework to learn these two parameters through classification-driven semantic-based image feature space reduction. The net result of the feature space reduction in image retrieval application includes: (1) reduced computational cost during image retrieval; (2) effective scalability to larger image datasets; and (3) improved discriminating power for finding similar images at retrieval time. More formally, given an image database B and a set of potential indexing features F B = ff 1 ;f 2 ; :::; f n g which we call the original feature space of dimension n, feature space reduction is defined as a mapping from F B toaweighted feature space F W = fw 1 f 1 ;w 2 f 2 ; :::; w n f n g where 0» w i» 1. Since some of the w i smay equal 0, jf W j»jf B j. If d is the number of non-zero weights w i above a certain ffl, the ratio n=d is called the reduction ratio. d is an indication of the fundamental dimension of the database under F B. The classification rate under the chosen fundamental dimension of the image set would be a good indicator of the class separability. Two types of features are expected to be removed from F B during feature space reduction: irrelevant features and redundant features. As a result, for any image semantic class c i the posterior probabilities P (c i jf B ) and P (c i jf W ) are equivalent since minimal useful information is taken away from F W.

Feature Space Reduced Feature CR Reduction Method SpaceDimension (fold) Improved MemoryBasedLearning (holdout) 5-10 (5-10) 15% ClassificationTree (holdout) 2-4 (12-24) 8% ClassificationTree (LOO) 3-5 (9-15) 5% Discriminant Analysis (holdout) 6-10 (5-7) 19% Discriminant Analysis (LOO) 12-15 (3-4) 29% Table 1. The original feature space dimension is 46. This table shows the feature space dimension reduction rates after feature selection, in actual dimensions and in fold. Also shown is the classification rate (CR) improvement over the original feature space. Here holdout means separating the original dataset into a training set (2/3 data) and a holdout test set (1/3 data), training the classifier on the training set and evaluating results using the test set. LOO = Leave One Out strategy, it uses each data point as a test set once and the rest of the data points as a training set in turn. 2. Classification-Driven Feature Space Reduction We use a novel midsagittal plane method for 3D pathological neuroimage alignment [2]. Global and local statistical image features are extracted for describing brain asymmetry [4], geometry and texture properties. These features form the initial, high dimensional feature space. There are many existing feature space reduction approaches, not all of which are appropriate for image retrieval. We are looking for methods that can truly reduce computational cost by explicitly weighting each existing feature dimension (so that we can discard unnecessary feature dimensions during on-line retrieval). Principal component analysis, for instance, though it gives an estimate of the fundamental dimensions for data representation NOT for data discrimination, does not meet this standard. Artificial neural networks do not provide the explicit weightings desired for image retrieval either. Table 1 records the results from feature space reduction using three very different methods: memory based learning, classification trees and discriminant analysis. All the experiments are carried out on an image dataset containing 48 3D brains with three broad pathology classes: normal (26), blood (8) and stroke (14). The initial feature space is 46 dimensions composed of statistical measurements describing human brain asymmetry [4, 2]. Memory Based Learning In [4] we have reported in detail our initial effort on finding the most discriminating feature subset(s) using a memory-based learning approach, Bayes classifier, Parzen windows and a non-deterministic search engine. The average precision rate for image retrieval across different pathologies is near 80%. Classification Trees A binary classification tree is used by finding the best binary split among the features at each step [7]. Based on a multinomial probability model, we define a deviance for the tree: D = P i D i where D i = 2 P k n ik log p ik as a sum over leaves. The `best' split is the maximum reduction in deviance over all leaves. The tree grows until the leaf is homogeneous enough (its deviance is

less than certain value). We randomly divided the dataset 50 times into 2/3-1/3 training-testing pairs. A tree was grown on each of the 50 training sets. Results are listed in Table 1. Discriminant Analysis A combination of forward selection [7] and linear discriminant analysis is used with two different criteria for selecting features: (1) the augmented variance ratio, which compares within-group and between-group variances and penalizes features where class means are too close together, and (2) Wilks' lambda, which measures dissimilarity among groups. Results on feature space reduction over 50 randomly divided training-testing pairs are in Table 1. Figure 2. Left panel: a JAVA user interface of our semantic-based 3D Neuroimage Retrieval System, under development by Dr. Liu et al at the Robotics Institute of Carnegie Mellon University. Two right panels: the netscape screens show one query image on the left with right basal ganglion acute blood (shown as a set of 2D slices), the most similar image on the right (top panel)) and the second most similar image (bottom panel) retrieved from an image database of more than 100 pathological, volumetric neuroimages. 3. Discussion and Conclusion Under our classification-driven semantic-based image retrieval framework, image classification is used as a tool for feature space reduction. We have shown in Table 2 that an initial feature space with 46 dimensions can be reduced to 2-15 dimensions (3-24 fold) with increased discriminating power (better classification rates). It is interesting to notice that the best classification rate improvement is usually associated

with the least feature dimension reduction rate and vice versa. This seems to suggest a tradeoff between feature space dimensionality and classification rates. Our effort in looking for the best automated methods for feature space reduction distinguishes our work from most image retrieval practice where the image indexing features are determined by human system designers. These selected feature dimensions form the basis for on-line image indexing at retrieval time. Our method provides the generality and potential to scale up to much larger image databases. Figure 2 demonstrates a JAVA image interface for our adaptive classification-driven neuroimage retrieval system. ACKNOWLEDGMENTS The authors would like to thank M. Buzoianu for carrying out experiments on discriminant analysis as a summer project. This research is supported in part by the Allegheny-Singer Research Institute under prime contract through the Advanced Technology Program of the National Institute of Standards and Technology (NIST#70NANB5H1183) and in part by NIH National Cancer Institute Unconventional Innovation Program, award # = N01-CO-07119. REFERENCES 1. D. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz. Efficient and effective querying by image content. Journal of Intelligent Information Systems, 1994. 2. Y. Liu, R.T. Collins, and W.E. Rothfus. Robust Midsagittal Plane Extraction from Normal and Pathological 3D Neuroradiology Images. IEEE Transactions on Medical Imaging, 20(3), March 2001. 3. Y. Liu and F. Dellaert. A Classification-based Similarity Metric for 3D Image Retrieval. In Proceedings of Computer Vision and Pattern Recognition Conference (CVPR'98), pages 800 805, Santa Barbara, June 1998. IEEE Computer Society Press. 4. Y. Liu, F. Dellaert, W.E. Rothfus, A. Moore, J. Schneider, and T. Kanade. Classification-driven pathological neuroimage retrieval using statistical asymmetry measures. In International Conference on Medical Imaging Computing and Comptuer Assisted Intervention (MICCAI 2001). Springer, October 2001. 5. Y. Liu, W.E. Rothfus, and T. Kanade. Content-based 3d neuroradiologic image retrieval: Preliminary results. IEEE workshop on Content-based access of Image and Video Databases in conjunction with International Conference of Computer Vision (ICCV'98), January 1998. 6. A. Pentland, R.W. Picard, and S. Sclaroff. Photobook: Content-based manipulation of image databases. IJCV, 18(3):233 254, June 1996. 7. W.N. Venables and B.D. Ripley. Modern Applied Statistics with Splus, 2nd edition. Springer-Verlag, London, Paris, Tokyo, 1997.