Support Vector Machine Density Estimator as a Generalized Parzen Windows Estimator for Mutual Information Based Image Registration

Similar documents
J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report. February 5, 1998

Multi-modal Image Registration Using the Generalized Survival Exponential Entropy

Kernel Density Construction Using Orthogonal Forward Regression

Entropy Based, Multiple Portal to 3DCT Registration for Prostate Radiotherapy Using Iteratively Estimated Segmentation

A Pyramid Approach For Multimodality Image Registration Based On Mutual Information

Fast Image Registration via Joint Gradient Maximization: Application to Multi-Modal Data

A Radiometry Tolerant Method for Direct 3D/2D Registration of Computed Tomography Data to X-ray Images

Hybrid Spline-based Multimodal Registration using a Local Measure for Mutual Information

Leave-One-Out Support Vector Machines

Mixture Models and EM

An Introduction to Statistical Methods of Medical Image Registration

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences

William Yang Group 14 Mentor: Dr. Rogerio Richa Visual Tracking of Surgical Tools in Retinal Surgery using Particle Filtering

Dynamic Thresholding for Image Analysis

Image Registration. Prof. Dr. Lucas Ferrari de Oliveira UFPR Informatics Department

Rigid and Deformable Vasculature-to-Image Registration : a Hierarchical Approach

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

Adaptive Metric Nearest Neighbor Classification

Multi-Modal Volume Registration Using Joint Intensity Distributions

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Registration of 2D to 3D Joint Images Using Phase-Based Mutual Information

Fast 3D Mean Shift Filter for CT Images

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques.

ASSOCIATIVE LEARNING OF STANDARD REGULARIZING OPERATORS IN EARLY VISION

Non-Parametric Modeling

2D-3D Registration using Gradient-based MI for Image Guided Surgery Systems

Table of Contents. Recognition of Facial Gestures... 1 Attila Fazekas

Local Image Registration: An Adaptive Filtering Framework

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Medical Image Segmentation Based on Mutual Information Maximization

PET Image Reconstruction using Anatomical Information through Mutual Information Based Priors

A New Method for CT to Fluoroscope Registration Based on Unscented Kalman Filter

Image registration for motion estimation in cardiac CT

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

OnDemand3D Fusion Technology

EE795: Computer Vision and Intelligent Systems

Machine Learning Lecture 3

Machine Learning Lecture 3

Nonparametric regression using kernel and spline methods

A Combined Encryption Compression Scheme Using Chaotic Maps

Support Vector Regression with ANOVA Decomposition Kernels

Assessing Accuracy Factors in Deformable 2D/3D Medical Image Registration Using a Statistical Pelvis Model

MR IMAGE SEGMENTATION

Algebraic Iterative Methods for Computed Tomography

Non-Rigid Image Registration III

On Kernel Density Estimation with Univariate Application. SILOKO, Israel Uzuazor

Nonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H.

Efficient 3D-3D Vascular Registration Based on Multiple Orthogonal 2D Projections

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Coupled Bayesian Framework for Dual Energy Image Registration

Image Thickness Correction for Navigation with 3D Intra-cardiac Ultrasound Catheter

A Non-Iterative Approach to Frequency Estimation of a Complex Exponential in Noise by Interpolation of Fourier Coefficients

Categorization by Learning and Combining Object Parts

Efficient population registration of 3D data

Hierarchical Mixture Models for Nested Data Structures

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Anisotropic Methods for Image Registration

Algorithm That Mimics Human Perceptual Grouping of Dot Patterns

Projection-Based Needle Segmentation in 3D Ultrasound Images

APPLICATION OF RADON TRANSFORM IN CT IMAGE MATCHING Yufang Cai, Kuan Shen, Jue Wang ICT Research Center of Chongqing University, Chongqing, P.R.

Nonparametric Methods

INDEPENDENT COMPONENT ANALYSIS WITH QUANTIZING DENSITY ESTIMATORS. Peter Meinicke, Helge Ritter. Neuroinformatics Group University Bielefeld Germany

The role of Fisher information in primary data space for neighbourhood mapping

Comparison of Vessel Segmentations Using STAPLE

Machine Learning Lecture 3

Modern Medical Image Analysis 8DC00 Exam

COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING

Iterative CT Reconstruction Using Curvelet-Based Regularization

Mixture Models and the EM Algorithm

P257 Transform-domain Sparsity Regularization in Reconstruction of Channelized Facies

Image Segmentation. Shengnan Wang

Robust Linear Registration of CT images using Random Regression Forests

Comparison of Different Metrics for Appearance-model-based 2D/3D-registration with X-ray Images

Learning-based Neuroimage Registration

MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 4: Pre-Processing Medical Images (II)

Good Morning! Thank you for joining us

Probabilistic Tracking and Model-based Segmentation of 3D Tubular Structures

Economics Nonparametric Econometrics

l ealgorithms for Image Registration

Combinatorial Methods in Density Estimation

A 2D-3D Image Registration Algorithm using Log-Polar Transforms for Knee Kinematic Analysis

A New Method of Probability Density Estimation with Application to Mutual Information Based Image Registration

Bumptrees for Efficient Function, Constraint, and Classification Learning

Annales UMCS Informatica AI 1 (2003) UMCS. Registration of CT and MRI brain images. Karol Kuczyński, Paweł Mikołajczak

PATTERN CLASSIFICATION AND SCENE ANALYSIS

3D Registration based on Normalized Mutual Information

Response to Reviewers

Introduction The problem of cancer classication has clear implications on cancer treatment. Additionally, the advent of DNA microarrays introduces a w

Probabilistic Graphical Models

Introduction to Medical Image Registration

Image Registration in Hough Space Using Gradient of Images

Nonparametric Density Estimation

Nonparametric Approaches to Regression

Registration concepts for the just-in-time artefact correction by means of virtual computed tomography

ECSE 626 Project Report Multimodality Image Registration by Maximization of Mutual Information

Automatic Parameter Optimization for De-noising MR Data

Transcription:

Support Vector Machine Density Estimator as a Generalized Parzen Windows Estimator for Mutual Information Based Image Registration Sudhakar Chelikani 1, Kailasnath Purushothaman 1, and James S. Duncan 2 1 Departments of Diagnostic Radiology, 2 Departments of Electrical Engineering and Diagnostic Radiology, Yale University, New Haven, CT 06520-8042 Abstract. Mutual Information is perhaps the most widely used multimodality image registration method. A crucial step in mutual information is the estimation of the probability density function (pdf). In most cases, the Parzen window estimator is employed for this purpose which results in an excessive computational cost. In this paper we demonstrate that replacing the Parzen density estimator with a Support Vector Machine (SVM) density estimation will result in a significant reduction of the computational time. We verified this by registering 2D portal images to DRRs (digitally reconstructed radiographs) projected from 3D CT volumetric data. 1 Introduction One of the prime motivations behind using Mutual Information (MI) in image analysis is the ability to perform multi-modal registration. Since its inception as a similarity metric or measure in 1995 [1,2], it has been successfully applied in many applications concerned with 2D-3D and 3D-3D registration. In principle, mutual information measures the statistical dependence between two images. The metric operates solely on the pixel intensities without reference to any extraneous properties of the object that is imaged, or the imaging system itself. Consequently, it is quite robust when applied across imaging modalities. At the core of the metric s algorithm is the evaluation of the probability densities of the data sets that are being registered. In most imaging modalities, the probability density functions (pdf) are not easily predicted and need to be estimated. The various ways of predicting or estimating pdfs are grouped under parametric, semi-parametric and non-parametric methods. A Gaussian is a standard example of a parametric pdf, while a finite mixture model such as a mixture of Gaussians is an example of a semi-parametric pdf [3]. However, in many cases, non-parametric methods are the models of choice because no prior assumptions about the forms of the pdfs are required. Examples are histograms, K-nearest neighbour method, and kernel based estimators such as the Parzen Windows [4]. Viola et al. use the Parzen windows density estimator because it provides a very smooth estimate [2]. This is achieved by using the full data set R.E. Ellis and T.M. Peters (Eds.): MICCAI 2003, LNCS 2879, pp. 854 861, 2003. c Springer-Verlag Berlin Heidelberg 2003

Support Vector Machine Density Estimator 855 as input to compute subsequent estimates. Employing large data samples results in an accurate estimate, but with the penalty of a higher computational expense [5]. An alternate method of density estimation using the Support Vector Method (SVM) was proposed by Vapnik et al. [6,7]. The SVM technique expresses the density estimation problem as a linear operator equation and uses the regression approach with a regularization term to converge to the solution. It can be classified under kernel based estimators. We integrated the SVM estimator into the mutual information registration method to reduce the computational cost of using the Parzen windows estimator. The SVM method is sparser in comparison to the Parzen windows method because many of the data elements are not used in the estimate. The compromise between data sparsity and estimation accuracy is controlled by the use of a regularization term which is used in addition to the kernel width. 2 Mathematical Background Mutual information has multiple interpretations in standard literature. The Kullback-Leibler distance measure interprets the mutual information between two random variables X and Y as the relative entropy between the joint probability distribution of the variables and the product of their marginal distributions. This in essence is an estimate of their statistical dependence. The metric can be expressed as follows: I (X, Y )=D(p(X, Y ) p (X) p (Y )) [ = E X,Y log p (X, Y ) ] p (X) p (Y ) = p (x, y) p (x, y) log p (x) p (y) x X y Y (1) An alternate formulation is to relate mutual information to the entropy of the random variables under consideration. The entropy of a random variable is the expected value of the negative log probability. H (X) =E X [ log p (X)] = x X p (x) log p (x) (2) Similarly, the joint entropy between X,Y can be expressed as: H (X, Y )=E X, Y [ log p (X, Y )] = p (x, y) log p (x, y) (3) x X y Y Consequently, I (X, Y )=H (X)+H (Y ) H (X, Y ) (4)

856 S. Chelikani, K. Purushothaman, and J.S. Duncan In the context of registering 2D images to a 3D volume, such as 2D portal, X-Ray fluoroscopy etc. to a 3D CT volumetric data set, the mutual information metric can be formulated in the following manner. Let X represent a set of sample points, and U (X) be the 2D portal image, and V (T (X)) be the transformed CTderived 2D DRR (digitally reconstructed radiograph). The objective function is then written as: I (U (X),V(T(X))) = H (U (X)) + H (V (T (X))) H (U (X) V (T (X))) = E U, V [log p (U (x),v(t(x)))] E U [log p (U (x))] E V [log p (V (T (x)))] x X, y Y (5) The computational burden in mutual information registration lies in accurately estimating the joint and marginal pdfs in equation 5. 2.1 Parzen Window Density Estimation This method of density estimation was first introduced by Parzen [8]. The probability density is estimated as a linear combination of a set of symmetric kernels, or window functions, that are centered at the sample points. The Parzen method assumes that it is modeling a continuous pdf p (x). A window function, for instance a Gaussian: g ψ (u) with variance ψ is chosen and we obtain the approximation p n (x, a) to the density p (x) at the sample point x given the sample set a of size n. It is expressed as: p n (x, a) = 1 n g ψ (x x a ) (6) x a a This interpolates between the sample points and each point contributes an amount proportional to its distance from x. The parameter that needs to be estimated is the variance. If it is too wide, the estimate will be too coarse and if it is too narrow, it will have too much variability. Consequently, the variance chosen should be optimal, and the computation tends to be intensive if the sample set is large. Another constraint when using the Parzen method is that the sample set used to represent the density should be different from the set used to compute the density approximation. If N a and N b be the two sets, then at every point in set N b we need to calculate the linear combination of N a window functions, making the total N a N b computations. This computation becomes prohibitively large with increasing N a and N b. 2.2 Support Vector Method of Density Estimation We follow the method suggest by Vapnik et al. [6,7] in estimating the density. Let the distribution function corresponding to the density p (x) bef (x). Then, F (x) = x p (t) dt (7)

Support Vector Machine Density Estimator 857 However, in reality, the distribution function is unknown and only the set of independent and indentically distributed (iid) data is provided. The function F (x) is approximated with the empirical function F l (x) [7]: F l (x) = 1 l θ (x x i ) (8) i=1 where θ (x) is the step-function. Solving equations 7 with the aproximation in 8 is an ill-posed problem. This implies that small deviations in F l can cause large deviations in the solution for p (x). In a linear operator representation, we are solving Ap = F with the approximation F l. One of the standard methods for formulating solutions to such problems was proposed by Tikhonov [9]. A regularizing function ψ (p) is introduced, and the solution is formulated as follows [6]: ψ (p) = p 2 (9) This functional is minimized subject to the constraint x max i F l (x) p (t) dt = ɛ l (10) x=xi where ɛ l is one of the regularization parameters to control accuracy. We now look for a solution of the form p (x) = β i K l (x, x i ) (11) i=1 where K l (x, x i )= 1 l K ( ) x x i l. This a generalized form of the Parzen estimator, where each β i is a non-uniform weighting coefficient. This is equivalent to minimizing ψ (p) = p 2 = β i β j K l (xi, x j ) (12) subject to the constraints max l F l (x) i,=1 x β j j=1 K h (x i,t) dt β i 0, x=xi = ɛ l β i = 1 (13) This is now a standard optimization problem and is solved without much computational expense. Only a few of the β i are non-zero and the x i corresponding to those β i are called the support vectors [6,7]. The chosen kernel function will correspondingly have the smallest number of support vectors. Consequently, the support vector method of density estimation is analogous to the Parzen windows method, but with the additional choice of the regularization term, and a sparser data set, and with a faster convergence. i=1

858 S. Chelikani, K. Purushothaman, and J.S. Duncan 3 Results To test the Parzen windows density estimator and the support vector estimator, we first employed both methods to compute the density contours of an arbitrary data set. The data set consists of 500 points sampled from a set of three isotropic Gaussian functions. The Parzen method employed the full data set in estimating the densities and required 6 10 4 computations. The support vector method used only 80 data points chosen as support vectors and converged to a similar density estimate in 16000 iterations. The computation time for the latter was about one third that of the former. Figure 1(a) is a plot of the density iso-contours (contours on which the probability density is constant) using the Parzen method and figure 1(b) is plot of the same using the support vector method. The figures show the data points used for density computation in both cases. Parzen Window Density Estimation SVM Density Estimation 3 3 2 2 1 1 0 0 1 1 2 2 3 3 4 4 3 2 1 0 1 2 3 4 4 3 2 1 0 1 2 3 (a) Parzen Windows Method (b) Support Vector Method Fig. 1. Comparison of the Parzen and Support Vector methods in density estimation. Figures show contours of constant probability density. The Support vector method used a sparser sample set and computed the same iso-contours in one third the time. Subsequent to the previous simulation, we incorporated both density estimation techniques into a mutual information code and used it to test the registration of anterior-posterior (AP) images of the pelvis derived from a phantom. One of was a portal image acquired using a high energy (6MeV) radiation beam. This image was registered to two DRRs projected from a CT volume of the phantom. Prior to projecting the DRR, the CT volume was translated by 5 pixels (1 pixel = 1.1mm) along the x-axis in one case, and rotated by 5 degrees about the AP axis in the other. The DRRs thus acquired were registered to the portal image. Figure 2 shows the portal image and a sample DRR image that has been rotated about the AP axis. Figures 3 and 4 are plots of the MI values for the translational and rotational misalignment experiments. The MI function is maximum

Support Vector Machine Density Estimator 859 (a) AP Portal Image (b) Misaligned DRR Image Fig. 2. Portal and DRR images of a pelvic phantom. The latter was rotated by 5 o to test the registration algorithm with the Parzen and Support Vector density estimators. in both instances, and for both density estimation methods, close to the correct transformation values. We implemented both algorithms on a Pentium III, 450 MHz computer and the computation times were 25 min and 9 min respectively for the Parzen and SVM estimators respectively. In estimating the density, the SVM method used approximately 25% to 30% of the data while the Parzen extimator used almost all the available data. 1.5 1.4 Parzen Windows Support Vector 1.3 1.2 Mutual Information 1.1 1 0.9 0.8 0.7 0.6 1 2 3 4 5 6 7 8 9 10 Translation (pixels) Fig. 3. Plot of the MI function for registration with misalignment of 5 pixels along the x-axis,using the Parzen Windows and the Support Vector methods.

860 S. Chelikani, K. Purushothaman, and J.S. Duncan 1.5 1.4 Parzen Windows Support Vector 1.3 Mutual Information 1.2 1.1 1 0.9 0.8 1 2 3 4 5 6 7 8 9 10 Rotation (degrees) Fig. 4. Plot of the MI function for registration with misalignment of 5 o about the AP axis, using the Parzen Windows and the Support Vector methods. 4 Conclusion The support vector method provided approximately the same density estimation as the popular Parzen windows method with less computational expense. This was verified by incorporating SVM as part of a mutual information registration algorithm. The computation for the Parzen method scaled as O ( N 2), while the SVM estimator scales as O ( kn 2), with k<1. The SVM method used about one third the data and was about 2.5 times faster than the Parzen windows method. However, these performance characteristics should be taken only as potential indicators and not as true computational measures. We are in the process of testing the SVM algorithm on real data. The true test lies in how the SVM technique will perform when registering real clinical data. The complexity of such data will pose real challenges to the judicious selection of support vectors. If a corresponding speedup is achieved with clinical data also, it will result in a significant saving of time when performing 2D to 3D and 3D to 3D registration. This will be particularly useful when doing real time registration. References 1. Collignon, A. and Maes, F. and Delaere, D. and Vandermeulen, P. and Suetens, P. and Marchal, G.: Automated Multi-Modality Image Registration based on Information Theory. Info. Proc. in Med. Imaging (IPMI) (1998) 263 74 2. Viola, P. and Wells, W.: Alignment by Maximization of Mutual Information. Int. J. Comp. Vision 24 (1997) 137 54

Support Vector Machine Density Estimator 861 3. McLachlan, G.J. and Peel, D.: Finite Mixture Models. Wiley series in Probability and Mathematical Statistics John Wiley & Sons New York (2000) 4. Izenman, A.J.: Recent Developements in Nonparametric Density Estimation. J. Am. Stat. Assoc. 86 (1991) 205 24 5. Thévenaz, P. and Ruttimann, U.E. and Unser M.: A Pyramid Approach to Subpixel Registration based on Intensity. IEEE Trans. Image Processing 7 (1998) 27-41 6. Vapnik, N. and Mukherjee, S.: Support Vector Method for Multivariate Density Estimation. CBCL Paper #170AI Memo #1653, Massachusetts Institute of Technology, Cambridge, MA (1999) 7. Weston, J. and Gammerman, A. and Stitson, M. and Vapnik, V. and Vovk, V. and Watkins, C.: Density Estimation using Support Vector Machines. Technical report CSD-TR-97-23, Royal Holloway College, UK (1997) 8. Parzen, E.: On Estimation of a Probability Density Function and Mode. Ann. Math. Stat. 33 (1962) 1065 74 9. Tikhonov, A. N.: Solution of Incorrectly Formulated Problems and the Regularization Method. Soviet Math. Dokl. 4 (1963) 1035 38