Extracting Discriminative Information from Medical Images: A Multivariate Linear Approach

Similar documents
Feature Reduction and Selection

Recognizing Faces. Outline

A Binarization Algorithm specialized on Document Images and Photos

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Classifier Selection Based on Data Complexity Measures *

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Support Vector Machines

Classifying Acoustic Transient Signals Using Artificial Intelligence

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Detection of an Object by using Principal Component Analysis

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

Edge Detection in Noisy Images Using the Support Vector Machines

Parallelism for Nested Loops with Non-uniform and Flow Dependences

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A Statistical Discriminant Model for Face Interpretation and Reconstruction

Modular PCA Face Recognition Based on Weighted Average

Feature Selection for Target Detection in SAR Images

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Cluster Analysis of Electrical Behavior

Feature Extractions for Iris Recognition

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

The Research of Support Vector Machine in Agricultural Data Classification

An Entropy-Based Approach to Integrated Information Needs Assessment

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

RECOGNITION AND AGE PREDICTION WITH DIGITAL IMAGES OF MISSING CHILDREN

TN348: Openlab Module - Colocalization

Lecture 4: Principal components

A Deflected Grid-based Algorithm for Clustering Analysis

PCA Based Gait Segmentation

Face Recognition Based on SVM and 2DPCA

A Robust Method for Estimating the Fundamental Matrix

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

S1 Note. Basis functions.

Lecture 5: Multilayer Perceptrons

Incremental MQDF Learning for Writer Adaptive Handwriting Recognition 1

The Discriminate Analysis and Dimension Reduction Methods of High Dimension

Smoothing Spline ANOVA for variable screening

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Feature Extraction and Dimensionality Reduction in SVM Speaker Recognition

Unsupervised Learning

Paintings at an Exhibition EE368 Group 17 Project Report

RECOGNIZING GENDER THROUGH FACIAL IMAGE USING SUPPORT VECTOR MACHINE

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

A NEW FUZZY C-MEANS BASED SEGMENTATION STRATEGY. APPLICATIONS TO LIP REGION IDENTIFICATION

A Robust LS-SVM Regression

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Announcements. Supervised Learning

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

A B-Snake Model Using Statistical and Geometric Information - Applications to Medical Images

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A Semi-parametric Regression Model to Estimate Variability of NO 2

Robust visual tracking based on Informative random fern

Facial Expression Recognition Based on Local Binary Patterns and Local Fisher Discriminant Analysis

User Authentication Based On Behavioral Mouse Dynamics Biometrics

A Statistical Model Selection Strategy Applied to Neural Networks

Correlative features for the classification of textural images

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

Three supervised learning methods on pen digits character recognition dataset

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Problem Set 3 Solutions

An Efficient Image Pattern Recognition System Using an Evolutionary Search Strategy

Support Vector Machines

Applying EM Algorithm for Segmentation of Textured Images

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Using Neural Networks and Support Vector Machines in Data Mining

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Classification / Regression Support Vector Machines

Local Quaternary Patterns and Feature Local Quaternary Patterns

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

How can physicians quantify brain degeneration?

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

y and the total sum of

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Mathematics 256 a course in differential equations for engineering students

Mining Image Features in an Automatic Two- Dimensional Shape Recognition System

An efficient method to build panoramic image mosaics

A Multivariate Analysis of Static Code Attributes for Defect Prediction

On Supporting Identification in a Hand-Based Biometric Framework

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

CS 534: Computer Vision Model Fitting

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

A Bilinear Model for Sparse Coding

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis

An Improved Image Segmentation Algorithm Based on the Otsu Method

DELAUNAY TRIANGULATION BASED IMAGE ENHANCEMENT FOR ECHOCARDIOGRAPHY IMAGES

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Object-Based Techniques for Image Retrieval

A classification scheme for applications with ambiguous data

Transcription:

Extractng Dscrmnatve Informaton from Medcal Images: A Multvarate Lnear Approach Carlos E. homaz, Nelson A.O. Aguar, Sergo H.A. Olvera, Fabo L.S. Duran, Geraldo F. Busatto, Duncan F. Glles, and Danel Rueckert Department of Electrcal Engneerng, Centro Unverstaro da FEI, São Paulo, Brazl Departments of Psychatry and Radology, Faculty of Medcne, Unversty of São Paulo, Brazl Department of Computng, Imperal College, London, UK cet@fe.edu.br Abstract Statstcal dscrmnaton methods are sutable not only for classfcaton but also for charactersaton of dfferences between a reference group of patterns and the populaton under nvestgaton. In the last years, statstcal methods have been proposed to classfy and analyse morphologcal and anatomcal structures of medcal mages. Most of these technques work n hgh-dmensonal spaces of partcular features such as shapes or statstcal parametrc maps and have overcome the dffculty of dealng wth the nherent hgh dmensonalty of medcal mages by analysng segmented structures ndvdually or performng hypothess tests on each feature separately. In ths paper, we present a general multvarate lnear framework to dentfy and analyse the most dscrmnatng hyper-plane separatng two populatons. he goal s to analyse all the ntensty features smultaneously rather than segmented versons of the data separately or feature-by-feature. he conceptual and mathematcal smplcty of the approach, whch pvotal step s spatal normalsaton, nvolves the same operatons rrespectve of the complexty of the experment or nature of the data, gvng multvarate results that are easy to nterpret. o demonstrate ts performance we present expermental results on artfcally generated data set and real medcal data.. Introducton In the generc dscrmnaton problem, where the tranng sample conssts of the class membershp and observatons for N patterns, the outcome of nterest fall nto g classes and we wsh to buld a rule for predctng the class membershp of an observaton based on n varables or features. However, statstcal dscrmnaton methods are sutable not only for classfcaton but also for charactersaton of dfferences between a reference group of patterns and the populaton under nvestgaton. For example, n clncal dagnoss we mght want to understand underlyng causes of medcal data by explorng the dscrmnatng hyper-plane found by a statstcal classfer usng mage samples of patents and controls. In the last years, statstcal pattern recognton methods have been proposed to classfy and analyse morphologcal and anatomcal structures of magnetc resonance (MR) mages [4, 6, 8]. Most of these technques work n hgh-dmensonal spaces of partcular features such as shapes or statstcal parametrc maps and have overcome the dffculty of dealng wth the nherent hgh dmensonalty of medcal data by analysng segmented structures ndvdually or performng hypothess tests on each feature separately. Unfortunately, n such approaches changes that are relatvely more dstrbuted and nvolve smultaneously several structures of the pattern of nterest (.e., ventrcles and corpus callosum of the bran) mght be dffcult to detect, despte the possblty of some methods [6, 8] of extractng statstcally multvarate dfferences between mage samples of patents and controls. In ths work, we present a general multvarate statstcal framework to dentfy and analyse the most dscrmnatng hyper-plane separatng two populatons. he goal s to analyse all the ntensty features smultaneously rather than segmented versons of the data separately or feature-by-feature. We use a novel method proposed recently [0], called Maxmum uncertanty Lnear Dscrmnant Analyss (MLDA), to overcome the well-known nstablty of the wthn-class scatter matrx n lmted sample sze problems and to ncrease the computatonal effcency of the approach. he approach s not restrcted to any partcular set of

features and descrbes a smple and straghtforward way of mappng multvarate classfcaton results of the whole mages back nto the orgnal mage doman for further nterpretaton. he remander of ths paper s dvded as follows. In secton we descrbe the man parts of the multvarate lnear framework and ts desgn. hs secton ncludes a bref revew of Prncpal Component Analyss (PCA) and the novel MLDA method used. Secton presents expermental results of the approach and demonstrates ts effectveness on a smple artfcally generated data set and on a real medcal data. In the last secton, secton 4, the paper concludes wth a short summary of functonaltes that form the bass for ths methodology of dscrmnatng and analysng the patterns of nterest.. A Multvarate Lnear Approach Our man concern here s to descrbe a multvarate framework that hghlghts the most dscrmnatng dfferences between two populatons when the number of examples per class s much less than the dmenson of the orgnal feature space. hs problem s ndeed qute common nowadays, especally n medcal mage analyss. For nstance, patents and controls are classes defned commonly by a small number of MR mages but the features used for recognton may be mllons of voxels or hundreds of pre-processed mage attrbutes... Prncpal Component Analyss (PCA) here are a number of reasons for usng PCA to reduce the dmensonalty of the orgnal mages. PCA s a lnear transformaton that s not only smple to compute and analytcally tractable but also extracts a set of features that s optmal wth respect to representng the data back nto the orgnal doman. Moreover, usng PCA as an ntermedate step wll reduce dramatcally the computatonal and storage requrements for the subsequent LDA-based covarance method. Snce n our applcatons of nterest the number of tranng patterns N (or mages) s much smaller than the number of features n (or nstance: voxels), t s possble to transform data n a way that patterns occupy as compact regons n a lower dmensonal feature space as possble wth far fewer degrees of freedom to estmate. Although much of the sample varablty can be accounted for by a smaller number of prncpal components, and consequently a further dmensonalty reducton can be accomplshed by selectng the Orgnal (, 80%) (0, 84%) (50, 9%) (all, 00%) Fg.. Reconstructon of a reference mage (shown on the top left) usng several prncpal components. he row on the bottom llustrates the correspondng dfferences between the reconstructons to the reference mage. he number of components retaned and the correspondng total sample varance explaned are shown n parentheses. We can see modfcatons on the reconstructed mages where all prncpal components wth non-zero egenvalues are not selected. prncpal components wth the largest egenvalues, there s no guarantee that such addtonal dmensonalty reducton wll not add artefacts on the mages when mapped back nto the orgnal mage space. Our am s to map the classfcaton results back to the mage doman for further vsual nterpretaton. For that reason, we must be certan that any modfcaton on the mages, such as blurrng or subtle dfferences, s not related to an ncomplete or perhaps msleadng feature extracton ntermedate procedure. For example, Fgure llustrates on the top a reference mage (shown on the left) reconstructed usng several prncpal components and on the bottom the correspondng dfferences between these reconstructons to the orgnal mage. he values n parentheses represent the number of prncpal components used and correspondng total varance explaned. We can see clearly that even when we use a set of prncpal components that represents more than 90% of the total sample varance we stll have subtle dfferences between the reconstructed mage and the orgnal one. herefore, n order to reproduce the total varablty of the samples we have composed the PCA transformaton matrx by selectng all prncpal components wth non-zero egenvalues. o avod the hgh memory rank computaton of the possbly large total covarance matrx and because the MLDA approach deal wth the sngularty of the wthn-class scatter matrx, we have assumed that all the N tranng patterns are lnearly ndependent. In other words, we have assumed that the rank of the total covarance matrx s N and the number of PCAs selected s m = N.

.. Maxmum Uncertanty LDA (MLDA) he prmary purpose of LDA s to separate samples of dstnct groups by maxmsng ther between-class separablty whle mnmsng ther wthn-class varablty. LDA s man objectve s to fnd a projecton matrx P lda that maxmzes the followng rato (Fsher s crteron): P Sb P Plda = arg max, () P P S wp where S b s the between-class scatter matrx defned as and w S b = g = N ( x x)( x x) () S s the wthn-class scatter matrx defned as S w = g N = j= ( x, j x )( x, j x ). () he vector x, j s the n-dmensonal pattern j from class π, N s the number of tranng patterns from class π, and g s the total number of classes or groups. he vector x and matrx S are respectvely the unbased sample mean and sample covarance matrx of class π [5]. he grand mean vector x s gven by x = N g = N x = N g N = j= x, j, (4) where N s the total number of samples, that s, N = N + N + L+ N g. he Fsher s crteron descrbed n equaton () s maxmsed when the projecton matrx P lda s composed of the egenvectors of S w S b wth at most ( g ) nonzero correspondng egenvalues. hs s the standard LDA procedure. It s well known, however, that the performance of the standard LDA can be serously degraded f there are only a lmted number of total tranng observatons N compared to the dmenson of the feature space m. Snce the wthn-class scatter matrx S w s a functon of ( N g) or less lnearly ndependent vectors, where g s the number of groups, ts rank s ( N g) or less. herefore n recognton problems where the number of tranng patterns s comparable to the number of features, S w mght be sngular or mathematcally unstable and the standard LDA cannot be used to perform the task of the classfcaton stage. In order to avod both the sngularty and nstablty crtcal ssues of the wthn-class scatter matrx S w when LDA s used n lmted sample and hgh dmensonal problems, we have used a maxmum uncertanty LDA-based approach (MLDA) based on a straghtforward covarance selecton method for the S w matrx. In an earler study [0], homaz and Glles compared the performance of MLDA wth other recent LDA-based methods, such as Chen et al. s LDA [], drect LDA [4], and Optmal Fsher Lnear Dscrmnant [], wth applcaton to the face recognton problem. Snce the face recognton problem nvolves small tranng sets, a large number of features, and a large number of groups, t has become the most used applcaton to evaluate such lmted sample sze approaches. he expermental results carred out have shown that the MLDA method mproved the LDA classfcaton performance wth or wthout an ntermedate dmensonalty reducton and usng less lnear dscrmnant features. he MLDA algorthm can be shortly descrbed as follows:.fnd the Φ egenvectors and Λ egenvalues of where S = S [ N g] ; p.calculate the w S p average egenvalue λ, that s, S p, n trace( S p ) λ = j n λ = ; (5a) n j=.form a new matrx of egenvalues based on the followng largest dsperson values Λ = dag[max( λ, λ ),...,max( λ, λ )]; (5b) v.form the modfed wthn-class scatter matrx S w p = S ( N g) = ( ΦΛ Φ )( N g). (5c) he maxmum uncertanty LDA (MLDA) s constructed by replacng S w wth S w n the Fsher s crteron formula descrbed n equaton (). As ponted out by homaz and Glles [G05], t s based on a maxmum entropy covarance selecton dea developed to mprove the performance of Bayesan classfers on lmted sample sze problems []... Framework Desgn We can dvde the desgn of the PCA+MLDA multvarate framework nto two man tasks: classfcaton (tranng and test stages) and vsual analyss. n

In the classfcaton task the prncpal components and the maxmum uncertanty lnear dscrmnant vector are generated. As llustrated n Fgure, frst a tranng set s selected and the average mage vector of all the tranng mages s calculated and subtracted from each pre-processed mage vector. hen the descendng order. Recall, from secton., that we have retaned all the PCA egenvectors wth non-zero egenvalues. he zero mean mage vectors are projected on the prncpal components and reduced to m-dmensonal vectors representng the most expressve features of each one of the pre-processed n- CLASSIFICAION - RAINING SAGE ranng mages: each row s a n-dmensonal mage vector ranng mages wth zero mean: each row s a n-dmensonal vector m most expressve features of each one of the N vectors - x x (N x n) data matrx (N x n) (N x m) (N x ) he most dscrmnant feature of each one of the N vectors PCA MLDA Average mage ( x n) m prncpal components: each column s an egenvector n egenvalue descendng order (n x m) Lnear dscrmnant egenvector (m x ) ES SAGE ( x n) A test vector on the mage space: n-dmensonal mage vector - x x ( x n) est mage vector wthout the average mage calculated n the tranng stage ( x m) Its m most expressve features ( x ) Its most dscrmnant feature VISUAL ANALYSIS ( ) ( ) Its n-dmensonal mage vector ( x n) + x x ( x n) Its correspondng n-dmensonal vector wthout the average mean ( x m) Its m most expressve features ( x ) A partcular pont on the most dscrmnant feature space Fg.. Desgn of the multvarate lnear framework. tranng matrx composed of zero mean mage vectors s used as nput to compute the PCA transformaton matrx. he columns of ths n x m transformaton matrx are egenvectors, not necessarly n egenvalues dmensonal mage vector. Afterwards, the N x m data matrx s used as nput to calculate the MLDA dscrmnant egenvector. Snce we are assumng only two classes to separate, there s only one MLDA

dscrmnant egenvector. he most dscrmnant feature of each one of the m-dmensonal vectors s obtaned by multplyng the N x m most expressve features matrx by the m x MLDA lnear dscrmnant egenvector. hus, the ntal pre-processed tranng set consstng of N measurements on n varables, s reduced to a data set consstng of N measurements on only most dscrmnant feature. he other man task that can be mplemented by ths two-stage multvarate statstcal approach s to vsually analyse the most dscrmnant feature found by the maxmum uncertanty method. Accordng to Fgure, more specfcally from rght to left n ts Vsual Analyss frame, any pont on the most dscrmnant feature space can be converted to ts correspondng n- dmensonal mage vector by smply: () multplyng that partcular pont by the transpose of the lnear dscrmnant vector prevously computed; () multplyng ts m most expressve features by the transpose of the prncpal components matrx; and () addng the average mage calculated n the tranng stage to the n-dmensonal mage vector. herefore, assumng that the clouds of the classes follow a multdmensonal Gaussan dstrbuton and applyng lmts to the varance of each cloud, such as ± s, where s s the standard devaton of each group, we can move along ths most dscrmnant feature and map the result back nto the mage doman. hs mappng procedure provdes an ntutve nterpretaton of the classfcaton experments and, as we wll show n the s square square x ssquare Fg.. A synthetc data set. expermental results of real medcal data, bologcally plausble results that are often not detectable smultaneously.. Expermental Results + boundary scrcle o llustrate the performance of the multvarate lnear approach we present n ths secton expermental results of the framework based on a smple artfcally generated data set and on a real medcal data... A Synthetc Data Example We have chosen a very smple artfcal data set composed of 8 bnary mages of crcles (ellpses) and 8 bnary mages of squares (rectangles). Fgure shows both samples of mages composed of 70 x 70 pxels. As descrbed n the prevous secton, we use such tranng examples (wthout any spatal normalsaton) to construct the multvarate lnear classfer for labellng new examples and dentfyng the most dscrmnatng hyper-plane separatng crcles (or ellpses) from squares (or rectangles). Snce those samples are very smple and easly separable, the classfer acheved 00% of leave-one-out accuracy. Fgure 4 presents the PCA+MLDA most dscrmnant feature of the synthetc database usng all the 6 examples as tranng mages. It dsplays the mage regons captured by the classfer that change when we move from one sde (squares or rectangles) of the x crcle + scrcle Fg.4. Image dsplay of the regons captured by the classfer that change when we move from one sde (squares or rectangles) of the dvdng PCA+MLDA hyper-plane to the other (crcles or ellpes), followng lmts of ± standard devatons for each sample group.

dvdng hyper-plane to the other (crcles or ellpes), followng lmts to the varance ( ± s standard devatons) of each sample group. Despte the changes due to msalgnments of the mages, Fgure 4 shows clearly that the statstcal mappng effectvely extracts the group dfferences. It s mportant to note that these dfferences could be very subtle on samples that are very close to the dvdng boundary and consequently dffcult to characterse as belongng to one of the groups... A Real Data Example In order to demonstrate the effectveness of the methodology on medcal data, we have used an Alzhemer MR bran data set that contans mages of 4 patents and 4 healthy controls. All these mages were acqured usng a.5 Phlps Gyroscan S5-ACS MRI scanner (Phlps Medcal Systems, Endhoven, he Netherlands), ncludng a seres of contguous.mm thck coronal mages across the entre bran, usng a -weghted fast feld echo sequence (E = 9ms, R = 0ms, flp angle 0 o, feld of vew = 40mm, 56 x 56 matrx). All mages were revewed by a MR neuroradologst. Ethcal permsson for ths study was granted by the Ethcs Commttee of the Clncal Hosptal, Unversty of Sao Paulo Medcal School, Sao Paulo, Brazl.... Mass-unvarate Statstcal Analyss For comparson purpose, Statstcal Parametrc Mappng (SPM, verson SPM) [4] analyses were conducted usng an optmsed Voxel-based Morphometry (VBM) protocol [7]. In contrast to the multvarate approach, SPM has been desgned to enable voxel-by-voxel nferences about localsed dfferences between the groups and, consequently, does not characterse nterregonal dependences between the structures of the bran []. A standard template set selected by the psychatrsts was created specfcally for ths study, consstng of a mean -weghted mage, and a pror gray matter, whte matter and CSF templates based on the mages of all AD (Alzhemer Dsease) and healthy control subjects. Intally, mages were spatally normalzed to the standard SPM -MRI template [9], usng lnear -parameter affne transformatons. Spatally normalzed mages were then segmented nto gray matter, whte matter and cerebrospnal flud (CSF) compartments, usng a modfed mxture model cluster analyss technque [7]. he segmentaton method also Fg. 5. Bran regons where sgnfcant dfferences n Alzhemer patents relatvely to controls were detected by the SPM voxel-wse statstcal tests at p < 0. 0. We can see between-group dfferences n the occptal, paretal and frontal lobes, nter-hemspherc fssure, and corpus callosum. ncluded: an automated bran extracton procedure to remove non-bran tssue and an algorthm to correct for mage ntensty non-unformty. Fnally, mages were smoothed wth an sotropc Gaussan kernel (8mm FWHM), and averaged to provde the gray, whte matter and CSF templates n stereotactc space. o boost the sgnal-to-nose rato, the mage processng of the orgnal mages from all AD patents and controls was then carred out, begnnng by mage segmentaton. he segmented mages were spatally normalzed to the customzed templates prevously created by usng -parameter lnear as well as nonlnear (7 x 9 x 7 bass functons) transformatons. he parameters resultng from ths spatal normalzaton step were reappled to the orgnal structural mages. hese fully normalzed mages were re-slced usng trlnear nterpolaton to a fnal voxel sze of x x mm, and segmented nto gray matter, whte matter and CSF parttons. Voxel values were modulated by the Jacoban determnants derved from the spatal normalsaton, thus allowng bran structures that had ther volumes reduced after spatal normalsaton to have ther total counts decreased by an amount proportonal to the degree of volume shrnkage [7]. Fnally, mages from AD patents and controls were smoothed usng a mm Gaussan kernel and compared statstcally between the two groups usng unpared Student s t-tests at p < 0. 0 (level of sgnfcance). Fgure 5 llustrates the locatons where sgnfcant dfferences between the groups were detected. he underlyng mage s the reference template used n the

spatal normalsaton of all MR mages. As can be seen, there are some localsed dfferences n the Alzhemer patents relatvely to controls n the occptal, paretal and frontal lobes, n the nterhemspherc fssure, and corpus callosum. hese structures, especally where sgnfcant gray matter changes were observed, are among the regons thought to be the most promnently affected by atrophc changes n Alzhemer dsease [].... Multvarate Statstcal Analyss Evaluatng the classfer s performance. In order to evaluate the PCA+MLDA classfcaton s rule, we have used the Bhattacharyya bound to estmate the error probablty of the multvarate statstcal framework. For two-class problems, the upper bound of the error probablty eu s defned as [5] e u d = ( p p ) exp( ), (6) where p and p are the pror probabltes of classes π and π respectvely, and d s the Bhattacharyya dstance between the two classes defned as d = ( 8 S + S x x) ( x x) + ln S + S S S, (7) where the notaton. denotes the determnant of a matrx. As descrbed prevously, the vector x and matrx S are respectvely the unbased sample mean and sample covarance matrx of class π ( =, ). Snce the dataset under nvestgaton comes wth the same proporton of patent mages relatvely to controls, we have assumed that the pror probabltes of both groups are equal. hus, assumng p = p = 0.5 and calculatng the Bhattacharyya dstance d usng all the patent and control samples, the multvarate statstcal classfer acheves the value of.56%. hs result confrms the classfer s ablty of dscrmnatng the brans of controls from those of patents wth a successful classfcaton rate of 98.44%, usng the closed-form method for the error probablty. Vsual Analyss of dscrmnatve nformaton. he vsual analyss of the lnear dscrmnant feature found by the multvarate approach s summarsed n Fgure 6. As mentoned earler, the one-dmensonal vector found by the PCA+MLDA approach corresponds to a hyperplane on the orgnal mage space whch drecton descrbes statstcally the most dscrmnant dfferences between the control and patent mages used for tranng. Fgure 6 shows the dfferences between the control (on the left column) and patent (on the rght column) mages captured by the multvarate statstcal classfer usng MR ntensty features as nputs. hese mages correspond to one-dmensonal ponts on the PCA+MLDA space projected back nto the mage doman and located at standard devatons of each sample group. We can understand ths mappng procedure as a way of defnng ntensty changes that come from defntely control and defntely patent samples captured by the statstcal classfer. We can see the followng bran dfferences n the Alzhemer patents relatvely to the controls: () enlargement of the ventrcular system, () atrophy of the hppocampus, () cortcal degeneraton of the occptal, paretal, and frontal lobes, (4) enlargement of the nter-hemspherc Fg. 6. Statstcal dfferences between the control (on the left) and Alzhemer patent (on the rght) mages captured by the multvarate statstcal classfer. We can see the followng bran dfferences n the Alzhemer patents relatvely to the controls: () enlargement of the ventrcular system, () atrophy of the hppocampus, () cortcal degeneraton of the occptal, paretal, and frontal lobes, (4) enlargement of the nter-hemspherc fssure, and (5) atrophy of corpus callosum. 5 4 4

fssure, and (5) atrophy of the corpus callosum. hese multvarate results are consstent wth the SPM between-group dfferences presented prevously and wth other common fndngs of patents who have developed the pathology [], such as the enlargement of the ventrcular system. herefore, the use of the multvarate approach has allowed not only the smultaneous dentfcaton of localsed between-group dfferences but also dstrbuted ones that are often measured separately n the voxel-wse statstcal approaches. 4. Concluson We have presented a general PCA+MLDA multvarate lnear framework to dentfy and analyse the most dscrmnatng hyper-plane separatng two populatons. he statstcal analyss generates a detaled descrpton of the neuroanatomcal changes due to dseases and can facltate the studes of the bran dsorders, such as Alzhemer, through understandng of the captured anatomcal changes. he dea of usng PCA plus an LDA-based approach to dscrmnate patterns of nterest s not new. In ths paper we have added to the functonalty of ths approach the followng mportant ponts for medcal mage analyss. he use of full rank verson of PCA transformaton matrx that allows valuable low representaton of hgh dmensonal data, provdng optmal reconstructon of the most dscrmnant ntensty features wthout addng any artefacts on the patterns when mapped back nto the orgnal mage space. By selectng a slghtly based wthn-class scatter matrx composed of the most nformatve dspersons we resolve not only the LDA sngularty problem but also we stablse the maxmsaton of the Fsher s crteron on lmted sample sze problems. he conceptual and mathematcal smplcty of the approach, whch pvotal step s spatal normalsaton, nvolves the same operatons rrespectve of the complexty of the experment or nature of the data, gvng multvarate results that are easy to nterpret. Although the approach has been demonstrated n two-class problems, t s extensble to several classes. he only dfference s the vsual analyss of the dscrmnant features, whch mght be performed parwsely. References Furue, and C. M. Bottno, A voxel-based morphometry study of temporal lobe gray matter reductons n Alzhemer s dsease, Neurobology of Agng, 4, pp. -, 00. [] L. Chen, H. Lao, M. Ko, J. Ln, and G. Yu, A new LDA-based face recognton system whch can solve the small sample sze problem, Pattern Recognton, (0), pp. 7-76, 000. [] K. J. Frston and J. Ashburner, Generatve and recognton models for neuroanatomy, NeuroImage, vol., pp. 4, 004. [4] K. J. Frston, A. P. Holmes, K. J. Worsley, J. P. Polne, C. D. Frth, and R. S. J. Frackowak, Statstcal Parametrc Maps n Functonal Imagng: A General Lnear Approach, Human Bran Mappng, pp. 89 0, 995. [5] K. Fukunaga, Introducton to Statstcal Pattern Recognton, second edton. Boston: Academc Press, 990. [6] P. Golland, W. Grmson, M. Shenton, and R. Kkns, Detecton and Analyss of Statstcal Dfferences n Anatomcal Shape, Medcal Image Analyss, vol. 9, pp. 69-86, 005. [7] C. D. Good, I. S. Johnsrued, J. Ashburner, R. N. Henson, K. J. Frston, and R. S. Frackowak, A voxel-based morphometrc study of ageng n 465 normal adult human brans, NeuroImage, vol. 4, pp. -6, 00. [8] Z. Lao, D. Shen, Z. Xue, B. Karacal, S. Resnck, and C. Davatzkos, Morphologcal classfcaton of brans va hgh-dmensonal shape transformatons and machne learnng methods, NeuroImage, vol., pp. 46-57, 004. [9] J.C. Mazzotta, A.W. oga, A. Evans, P. Fox, and J. Lancaster, A probablstc atlas of the human bran: heory and ratonale for ts development, NeuroImage, vol., pp. 89-0, 995. [0] C. E. homaz and D. F. Glles, A Maxmum Uncertanty LDA-based approach for Lmted Sample Sze problems - wth applcaton to Face Recognton, n Proceedngs of SIBGRAPI 05, IEEE CS Press, pp. 89-96, 005. [] C. E. homaz, D. F. Glles and R. Q. Fetosa, A New Covarance Estmate for Bayesan Classfers n Bometrc Recognton, IEEE ransactons on Crcuts and Systems for Vdeo echnology, vol. 4, no., pp. 4-, 004. [] P. M. hompson, J. Moussa, S. Zohoor, A. Goldkorn, A. Khan, M.S. Mega, G. Small, J. Cummngs, and A. W. oga, Cortcal varablty and asymmetry n normal agng and Alzhemer s dsease, Cerebral Cortex, vol. 8, pp. 49-509, 998. [] J. Yang and J. Yang, Why can LDA be performed n PCA transformed space?, Pattern Recognton, vol. 6, pp. 56-566, 00. [4] H. Yu and J. Yang, A drect LDA algorthm for hgh dmensonal data wth applcaton to face recognton, Pattern Recognton, vol. 4, pp. 067-070, 00. [] G. F. Busatto, G. E. J. Garrdo, O. P. Almeda, C. C. Castro, C. H. P. Camargo, C. G. Cd, C. A. Buchpguel, S.