Diffusion MRI Analysis Techniques Inspired by the Preterm Infant Brain

Size: px

Start display at page:

Download "Diffusion MRI Analysis Techniques Inspired by the Preterm Infant Brain"

Rosaline Morrison
5 years ago
Views:

1 Diffusion MRI Analysis Techniques Inspired by the Preterm Infant Brain by Brian Gregory Booth M.Sc., University of Alberta, 2008 B.Sc., University of Alberta, 2005 Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in the School of Computing Science Faculty of Applied Science c Brian Gregory Booth 2015 SIMON FRASER UNIVERSITY Fall 2015 All rights reserved. However, in accordance with the Copyright Act of Canada, this work may be reproduced without authorization under the conditions for Fair Dealing. Therefore, limited reproduction of this work for the purposes of private study, research, criticism, review and news reporting is likely to be in accordance with the law, particularly if cited appropriately.

2 Approval Name: Degree: Title: Examining Committee: Brian Gregory Booth Doctor of Philosophy Diffusion MRI Analysis Techniques Inspired by the Preterm Infant Brain Dr. Ping Tan (chair) Assistant Professor Dr. Ghassan Hamarneh Senior Supervisor Professor Dr. Mark Drew Co-Supervisor Professor Dr. Brian Funt Internal Examiner Professor Dr. Colin Studholme External Examiner Professor Departments of Pediatrics and Bioengineering University of Washington Date Defended: 24 November 2015 ii

3 Abstract Diffusion MRI (dmri) is a powerful imaging modality that allows us to non-invasively examine the organization and integrity of fibrous tissue, particularly the brain s white matter. The result of a dmri scan is a 3D image where each voxel contains a model that describes the local diffusion pattern of water molecules. The complicated nature and high dimensionality of these voxel-wise models make dmri analysis especially challenging. The challenges increase when we look at dmri scans from infants born prematurely. The smaller brain size for these infants, and the still-emerging brain structures these infants possess, increase the challenges involved in processing, analyzing, and interpreting dmri scans. This thesis introduces four computational contributions in the area of dmri analysis that attempt to address challenges exacerbated when imaging the preterm infant brain. Specifically, these four contributions are: (a) the first information content estimators for unaltered dmri data, including a mutual information estimator for image registration; (b) a novel dmri segmentation algorithm based on a cross-sectional piecewise constant image model; (c) the first global, closed-form probabilistic tractography algorithm, one where tracts compete for space in the brain; and (d) STEAM: the first patient-specific statistical abnormality mapping technique. In these four contributions, we model the dmri data more accurately in order to improve the accuracy of dmri analysis techniques, particularly in the presence of the small brain sizes and still-emerging brain structures seen in preterm infant dmri. We further make the source code for these contributions publicly available to aid in the reproducibility of our research. Keywords: Diffusion MRI; Preterm Infants; Segmentation; Tractography; Entropy Estimation; Voxel-based Analysis; Statistical Modelling iii

4 Dedication To the universe, or whoever decided that someone else would be Leonardo and that I would be...less remarkable. iv

5 Acknowledgements First and foremost, I would like to thank my senior supervisor Ghassan Hamarneh for his mentorship, support, and infinite patience over the past few years. I would also like to thank my other collaborators for their help, feedback, and support on various projects. In particular, I would like to thank Dr. Steven Miller and Dr. Ken Poskitt for their help in understanding the clinical perspective of the research presented in this thesis. Further, I would like to acknowledge my various funding sources who have made it possible for me pursue this research. In particular, I would like to thank NSERC; the governments of British Columbia, Alberta, and France; IODE Canada; and Simon Fraser University for the funding they provided to me. I would also like to thank my examiners Mark Drew, Brian Funt, and Colin Studholme for their feedback on this thesis. I would further like to thank my fellow students in the Medical Image Analysis Lab particularly Chris McIntosh, Colin Brown, Saba El-Hilo, and Pradeep Raamana for their camaraderie and support and throughout my doctoral studies. It was a pleasure to share a lab with all of you. Finally, I would like to thank my parents for always holding me to a higher standard than the rest of the world does. Any success I achieve is as much due to their hard work as it is due to mine. v

6 Table of Contents Approval Abstract Dedication Acknowledgements Table of Contents List of Tables List of Figures List of Acronyms ii iii iv v vi x xii xxiii 1 Introduction Background and Motivation Thesis Contributions Information Content Estimators for Diffusion MRI Diffusion MRI Segmentation via Cross-sectional Piecewise Constancy Global, Competitve, Closed-Form Probabilistic Tractography STEAM - Statistical Template Estimation for Abnormality Mapping Reproducible Research and Published Software Full Bibliography of Doctoral Work Thesis Outline Fundamentals of Diffusion MRI Acquisition and Modelling Introduction Biological Basis for Diffusion MRI Diffusion Weighted Image Acquisition Magnetic Resonance Imaging Diffusion Weighted Imaging Correction of Image Artifacts vi

7 2.4.1 Eddy Currents Subject Motion Rician Noise The Diffusion Tensor Model Tensor Image Visualization High Angular Resolution Diffusion Models Compartment Models Higher Order Tensors Diffusion Orientation Distribution Functions HARDI versus the Diffusion Tensor Preterm Infant dmri Acquisition and Modelling Conclusions Information Content Estimators for Diffusion MRI Introduction and Motivation Background: Binning Estimators Extensions to Tensor-Valued Data Limitations of Binning Estimators Methods: Nearest Neighbour Estimators The Shannon Entropy Estimator Extension to Mutual Information Results: Comparison of Entropy Estimators Noise Estimation Application to Image Segmentation Application to Image Registration Discussion Conclusions Diffusion MRI Segmentation via Cross-sectional Piecewise Constancy Introduction and Motivation Methods Overview Anchor Curve Generation Obtaining Tract Cross-Sections Cross-Sectional Piecewise Constancy Mapping Dissimilarities to the Image Space Dissimilarity Map Segmentation Experimental Setup and Results Phantom Experiment Real Data Experiment vii

8 4.4 Discussion Design Decisions Segmenting Preterm Infant dmri Conclusions Global, Competitve, Closed-Form Probabilistic Tractography Introduction and Motivation Methods Overview dmri Graph Embedding Analytical Edge Weight Integration Random Walker Tractography Modelling the Background Experimental Results Graph Construction Tractography Synthetic dmri Data Real dmri Data Discussion Conclusions STEAM - Statistical Template Estimation for Abnormality Mapping Introduction and Motivation Methods Statistical Template Estimation Abnormality Mapping Completing the Template Collection Demographics and Clinical Factors of our Preterm Infant Cohort Cohort Demographics Image Scoring and Quality Control Defining Experimental and Control Groups Experimental Results Validation of Normative Statistical Templates Subject-Specific Abnormality Maps: Proof of Concept Relating STEAM Abnormalities to Outcome Comparing STEAM Abnormalities to T1 Abnormalities Discussion Conclusion Conclusion 133 viii

9 7.1 Thesis Summary Conclusions Future Work in Preterm dmri Analysis Bibliography 139 ix

10 List of Tables Table 5.1 Table 5.2 Table 6.1 Table 6.2 Table 6.3 The Legendre polynomials and their indefinite integrals of even degree up to l = Computation time and normalized root-mean-squared error results for various methods of calculating P odf in (5.3). Results shown for our exact method versus numerical integration using different order tesselations of an icosahedron. Note our proposed method gives accurate results within machine precision ǫ Deomgraphics of the experimental and control groups for the preterm infant cohort used in this thesis to validate STEAM. Experimental and control groups are defined in Section P-values are shown between the two groups. Note that there are no significant differences in birth age, scan age, sex, or brain volume between the experimental and control groups Demographics for the preterm infant cohort divided by age window for which we created a statistical template. Numbers are provided for the control group first, followed by the experimental group, with the p- value (1-way ANOVA) of the group differences shown in brackets. Note that there are significant age differences (highlighted in bold) between experimental and control groups for the second and fourth-youngest statistical template age windows The DTI scans used to test the relationship between our voxel-based analysis and motor outcome at 18 months corrected age. The number of scans are grouped according to post-menstrual age and Bayley motor score. Scans with clinically abnormal motor outcomes are highlighted in red while scans with normal motor outcomes are highlighted in green. Borderline, or low normal, cases are highlighted in yellow x

11 Table 6.4 The DTI scans used to test the relationship between our voxel-based analysis and the presence of white matter injury (WMI) scored according to [149]. The number of scans are grouped according to postmenstrual age and WMI score. Scans with significant white matter lesions are highlighted in red while lesion-free scans are highlighted in green. Scans with single, small lesions are highlighted in yellow xi

12 List of Figures Figure 1.1 Figure 1.2 Figure 1.3 Figure 1.4 Figure 1.5 Figure 1.6 The biological basis for Diffusion MRI. Water molecules move faster when they are not restricted by cell structures. As a result, diffusion along fiber tracts is faster than perpendicular to those tracts. From Beaulieu [37] Slices of a preterm infant s brain shown in diffusion tensor MRI at post-menstrual ages (PMA) of 29 and 44 weeks, respectively. Note the large change in both brain shape and size, leading to a more challenging image registration problem. Also note the decreased anisotropy (i.e., brightness) around the edge of the brain as the infant matures An example of the cingulum fiber tract as highlighted on the dmri scan of a healthy adult. Note that the cingulum forms a semicircular shape and given that diffusion is strongest along the tract, the diffusion measurements along this tract are not homogeneous. From Nand et al. [160] The segmentation workflow of our algorithm based on our crosssectional piecewise constant image model. For further details, please refer to the text below Examples of tract jumping on the same dmri scan using both traditional streamline tractography and more recent probabilistic tractography. Note that the tract identified by the purple arrow is not anatomically correct, yet these tractography algorithms erroneously identify this as a valid anatomical connection in the brain Our proposed tractography algorithm is based on a random walk over the image grid where the grid s edge weights are computed from the dmri data; the more diffusion we measured along an edge direction, the more likely the random walker will select that edge xii

13 Figure 1.7 Figure 1.8 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Flow diagram for the statistical template estimation procedure first presented in [96]. Diffusion MRI scans from normative subjects are aligned to a target image (yellow) and Gaussian distributions are fit to the measurements at each voxel (red), resulting in a statistical template for the normative population. The image transformations from the alignment step are then inverted, averaged, and applied to the statistical template images (blue) to transform them to an unbiased shape and size. Multiple iterations can then done to reduce image registration error The proposed analysis pipeline for a new dmri scan. The new scan is aligned to the STEAM statistical template, then values at each voxel are compared to the Gaussian distributions in the template. The voxels whose tensors are significantly different than their corresponding Gaussian distribution are identified and visualized Synthetic examples of the diffusion seen in Corticospinal fluid (left), gray matter (middle), and white matter (right) within the brain. The diffusion rates for various directions are shown in red. Adapted from [7] Nuclear spin s generating a magnetic moment m. The particle spins around a rotational axis shown here in gray. Adapted from [138].. 23 The Stejkall-Tanner diffusion weighted imaging sequence. Adapted from [230] Axial slices of (left to right) (a) standard T2 image and (b) its corresponding diffusion weighted images from gradient pulses in the horizontal, vertical, and out of plane directions. Note the differences in measured diffusion in the Splenium due to gradient direction (highlighted by the white arrows). Adapted from [119] Examples of the ellipsoidal representation of prolate (a) and oblate (b) diffusion tensors. Adapted from [119] Various methods of visualizing the information contained in a diffusion tensor field.images generated using MedINRIA ( on data obtained from [153]. 31 Example of crossing fibers and how they are modeled using Diffusion MRI. Adapted from [7, 71, 230] respectively Sample visualization techniques for diffusion ODFs obtained from HARDI xiii

14 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Slices of a preterm infant s brain shown in diffusion tensor MRI at post-menstrual ages (PMA) of 29 and 44 weeks respectively. Note the large change in both brain shape and size, leading to a more challenging image registration problem. Also note the decreased anisotropy (i.e., brightness) around the edge of the brain as the infant matures A simple 2D example showing the limitations of the averaging estimator. The two unique distributions in (a) and (b) have the same 1D histograms for each of their channels (shown in (c) and (d) respectively). An averaging estimator would obtain the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b) A simple 2D example showing the limitations of the grouping estimator. The two unique distributions in (a) and (b) have the same 1D histogram when grouping the samples from the different channels (shown in (c)). The grouping estimator obtains the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b) Synthetic tensor data sampled from Wishart distributions of different degrees of freedom and same mean tensor diag(3, 1, 1). Note that as the degree of freedom increases, the variability of the sample data decreases Entropy estimates for noisy synthetic data. Entropy estimates using various estimators shown in (a) and grouped into distance-based (b) and binning-based (c) methods for a close up view. Note that, as expected, the distance-based estimates decrease as noise decreases, while the binning-based estimates behave inconsistently with respect to the underlying uncertainty in the data xiv

15 Figure 3.6 Figure 3.7 Figure 3.8 Entropy estimates from various estimators for different quality segmentations of white matter fiber bundles in the ICBM DTI-81 Atlas. Figure (a) shows the entropy estimates as a function of ground truth ROI dilation for the left Cingulum. Note the slight dip in the entropy estimate for the FA estimator at around 3 ROI dilations (red arrow). To better quantify non-convexity, we study the existence of local minima. Figure (b) shows, for each estimator, the number of regions in this experiment where we avoid generating misleading local mimima. Note that our proposed estimators are the only ones that, in all cases, did not generate additional local minima, thereby demonstrating their greater suitability in a gradient-based optimization of a DTI segmentation problem. Figure (c) shows the average slope of the entropy vs. ROI dilation curves near the optimal segmentations. The greater slope provided by our proposed estimators suggest faster convergence in the optimization step of an image segmentation problem Mutual information estimates using various tensor-valued estimators for rotational misalignments with additive noise. Figure (a) shows the range of z-axis rotations tested while figure (b) shows the range of additive Rician noise values. Mutual information for different rotational alignments is shown in (c) for the case without additive Rician noise. Note the significantly higher slope around the optimum for our proposed estimators. Figure (d) shows how the margin between the optimal mutual information value and the baseline changes as noise increases. Note the added robustness of our nearest-neighbor estimators in the presence of additive noise Axial slices of a preterm infant s brain shown in diffusion tensor MRI at post-menstrual ages of 29 and 44 weeks respectively, as well as the result of registering the earlier scan to the later scan using the proposed mutual information measure (with the log-euclidean distance metric). Note that the registration performs poorly as a result of large topological, highlighted in white, between the two original scans. These topological changes cannot be accounted for in diffusion tensor image registration. See text for further details xv

16 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Examples of the additional challenges seen for preterm infant dmri segmentation. One challenge shown in (a) is that curved fiber tracts, like the cingulum, possess non-homogeneous diffusion properties, making their dmri data difficult to model. An additional challenge is that fiber tracts which are easily identifiable in adult dmri scans are still emerging in the preterm brain. One example is the inferior occipitofrontal fasciculi (IOF) highlighted by white arrows in (b) and (c). As can be seen in (c), these emerging fiber tracts are more difficult to distinguish from their surrounding tissue Our proposed segmentation workflow. Tractography is employed to generate an anchor curve (blue) which is then used to generate crosssections of the fiber bundle (magenta). For each cross-section, we measure diffusion dissimilarities between the points on the plane and the intersection point between the plane and the anchor curve. These dissimilarities are then interpolated back into a 3D image and a scalar segmentation algorithm provides us with the final segmentation Segmentation results for the ring tract in (a). Note that we obtain significantly higher Dice coefficients than competing methods as we are able to better model curved tracts and fiber crossings. Further, our approach generates consistent results across various noise levels. 65 Results on the segmentation of cingulum bundles from real dmri scans. A sample is shown in (b). Note that we obtain higher Dice coefficients than competing methods. For the methods that were able to segment the cingulum, we were better able to reduce undersegmentation as highlighted by the blue arrows in (c) At the endpoints of the anchor curve, we can create additional planes spanned by T, B and T, N respectively to capture the extremities of the fiber tract Distance image histograms for the four cases related to FA normalization of distances and log mapping of the distances. Histograms were computed for the synthetic phantom used in our experiments, with σ = Distances within the tract are shown in blue while distances outside the tract are shown in red. Note that including FA normalization makes the classes more easily separable. Meanwhile, applying the log mapping leads to more Gaussian-like behavior and similar magnitude ranges for each class, thereby making the distribution of distances within each class easier to model xvi

17 Figure 4.7 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Segmentation results on the inferior occipitofrontal fasciculi (IOF) in the brain of an infant born preterm (the same infant shown in Figure 4.1c). The ground truth segmentation, shown in (a) was drawn manually by an expert. Note that our proposed segmentation algorithm provided a result in (b) that is visually similar to the ground truth, while the two competing algorithms either undersegmented (c) or oversegmented (d) the IOF Examples of tract jumping in both the adult and preterm infant brain. The streamline tractography (performed by FACT [30]) results are shown for streamlines passing through a region-of-interest located in the splenium. Note that in both cases, the forceps major (i.e., the upside-down U-shaped tract) is correctly identified, but that the preterm infant s tractography result shows a greater degree of spurious streamlines in the anterior direction. Those spurious streamlines are examples of tract jumping in the tractography algorithm Our proposed tractography algorithm is based on a random walk over the image grid where the grid s edge weights are computed from the dmri data; the more diffusion we measured along an edge direction, the more likely the random walker will select that edge Connection probabilities using the exact and the numerical approximation of P odf in (5.3). The tractography results shown are based on our exact solution for obtaining the graph edge weights. The regions highlighted in cyan are compared to results for numerical approximation with 2 nd order tesselation of an icosahedron. Maximum intensity projections along the coronal and axial planes are shown in (a) and (b) respectively. Note the qualitative differences in connection probabilities in the highlighted regions Percentage error in connection probabilities for the examples in Figure 5.3. Maximum intensity projections along the coronal and axial planes are shown respectively. Note how the error from numerical approximation reaches over 15% Synthetic dmri phantoms showing complex fiber relationships of kissing and crossing fibers. These two phantoms are used in our tractography experiments with the competing target regions delineated by white and yellow boxes respectively xvii

18 Figure 5.6 Figure 5.7 Figure 5.8 Figure 5.9 Figure 5.10 Tractography results for the crossing and kissing fiber phantoms. Note that our approach in (b) and (c) produces smoother tractograms than the minimal path approach in (a) as we avoid local computation of the connection probabilities. Further, the introduction of competition reduces the chances of tract jumping as shown by the decreasing probability we see away from the targeted vertical tracts. This result is highlighted by the green arrows Target regions defined for examining the strength of the tracts passing through different regions of the Corpus Callosum (CC). These target regions are chosen within the crossing region between the CC and the internal capsule in order to test the effect that the competing seeds within the internal capsule have on connections within the CC. 90 Tractography results near the mid-saggital plane for the target regions in Figure 5.7. The different sections of the Corpus Callosum (CC) are delineated in blue using atlas-based segmentation. Tractography results within the CC are scaled up and shown in insets. Note that the connection probabilities for the minimal path approach in (a) are noisy as a result of their local computation. An example is pointed out by the blue arrow. Our approach is less susceptible to noise and shows a smoother result. Further, note that the addition of competition reduces the connection probabilities outside the seeded sections of the CC, as shown by the darker regions highlighted by the green arrows Marginal connection probabilities within the different sections of the Corpus Callosum with respect to the three target regions displayed in Figure 5.7. Note that our approach with competition shows the highest marginal connection probabilities, reflecting improved localization of different sections of the Corpus Callosum Tractography results for the splenium for a preterm infant scanned at 29 weeks PMA. The splenium was targeted with the region highlighted in green in (a). For our competitive tractography algorithm, the red region around the fornix was used as a competing region. Note that with the introduction of competition, we see lower connection probabilities between the splenium the fornix xviii

19 Figure 6.1 Figure 6.2 Figure 6.3 Figure 6.4 Figure 6.5 Popular preterm infant DTI analysis techniques presented according to the amount of the brain included in the analysis (coverage), the resolution at which the statistical analysis is undertaken (scale), and whether the analysis technique is group-based or personalized. Note that our proposed techinque, STEAM, is the only technique that provides a personalized, fine-scale analysis of the whole brain Flow diagram of the template creation procedure of Guimond et al. [96]. The subject s scans are aligned to a given target image and averaged. The corresponding image transformations are inverted, averaged, then applied to the average image to adjust the template to an unbiased shape and size. Multiple iterations are then done to reduce registration error. Each step in the pipeline is colourcoded by section, with the image alignment (yellow) discussed in Section 6.2.1, the model fitting (red) discussed in Section 6.2.1, and the bias correction steps (blue) discussed in Section STEAM centers around a 3D statistical template modeling the DTI scan of a healthy preterm infant brain. This template consists of the three 3D images seen on the left: a mean image M, a covariance image S, and a normalcy p-value image P. To conserve space, we will refer to a 3D statistical template using the stacked visualization on the right The proposed analysis pipeline for VBA. A new DTI scan is aligned to the STEAM statistical template, then values at each voxel are compared to the Gaussian distributions in the template using a χ 2 - test. The voxels whose tensors are significantly different than their corresponding Gaussian distribution, after multiple comparison correction, are identified and visualized A visualization of the various diffusion tensor measures for which we generated preterm infant statistical templates. Among these measures are mean diffusivity (MD), the tensor shape measures (c l, c p, c s ) defined in [230], fractional anisotropy (FA), log-euclidean FA (LFA), relative anisotropy (RA), each individual eigenvalue (λ 1, λ 2, λ 3 ) of the diffusion tensor, radial diffusivity (RD), tensor norm ( D F ), and volume ratio (VR). Note that templates for all of these measures are computed at multiple spatial scales xix

20 Figure 6.6 Figure 6.7 Figure 6.8 Figure 6.9 Exclusion criteria for the group of preterm infant DTI scans used to create the normative statistical templates. The number of scans excluded by each criteria are listed below the corresponding criteria (WMI = white matter injury, IVH = intraventricular hemmorhage, Bayley-III = Bayley Scales of Infant and Toddler Development, PDMS-II = Peabody Developmental Motor Scales). Excluded scans of sufficient quality were used to validate the VBA aspect of STEAM. A more detailed description of that VBA test set is given with the corresponding experiments in Section 6.4. Note that scans were included in the template only if the infant s measures of neurodevelopment are within 1 standard deviation of the normal mean (> 85). Further details on these exclusion criteria are given in Section The distribution of DTI scans used to generate each statistical preterm infant template. Bars are color-coded based on the age windows used for each template (see Section 6.3.3) Axial, coronal, and saggital slices of the mean color FA maps for our four preterm infant DTI templates. Note that all figures are drawn with the same scaling so any change in size is due to brain development. Also note the template quality makes it easy to distinguish major fiber tracts Axial, coronal, and saggital slices of the FA coefficient of variation for our four preterm infant DTI templates. Note that as the brain develops the inter-subject variability increases. Also, the variability is greater in the posterior part of the brain, suggesting greater development in that part of the brain over this week PMA time period xx

21 Figure 6.10 Figure 6.11 Figure 6.12 A case study of how STEAM can be used to identify DTI abnormalities and how those detected abnormalities compare to structural MRI abnormalities and outcome. The results from STEAM s voxelbased analysis for FA, MD, RD, and λ 1 are shown in (a). These results show abnormally high MD, RD, and λ 1 over a large region encompassing deep gray and white matter. These results are consistant both with the presence of white matter lesions on the infant s T1 scan shown in (b) and the infant s significantly reduced neurodevelopmal test scores at 18 months corrected age (shown in the table above). The registration accuracy between the infant s DTI scan at the STEAM statistical template is shown in (c). We do see some misregistration around the posterior portion of the ventricles on the MD blended image, which is a result of ventriculomegaly. However, this misregistration is small in comparison to the STEAM-detected DTI abnormalities. The combination of all these results suggest that STEAM is identifying a true structural abnormality in this infant A second case study of how STEAM can be used to identify DTI abnormalities and how those detected abnormalities compare to structural MRI abnormalities and outcome. The results from STEAM s voxel-based analysis for FA, MD, RD, and λ 1 are shown in (a). These results show abnormally high FA and λ 1 in areas of cortical gray matter and superficial white matter on the left side of the brain. These results are consistant with the presence of nearby white matter lesions on the infant s T1 scan shown in (b) as well as the infant s significantly reduced motor test scores at 18 months corrected age (shown in the table above). The registration accuracy between the infant s DTI scan at the STEAM statistical template is shown in (c). While some of the abnormalities are around the cortex, there is no discernable registration error in these regions. The combination of all these results suggest that STEAM is identifying a true structural abnormality in this infant Comparison of STEAM-detected abnormalities in fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (λ 1 ) between infants with normal and abnormal motor development. P-values were computed using two-way ANCOVA with sex, birth age, and brain volume included as covariates. Note that for all diffusion features except FA, the extent of STEAM-detected abnormalities is significantly higher for the infants with abnormal motor outcome xxi

22 Figure 6.13 Figure 6.14 Figure 6.15 Comparison of STEAM-detected abnormalities in fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (λ 1 ) between infants with and without white matter lesions. P- values were computed using two-way ANCOVA with sex, birth age, and brain volume included as covariates. Note that for all diffusion features except FA, the extent of STEAM-detected abnormalities is significantly higher for the infants with white matter lesions, which is consistent with previous literature [20] A collection of views from the STEAM website. From the website, one can download the STEAM source code, template collections (a) or even individual templates within each the collection (b). Also, the STEAM website boasts an online image viewer (c) that allows an interested user to examine each STEAM statistical template used in this work (d) An example of image misregistration leading to false positives in the STEAM abnormality maps. In (a), a large region of FA abnormality is identified by STEAM around the right corticospinal tract (shown highlighted by blue arrows). In (b), we see in the same area that the subject s right corticospinal tract (in purple) does not align with the corresponding tract in the template (again highlighted by blue arrows). In this case, the misalignment pattern matches the abnormality pattern, strongly suggesting that these STEAM-detected abnormalities are false positives xxii

23 List of Acronyms ADC ANCOVA ANOVA Bayley-III CSF dmri DOT DSC DTI DWI EPI FA FDR FDT FLIRT FMRIB fodf FOV FWHM FSL GA GRF HARDI IOF IVH MD MOW MRI ODF PAS Apparent Diffusion Coefficient Analysis of Covariance Analysis of Variance Bayley Scales of Infant and Toddler Development, Third Edition Corticospinal Fluid Diffusion Magnetic Resonance Imaging Diffusion Orientation Transform Dice Similarity Coefficient Diffusion Tensor Imaging (or Diffusion Tensor MRI) Diffusion Weighted Imaging Echo Planar Imaging Fractional Anisotropy False Discovery Rate FSL s Diffusion Toolbox FSL s Linear Image Registration Tool Oxford Centre for Functional MRI of the Brain Fiber Orientation Distribution Function Field Of View Full Width at Half-Maximum FMRIB Software Library Gestational Age Gaussian Random Field High Angular Resolution Diffusion Imaging Inferior Occipitofrontal Fasciculi Intraventricular Hemorrhage Mean Diffusivity Mixture of Wisharts Magnetic Resonance Imaging Orientation Distribution Function Persistent Angular Structure xxiii

24 PDF PDMS-II PMA RBF RD ROI STEAM TE TEND TR VBA VM WMI Probability Density Function Peabody Developmental Motor Scales, Second Edition Post-Menstrual Age Radial Basis Function Radial Diffusivity Region of Interest Statistical Template Estimation for Abnormality Mapping Time to Echo Tensor Deflection Tractography Time to Repetition Voxel-Based Analysis Ventriculomegaly White Matter Injury xxiv

Chapter 1 Introduction 1.1 Background and Motivation The majority of the human body is comprised of water molecules [97] and those molecules undergo diffusion: Random Brownian motion [59].

25 Chapter 1 Introduction 1.1 Background and Motivation The majority of the human body is comprised of water molecules [97] and those molecules undergo diffusion: Random Brownian motion [59]. While the diffusion process is random, our cell structures restrict the movement of water molecules [36] as shown in Figure 1.1. This relationship between diffusion and cell structure leads to two noticeable phenomenons. First, a greater amount of cell structure leads to slower, more restricted, diffusion. Second, in fibrous tissues, the rate of diffusion is greater along the direction of the fibers as opposed to perpendicular to them. The ability to measure diffusion would provide us with a window to these two phenomenon and, in turn, the ability to assess tissue organization and integrity in a non-invasive fashion. Diffusion MRI (dmri) is a powerful imaging protocol that allows us to measure, at each point in a subject, the diffusion rate along a given direction in 3D [119]. Repeating the dmr imaging protocol allows us to produce multiple 3D images, each of which measures, voxelby-voxel, the diffusion rate along a unique 3D direction. After obtaining multiple diffusion measurements at each voxel, it is then common to fit a model that concisely summarizes Figure 1.1: The biological basis for Diffusion MRI. Water molecules move faster when they are not restricted by cell structures. As a result, diffusion along fiber tracts is faster than perpendicular to those tracts. From Beaulieu [37]. 1

26 and interpolates between those diffusion measurements [8]. One common example of such a model is the diffusion tensor [29], but more flexible models have also been proposed [18]. With models fit to the diffusion measurements at each voxel, we obtain a 3D image whose voxel values are high-dimensional and typically manifold-valued (as these models usually rule out negative diffusion values). As a result, a dmri scan can be very difficult to interpret through visual inspection [174], which has led to a need for computational methods to reduce the image s complexity [99] and extract its clinically-relevant data for further analysis [30, 118, 203, 234]. A more thorough review of diffusion MRI can be found in [50, 52] and is included as the second chapter of this thesis. Imaging Preterm Infants Diffusion MRI is most commonly used for the non-invasive assessment of the brain s white matter [203]. Recently, there has been an increasing desire to use dmri in the assessment of infants born prematurely [77]. This desire stems from the fact that preterm infants are at a high risk of long term neurodevelopmental problems due to white matter brain injuries occurring around the time of the infant s birth [20]. Diffusion MRI would allow us to examine the preterm infant brain and determine an infant s potential of having long-term developmental abnormalities. While the high-dimensionality and manifold-valued nature of dmri data make them difficult to analyze, the scans obtained from preterm infants pose additional unique challenges, including: decreased image quality due to motion from infants that are less cooperative during the imaging process than the average adult patient [158], increased aliasing due to the combination of the infant s small brain size and the limited image resolution in dmri (on the order of millimetres) [17], the infant s brain being at an earlier stage of development than the average adult patient, with various brain structures not yet fully formed at the time of image acquisition [181]. The combination of these three challenges results in dmri scans that are of lower quality, and show greater variability, than what is seen for adult patients. In fact, these challenges have highlighted various limitations in computational dmri analysis techniques, many of which were built for and only tested on adult dmri scans [23, 76]. Due to these limitations, there remains a need for dmri analysis algorithms that take into consideration the unique challenges posed by preterm infant dmri scans. 2

27 1.2 Thesis Contributions In this thesis, we take on the additional challenges of preterm infant dmri analysis by introducing novel contributions in the areas of dmri registration, segmentation, tractography, and statistical analysis. In particular, those contributions are: 1. The creation of the first information content estimators for unaltered diffusion MRI data [46], thereby generating a robust dmri mutual information metric that can be used for longitudinal preterm dmri registration. 2. The development of a preprocessing technique to simplify dmri scans for segmentation. This preprocessing technique highlights local changes in diffusion due to tissue differences while ignoring local changes in diffusion that are due to a tissue s pose or orientation. We show that differentiating between these two types of diffusion changes leads to more accurate tract segmentations [49], which should better highlight emerging brain structures in a preterm infant s dmri scan. 3. The establishment of the first probabilistic tractography technique that allows the user to map out multiple neuronal pathways simultaneously, effectively introducing competition between pathways for the space those pathways occupy [47, 48]. By introducing this competitive aspect to tractography, we reduce the amount of spurious neuronal pathways that existing tractography algorithms currently identify. 4. The construction of a whole-brain, voxel-based, statistical analysis pipeline called STEAM: Statistical Template Estimation for Abnormality Mapping [53]. This pipeline is the first to allow a user to take a single preterm dmri scan, compare it to a normative preterm population, and highlight voxels whose measured diffusion is significantly different than that of the normative population. In all four cases, we make the source code for these contributions publicly available to aid in the reproducibility of our research. In the remainder of this section, we introduce each contribution and discuss how they are related to preterm infant dmri analysis Information Content Estimators for Diffusion MRI Information theory is a key component of computer vision and has seen uses in image compression, feature detection, segmentation, and registration [81]. Attempts have been made to apply information theory to dmri data [4, 103, 163] and the success of these applications, and other potentially similar work, relies on being able to estimate entropy in a manner that is mathematically consistent and accurate. Usually, entropy is estimated using histogram binning, but this estimation technique only works well if the dimensionality of the samples is low [201]. For high dimensional dmri data, this form of entropy estimation is prone to bias and quantization errors [222]. 3

28 1 cm (a) 29 weeks post-menstruation (b) 44 weeks post-menstruation Figure 1.2: Slices of a preterm infant s brain shown in diffusion tensor MRI at postmenstrual ages (PMA) of 29 and 44 weeks, respectively. Note the large change in both brain shape and size, leading to a more challenging image registration problem. Also note the decreased anisotropy (i.e., brightness) around the edge of the brain as the infant matures. Existing works that estimate the entropy of dmri datasets reduce the dimensionality of the data prior to computing the estimate, resulting in some dmri information being thrown away [4, 103, 163]. Ideally, we desire an dmri entropy estimator that is both consistent and accurate without requiring dimensionality reduction. Motivation from a Neurodevelopmental Perspective An important marker in determining whether a preterm infant will experience long-term neurodevelopmental problems is how quickly the brain develops over the first few weeks after birth [66]. Diffusion MRI has been used to measure that development by comparing a scan taken early on to one taken at a later date [57, 150, 181]. One way that comparison could be performed is by registering the initial scan to the follow-up scan and looking at the image difference. The key computational challenge in such an analysis is the image registration: Deforming one dmri scan in a way that maximizes its similarity to another, stationary, dmri scan. This step requires a way to measure similarity between two dmri scans. That similarity can be difficult to define between a preterm infant s initial and follow-up dmri scans as the preterm infant brain experiences rapid growth and maturation in the first few weeks of life [181]. An example of that rapid growth is shown in Figure 1.2. The slices shown are from two dmri scans of the same infant taken 15 weeks apart, yet the brain in the latter scan is 4

29 nearly double the size of the brain in the initial scan. Further, the brain structures develop dramatically over this time period leading to changes in the diffusion measurements. An example can be seen in the cortex where a decrease in diffusion anisotropy (shown as brightness in Figure 1.2) can be observed over time. Even so, major structures, like the corpus callosum (in red) and the corticospinal tracts (in blue), are clearly recognizable in both scans. Mutual information is commonly used as an image registration similarity measure in situations like this, where both images have similar structures, but whose appearance varies between the scans [145]. We would like to use mutual information for this dmri registration task, but estimating mutual information for dmri data requires being able to estimate entropy for high-dimensional, manifold-valued datasets. High Level View of our Information Content Estimators We propose novel information content estimators for diffusion MR images using binless approaches based on nearest-neighbour distances. By measuring these nearest-neighbour distances using existing dmri distance metrics, we can generate entropy estimates that are mathematically consistent and respect the manifold of the dmri models. Further, we are able to obtain such estimators without having to reduce the dimensionality of the dmri data to the point where a binning estimator can be reliably used. Our dmri entropy estimator is derived from the nearest-neighbour entropy estimator first presented by Kozachenko and Leonenko [129]. That estimator can be described as, H nn (X) 1 N N log 2 (η j ) + f bias (N) (1.1) j=1 where N is the number of samples, and η j is the distance between sample j and the nearest sample in the sample set. At a high level, the first term of the estimator is the average nearest-neighbour distance between samples and captures how much these samples are spread out. The other term, f bias ( ) then corrects for any bias in the estimate due to the size of the sample set. The error of this estimator is bounded by H nn (X) H(X) O(N 1/2 ) ensuring that as the number of samples increases, the estimator converges to the true value [129]. We propose to combine this entropy estimator with dmri distance metrics to obtain an entropy estimator for dmri datasets. The dmri distance metrics will be used to measure the nearest-neighbour distances η in a fashion that respects the underlying manifold for the given dmri model. Such distance metrics exist for all common dmri models including diffusion tensors [13, 183, 225], spherical harmonic representations [69], and higher-order tensor representations [26]. 5

30 With the proposed entropy estimator, it becomes straightforward to estimate additional dmri-specific information-theoretic measures, as shown in [46], including joint entropy, conditional entropy, mutual information, and variation of information. Our mutual information estimator in particular moves us one step closer to addressing the preterm dmri registration problem described earlier. Contributions and Publications We published our dmri entropy estimator, as well as its extensions to additional informationtheoretic measures, in the following paper: Brian G. Booth and Ghassan Hamarneh. Consistent Information Content Estimation for Diffusion Tensor MR Images, in Proceedings of 1st IEEE Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), July 2011, pp (32% Acceptance Rate, Best Paper Award). As part of that paper, we showed that our estimators more accurately reflect the underlying dmri data and provide faster convergence rates for image segmentation and registration algorithms. An extended version of this paper makes up the third chapter of this thesis. The source code for our dmri information content estimator has also been made publicly available at Our proposed dmri entropy estimator is the first to compute entropy estimates on dmri data without requiring a lossy dimensionality reduction preprocessing step Diffusion MRI Segmentation via Cross-sectional Piecewise Constancy Segmentation has become an important component of medical image analysis as it is a way of extracting anatomical structure from image data. The resulting structures can then be examined in terms of their shape, size, and appearance to obtain useful imaging biomarkers of clinical measures. One common assumption used in image segmentation algorithms is that the image can be approximated by a piecewise constant function with each homogeneous region being an anatomical structure of interest [64, 74]. In dmri, segmentation is often used to delineate axonal fiber tracts connecting functional brain regions [19]. The challenge in segmenting fiber tracts from dmri is that the tracts often curve [61]. A common example of a curved fiber tract is the cingulum shown in green and blue in Figure 1.3. As the diffusion measurements in dmri are dependent on tract orientation, the resulting dmri data is not homogeneous over the whole fiber tract. This violates the commonly-used piecewise constancy assumption that many segmentation algorithms make. In order to handle dmri segmentation tasks, we require the ability to 6

31 Figure 1.3: An example of the cingulum fiber tract as highlighted on the dmri scan of a healthy adult. Note that the cingulum forms a semicircular shape and given that diffusion is strongest along the tract, the diffusion measurements along this tract are not homogeneous. From Nand et al. [160]. use (a) a more complex model of the image, and/or (b) preprocess the dmri scan so that the piecewise constancy assumption can be applied effectively. Motivation from a Neurodevelopmental Perspective A common task in preterm infant dmri analysis is to perform statistical analysis of diffusion values within a fiber tract or region of interest [1]. Currently, the most popular way of identifying the voxels that make up a region of interest is to identify them manually [66]. Segmentation can be used to extract the boundaries of those regions, but it remains a challenge to do so accurately with the low quality and resolution in dmri scans of preterm infants [76]. An additional concern with preterm infant dmri is the fact that the brain is in the process of developing [210] and neighbouring brain structures that are easily distinguishable on an adult dmri scan can have similar diffusion measurements at this younger age [191]. As a result, neighbouring regions of interest can have low inter-region variability in their diffusion values (i.e., low contrast), making them more difficult to segment using traditional segmentation algorithms. At the same time, the orientation dependency of the diffusion can lead to high intra-region variability within a curved fiber tract [166]. In order to be successful, a preterm infant dmri segmentation algorithm would have to (a) increase interregion variability and/or (b) decrease intra-region variability. High Level View of the Algorithm To overcome the limitations of the piecewise constant image model, and to decrease intraregion variability due to curving fiber tracts, we propose a cross-sectional piecewise constant model for highly curved fiber tracts in diffusion MRI scans. The cross-sectional piecewise constant model captures the idea that 2D cross-sections normal to the medial of a 3D structure can be modelled by a piecewise constant function in regions local to that structure. Under this image model, we are able to use the medial curve as an intrinsic frame of 7

32 Figure 1.4: The segmentation workflow of our algorithm based on our cross-sectional piecewise constant image model. For further details, please refer to the text below. reference for the tract and homogenize the tract s diffusion data through the use of that curve. We apply this concept to preprocess dmri scans for segmentation using the workflow in Figure 1.4. As an initial step, our algorithm employs a tractography technique [235] to generate an anchor curve which is used as an approximation of the medial of the fiber tract. The anchor curve is then used to define the curve s intrinsic Frenet frame and generate the crosssections of the fiber tract. For each cross-section, we use dmri distance metrics [13, 26, 69] to measure dissimilarities between all points on the cross-sectional plane and the intersection point between the plane and the anchor curve. As our anchor curve is intrinsic to the tract, its intersection with the cross-section serves as a local representative for the tract and this comparison step effectively homogenizes the diffusion measurements within the tract. Once computed, these diffusion dissimilarities are then interpolated back into a 3D scalar image using a scattered data interpolation technique [87], resulting in a scalar image that fits with a piecewise constant image model. Finally, a segmentation algorithm based on the piecewise constant image assumption is used to segment the scalar 3D dissimilarity image to obtain the final segmentation [54]. As shown in Figure 1.4, the cross-sections of the tract can appear very different from each other, but the resulting diffusion dissimilarities on each cross-section will be lower inside the tract than outside it. By measuring dissimilarities on each cross-section independently, we are able to eliminate intra-region variability due to tract curvature and obtain a 3D dissimilarity image that can be well-modelled by a piecewise constant function. 8

33 Contributions and Publications We published our dmri image model and segmentation technique in the following venue: Brian G. Booth and Ghassan Hamarneh. A Cross-sectional Piecewise Constant Model for Segmenting Highly Curved Fiber Tracts in Diffusion MR Images, in Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI), September 2013, pp (32% Acceptance Rate). As part of that publication, we showed that our proposed segmentation algorithm produced more accurate segmentation results for the cingulum (see Figure 1.3) than five different state-of-the-art dmri segmentation algorithms [19, 74, 84, 139, 166]. An extended version of this paper makes up the fourth chapter of this thesis. The source code for our dmri segmentation algorithm has also been made publicly available at Our proposed technique is the first to introduce the concept of a cross-sectional piecewise constant image model. Further, it is the first to employ such an image model in a dmri segmentation algorithm Global, Competitve, Closed-Form Probabilistic Tractography The innovation of dmri has uniquely led to the ability to perform tractography: The delineation of neural connections in the brain. As diffusion is strongest along the fiber tracts that make up neural pathways, the directions of maximal diffusion at each voxel can be linked up to reconstruct the fiber tracts, thereby mapping out connectivity in the brain [30]. The tractography problem can be complicated by many factors. Image noise and aliasing pose major challenges, potentially leading tractography algorithms off the path of the underlying neural pathways, resulting in a phenomenon known as tract jumping [156]. An example of tract jumping is highlighted by a purple arrow in Figure 1.5(a). Fitting an overly simplistic model (like the diffusion tensor) to a voxel s diffusion measurements is also a common source of tractography errors [71]. While more recent work in tractography particularly with probabilistic [88] and minimal path [235] tractography algorithms attempt to improve confidence and accuracy in tractography results, the same tract jumping errors can still occur, as can be seen in Figure 1.5(b). These errors can be reduced if we address two aspects of tractography algorithms that limit their effectiveness: The independent delineation of each fiber tract, and the reliance on purely local image cues to guide tractography algorithms. 9

34 (a) Streamline Tractography (b) Probabilistic Tractography Figure 1.5: Examples of tract jumping on the same dmri scan using both traditional streamline tractography and more recent probabilistic tractography. Note that the tract identified by the purple arrow is not anatomically correct, yet these tractography algorithms erroneously identify this as a valid anatomical connection in the brain. Motivation from a Neurodevelopmental Perspective Tractography is being used increasingly often in preterm dmri analysis, both in regards to segmenting fiber tracts [1] and in connectome 1 analysis [57]. The accuracy of both analysis tasks are limited by the accuracy of the tractography itself. The tractography problem is further complicated in preterm infant dmri as (a) the images are generally of lower quality than those obtained from adult subjects, (b) the brain s fiber tracts are in an early state of development and are more difficult to identify, and (c) there is greater aliasing in the dmri due to the limited image resolution and smaller brain size. All three of these complications can introduce tract jumping error similar to, or worse than, those shown in Figure 1.5. A more robust tractography algorithm would be desirable to reduce the number of erroneous connections identified, leading to a more accurate tractography-based analysis of the preterm infant brain. High Level View of the Algorithm To improve the robustness of tractography algorithms, we must address the aspects of tractography algorithms that limit their effectiveness: Locally-driven decision making and independent tract formation. With these characteristics in mind, we propose a novel ap- 1 A connectome is a concise mapping of the neural pathways throughout the full brain. See [195] for further details. 10

35 (a) A random walk on an image grid. The trajectory of the walker is shown in green. (b) The grid s edge weights are computed from the diffusion model. Figure 1.6: Our proposed tractography algorithm is based on a random walk over the image grid where the grid s edge weights are computed from the dmri data; the more diffusion we measured along an edge direction, the more likely the random walker will select that edge. proach to tractography by incorporating the notion of competition. At a high level, we propose a tractography algorithm where tracts compete for space within the brain. Our proposed algorithm is based on the concept of a random walk on the image grid (see Figure 1.6(a)) with the steps of the random walker being driven by the dmri data. In a fashion similar to that proposed in [108], we assign a probability to each random walker step that is proportional to the local rate of diffusion in the step direction (see Figure 1.6(b)). That way, the random walker s trajectory aligns with the underlying neural pathways. With the image grid defined and weighted by the dmri data, we introduce competition by defining multiple target regions at which the random walker will terminate. Random walkers are then initialized at each voxel and we compute the probability of a walker reaching at a particular target region first. By introducing this temporal aspect to the tractography problem, the target regions end up competing to be the terminus for a random walker and, in doing so, reduce the likelihood of discovering meandering, erroneous connections like those identified in Figure 1.5. Our resulting random walk problem is a special case of the random walker problem described in [95] and can be solved using the algorithm presented therein. As our proposed algorithm is a special case of the more general random walker, we inherit additional beneficial properties of that algorithm, in particular the ability to obtain a deterministic, closed form solution for the connection probabilities; and the ability to incorporate global image cues. We hypothesize that these benefits, combined with the competition we introduce, will reduce the frequency of anatomically erroneous connections being discovered. 11

36 Contributions and Publications We have published the following two papers related to this tractography algorithm: Brian G. Booth and Ghassan Hamarneh. Exact Integration of Diffusion Orientation Distribution Functions for Graph-based Diffusion MRI Analysis, in Proceedings of 8th IEEE International Symposium on Biomedical Imaging (ISBI), March 2011, pp (Podium Presentation). Brian G. Booth and Ghassan Hamarneh. Multi-region Competitive Tractography via Graph-based Random Walks, in Proceedings of 11th IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA), January 2012, pp (Podium Presentation). The first paper introduced the only technique to date that analytically computes the random walker transition weights in Figure 1.6(b) from dmri data (instead of approximating them using a numerical technique) [47]. By computing these weights analytically, we remove a source of error from the tractography algorithm and increase the speed at which the graph-based dmri representation can be built. The second paper introduced the full tractography algorithm and showed its ability to reduce the number of erroneous connections obtained on both synthetic and real dmri scans [48]. The proposed tractography algorithm is the first to: Introduce the notion of tracts competing for space as a way of reducing the number of erroneous connections being detected. Obtain a closed form solution to the problem of probabilistic tractography. Provide the ability to introduce global image cues into the tractography problem. These two papers make up the fifth chapter of this thesis. We have also made the source code from these two publications publicly available. The graph construction code is available at while the tractography code is available at STEAM - Statistical Template Estimation for Abnormality Mapping While the contributions in this thesis all relate to dmri analysis, the previous contributions stop short from answering clinical questions with their analysis results. For example, our other contributions have not addressed the question of whether the dmri scan or a measurement taken from that scan falls within the range associated with normal neurodevelopment. This question is one of the most important in medical image analysis, and 12

37 answering that question often relies on the outcome of a statistical test (e.g. t-test, analysis of covariance, etc...) [147]. To date, various statistical analysis techniques have been proposed or applied on dmri scans. Some examples include region of interest (ROI) analysis which test for statistical differences in a few, small ROIs [66], to connectome analysis which tests for statistical differences across the whole brain [57]. One particular technique of note is tract-based spatial statistics (TBSS): A dmri-specific statistical analysis technique that first identifies the major fiber tracts, then performs voxel-by-voxel statistical tests along those tracts to identify abnormalities [203]. While these statistical analysis techniques have been largely applied to identify group differences between a normative population and a clinically adverse group [2, 70, 191], they stop short of applying those group-level conclusions to individual cases. As a result, there is still the open question of, Given the dmri scan of a single subject, what cues can we extract from that one scan to gauge whether the structural properties of that subject s brain fall within a normal range? Motivation from a Neurodevelopmental Perspective Preterm infants are at high risk of neurodevelopmental outcome [43] and it is believed that their poor neurological outcomes are a result of white matter injuries acquired around the period of the infant s birth [20]. The earlier these abnormalities are detected, the sooner clinicians can intervene and either improve a preterm infant s neurodevelopmental health, or set up appropriate rehabilitation care to aid in that child s growth and maturation. Existing preterm infant dmri statistical analyses, including those using the techniques identified in the previous section, have unanimously been group-based studies [2, 66, 70, 191]. While those studies provide a broad, population-based, view of dmri abnormalities, they do not provide us with the ability to analyze a single dmri scan and make a determination as to whether a pattern of brain abnormality is present for that specific infant. High Level View of our Statistical Technique We propose to generate a subject-specific dmri analysis technique which can flag brain regions that have abnormal dmri measurements indicative of future neurodevelopmental delay. We refer to this technique as STEAM: Statistical Template Estimation for Abnormality Mapping. The STEAM technique consists of two parts. First, we generate a collection of 3D statistical template images that capture, at the scale of each individual voxel, the distribution of diffusion measurements for a group a preterm infants with normal developmental outcomes. This template collection acts as our normative statistical model and can be compared to in order to identify abnormalities. 13

38 Figure 1.7: Flow diagram for the statistical template estimation procedure first presented in [96]. Diffusion MRI scans from normative subjects are aligned to a target image (yellow) and Gaussian distributions are fit to the measurements at each voxel (red), resulting in a statistical template for the normative population. The image transformations from the alignment step are then inverted, averaged, and applied to the statistical template images (blue) to transform them to an unbiased shape and size. Multiple iterations can then done to reduce image registration error. Figure 1.8: The proposed analysis pipeline for a new dmri scan. The new scan is aligned to the STEAM statistical template, then values at each voxel are compared to the Gaussian distributions in the template. The voxels whose tensors are significantly different than their corresponding Gaussian distribution are identified and visualized. 14

39 An overview of the statistical template estimation procedure is shown in Figure 1.7. The technique is an extension of the atlas-creation pipeline presented in [96] where MRI scans are aligned and averaged to create an unbiased mean MRI of the population. We propose to extend this technique to dmri, and to compute not just an unbiased mean image but also an unbiased standard deviation image. These contributions will allow us to model, at each voxel, a full Gaussian distribution over the dmri data. Second, we use these statistical templates to perform the analysis of a new dmri scan. We do so by first aligning the new dmri scan to the mean image of the template, then by applying single-sample statistical tests at each voxel to identify abnormalities. An overview of this process is given in Figure 1.8. The result of this analysis pipeline is a spatial abnormality map : A map that identifies all voxels that are significantly different from the normative population modelled by the statistical template. Contributions and Publications Our proposed STEAM dmri analysis technique is the first to provide a patient-specific mapping of where their dmri scan differs significantly from the normative population. Further, STEAM is able to display those differences over the whole brain and at the level of individual voxels, two benefits that most group-based dmri statistical analysis techniques have yet to claim. STEAM was recently published in the following journal: Brian G. Booth, Steven P. Miller, Colin J. Brown, Kenneth J. Poskitt, Vann Chau, Ruth E. Grunau, Anne R. Synnes, and Ghassan Hamarneh. STEAM - Statistical Template Estimation for Abnormality Mapping: a Personalized DTI Analysis Technique with Applications to the Screening of Preterm Infants, NeuroImage, Vol. 125, 2016, pp A website for the project is online ( and contains both the source code for STEAM as well as our generated statistical templates. As part of the submitted manuscript, we showed that a relationship exists between the volume of STEAMdetected abnormalities and neurodevelopmental outcomes measured 18 months after the dmri scan was acquired. Further, we showed that infants with similar adverse outcomes can show different patterns of abnormality, a conclusion that is widely believed [20] but that group-based studies do not have the power to show. This paper makes up the sixth chapter of this thesis Reproducible Research and Published Software As part of this thesis reserach, we went through the effort of making any software we created publicly available. We further used publicly available datasets when possible. In doing so, 15

40 we ensure that our research can be reproduced and expanded on by other groups. As part this thesis, we produced the following software packages: DT-ICE: Diffusion Tensor Information Content Estimators [46]. DT-ICE is a MAT- LAB tool to generate information content estimates from tensor-valued data. Estimators are provided for entropy, joint entropy, conditional entropy, mutual information, and variation of information. Both Shannon and Renyi-based information content estimators are implemented. DT-ICE is available at Frenet-Frame DTI Segmentation: Diffusion tensor image segmentation using the cross-sectional piecewise constant image model defined by the tract s Frenet frame [49]. The DTI Frenet Segmentation package is implemented in MATLAB and provides code to generate the anchor tract, create the cross-sectional planes, employ the crosssectional piecewise constant image model, and perform the segmentation. DTI Frenet Segmentation is available at dmri Graph Embedding Toolbox: This diffusion MRI graph-embedding toolbox is a MATLAB tool to take a field of diffusion orientation distribution functions (ODF) and map them to a graph representation where a each node is a voxel in the dmri scan and graph edges are weighted by the ODF values [47]. The resulting graphs are commonly used in graph-based tractography algorithms. The dmri Graph Embedding Toolbox is available at Dijkstra-Tract: Minimal Path Tractography [47]. This minimal path tractography toolbox is a MATLAB tool to take a dmri scan and generate a graph representation where a each node is a voxel in the dmri and graph edges are weighted by the diffusion measurements. This graph is then combined with DijkstraâĂŹs algorithm to perform minimal path tractography [235]. Dijkstra-Tract is available at WalkTract: Random Walker Tractography [48]. WalkTract is a MATLAB tool for performing competitive, multi-region, probabilistic tractography. The tractography problem is modelled as a random walk problem and the resulting connection probabilities are computed quickly and analytically. WalkTract is available at STEAM: Statistical Template Estimation for Abnormality Mapping [53]. STEAM is a whole brain voxel-based analysis engine for the examination of diffusion tensor images (DTIs) of the developing preterm infant brain. Key to our STEAM analysis engine is a collection of 3D statistical DTI templates that represent a normal control 16

41 population. These templates are used as a statistical model that new scans can be compared to in order to highlight regions of abnormality. STEAM is available at These software packages are a subset of the published software from my doctoral work. Additional software packages developed (or co-developed) during my doctoral research include: DT-STRUCT 2 : DTI Structure Detectors [160]. DT-STRUCT is a MATLAB toolbox that can highlight structures in diffusion tensor images. In particular, DT- STRUCT can highlight corners, tubular structures, and sheet-like structures. Each structure detector can be run at multiple spatial scales. DT-STRUCT is available at DeformIt, Version 2.0. Image Data Augmentation Tool [51]: Simulate novel images with ground truth segmentations from a single image-segmentation pair. DeformIt, version 2.0 includes support for scalar, vector and tensor-valued 2D and 3D images. The software package is available at Bilateral Filtering of Diffusion Tensor MRI 3. The scalar version of bilateral image filtering is extended to perform edge-preserving smoothing of DT field data. The bilateral DT filtering is performed in the Log-Euclidean framework which guarantees valid output tensors. The software is available at View3D 4 : MATLAB viewer for 3D scalar, vector, and tensor-valued medical images. This software package is available at Full Bibliography of Doctoral Work This thesis is based on our following published work (publications are listed in order of appearance in this thesis): Brian G. Booth and Ghassan Hamarneh. Diffusion MRI for Brain Connectivity Mapping and Analysis (Chapter 7), in Angshul Majumdar and Rabab Kreidieh Ward (Ed.), MRI: Physics, Image Reconstruction, and Analysis, CRC Press, pp , 2015 (ISBN: ). 2 Co-developed with K. Krishna Nand. 3 Co-developed with Judith Hradsky. 4 Co-developed with Hossein Badakhshannoory. 17

42 Brian G. Booth and Ghassan Hamarneh. Brain Connectivity Mapping and Analysis using Diffusion MRI (Chapter 19), in Troy Farncombe and Krizysztof Iniewski (Ed.), Medical Imaging: Technology and Applications, CRC Press, pp , 2013 (ISBN: ). Brian G. Booth and Ghassan Hamarneh. Consistent Information Content Estimation for Diffusion Tensor MR Images, in Proceedings of 1st IEEE Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), July 2011, pp Brian G. Booth and Ghassan Hamarneh. A Cross-sectional Piecewise Constant Model for Segmenting Highly Curved Fiber Tracts in Diffusion MR Images, in Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI), September 2013, pp Brian G. Booth and Ghassan Hamarneh. Exact Integration of Diffusion Orientation Distribution Functions for Graph-based Diffusion MRI Analysis, in Proceedings of 8th IEEE International Symposium on Biomedical Imaging (ISBI), March 2011, pp Brian G. Booth and Ghassan Hamarneh. Multi-region Competitive Tractography via Graph-based Random Walks, in Proceedings of 11th IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA), January 2012, pp Brian G. Booth, Steven P. Miller, Colin J. Brown, Kenneth J. Poskitt, Vann Chau, Ruth E. Grunau, Anne R. Synnes, and Ghassan Hamarneh. STEAM - Statistical Template Estimation for Abnormality Mapping: a Personalized DTI Analysis Technique with Applications to the Screening of Preterm Infants, NeuroImage, Vol. 125, 2016, pp The publications above make up a subset of my doctoral research. Additional contributions to the field of diffusion MRI were also published in the following papers: Colin J. Brown, Steven P. Miller, Brian G. Booth, Shawn Andrews, Vann Chau, Kenneth J. Poskitt, and Ghassan Hamarneh. Structural network analysis of brain development in young preterm neonates,, NeuroImage, Vol. 101, 2014, pp Colin J. Brown, Steven P. Miller, Brian G. Booth, Kenneth J. Poskitt, Vann Chau, Anne R. Synnes, Jill G. Zwicker, Ruth E. Grunau, and Ghassan Hamarneh. Prediction of Motor Function in Very Preterm Infants using Connectome Features and Local Synthetic Instances, in Proceedings of Medical Image Computing and Computer- Assisted Intervention (MICCAI), October 2015, pp

43 Brian G. Booth and Ghassan Hamarneh. DTI-DeformIt: Generating Ground-Truth Validation Data for Diffusion Tensor Images, in Proceedings of 11th IEEE International Symposium on Biomedical Imaging (ISBI), May 2014, pp Colin J. Brown, Brian G. Booth, and Ghassan Hamarneh. Uncertainty in Tractography via Tract Confidence Regions, in Proceedings of MICCAI Workshop on Computational Diffusion MRI (CDMRI), September 2013, pp Colin J. Brown, Brian G. Booth, and Ghassan Hamarneh. K-Confidence: Assessing Uncertainty in Tractography using k Optimal Paths, in Proceedings of 10th IEEE International Symposium on Biomedical Imaging (ISBI), April 2013, pp K. Krishna Nand, Rafeef Abugharbieh, Brian G. Booth, and Ghassan Hamarneh. Detecting Structure in Diffusion Tensor MR Images, in Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI), September 2011, pp Thesis Outline In this thesis, we present and have published or are in the process of publishing four main contributions to address dmri analysis limitations exacerbated when imaging preterm infants: We present information content estimators for dmri scan that maintain consistent and accurate behaviour in the presence of their high-dimensional, manifold-valued data [46]. Our proposed mutual information estimator may improve preterm dmri image registration, particularly for registering initial and follow-up dmri scans whose appearance can vary dramatically. We present a cross-sectional piecewise constant image model, along with a corresponding segmentation algorithm, that will allow us to accurately model and segment curved fiber tracts in dmri scans [49]. We believe these contributions will aid in the segmentation of fiber tracts in preterm infant dmri as the tracts in the preterm brain are still developing and are difficult to distinguish from surrounding brain regions. Our proposed algorithm isolates and discards diffusion differences due to tract curvature, allowing us to emphasize tissue differences in the segmentation process. We present a tractography algorithm that reduces tract jumping by introducing a competition for space between multiple targeted fiber bundles. Our proposed algorithm also provides an analytical solution that removes approximation error in the computation of tract probabilities while also giving the ability to include non-local image cues to guide the tractography process [47, 48]. These contributions should 19

44 benefit the application of tractography to the preterm infant brain as the lower dmri scan quality, smaller brain size, and the fact that their fiber tracts are still developing, all increase the potential for tract jumping. We present STEAM: Statistical Template Estimation for Abnormality Mapping [53]. The STEAM technique is the first to provide a patient-specific map of where the patient s dmri scan differs significantly from the normative population. STEAM has the potential to identify at an early stage brain abnormalities in preterm infants that can affect their long term neurodevelopment. These four contributions make up chapters 3-6 of the thesis, following this introductory chapter and a chapter reviewing the fundamentals of diffusion MRI (based on [50, 52]). A seventh and final chapter will conclude this thesis and discuss future work. With these four contributions, we address computational challenges in the areas of image segmentation, registration, tractography, and statistical analysis. We further address challenges unique to preterm infant dmri analysis, particularly the greater aliasing due to smaller brain size, and the fact that certain brain structures in the preterm infant are not yet fully formed. 20

45 Chapter 2 Fundamentals of Diffusion MRI Acquisition and Modelling 2.1 Introduction Diffusion MRI (dmri) is a powerful imaging protocol that allows for the assessment of the organization and integrity of fibrous tissue. The imaging works by measuring the diffusion of water molecules within the body. This diffusion is restricted by cell membranes and as a result, rates of diffusion are far less across tissue fibers than parallel to them. With enough diffusion measurements along different directions in 3D, we can non-invasively obtain a full model of the diffusion at different points within an imaged subject. The modeled dmri scan is then used for later visualization, processing, and analysis tasks. This thesis will present various dmri processing and analysis algorithms and their details will depend on how dmri scans are acquired and how the diffusion measurements are modeled. As such, we present here an examination of the current state of dmri acquisition and modeling techniques Biological Basis for Diffusion MRI The biological basis for diffusion MRI dates back to 1828 when botanist Robert Brown noticed the continuous and random motion of pollen grains suspended in water [59]. What Robert Brown had discovered was later determined to be the motion of water molecules due to thermal agitation [119]. This motion, now known as Brownian motion or diffusion, was later characterized by Albert Einstein [79], resulting in Einstein s equation: 1 This chapter is based on our work in [50, 52]. r 2 = 6dt. (2.1) 21

What Einstein s equation characterized was that the square of the average displacement of molecules (r) with a given diffusion rate (d) is proportional to the observation time (t).

46 Figure 2.1: Synthetic examples of the diffusion seen in Corticospinal fluid (left), gray matter (middle), and white matter (right) within the brain. The diffusion rates for various directions are shown in red. Adapted from [7]. What Einstein s equation characterized was that the square of the average displacement of molecules (r) with a given diffusion rate (d) is proportional to the observation time (t). If we can measure this molecular displacement over a fixed time, we can obtain the diffusion rate of different substances under different conditions. As the majority of the human body is water [97], the diffusion phenomenon occurs within us as well. While the diffusion process is random, our cell structures can restrict or hinder the the motion of water molecules [36]. As such, the diffusion of water molecules in our body depends on the microstructure of our tissues. Fast molecular diffusion occurs within and around a cell as there are few microstructures to inhibit motion. Diffusion though the cell however is slower as the cell membrane and other structures (e.g., myelin sheaths in the brain white matter) restrict molecular motion. Since the diffusion of water within the body is dependent on local cell structure, we can discuss how different organizations of these structures affect diffusion rates. Consider for example the human brain where functional regions (gray matter) are connected by a collection of neural pathways (white matter). Figure 2.1 presents diffusion measures for the brain s corticospinal fluid (CSF), gray matter, and white matter respectively. When the cell structure is minimal as in CSF, we see fast isotropic diffusion. More complex cell structure that is not consistently organized, such as gray matter, shows slower, but still isotropic, diffusion. Yet if the local cell structure is organized along a consistent orientation, as it is in white matter, the diffusion rates become anisotropic, i.e., they vary with regards to direction [36]. These diffusion differences within the brain are potentially useful cues in analyzing brain structure and function. For example, measuring the average diffusion rate or the anisotropy of a tissue can give us significant information about the tissue s organization and integrity [167]. Diffusion measurements would be most informative in white matter regions where the orientation of the microstructure can be inferred from the diffusion. This 22

47 Figure 2.2: Nuclear spin s generating a magnetic moment m. The particle spins around a rotational axis shown here in gray. Adapted from [138]. microstructural orientation within the brain s white matter is, in turn, known to describe the direction of neural pathways in the brain [37]. As a result, by measuring the diffusion using Einstein s equation, we could infer the orientational structure of the brain s white matter and ultimately, map out the brain s neural pathways. This is precisely what diffusion MRI is used to accomplish. 2.3 Diffusion Weighted Image Acquisition To understand how diffusion can be measured through magnetic resonance imaging, we must first address the basic concepts on nuclear magnetic resonance. The most fundamental of those concepts is the physical property of spin (s) that all elementary particles possess. An example of a particle s spin is shown in Figure 2.2. The spin property rotates the particle around its nucleus, thereby giving the particle a magnetic moment (m). This magnetic moment can then be manipulated using magnetic fields like those generated by an MRI scanner. As the body is mostly water, the spins of hydrogen atoms within water molecules become a good candidate for MR imaging Magnetic Resonance Imaging Magnetic resonance imaging manipulates the magnetic moments of hydrogen atoms in a specific pattern in order to generate an image of tissue measurements. This pattern consists of three principal steps: precession, resonance, and relaxation. We consider each in turn. Precession: A static magnetic field B 0 is applied to the body. This magnetic field aligns the rotational axis of each spin with its field direction. These spins now rotate (i.e., precess) around the same magnetic axis. Note that roughly an equal number of spins will be aligned with the positive direction of the magnetic axis as with the negative direction and the overall signal generated during precession will be negligible. 23

48 90-pulse 180-pulse Signal g δ g δ Δ Figure 2.3: The Stejkall-Tanner diffusion weighted imaging sequence. Adapted from [230]. Resonance: With the magnetic field B 0 in place, a second, weaker, magnetic pulse is applied to the body in the direction g. This second field results in the magnetic moment m of each spin aligning with the pulse direction g. The spin s axis of rotation remains aligned with B 0. The result of the resonance phase is to cause the net magnetism of the spin to veer away from the main magnetic field B 0. Relaxation: The second magnetic pulse is removed and the magnetic moments of the hydrogen atoms realign with B 0. As this realignment occurs, the changing magnetic field generated by the realignment of the spins induces a current in the coil of the MRI scanner. From this current, two common measurements can be taken: 1. Spin-spin relaxation time (T 2 ): The amount of time it takes for the magnetism in the direction of g to reduce to 37% of its maximum value. 2. Spin-lattice relaxation time (T 1 ): The amount of time it takes for the magnetism in the direction of B 0 to recover 63% of the magnetism it lost when the second gradient was applied in the direction g. These relaxation times can be visualized at multiple locations in the brain, resulting in what are known as T 1 and T 2 weighted images Diffusion Weighted Imaging Since MR imaging depends on the magnetic moments of hydrogen atoms, it is possible to develop a sequence of precession, resonance, and relaxation periods that allow MRI to measure the movement of hydrogen atoms over time, and in turn the water molecules of which they are a part. Such a imaging sequence was initially described by Stejskal and Tanner [207], and later adapted to the scanning of the human body by Le Bihan and Breton [42]. This imaging sequence is summarized in Figure 2.3 for a given diffusion direction g. The sequence in Figure 2.3 assumes the magnetic field B 0 has been applied and that the spins are precessing around B 0. In this state, a magnetic pulse is applied at an angle of 90 from the direction of B 0. This pulse aligns the spins that were separately aligned to 24

(a) B0 Image (b) Diffusion Weighted Images Figure 2.

Note the differences in measured diffusion in the Splenium due to gradient direction (highlighted by the white arrows). Adapted from [119]. either the positive or negative B 0 axis.

This gradient pulse senses the induced current to a specific angular direction.

The 180 pulse plays a key role in that it flips the spin direction of the atoms to the opposite of what they were during the precession phase.

49 (a) B0 Image (b) Diffusion Weighted Images Figure 2.4: Axial slices of (left to right) (a) standard T2 image and (b) its corresponding diffusion weighted images from gradient pulses in the horizontal, vertical, and out of plane directions. Note the differences in measured diffusion in the Splenium due to gradient direction (highlighted by the white arrows). Adapted from [119]. either the positive or negative B 0 axis. Once the spins are aligned, the 90 pulse is removed an a second pulse, known as a gradient pulse, is applied in the direction g. This gradient pulse senses the induced current to a specific angular direction. A third magnetic pulse in the direction 180 from B 0 follows the gradient pulse and then the gradient pulse is reapplied. The 180 pulse plays a key role in that it flips the spin direction of the atoms to the opposite of what they were during the precession phase. As a result of this flip, the current induced by stationary atoms during the application of the second gradient pulse will cancel out the current induced by the same atoms during the first gradient pulse [138]. Therefore, the resulting signal measured after all gradient pulses have been applied relates solely to the molecules experiencing motion in the direction g. The T 2 relaxation time is then measured from this final signal for multiple locations in the brain and visualized in what are known as diffusion weighted images. Figure 2.4 displays a conventional T 2 image next to sample diffusion weighted images (DWIs) for various gradient directions g. Note here that rapid diffusion results in fast T 2 relaxation times, resulting in a low intensity in the diffusion weighted image. Further note the different rates of diffusion for different directions within the brain s white matter as pointed out by the white arrows in Figure 2.4. From the diffusion weighted image for gradient direction g, the diffusion rate (d) can be computed using the Stejskal-Tanner equation: S = S 0 exp( bd) (2.2) where S is the diffusion weighted image intensity, S 0 is the standard T 2 image intensity, and b is the diffusion weighting [207]. The diffusion weighting b is in turn proportional to the strength and duration of the gradient pulse. The T 2 image used in (2.2) is typically 25

50 referred to in this context as a B0 image as it is acquired without the application of the gradient pulses (i.e., b = 0). The scalar d is commonly referred to as the apparent diffusion coefficient (ADC). 2.4 Correction of Image Artifacts While dmri provides us with measurements of diffusion, it must be acknowledged that the quality of the diffusion weighted images is affected and limited by the image acquisition process. All further analysis is going to depend on the accuracy of these diffusion measurements and as such, we must address the presence of noise and imaging artifacts within these diffusion weighted images. Diffusion MRI is susceptible to various artifacts, the three most common being eddy currents, subject motion, and Rician noise [28]. Let us consider each in turn Eddy Currents As seen in the diffusion imaging sequence in Figure 2.3, multiple magnetic gradient pulses are applied in rapid succession. Switching between these gradients can result in fluctuations in the scanner s magnetic field. These fluctuations induce what are known as eddy currents in the coil of the MRI scanner. The eddy currents interfere with the currents induced by the scanned subject, thereby distorting the resulting diffusion weighted images [28]. Much is known of eddy currents, namely that they are dependent on the magnitude of the gradient pulse, independent of the subject being scanned, and that they result in related geometric and intensity distortions in diffusion weighted images [102]. The geometric distortion produced from eddy currents has been shown to consist of a translation, scale and shear of the resulting image and is commonly rectified using affine registration [102, 146, 44]. The diffusion weighted images are corrected by registering them to a T 2 weighted image with the mutual information similarity measure providing the best results [146]. As the T 2 image is acquired without gradient pulses that produce eddy currents, it is assumed to be free of geometric distortion, thereby making it an appropriate template to which we can register the DWIs. Intensity corrections are then calculated directly from the magnitudes of the shear, scaling, and translations of the affine warp [102, 146]. One benefit of eddy currents being independent of the subject scanned is that the affine warp used in the correction can be obtained by imaging a physical phantom with known ground truth [67]. This warp can then be applied to later subject scans Subject Motion Depending on the number of diffusion weighted images being acquired, the length of a diffusion MRI scan can range from a couple of minutes [158] to a few hours [228]. During 26

51 that time, the subject may move both voluntarily or involuntarily (e.g., breathing). As a result, the same voxel location in two diffusion weighted images is not guaranteed to correspond to the same anatomical location in the subject. While correcting for subject motion in a single image has been well studied (see [215] for a survey), the problem of correcting motion between separate diffusion weighted images has yet to receive a strong theoretical treatment [28]. Even so, two main approaches have been proposed to correct for subject motion between diffusion weighted images, both involving image registration. First, we can, as with eddy current correction, align the diffusion weighted images to a T 2 weighted image with the mutual information similarity measure [146, 189]. The alternative approach is to model the diffusion at each voxel (as discussed further in Section 2.5) and to align images so as to minimize the residual of the model fit [11, 21]. A recent quantitative comparison of these approaches suggests that both methods are equally capable of correcting for subject motion [196]. Note however that if the rotational motion of the subject is large, the directions of the applied gradient pulses would need to be corrected for any further model fitting or analysis [136, 189] Rician Noise Any environment is going to contain a certain amount of background noise. In the case of diffusion MRI, this noise has been well modeled using a Rician distribution [32] given as p(x µ, σ) = x σ 2 exp ( ) x2 + µ 2 ( ) xµ 2σ 2 I 0 σ 2 (2.3) where x is the observed image intensity, µ is the noise-free signal, σ is the standard deviation of the noise, and I 0 is the zeroth-order Bessel function of the first kind. At high signal to noise ratios, the Rician distribution is occasionally approximated using a Gaussian distribution [89]. This additive noise can have an adverse effect on the diffusion rates calculated using the Stejskal-Tanner equation, particularly for images taken at a high diffusion weighting [121]. Historically, variational methods have been applied to remove this Rician noise, with anisotropic filtering [180] and total variation regularization [32, 89] both showing success. Weighted mean filtering approaches have also been used [38, 214, 231]. The main conceptual difference in denoising algorithms for diffusion weighted images is whether to denoise one image at a time (the scalar approach) or all images at once (the vector approach) [89]. Recent results suggest that the vector-based algorithms improve signal-to-noise ratio to a greater extent [214]. 27

52 2.5 The Diffusion Tensor Model Since the introduction of diffusion MRI, two key advancements have propelled the field to where it is today: first, the introduction of the diffusion tensor by Basser et al. [29] and second, the introduction of higher angular resolution diffusion imaging (HARDI) [217]. The introduction of the diffusion tensor brought forth the concept of modeling the diffusion rates from the DWIs as a three dimensional function within each voxel. HARDI, on the other hand, allowed us to increase the complexity of these models to better represent the local diffusion properties. The remainder of this chapter will show how these two contributions underlie the ability to perform a more holistic dmri analysis. While the Stejskal-Tanner equation (2.2) relates diffusion rates to the diffusion weighted image intensities, we can consider a more general formulation of the diffusion properties at a voxel. Since water molecules undergo random Brownian motion, we can consider a probability density function (PDF) p t ( x) describing the probability of a water molecule experiencing a displacement x over the observation time t. It has been shown that the distribution p t is related to the diffusion weighted image intensities via the Fourier transform [8]: S( g) = p t ( x) exp ( ib g x) d x (2.4) S 0 x R 3 As earlier, S( g) represents the diffusion weighted signal for the gradient direction g, S 0 is the unweighted B0 image signal, and b is the diffusion weighting. With enough diffusion weighted images S( g), the Fourier transform can be inverted to obtain p t. This is known as q-space imaging [228]. In practice however, the number of diffusion weighted images required to accurately perform the inversion leads to scanning times on the order of hours [111] which is generally not available in a clinical setting. As a result, it has become common to assume a model for p t, the simplest model being a zero-mean Gaussian: p t ( x) = ( 1 x T (2π) 3 2tD exp D 1 ) x 4t (2.5) where the covariance is 2tD. Plugging the Fourier transform of (2.5) into (2.4) results in a more general case of the Stejskal-Tanner equation: ( ) S( g) = S 0 exp b g T D g (2.6) The 3x3 second-order positive-definite symmetric matrix D is referred to as the diffusion tensor [29]. It contains six unique elements and therefore six diffusion weighted images are required, along with the B0 image, to estimate the tensor. The diffusion weighted images are obtained from uniform, non-colinear gradient directions so as to not favor a given direction in the tensor fitting process. These seven images can be obtained with an MRI scan on the order of 1-2 minutes [158], thereby making it a clinically feasible imaging protocol. 28

53 Many factors affect the quality of the diffusion tensors. As mentioned earlier, noise, motion, and distortions in the diffusion weighted images will result in poor tensor estimates. Aside from post-processing the diffusion weighted images, it is also common to obtain DWIs from more than six gradient directions in order to overfit the tensor, thereby reducing the effect of having some corrupted DWI signals [121]. The fitting procedure also affects the quality of the resulting tensors. The simplest approach is to take the logarithm of (2.6) and fit the tensor using least squares [29]. This approach, however, does not ensure that the resulting tensor is positive-definite (i.e., have positive eigenvalues). Non-linear fitting allows for this constraint and generally results in a less noisy tensor field [121], especially if spatial regularization is incorporated into the fitting procedure [227]. Additional approaches include using a weighted least squares fitting of the log-transformed equation (2.6) which is used to detect and remove outlier DWI signals prior to the final tensor fit [65]. 2.6 Tensor Image Visualization The power of the diffusion tensor lies in its ability to capture and visualize more detailed properties of the diffusion than the scalar diffusion weighted images provide. For example, we can look at how the rates of diffusion vary with direction or calculate the average diffusion rate at each voxel. In fact, significant diagnostic information can be obtained from the diffusion tensor by analyzing its principal components obtained through the tensor s eigendecomposition [100]. Given a diffusion tensor D, we can obtain the eigendecomposition λ 1 D = [ e 1 e 2 e 3 ] λ 2 λ 3 [ e 1 e 2 e 3 ] T (2.7) where the eigenvalues are positive and sorted in descending order (i.e., λ 1 λ 2 λ 3 ). The eigenvectors e 1, e 2, e 3 are considered the main axes of diffusion while the eigenvalues encode the rates of diffusion along each corresponding axis. Given this interpretation, we can visualize the diffusion tensor as an ellipsoid as shown in Figure 2.5. The axes of the ellipsoid are the eigenvectors of the tensor while the tensor s eigenvalues describe the ellipsoid s stretch along each axis. Another interpretation of the ellipsoid is as an isoprobability surface of the Gaussian diffusion model given in (2.5). The tensor eigendecomposition allows for the computation of two key diffusion properties: the Mean Diffusivity (MD) and the Fractional Anisotropy (FA) [119, 230]. These two measures respectively capture the mean and variance of the diffusion rate with respect to direction. They are computed from the tensor s eigenvalues as MD = 1 3 (λ 1 + λ 2 + λ 3 ) (2.8) 29

From these images, we observe the brain microstructure described in Section 2.

54 e 3 e 1 e 2 Figure 2.5: Examples of the ellipsoidal representation of prolate (a) and oblate (b) diffusion tensors. Adapted from [119]. FA = 3 (λ1 MD) 2 + (λ 2 MD) 2 + (λ 3 MD) 2 2 λ λ2 2 + λ2 3 (2.9) Examples of MD and FA on a slice of the brain are shown in Figure 2.6. From these images, we observe the brain microstructure described in Section 2.2. Note that the mean diffusivity is higher in the ventricles than the rest of the brain due to the lack of tissue structure. Conversely, the fractional anisotropy is highest in the white matter of the brain due to coherent orientation of the tissue fibers. We can further estimate the orientation of this microstructure as being equivalent to e 1. Of course, the quality of this estimate will depend in part on the FA. Low fractional anisotropy would imply a less coherent orientation in the tissue fibers, making this estimation of the fiber orientation a potentially poor one. Other scalar measures have been generated to characterize both the shape and anisotropy of diffusion tensors, but FA and MD are most commonly used in practice. A review of other scalar tensor measures can be found in [160, 230]. Aside from visualizing the scalar FA and MD maps, approaches have been developed to display the tensor s orientation information as well. The two most common approaches are shown in Figure 2.6. First, the primary diffusion direction e 1 can be visualized as a color image, where the RGB values are R = FA e 1 [1, 0, 0], G = FA e 1 [0, 1, 0], and B = FA e 1 [0, 0, 1] [174]. Such a scheme allows for an intuitive visualization of fiber orientation weighted by the the reliability of that estimate, yet color assignments are not unique. For example, the color yellow would be assigned to voxels with e 1 = [1, 1, 0] and e 1 = [ 1, 1, 0], leading to ambiguity of the underlying fiber direction. As a result, it is occasionally necessary to visualize the tensor ellipsoids themselves as seen in Figure 2.6b. Note that generating color representations of tensor data remains an area of open research [99]. 30

(a) Mean Diffusivity (left), Fractional Anisotropy

55 (a) Mean Diffusivity (left), Fractional Anisotropy (center), and Color-coded Orientation Map (right) (b) Ellipsoidal Visualization Figure 2.6: Various methods of visualizing the information contained in a diffusion tensor field.images generated using MedINRIA ( on data obtained from [153]. 31

(a) Crossing Fibers (b) Corresponding Diffusion (c) Corresponding Diffusion ODF Tensor Figure 2.7: Example of crossing fibers and how they are modeled using Diffusion MRI.

Regardless of the visualization strategy, the value of the orientation information in Diffusion MRI is significant as it allows us to infer the orientation of neural pathways in the white matter of

56 (a) Crossing Fibers (b) Corresponding Diffusion (c) Corresponding Diffusion ODF Tensor Figure 2.7: Example of crossing fibers and how they are modeled using Diffusion MRI. Adapted from [7, 71, 230] respectively. Regardless of the visualization strategy, the value of the orientation information in Diffusion MRI is significant as it allows us to infer the orientation of neural pathways in the white matter of the brain. If we take, for example, the color-coded orientation map in Figure 2.6, we can see four major neural pathways. The genu of the corpus callosum can be seen in red in the upper portion of the image and arching upwards in a U-shape. A similar looking pathway, the splenium of the corpus callosum, can be seen in the bottom half of the image flanked on each side by the optic radiations in green. Finally, the corticospinal tract, which connects the spinal cord to the motor cortex, can be seen in blue (coming out of the page) in the middle of the brain. These pathways, as seen in diffusion MRI, agree with histological studies [61], thereby making diffusion MRI a powerful tool for mapping out these neural pathways non-invasively. 2.7 High Angular Resolution Diffusion Models While the diffusion tensor model provides a powerful tool for visualizing and assessing the microstructure of brain tissue, it suffers from a significant limitation: the assumption that diffusion follows a Gaussian model. While this model may hold for simple examples such as those in Figure 2.1, there exist many situations where the local diffusion is more complex. Take for example, this situation shown in Figure 2.7a. This example shows a mixture of fibrous tissues oriented along the positive and negative diagonal directions, resulting in a diffusion profile in red. Ideally, we would like to model this diffusion as shown in Figure 2.7b. Unfortunately, the diffusion tensor model assumes ellipsoidal Gaussian diffusion. As a result, we would obtain for this example the tensor shown in Figure 2.7c. This tensor would misleadingly suggest that diffusion is equal for all directions in the plane of the crossing. 32

57 Further, the primary eigenvector of the tensor is not guaranteed to align with either fiber direction. Such an example is common in the white matter of the brain. The neural pathways are made up of aligned tissue fibers whose diameter is on the order of microns [6]. In contrast, the resolution of diffusion weighted images is typically on the order of millimeters cubed. As a result, this type of averaging of diffusion from multiple pathways is unavoidable. In fact, it has been estimated that at least one third [39] to two thirds [71] of voxels in the brain may exhibit this crossing fiber property. Tuch et al. first proposed the use of more descriptive diffusion models by showing that there are regions in the brain where fiber cross [217]. They noted that in order to detect these crossing fibers, diffusion weighted images from a greater number of diffusion directions, and at a higher gradient weighting, were required. Thus was born the concept of Higher Angular Resolution Diffusion Imaging (HARDI). Various attempts have been made to come up with HARDI models that can describe multiple fiber populations at a single voxel. While work has been done in reviewing and comparing these different HARDI approaches [7, 6, 5, 186], there is generally no consensus as to which HARDI model is best suited to represent diffusion MR characteristics. While research in this area is ongoing, three main classes of HARDI models have established themselves. We examine each in turn Compartment Models Initial attempts to model more complicated diffusion profiles revolved around fitting multiple tensors to the different fiber populations (i.e., compartments) of the diffusion weighted image signals [217]. A mixture of Gaussians model is commonly assumed and the Stejskal- Tanner equation was updated to incorporate the mixture. S( g) S 0 = i ( ) f i exp b g T D i g (2.10) Each tensor D i has a corresponding volume fraction f i representing the fraction of the local diffusion the fiber population represents. Later work using the CHARMED [16, 15] and FORECAST [10] methods assumed a particular shape for each fitted tensor. The former approach attempts to model intra-fiber and extra-fiber diffusion using prolate (cigar-like) and spherical tensors respectively. The latter approach models prolate tensors with an equal and known mean diffusivity. More recent work has instead assumed a mixture of Wishart (MOW) distributions - effectively a distribution over tensors - as the choice of diffusion model [117, 116]. A recent and thorough review of these compartment models can be found in [176]. While these mixture model approaches allow for the same intuitive representation as the single tensor model, they also have their limitations. These include: 33

58 The number of tensors being fitted to the DWI signal has to be specified ahead of time. While there has been work on estimating this number from the data [8, 217], there is no ground truth specification for the number of tensors to fit at a voxel. There is no guarantee that the assumed shape of the fitted tensors is appropriate for the underlying diffusion. If the shape assumption is poor, the volume fractions can be poorly estimated [10]. The mixture model, and not the underlying mixture components, is fit to the DWIs. While the peaks of the mixture model will align with the directions of maximal diffusion, there is no guarantee that the peaks of the underlying distributions will align with these directions as well [217, 224]. Some of these limitations have been addressed in recent work. For example, instead of fitting a fixed number of tensors to the data, volume fractions can be calculated for a set of basis tensors [140, 117, 116]. Those tensors with a volume fraction above a given threshold are maintained to model the diffusion Higher Order Tensors While compartment models increase the fidelity of the tensor fitting by mixing multiple, simple tensor models, it is also possible to increase the complexity of the tensor model itself [26, 72, 143, 171]. By introducing a higher-order tensor representation into the Stejskal- Tanner equation, we obtain S( g) S = exp b D i1 i l g i1 g il (2.11) i 1 =1 i l =1 where the l-th order tensor D represents an l-dimensional grid of entries and has the degrees of freedom necessary to model non-gaussian diffusion [171, 143]. Note that by moving to this higher-dimensional representation, we no longer make a Gaussian assumption on the diffusion PDF. Instead, the higher-order tensor captures higher-order moments of the diffusion signal [26]. One such example of modeling these higher-order moments is in Diffusion Kurtosis Imaging, where the diffusion PDF is modeled using both a regular diffusion tensor for the Gaussian part of the diffusion PDF and a 4th order tensor for the kurtosis of the diffusion PDF [114]. Like ordinary diffusion tensors, various diffusion features can be extracted from a higher order tensor. The majority of these features are based on the higher order tensor s directional profile 3 3 D(g) = D i1 i l g i1 g il (2.12) i 1 =1 i l =1 34

59 where D(g) is the diffusion rate along unit direction g and g ij is the vector g projected onto the i j -th axis of the higher order tensor. From this directional profile, we can obtain features equivalent to mean diffusion and fractional anisotropy by taking, respectively, the mean and variance of D(g) [171]. A recent review of higher order tensors is available in [26]. Unlike compartment models, higher order tensors have the benefit of modeling non- Gaussian diffusion without having to specify the number of fiber compartments present. However, there persists the concern that the maxima of (2.12) are not guaranteed to line up with the underlying fiber tract directions [224] Diffusion Orientation Distribution Functions On the other end of the spectrum, model-free approaches have also been proposed to capture local diffusion properties. Again, Tuch instituted this approach through the introduction of q-ball imaging [216]. Based on the earlier q-space approach described by the Fourier transform in (2.4), Tuch noticed that the directional dependence of the diffusion rate is the most commonly used information for diffusion MRI analysis and that the radial distance component of the diffusion does not play a significant role. As a result, instead of modeling the diffusion as a PDF p( x), where x is a vector of any length, q-ball imaging models the diffusion orientation distribution function (ODF) ψ(θ, φ), where θ, φ are spherical angles. As such, the ODF captures the probability of diffusion along different angular directions (θ, φ) but without a radial distance parameter. An estimation of the ODF can be more efficiently obtained through the use of the Funk-Radon transform [216]. By ignoring the radial component, the q-ball ODF can be estimated with fewer diffusion weighted image samples than the original PDF from q-space imaging, leading to more reasonable scanning times. Other model-free approaches have also gained traction in the diffusion MRI community. First, the diffusion orientation transform (DOT) shares similarities with q-ball imaging as both are based on the earlier q-space approach. In contrast, DOT assumes diffusion is Gaussian along the radial direction and uses this assumption to perform the Fourier transform in (2.4) using fewer diffusion weighted image samples [173]. The DOT diffusion ODF is then obtained by analytically integrating the resulting PDF along the radial direction. An alternative model-free approach is Jansons and Alexander s Persistent Angular Structure (PAS) approach [111]. The goal behind PAS is to find a diffusion PDF p( x) from (2.4) that is both smooth yet captures the key angular structure of the diffusion. This goal is achieved through optimization by finding a PDF p( x) that maximizes entropy while minimizing the error in fitting to the diffusion weighted image samples. A Lagrange multiplier is used to weight the two competing terms [111]. Unlike the diffusion tensor, the diffusion ODF is a spherical function that can be represented in many ways [115, 187]. The most popular choice for its representation is a real spherical harmonic expansion [10, 73, 115]. The diffusion ODF ψ can be represented as: 35

60 K ψ(θ, φ) = l l=0 m= l F m l Y m l (θ, φ) (2.13) where integers l and m are the degree and order of the harmonics respectively. The basis harmonics Y m l are given as: 2l+1 4π P l 0 (cos(φ)), if m = 0 Yl m = 2 2l+1 (l m)! 4π (l+m)! P l m (cos(φ))cos(mθ), if m > 0 2 2l+1 (l m)! 4π (l+m)! P l m (cos(φ))sin(mθ), if m < 0 (2.14) where Pl m is the associated Legendre function of degree l and order m. As the ODF is anti-podally symmetric, only the even degree basis harmonics are used [73]. Typically, the expansion is limited to degree l 8 to suppress noise artifacts in the resulting ODF [71]. Other ODF representations have also seen limited use, including von Mises-Fisher and Watson distributions [187]. The notions of mean diffusivity and fractional anisotropy have also been extended to HARDI diffusion ODFs, with the latter being referred to in this context as generalized anisotropy (GA). Similar to the tensor case, the two measures correspond to the mean and variance of the diffusion ODF ψ: MD = 1 4π ψ(θ, φ)ds GA = 1 4π (ψ(θ, φ) MD) 2 ds (2.15) Unlike the diffusion tensor model, MD and GA generally do not have an elegant solution. Analytical solutions have been proposed for both measures [172] but involve ad hoc scaling and normalization weights. Examples of MD and GA images are shown in Figure 2.8. We can also visualize the orientation information in the diffusion ODF by visualizing the spherical functions themselves as seen in Figure 2.8. Various other model-free diffusion ODF estimation procedures have been recently proposed and a thorough review of these techniques is available in [18]. One of the key limitations of the model-free HARDI approaches is precisely that a model is not assumed. In areas of low anisotropy, both PAS and q-ball imaging can overestimate the directional dependence of the diffusion as a result of image noise [111, 216]. This overestimation can result in spurious maxima in the diffusion ODFs HARDI versus the Diffusion Tensor Despite the presence of these HARDI models that better represent the underlying diffusion properties, the use of the diffusion tensor model still persists in a clinical setting [158, 159, 36

(a) Mean Diffusivity (top) and Generalized Anisotropy (bottom) (b) q-ball Diffusion ODFs Figure 2.8: Sample visualization techniques for diffusion ODFs obtained from HARDI. 167].

diffusion weighted images, required for the reconstruction of HARDI models is still significantly larger than for diffusion tensor imaging.

To observe non-gaussian diffusion, the strength of the magnetic gradients used in the scan is increased [217].

61 (a) Mean Diffusivity (top) and Generalized Anisotropy (bottom) (b) q-ball Diffusion ODFs Figure 2.8: Sample visualization techniques for diffusion ODFs obtained from HARDI. 167]. There are various reasons for the use of what is perceived to be an inferior model and these reasons highlight some of the limitations of HARDI: The number of gradient directions, and in turn diffusion weighted images, required for the reconstruction of HARDI models is still significantly larger than for diffusion tensor imaging. With scanning time as a bottleneck, the opportunity to obtain enough diffusion weighted images for a HARDI reconstruction remains, in many cases, a luxury. To observe non-gaussian diffusion, the strength of the magnetic gradients used in the scan is increased [217]. Increasing the gradient strength increases diffusion rates, which in turn are inversely proportional to relaxation time and diffusion weighted image intensity. If we increase the gradient strength enough, the diffusion weighted image intensities can fall below the noise floor, an effect seen with HARDI imaging settings [121]. Recent research suggests that limitations of the tensor model with regards to crossing fibers might be overcome by taking into account neighborhood information [27, 197]. With such advancements, it remains unclear at this time if HARDI can provide enough additional information over a diffusion tensor image to warrant the added imaging cost. 37

62 Due to these above reasons, and the wealth of diffusion tensor medical research [167], the tensor model cannot be ignored. 2.8 Preterm Infant dmri Acquisition and Modelling The fundamentals regarding diffusion MRI acquisition and modelling have seen significant use in the examination of the preterm infant brain [2, 131, 192, 70, 31, 190, 66, 191]. As the incidence of preterm birth (i.e., birth earlier than 37 weeks gestational age) continues to increase [43], and given the higher rate of adverse neurodevelopmental outcome for preterm infants, it is hoped that dmri can provide clinicians with the insights necessary to improve a preterm infant s neurodevelopmental health. The ability of diffusion MRI to non-invasively assess the organization and integrity of the brain s white matter is of particular interest to neurodevelopmental researchers as it is believed that the abnormal neurodevelopment common to preterm infants may be a result of injuries to the white matter around the time of the preterm infant s birth [20, 77]. Over the years, researchers have identified positive correlations between fractional anisotropy and neurodevelopmental outcome in preterm infants [2, 66, 70, 192], suggesting that long term cognitive and motor deficiencies may be linked to the organization of white matter axons at an early age. Researchers have also established a negative correlation between mean diffusivity and neurodevelopmental outcome [66, 131, 190, 191], suggesting that similar long term outcomes may relate to the amount of cell structure present in the brain at an early age. Finally, there is further evidence that preterm infants with abnormal neurodevelopmental outcomes show less dramatic changes in fractional anisotropy and mean diffusivity over the first few weeks after birth [66, 150]. These results suggest that preterm infants with poor neurodevelopmental outcomes may experience restricted, or delayed, growth and development of the brain in the first few weeks of life. In this thesis, we will make use of dmri scans from a cohort of preterm infants to explore and evaluate our proposed algorithms. Our cohort consists of 195 premature newborns born between 24 to 32 weeks gestational age (GA) at the Children s & Women s Health Centre of British Columbia, 177 are described in Chau et al. [66] and an additional 18 infants recruited since that work was published. Full demographics for the cohort are provided, and discussed further, in Section 6.3. The MRI studies on this cohort were carried out on a Siemens (Erlangen, Germany) 1.5T Avanto scanner using a multi-slice 2D axial EPI diffusion MR acquisition (TR 4900 ms; TE 104 ms; FOV 160 mm; slice thickness, 3 mm; no gap). Three averages of 12 noncolinear gradient directions were acquired, resulting in an in-plane resolution of mm. The dmri acquisition was repeated twice, once with a diffusion weighting (b-value) of 600 s/mm 2 and once with a diffusion weighting of 700 s/mm 2. The two dmri acquisitions were then combined to create a single diffusion tensor image. The combined diffusion 38

63 weighted image set was preprocessed (i.e., eddy current corrected and skull stripped) using the FSL Diffusion Toolbox (FDT) pipeline 2 and tensors were then fit using RESTORE [65]: a weighted least-squares tensor fitting algorithm implemented in the Camino toolkit Conclusions When it comes to processing dmri scans, traditional image processing tasks (e.g., segmentation, registration) are preceeded by a sequence of image acquisition, preprocessing, and modeling steps. As each of these steps impact further processing, we have provided this introduction to the main image acquisition and modeling concepts that will appear throughout the remainder of this thesis. These concepts include the notion of using nuclear magnetic resonance techniques to acquire measurements of molecular diffusion along different directions, resulting in diffusion-weighted MR images. It is these diffusion-weighted MRIs that are then preprocessed to remove noise, eddy currents, and motion artifacts. From the preprocessed diffusion weighted MRIs, we then fit, at each voxel, a diffusion model to come up with a holistic representation of the local molecular diffusion. The most ubiquitous of these models is the diffusion tensor, but other HARDI models are becoming more common. Once modeled, the dmri data can be visualized in various ways. Two common visualizations are fractional anisotropy (which is generally interpreted to represent how fibrous the tissue is) and mean diffusivity (which is typically interpreted to represent how much tissue structure is present). In subsequent chapters, we will consider the task of processing and analyzing dmri scans that have already gone through this process of acquisition, preprocessing, and modeling. Extended versions of this chapter were published in [50, 52] and we refer the reader to them if interested

64 Chapter 3 Information Content Estimators for Diffusion MRI 3.1 Introduction and Motivation The advent of diffusion MRI (dmri) has provided clinicians with the ability to assess the integrity of the brain s neural pathways in a non-invasive manner. This unique ability has quickly allowed dmri to become an established imaging protocol generating images that, like other medical images, benefit from having computational methods for processing and analysis [230]. Yet unlike other medical image data, diffusion MRI scans typically contain high-dimensional, manifold-valued data at each voxel. For example, diffusion tensor images contain at each voxel a 3 3 symmetric positive definite matrix. The computational analysis of such images is complicated both by the manifold-valued nature of the data as well as the high dimensionality. The result of these complications is that, frequently, analysis is performed only on a single dimension of the data (e.g., fractional anisotropy). Working directly with the dmri data, on the other hand, typically requires adjusting existing medical image analysis algorithms 1. Information theory in particular is a key component of computer vision and has seen uses in image compression, feature detection, segmentation, and registration (see [81] for a recent survey). Attempts have been made to apply information theory to diffusion MRI data in the contexts of image segmentation [225, 226], image registration [209, 103, 137], and feature detection [4, 163, 172]. In our context of preterm brain analysis, information theory may be a valuable tool for longitudinal image registration. Consider the problem of aligning two dmri scans, like those in Figure 3.1, that come from the preterm infant at different post-menstrual ages. As can be readily observed, the preterm infant brain changes dramatically in both its geometry and microstructural properties over the first few weeks of life [150]. These 1 This chapter is based on our published work in [46]. 40

65 1 cm (a) 29 weeks post-menstruation (b) 44 weeks post-menstruation Figure 3.1: Slices of a preterm infant s brain shown in diffusion tensor MRI at postmenstrual ages (PMA) of 29 and 44 weeks respectively. Note the large change in both brain shape and size, leading to a more challenging image registration problem. Also note the decreased anisotropy (i.e., brightness) around the edge of the brain as the infant matures. dramatic brain changes make it unreasonable to assume that these two dmri scans have directly comparable diffusion measurements. That said, we also note in Figure 3.1 that the main white matter fiber tracts appear clearly in both scans, to the point that one can visually identify a correspondence between them. In this respect, the brain structures are comparable even though their individual voxel values (i.e., diffusion measurements) may not be. In image registration problems where the images have this property, the information-theoretic measure of mutual information has been shown to perform well [145]. The success of this image registration application, and other potentially similar work, relies on being able to estimate entropy in a manner that is consistent 2 and accurate. It is from these estimates of entropy that we build up all information-theoretic measures, including mutual information. For scalar data, the most popular method of entropy estimation relies on histogram binning [201]. For this estimator, a normalized histogram is generated from a set of samples by discretizing the domain of the sample set. The resulting normalized histogram is then used as the probability distribution from which entropy can be estimated. Binning estimators work well if the dimensionality of our samples is low (e.g., d = 1, 2). However, for higher dimensional samples, the binning used for the estimator becomes an 2 Consistent here refers to the formal mathematical notion that as the number of samples an estimator uses approaches infinity, the generated estimates converge to the exact value. 41

66 issue. In practice, we often have a fixed set of samples and as the size of the space increases, we require exponentially more bins to cover said space [201]. The combination of more bins and a fixed set of samples leads to biases and increasing mean-squared error as a result of undersampling [222]. Increased computational costs also result from higher bin counts. Bin sizes can be increased to overcome these limitations, but doing so introduces significant quantization effects that also increase the estimator s mean-squared error [132]. The limitations of binning-based entropy estimators make them a poor choice for dmri data like tensors, which make up a non-linear, convex half-cone in R 6 [222]. As a result, the estimation of information content in diffusion MRI has, to date, focused on reducing the dimensionality of the data to the point where a binning estimator can be used. In [4, 163], features of diffusion tensors (e.g., fractional anisotropy) are computed and used for entropy estimation. In [103, 137, 209], the information content of each channel is computed separately, then averaged. In [103], data from all channels are grouped together into a single one-dimensional histogram for entropy estimation. Reducing the dimensionality of the dmri data using such methods does not preserve uniqueness; multiple tensors, for example, can be mapped to the same point in the lower-dimensional space. As such, a set of non-homogeneous tensors could be reduced to same point in the lower-dimensional space, thereby generating an erroneous estimate of zero entropy. While some work has been done to develop information-theoretic measures for the features of individual diffusion models [172, 225, 226], these methods do not naturally extend to sets of dmri data. We propose that the entropy of high-dimensional, manifold-valued images should be computed using a different estimator that does not require reducing the data s dimensionality. In particular, we contribute an entropy estimator based on nearest-neighbour dmri distances. By using one of the established dmri distance metrics [13, 33, 182, 183, 225], we are able to obtain nearest neighbour distances that respect the underlying manifold of the diffusion measurements. Using these nearest neighbour distances for entropy estimation provides us with entropy estimates that are known to be consistent and accurate for high-dimensional data [222]. The contribution of this estimator also allows us to avoid the lossy dimensionality reductions used in existing binning-based entropy estimators for dmri data. To our knowledge, this is the first work on computing information content of a set of unaltered dmri values. In the following two sections, we present our nearest-neighbour entropy estimator for dmri data by focusing on the diffusion tensor model. We will note that the approach presented can be trivially extended to other dmri models while also placing it in the context of existing binning estimators. We further show how the same nearest-neighbour approach can be used to estimate mutual information, a similarity measure commonly used in image registration. We compare our estimator to existing binning estimators in Section 3.4. Specifically, we show how our proposed estimator behaves more consistently in the presence of noise and captures information in the tensor data that existing estimators 42

67 miss. We further show that in the context of image segmentation, our entropy measure shows better agreement with the homogeneity properties of various anatomical regions. Finally, we show that for image registration, our estimator provides a more robust estimate of mutual information than existing binning estimators. We will revisit the task of preterm infant dmri registration in Section 3.5 before concluding with discussion in Section Background: Binning Estimators For the purposes of later comparisons, we first summarize the existing binning-based entropy estimators for dmri data. Our nearest neighbour distance-based estimator will follow in Section 3.3. Also, for the sake of simplicity, we will restrict our initial discussion of entropy estimation to the diffusion tensor model [230] before mentioning how the presented work trivially extends to other dmri models. The Shannon entropy of a random variable is described as H(X) = p(x)log 2 (p(x)) (3.1) where p(x) is the probability of the sample x occurring for a random variable X. typical image processing applications, the distribution p(x) does not have a parametric form, thereby forcing us to estimate both p(x) and its entropy. The most common approach to estimating entropy is to discretize the domain of p(x) into a finite set of bins. The sample data is binned and the resulting normalized histogram is used as p(x). The integration in (3.1) is then converted to a sum and the resulting estimate obtained as For H(X) x X p(x)log 2 (p(x)) (3.2) The same histogram binning concept also applies to joint entropy estimation between two sets X and Y. H(X, Y) = x X x y y Y p(x, y)log 2 (p(x, y)) (3.3) p(x, y)log 2 (p(x, y)) (3.4) From the above definitions of entropy and joint entropy, we can estimate many information theoretic measures, including (but not limited to) conditional entropy, mutual information, and variation of information as follows H(Y X) = H(X, Y) H(X) (3.5) 43

68 MI(X, Y) = H(X) + H(Y) H(X, Y) (3.6) V I(X, Y) = H(X) + H(Y) 2MI(X, Y). (3.7) The consistency of these estimators is inherently linked to that of their underlying entropy estimators and further highlights the necessity of having a consistent and accurate entropy estimator for diffusion tensor data Extensions to Tensor-Valued Data For diffusion tensor data, the above estimators have been used following a scalarization of the data. Different scalarization schemes have led to three main entropy estimators: FA Estimator: The entropy estimation in (3.2) is performed on the fractional anisotropy (FA) of the tensors and not on the tensors themselves [4]. The FA can be computed from each tensor D as FA(D) = 3 2 D 1 3 trace(d)i. (3.8) D Averaging Estimator: Given the six unique elements {d 1,, d 6 } of each tensor D, the entropy is estimated for each element independently using (3.2), then averaged [209, 103]. Specifically, given a set of tensors X, the entropy is estimated as H(X) 1 6 H(X di ) (3.9) 6 i=1 where X di is the set of scalars from channel i of the tensors in X. Grouping Estimator: Samples in the six dimensions of D are considered as six samples in one dimension and a single one-dimensional histogram is used to estimate entropy [103]. Given the scalar sets X di for each unique tensor channel i, this approach uses (3.2) to estimate entropy as H(X) H(X d1 X d6 ). (3.10) The same estimators can be developed for other dmri models by either computing the entropy of the generalized anisotropy, averaging entropy estimates over all channels in a diffusion model, or by grouping all channels of the diffusion models into a single histogram Limitations of Binning Estimators The grouping, averaging, and FA estimators have shown promise in various areas of application [209, 103, 4], yet they have their limitations. Clearly, the FA estimator does not 44

(c) Y Dimension (a) Example 1 (H = 2 bits) (b) Example 2 (H = 1.7219 bits) (d) X Dimension Figure 3.2: A simple 2D example showing the limitations of the averaging estimator.

An averaging estimator would obtain the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b).

69 (c) Y Dimension (a) Example 1 (H = 2 bits) (b) Example 2 (H = bits) (d) X Dimension Figure 3.2: A simple 2D example showing the limitations of the averaging estimator. The two unique distributions in (a) and (b) have the same 1D histograms for each of their channels (shown in (c) and (d) respectively). An averaging estimator would obtain the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b). take into account the orientation or size of the tensors, thereby omitting these aspects of the tensors from their analysis. The limitations of the grouping and averaging estimators are more subtle and, therefore, we examine them here. Figure 3.2 shows a simple example where the averaging estimator in (3.9) would have difficulty. The two discrete 2D distributions in Figures 3.2a and 3.2b are clearly different with different entropy values. Yet, when looking at the X and Y channels independently in Figures 3.2d and 3.2c, we see that data from both channels are equivalent for both 2D distributions. The averaging estimator would be unable to detect any difference between the information content of these two distributions. The example in Figure 3.2 also provides us with some intuition as to where the averaging estimator might be insensitive to information content. The particular reason why the distribution in Figure 3.2a gives the same entropy estimate as that of the distribution in Figure 3.2b is that there is a correlation between the X and Y dimensions. In general, an N-dimensional distribution can cause difficulties for the averaging estimator if there is a negative correlation between different dimensions of the distribution. In such a situation, a distribution can be altered along one dimension (e.g., X = 1 in Figure 3.2) only to have that alteration be cancelled out by a related alteration in a second dimension (e.g., X = 4 in Figure 3.2). Further, an increase in entropy along one dimension can be offset by decrease in entropy along a second dimension without affecting the average entropy of all channels. 45

The grouping estimator obtains the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b). Meanwhile, Figure 3.

70 (a) Example 1 (H = 2 bits) (b) Example 2 (H = 1.58 bits) (c) X and Y Grouped Figure 3.3: A simple 2D example showing the limitations of the grouping estimator. The two unique distributions in (a) and (b) have the same 1D histogram when grouping the samples from the different channels (shown in (c)). The grouping estimator obtains the same entropy estimate for both examples (H = 1 bit) despite the fact that (a) has higher entropy than (b). Meanwhile, Figure 3.3 shows a simple example where the grouping estimator in (3.10) would have difficulty. The two discrete 2D distributions in Figures 3.3a and 3.3b are again unique with different entropy. Grouping the samples from the X and Y channels in each of the two distributions results in identical 1D histogram as shown in Figure 3.3c. The grouping estimator would provide the same estimate for both 2D distributions despite their differences in true entropy. Unlike the averaging estimator, the properties of a distribution that can expose the limitations of the grouping estimator cannot be concisely expressed. Yet, Figure 3.3 shows that such limiting situations exist and would affect further analysis based on these entropy estimates. 3.3 Methods: Nearest Neighbour Estimators We propose the use of the nearest neighbour entropy estimator to determine the information content of diffusion MRI volumes. The nearest neighbour distances - denoted in this section by η - can be calculated using an appropriate diffusion MRI distance metrics The Shannon Entropy Estimator The nearest neighbour entropy estimator was first presented by Kozachenko and Leonenko [129] and is given as H nn (X) = d N log N 2 ( η j ) + log π d 2 (N 1) 2 ( ) j=1 Γ d γ ln(2) (3.11) 46

71 where N is the number of samples in set X, η j is the distance vector from sample x j to its nearest neighbour in set X, d is the number of distributions in the entropy estimation (i.e., d = length(η j )), Γ is the standard Gamma function, and γ is the Euler-Mascheroni constant (γ = ). The error of this estimator is known to be bounded as H nn (X) H(X) O(N 1/2 ) [132]. Given that a typical diffusion tensor image of an adult brain has N > 10 5 tensors, this error, in practice, is rather small. consistent, meaning that lim N H nn (X) = H(X) [129]. Further, the estimator is known to be The nearest-neighbour estimator, as presented in (3.11), stems from observations of nearest neighbour graphs and the weak law of large numbers [184], in particular that the sum of edge weights in a nearest neighbour graph is exponentially related to the Shannon entropy of the graph s coordinates in the embedding space. The nearest-neighbour estimator has the undesirable property of generating an estimate of negative infinity when presented with non-unique data (i.e., η j = 0). This is not a significant concern as two methods have been proposed in the literature to address these infinite estimates. First, we can slightly perturb the data prior to computing the nearest-neighbour distances to ensure uniqueness of the samples in X [133]. An alternative approach is to simply add a small constant ǫ to each nearest-neighbour distance η j [132]. We implement that latter approach here and note that in practice, non-uniqueness occurs for a very small percentage of the data (e.g., for the data used in the results section, this happened on average for % of the tensors per image). While the guaranteed consistency of the estimator is appealing, so too is the estimator s flexibility when it comes to measuring distance. We are free to calculate η j using any appropriate distance metric. In our context of tensor-valued data, this allows us the ability to use any established tensor distance metric, including the Euclidean metric [182], the log- Euclidean metric [13], Riemannian metric [33, 183], and J-Divergence [225] (A comparison of these distance metrics can be found in [182] and is beyond the scope of this thesis). These distance metrics are shown in Equations ( ) respectively and their use in computing η j in (3.11) give us four different entropy estimators for tensor-valued data d E (D 1, D 2 ) = D 1 D 2 2 (3.12) d LE (D 1, D 2 ) = log(d 1 ) log(d 2 ) 2 (3.13) d R (D 1, D 2 ) = log(d 1/2 1 D 2 D 1/2 1 ) 2 (3.14) d J (D 1, D 2 ) = 1 tr(d D 2 + D 1 2 D 1) 6. (3.15) Note that our proposed entropy estimator can be equally applied to other dmri models using their appropriate distance metrics [69, 26]. 47

72 3.3.2 Extension to Mutual Information Note that (3.11) can also be used to estimate the joint entropy of two distributions from sample sets X and Y. If we consider the joint random variable Z = (X, Y ) and its sample set Z = {(x 1, y 1 ),, (x N, y N )}, it has been proposed in [222] that H nn (X, Y) = H nn (Z). This joint entropy estimate could then be used to compute the additional information-theoretic measures in ( ). It has been noted, however, that the differences in scale between the joint space (Z) and marginal spaces (X and Y), combined with competing bias correction terms (the last two terms in (3.11)) from the entropy estimates, can lead to biased estimates of the measures in ( ) [130]. We propose instead the use of α-mutual Information (α- MI) first presented for scalar data in [162]. Like our proposed entropy estimator, the α-mi estimator relies on nearest-neighbour distances to obtain an estimate. Unlike our proposed entropy estimator, α-mi is based on Renyi entropy, a generalization of the Shannon entropy seen in (3.1). The nearest-neighbour estimator for α-mi is MI(X, Y, α) = 1 α 1 log 1 N α ( N 2β η i (Z) (3.16) ηi (X)η i (Y)) where Z = {(x 1, y 1 ),, (x N, y N )}, η i (X) is, as in (3.11), the nearest-neighbour distance of x i X using the tensor distance metrics in ( ), and β = d(1 α). In this work, as in [162], we used α = 0.99 to approximate the Shannon entropy case. i=1 3.4 Results: Comparison of Entropy Estimators For the following experiments, we estimate entropy using seven different estimators from the aforementioned two estimator families: Binning Estimators: (a) The FA Estimator, (b) The Averaging Estimator, and (c) The Grouping Estimator. Unless otherwise stated, bin widths are chosen using Scott s rule h = 3.5σ N 1/3 (3.17) where σ is the standard deviation of the scalar data while N is the number of scalar samples [201]. Proposed Nearest-Neighbour Estimators: Nearest Neighbour distances computed using the (d) Euclidean distance metric, (e) log-euclidean distance metric, (f) Riemannian distance metric, and (g) J-Divergence. We aim to show how our proposed nearest-neighbour entropy formulation provides more robust and consistent behaviour than existing binning estimators in the application areas of noise estimation, image segmentation, and image registration. 48

5 DOF 10 DOF 25 DOF 50 DOF Figure 3.4: Synthetic tensor data sampled from Wishart distributions of different degrees of freedom and same mean tensor diag(3, 1, 1).

1 Noise Estimation The objective of this experiment is to test whether the given entropy estimators behave consistently with respect to noise levels within a tensor image.

The Wishart distribution is defined over a nonnegative-definite matrix-valued random variable with a given scale matrix V and degree of freedom k [232].

decreases. Sample data for various degrees of freedom are shown in Figure 3.4.

73 5 DOF 10 DOF 25 DOF 50 DOF Figure 3.4: Synthetic tensor data sampled from Wishart distributions of different degrees of freedom and same mean tensor diag(3, 1, 1). Note that as the degree of freedom increases, the variability of the sample data decreases Noise Estimation The objective of this experiment is to test whether the given entropy estimators behave consistently with respect to noise levels within a tensor image. We expect that, on average, as noise increases, entropy also increases. For this experiment, two-dimensional tensor images of size pixels were generated by sampling from a Wishart distribution. The Wishart distribution is defined over a nonnegative-definite matrix-valued random variable with a given scale matrix V and degree of freedom k [232]. The degree of freedom of the Wishart distribution follows the convention of statistical estimation; as the degree of freedom increases, the variability of the samples produced by the distribution decreases. Sample data for various degrees of freedom are shown in Figure 3.4. We vary the degree of freedom parameter between 3 and 50 while keeping the mean tensor fixed at diag(3, 1, 1) (other choices of mean tensor gave similar results). Fifty synthetic images are generated for each degree of freedom and entropy estimates are computed for each estimator-image pair. The mean entropy estimates for each noise level are shown in Figure 3.5a. Note how the nearest-neighbour estimators provide a much higher entropy estimate than the binning estimators. This result stems directly from our proposed estimator avoiding the dimensionality reduction used by the binning estimators. By reducing the size of the space that the sampled data can occupy, the binning estimators reduce the unpredictability of the sampled tensor space. It is precisely this unpredictability that entropy measures. By avoiding this dimensionality reduction, our proposed estimators do not suffer from this entropy loss. Figure 3.5 also shows the entropy estimates grouped by estimator families. For our proposed nearest-neighbour estimators in Figure 3.5b, we see that, as expected, entropy monotonically decreases as noise in the images decreases. However, we notice erroneous behaviour for the binning estimators in Figure 3.5c. At high noise levels, the entropy estimates provided by the binning estimators increase as noise decreases. Further, the averaging and FA estimators effectively plateau around a degree of freedom of 20. These inconsistencies appear as a direct result of the dimensionality reductions performed prior to 49

74 (a) Entropy Estimates (b) NN Distance Estimators (c) Binning Estimators Figure 3.5: Entropy estimates for noisy synthetic data. Entropy estimates using various estimators shown in (a) and grouped into distance-based (b) and binning-based (c) methods for a close up view. Note that, as expected, the distance-based estimates decrease as noise decreases, while the binning-based estimates behave inconsistently with respect to the underlying uncertainty in the data. 50

75 entropy estimation. As a result of the dimensionality reductions, the fidelity of the entropy estimates to the original data is lost, thereby producing inconsistent estimator behaviour Application to Image Segmentation The problem of image segmentation can be described as the problem of splitting an image into two or more regions that are homogeneous in some fashion. As entropy is related to variability in a set of samples, segmentation methods have been proposed that define an optimal partition of an image as that which results in homogeneous regions with minimal entropy values in each region [105]. The estimators described herein can be used in such a context and we therefore examine the properties of our proposed estimators for image segmentation 3. To examine the potential of our proposed entropy estimators in this context, we estimate the entropy of the fifty labelled white matter fiber bundles of the ICBM DTI-81 Atlas [155]. Using these ground truth atlas segmentations, we proceed to gradually perturb and worsen the segmentation. We expect that for a measure of entropy to be effective for segmentation, the measure must increase as the image partitions include more and more voxels from other anatomical regions with different diffusion properties. To gradually worsen the segmentation one iteration at a time, we morphologically dilate the segmented regions with a spherical structuring element of radius r = 1 voxel. Entropy is then estimated using each estimator after each dilation step. A maximum of seventy-five region dilations were performed for each of the fifty segmentation fiber bundles. Figure 3.6 displays how the different estimators behave with respect to different quality segmentations. A representative result is shown for the left Cingulum in Figure 3.6a. As expected, each entropy estimator provides a minimum entropy value at the ground truth segmentation with higher entropy values for poorer quality segmentations. However, for the FA estimator, the increase in entropy was not monotonic as the segmentation quality got worse. Instead, we see a local minimum develop at around 3 ROI dilations. Such local minima are a common cause of difficulties for gradient-based optimization schemes commonly used to solve image segmentation problems. Figure 3.6b shows the number of anatomical regions (out of 50) for which no local minima was created as the segmentation was perturbed. The higher the number, the more suitable the measure is for segmentation. We see that, for all regions, our proposed entropy estimators do not generate additional local minima whereas the binning-based estimators generate non-convexity of the optimization landscape for one or more of the regions in this experiment. 3 We emphasize that our goal is not to perform and validate segmentation, but rather to study the effects of entropy estimation on a key aspect of segmentation algorithms (the objective function landscape). We do not carry out the optimization leading to segmentation in order to avoid polluting the comparisons with other optimization-related parameters (e.g. step size in gradient descent) that are irrelevant to the contribution of this work. 51

(a) Entropy as a function of ROI Dilation ( segmentation error) for the Left Cingulum (b) Number of Regions without Superfluous Local Minima (c) Slope of Entropy Function near Optimal Segmentation

Figure (a) shows the entropy estimates as a function of ground truth ROI dilation for the left Cingulum.

76 (a) Entropy as a function of ROI Dilation ( segmentation error) for the Left Cingulum (b) Number of Regions without Superfluous Local Minima (c) Slope of Entropy Function near Optimal Segmentation Figure 3.6: Entropy estimates from various estimators for different quality segmentations of white matter fiber bundles in the ICBM DTI-81 Atlas. Figure (a) shows the entropy estimates as a function of ground truth ROI dilation for the left Cingulum. Note the slight dip in the entropy estimate for the FA estimator at around 3 ROI dilations (red arrow). To better quantify non-convexity, we study the existence of local minima. Figure (b) shows, for each estimator, the number of regions in this experiment where we avoid generating misleading local mimima. Note that our proposed estimators are the only ones that, in all cases, did not generate additional local minima, thereby demonstrating their greater suitability in a gradient-based optimization of a DTI segmentation problem. Figure (c) shows the average slope of the entropy vs. ROI dilation curves near the optimal segmentations. The greater slope provided by our proposed estimators suggest faster convergence in the optimization step of an image segmentation problem. 52

77 Figure 3.6c displays the mean slope of the entropy versus ROI dilation curves for each estimator around the ground truth segmentations. We see a significantly higher slope for our proposed estimators than for existing binning-based estimators. Such high slopes are beneficial for optimization methods used in image segmentation as they increase the rate of convergence of the optimizer, thereby generating a solution in fewer computational steps. These results provide evidence that our proposed estimators would be better suited for energy-minimizing segmentation by generating a more favourable objective function landscape Application to Image Registration Finally, we aim to show improvements in image registration using the mutual information estimates provided by our nearest-neighbour estimators 4. We perform this task by computing the image registration energy landscape around the optimal alignment for given pairs of tensor-valued images. We expect that our proposed estimators would generate a more robust and distinguishable optimum value than the existing binning-based estimators. Twelve diffusion tensor MR images publicly available from the Laboratory of Brain Anatomical MRI (LBAM) at John Hopkins Medical Institute were used in this experiment [153]. For each of the twelve diffusion tensor MR images in the dataset, we generate candidate registered (i.e., transformed) images by applying a deformation to the original image followed by the addition of additive noise. For our deformation step, we apply rigid rotations about the inferior-superior axis (the image Z-axis) followed by tensor reorientation. The image then is decomposed into its six diffusion weighted images using the methods in [230]. Rician noise of zero mean and a given standard deviation is added to each diffusion weighted image. The tensors are then recomputed using the log-least-squares approach in [230] to generate our candidate image. We compute the mutual information using (3.6) and (3.16) between the original image and the candidate image for various degrees of rotation (from θ = 20 to θ = 20 ) and different levels of additive Rician noise (standard deviation from σ = to σ = 0.050). The range of rotations and noise levels are displayed in Figures 3.7a and 3.7b respectively. Figure 3.7c shows the resulting energy landscape for the case without additive noise. We note that all estimators obtain an optimum at the correct alignment, as expected, but that the slope around the optimal solution is clearly much higher for our nearestneighbour estimators than for the binning estimators. Again, this greater slope improves the convergence rate of many optimizers used in image registration. Figure 3.7d displays how each estimator behaves as we add noise to the candidate images. For each noise level, we compute the optimization margin: the mutual information at the optimum minus the asymptotic baseline mutual information value. We see that as we add 4 As in the image segmentation experiment, our goal is to study the effects of entropy estimation on the objective function landscape, not to perform the registration itself. 53

θ = 20 θ = 0 θ = 20 (a) Range Rotations around the Inf.-Sup. Axis σ = 0.005 σ = 0.020 σ = 0.

Figure 3.7: Mutual information estimates using various tensor-valued estimators for rotational misalignments with additive noise.

Mutual information for different rotational alignments is shown in (c) for the case without additive Rician noise.

78 θ = 20 θ = 0 θ = 20 (a) Range Rotations around the Inf.-Sup. Axis σ = σ = σ = (b) Range of Additive Rician Noise Levels (c) Mutual information as a function of rotation angle (d) Margin loss as a function of additive noise Figure 3.7: Mutual information estimates using various tensor-valued estimators for rotational misalignments with additive noise. Figure (a) shows the range of z-axis rotations tested while figure (b) shows the range of additive Rician noise values. Mutual information for different rotational alignments is shown in (c) for the case without additive Rician noise. Note the significantly higher slope around the optimum for our proposed estimators. Figure (d) shows how the margin between the optimal mutual information value and the baseline changes as noise increases. Note the added robustness of our nearest-neighbor estimators in the presence of additive noise. 54

79 noise, the margin for the binning estimators decrease rapidly while the margin from our proposed estimators decrease at a significantly slower rate. This sharper decrease in margin for the binning estimators demonstrate that they are more sensitive to slight differences in data than our proposed nearest-neighbour estimators. 3.5 Discussion Our primary motivation in creating these information content estimators for dmri data was to address the longitudinal image registration highlighted in Figure 3.1. Specifically, we wish to align a preterm infant s dmri scan taken soon after birth to one taken weeks later. Notable studies suggest that a preterm infant s abnormal neurodevelopmental outcomes may be identifiable by examining the rate at which the brain develops [66, 150]. By performing this longitudinal image registration and measuring the differences between the aligned scans, we may be able to identify patterns of development that are indicative of abnormal neurodevelopmental outcome. Such an analysis requires accurate image registration to ensure that image differences are true anatomical changes and are not a result of misalignment. To explore the potential of our dmri mutual information measure in image registration, we tested the α-mutual information measure in (3.16), with the log-euclidean diffusion tensor metric in (3.13), in registering the two scans shown in Figure 3.1. Our registration algorithm was built on the α-mutual information b-spline registration of Staring et al. [206] that is currently implemented in elastix [126]. We expanded on the work of Staring et al. by adding the log-euclidean diffusion tensor metric for the computation of nearest neighbour distances. Figure 3.8 shows the original dmri scans for the preterm infant as well as the result of registering the infant s first scan to their follow-up scan. While the log-euclidean, α-mutual information registration appears to align the corticospinal tracts and the corpus callosum well, there are noticeably large alignment errors around the optic radiations (highlighted in the lower white ovals) and the anterior portions of the internal capsule (highlighted in the upper white ovals). We hypothesize that the reason the registration failed in these regions is that there regions were ones where the topology of the brain was not preserved. We can see on the earlier scan that there is a gap between the forceps minor (upward u-shaped tract) and the nearby anterior portion of the internal capsule (in green). In the follow-up scan however, that gap has been reduced to the point that the two tracts are touching. The same phenomenon can be seen between the forceps major (downward u-shaped tract) and the nearby optic radiations (in green). There too, the gap between the two fiber tracts disappears over time. It has been noted that the brain changes topology during the neonatal time period [82], and these results agree with those earlier observations. 55

80 (a) 29 weeks (b) 44 weeks (c) 29 wks. 44 wks. Figure 3.8: Axial slices of a preterm infant s brain shown in diffusion tensor MRI at postmenstrual ages of 29 and 44 weeks respectively, as well as the result of registering the earlier scan to the later scan using the proposed mutual information measure (with the log-euclidean distance metric). Note that the registration performs poorly as a result of large topological, highlighted in white, between the two original scans. These topological changes cannot be accounted for in diffusion tensor image registration. See text for further details. To obtain an accurate alignment for this longitudinal image registration problem, one would require an image registration algorithm that allows for changes in topology. Unfortunately, dmri registration techniques cannot accommodate changes in topology once a diffusion model (like the diffusion tensor) has been fit to the diffusion-weighted images. The reason for this limitation relates to the geometric relationship between the fiber tracts and the measured diffusion. One of the main strengths of dmri is that the fiber tracts align with the direction of maximal diffusion [91]. As we deform one image to align it to another, we are implicitly deforming the geometry of the fiber tracts. To maintain the relationship between the tract geometry and the measured diffusion values, the diffusion models have to be rotated in step with the deformation [9] so that their maxima remain aligned with the deforming tracts. The rotations required for the diffusion models are computed from the inverse of the deformation s Jacobian matrix [9]. In locations where topology is not preserved, the deformation s Jacobian matrix is not invertible [221], and the rotation matrix for the diffusion models cannot be computed. As a result, addressing topological changes in dmri registration will likely require working with the original diffusion-weighted images directly, where rotations can be accounted for by artificially rotating the gradient vectors. 3.6 Conclusions We have presented for the first time estimators for entropy and mutual information of tensorvalued images. Using nearest-neighbour distances computed with metrics that respect the manifold of the tensor data, we generate entropy estimators that work directly in the tensor 56

81 space and require no dimensionality reduction. Our proposed estimators are also guaranteed to behave consistently and the error in their estimates have a theoretical bound. We have shown that our entropy estimators behaves as expected in the presence of varying levels of noise, a result not seen with state-of-the-art methods [4, 103, 209]. Further, we applied our estimators to DTI image segments that mismatch, to various degrees, their underlying anatomical regions. Results show that our estimator s use in image segmentation can lead to objective function landscapes that are easier to optimize (with higher slopes and devoid of local minima). Finally, we have shown that using nearest-neighbour distances, we can obtain estimates of mutual information between two images that are more robust to noise, a desirable and key result for image registration algorithms. While our application to preterm infant longitudinal dmri registration was limited by the additional challenge of changing brain topology, we stress that a wide variety of computer vision methods rely on information theory. We believe that our tensor-valued information content estimators can show further benefits in other areas of application (e.g., feature detectors, recognition) [81]. We also note that our approach extends naturally to higher order diffusion MRI data by using existing distance metrics [26, 69]. 57

82 Chapter 4 Diffusion MRI Segmentation via Cross-sectional Piecewise Constancy 4.1 Introduction and Motivation Segmentation has become an important component of medical image analysis as it is a way of extracting anatomical structure from image data 1. The resulting structures can then be examined in terms of their shape, size, and appearance to obtain useful biomarkers of clinical measures. In diffusion MRI (dmri), segmentation is often used to delineate axonal fiber tracts connecting functional brain regions [188]. In preterm dmri analysis, these tract segmentations are commonly used to collect statistics about the diffusion properties of the segmented tract [1, 31, 101]. Initial attempts to segment fiber tracts focused around performing streamline tractography, then defining the segment as the set of voxels that contain the streamlines (e.g., [168]). However, the goal of tractography is to capture a tract s direction and orientation, not its width. As a result, tractography cannot capture fine details along the surface of a fiber tract, routinely leading to under-segmentation [24]. Instead of relying on a collection of 3D streamlines with an unclear encapsulating surface, segmentation algorithms that label the underlying 3D image domain are preferred for defining a tract s volumetric region. Among these volumetric dmri segmentation algorithms, many assume a piecewiseconstant model of the image [74, 139] where each image segment can be approximated by a constant function value. However, given the fact that dmri data is tied to the orientation of a tract, the success of these piecewise constant approaches is limited to tracts that have little curvature (e.g., corpus callosum [74], optic radiation [80]). For highly-curved tracts like 1 This chapter is based on our published work in [49]. 58

(a) Cingulum Segmentation (b) Adult IOF (c) Preterm IOF (29 wks.) Figure 4.1: Examples of the additional challenges seen for preterm infant dmri segmentation.

An additional challenge is that fiber tracts which are easily identifiable in adult dmri scans are still emerging in the preterm brain.

83 (a) Cingulum Segmentation (b) Adult IOF (c) Preterm IOF (29 wks.) Figure 4.1: Examples of the additional challenges seen for preterm infant dmri segmentation. One challenge shown in (a) is that curved fiber tracts, like the cingulum, possess non-homogeneous diffusion properties, making their dmri data difficult to model. An additional challenge is that fiber tracts which are easily identifiable in adult dmri scans are still emerging in the preterm brain. One example is the inferior occipitofrontal fasciculi (IOF) highlighted by white arrows in (b) and (c). As can be seen in (c), these emerging fiber tracts are more difficult to distinguish from their surrounding tissue. the cingulum shown in Figure 4.1a, their dmri data can have high intra-tract variability, making a constant value a poor approximation of the tract s diffusion properties. A good approximation of a tract s diffusion properties is important not just for segmenting highly-curved tracts as a single structure, but also to distinguish fiber tracts from surrounding tissue with similar diffusion properties. The latter situation is a concern in the preterm infant brain where fiber tracts are still emerging and are less easy to visually identify. One example of such tracts is the inferior occipitofrontal fasciculi (IOF) shown in Figures 4.1b and 4.1c. The IOF appear as the large anterior-posterior tracts easily identifiable in green in an adult dmri scan, yet the same tracts in preterm brain difficult to visually distinguish from the surrounding tissue. Segmenting highly curved, or emerging, tracts in dmri scans requires extending segmentation techniques to more accurately represent a tract s appearance. This can be done by either increasing the complexity of the image model (e.g., piecewise smooth [220]) or by pre-processing the dmri scan so that tracts either appear more homogeneous, or contrast better with their surrounding tissues. The latter approach has been more popular over the past decade with examples including segmentation based on pre-computed edge information [84] and clustering voxels using local statistics pre-computed from Parzan windows [19]. More recently, tractography results have been used to provide global tract shape information as input to the segmentation process [24, 148, 166], allowing for the pre-processing of an image based on the orientation of the tract. This global shape information may well 59

Figure 4.2: Our proposed segmentation workflow. Tractography is employed to generate an anchor curve (blue) which is then used to generate cross-sections of the fiber bundle (magenta).

84 Figure 4.2: Our proposed segmentation workflow. Tractography is employed to generate an anchor curve (blue) which is then used to generate cross-sections of the fiber bundle (magenta). For each cross-section, we measure diffusion dissimilarities between the points on the plane and the intersection point between the plane and the anchor curve. These dissimilarities are then interpolated back into a 3D image and a scalar segmentation algorithm provides us with the final segmentation. complement the local appearance information obtained using Parzan windowing or edge maps, yet individually, these approaches are limited by either susceptibility to noise or lack of fidelity between the data and the image model [148]. We propose that a hybrid approach, where local appearance information is combined with global shape information, can show increased segmentation accuracy for diffusion MR images. We base this hybrid algorithm on the assumption that a tract s cross-section (i.e., the plane perpendicular to the tract s local direction) shows relatively constant diffusion. Under this assumption, we generate an approximation of the tract s medial curve and, from that curve, we define an intrinsic frame of reference for the tract through the use of Frenet frames. We can then homogenize the dmri data within a tract by comparing, within a cross-section, that data to its representative point on the medial curve. By performing this dmri data homogenization, we simplify the image content to the point where a piecewiseconstant image segmentation algorithm can be applied effectively. Our results on both synthetic and real data show improved segmentation quality compared to state-of-the-art methods [19, 74, 84, 139, 166], particularly in areas of crossing fiber tracts. 60

85 4.2 Methods Overview Figure 4.2 displays the general workflow of our segmentation algorithm. We use an anchor curve, obtained via tractography, to define cross-sections through our fiber bundle. We then model each cross-section as having a piecewise-constant appearance. Under this model, we are able to compute dissimilarities between the diffusion in the cross-sectional plane to the diffusion at the intersection of the anchor curve and the cross section. This provides us with 2D piecewise-constant dissimilarity images embedded in the 3D image space. The fiber tract is then segmented from an interpolated 3D dissimilarity map. The following subsections elaborate on each step Anchor Curve Generation Given a dmr image I : Ω M that maps a point x in our image space Ω R 3 to a diffusion representation (e.g., tensor, ODF) on a manifold M, we can generate an anchor curve r : [0, 1] Ω from various tractography algorithms. In this work, we employ the minimal path tractography algorithm of Zalesky [235] to generate r. This tractography algorithm is effectively an implementation of Dijkstra s algorithm on a graph where the nodes are image voxels and edges connect 26-neighbours in 3D. Edge weights for the graph used by Zalesky s algorithm were computed analytically [47]. While generating an anchor curve can be accomplished using any tractography algorithm, the quality of the resulting curve is dependent on the choice of algorithm. example, the tensor deflection (TEND) tractography algorithm [135] is known to have difficulty tracking through areas of high curvature, which would make it a poor choice here if we were to segment the cingulum in Figure 4.1a. One advantage of using the minimal path tractography algorithm we use here is that we can specify starting and ending point seeds, allowing us to more reliably track through crossing fiber regions Obtaining Tract Cross-Sections For each point s along the given anchor curve r, we compute the Frenet frame defined by the curve s local tangent T, normal N, and binormal B vectors T = r T s N = s T s B = T N. The resulting normal and binormal vectors span (and parameterize) the cross-sectional plane normal to r. In the situation where T/ s = 0, the curve s normal vector N and, in turn, its bi-normal vector B are undefined. In such cases, an alternative way of computing the local Frenet frame is required. Algebraically, we set T(s) = RT(s 0 ) N(s) = RN(s 0 ) B(s) = RB(s 0 ) For 61

86 where s 0 is the closest point to s on r that satisfies T/ s 0 and R is a rotation matrix that aligns T(s 0 ) with T(s). This rotation preserves the orthogonality between T, N, and B so we can be sure that N, and B span our cross-sectional plane. Note that this rotation does not ensure that N(s) and B(s) are the true normal and bi-normal vectors for r(s), but this is not a concern for our algorithm. We simply require two vectors that span our cross-sectional plane. Given the local Frenet frame {T, N, B} and a point s on the anchor curve r, we generate points x Ω on the cross-sectional plane by sampling along the normal and binormal vectors: x = r(s) + u N + v B. (4.1) The diffusion representation (e.g., tensor, ODF) at x and correspondingly (u, v) in the cross-sectional image space Φ s is linearly interpolated from the original image I. This procedure produces our cross-sectional images I s : Φ s M and the corresponding 3D coordinates of each cross-sectional image pixel Π s : Φ s Ω Cross-Sectional Piecewise Constancy Our approach is based on the assumption that the diffusion data within a cross-section of a fiber bundle can be well-modelled using a piecewise-constant function. Given that a fiber tract is a collection of coherently aligned axons, we expect the diffusion within the cross-section of the tract to be similar to that at the plane s intersection with the anchor curve. Meanwhile, we expect diffusion on that cross-sectional plane but outside the fiber tract to be different from that at the plane s anchor curve intersection point. As a result, computing dissimilarities between the diffusion data on the anchor curve I(r(s)) and the diffusion data throughout the cross-section I s will provide us with a scalar feature D s that will correlate well with fiber bundle membership. Various dissimilarity metrics can be employed, including those for second order tensors (e.g., [13]), 4 th -order tensors [26], and spherical harmonic ODF representations [69]. Given a chosen metric d( ), we employ the following mapping ( ) d(is (u, v), I(r(s))) D s (u, v) = log + ǫ FA(I(r(s))) (4.2) which applies a log mapping to the dmri dissimilarities in order to expand that the range of values within the fiber bundle (where dissimilarities are close to zero) and compress the range of values outside the fiber bundle (where dissimilarities have a much larger range). The result of this mapping are distributions of D s that are more unimodal within and outside the tract of interest, making these distributions easier to model in the subsequent segmentation step. The variable ǫ is added to ensure numerical stability of the log transformation. Finally, 62

87 we normalize the distances by the anchor tract s fractional anisotropy to ensure a more consistent range of dissimilarities across cross-sections. Further details about the impact of these distance mapping choices are presented in Section Mapping Dissimilarities to the Image Space Once we have the dmri distance features D s computed from (4.2) for a collection of points Π s defined by (4.1), we proceed with reconstructing a 3D distance image. This task is a basic scattered data interpolation problem and we employ an approach based on radial basis functions and k-nearest neighbours. The interpolated 3D dissimilarity map D is given as D(x) = k i=1 exp( Π s(i) (u(i), v(i)) x ) kj=1 exp( Π s(j) (u(j), v(j)) x ) D s(i)(u(i), v(i)) (4.3) where (u(i), v(i)) in cross section s(i) is the i th nearest neighbour to x in Ω. Effectively, voxel values for our 3D distance map are interpolated from the k-nearest neighbours in the cross-sectional data and those nearest neighbours are weighted using a radial basis function Dissimilarity Map Segmentation Using the local diffusion dissimilarities from (4.2), we have reduced our dmr image, with its variable region appearance and manifold-valued data, to a scalar image that is well modelled by a piecewise constant function. As a result, it now makes sense to employ a piecewise constant segmentation algorithm. We use a probabilistic variant of the Chan-Vese segmentation algorithm that minimizes E(S, µ in, σ in, µ out, σ out ) =α β S x/ S dx + β log(p(x µ in, σ in ))dx+ x S log(p(x µ out, σ out ))dx (4.4) where S Ω is the segmentation, µ in, σ in (µ out, σ out ) represent the mean and standard deviation of distances D inside (outside) S, and weights α, β regulate the trade-off between the contour regularization and image fidelity terms. We optimize the segmentation energy in (4.4) using the total variational approach of Bresson et al. [54]. Note that this is the same optimization scheme used by the competing approach of Niethammer et al. [166] though they use it to segment images of diffusion tensor primary eigenvectors. As with the approach of Niethammer et al., all voxels a distance greater than d max from the anchor curve are set to belong to the background while the foreground segment containing the anchor curve is taken as the final segmentation. 63

88 4.3 Experimental Setup and Results To evaluate the effectiveness of our segmentation approach, we perform two quantitative experiments, one on the synthetic phantom presented in [161] and another on 18 cingulum bundles from dmri scans from the IXI database 2. In both cases, resulting segmentations were compared to expert-drawn manual segmentations using the Dice similarity coefficient (DSC). The segmentation algorithms from [74, 84, 139, 166] are used as comparison methods. In all cases, k = 5, ǫ = exp( 4) and the log-euclidean distance metric was used to compute the dissimilarity maps [13]. To ensure fairness of comparison between segmentation algorithms, we optimize the weights of all energy terms in all segmentation algorithms (e.g., α, β) using genetic algorithms. Results are shown for the weights that produce the maximum DSC Phantom Experiment Figure 4.3a displays the middle slice of the synthetic phantom from [161]. We seek to segment the ring tract in the phantom in order to test our algorithm s ability to handle both tract curvature and crossing regions. We further test the impact of image noise by adding Rician noise of different magnitudes to the phantom. Twenty-five noisy images are generated for each noise level and all competing segmentation methods are applied. Figure 4.3b shows the resulting DSC for each algorithm and noise level. Note that our approach significantly outperforms those algorithms presented in [74, 84, 139, 166]. The reasons for this improvement can be seen in Figure 4.3c. The approach of Niethammer et al. [166], which assumes a piecewise constant image after rotating tensors to the anchor curve s Frenet frame, has difficulty segmenting the crossing regions where the global piecewise constant assumption does not hold. However, our assumption of cross-sectional piecewise constancy still holds in these regions, resulting in a more accurate segmentation. Further, Niethammer et al. rely only on the primary eigenvector for segmentation, leading to over-segmentation leaking into isotropic regions in which the primary eigenvector may align with those within the segment of interest. Meanwhile, the geodesic active contours approach of Feddern et al. [84] is limited by poor edge information due to image noise. Finally, the piecewise constant segmentation approaches of Descoteaux et al. [74] and Lenglet et al. [139] (the latter of which includes a geodesic active contour edge term to the segmentation energy) fail to model the tensor image appropriately, leading to poor segmentations. Our approach avoids these problems by applying the piecewise constant image model on a per cross-section basis

(a) Crossing fiber phantom (from [161] visualized using [25]). (b) DSC for phantom at different noise levels.

[84] Descoteaux et al. [74] Lenglet et al. [139] (c) Sample segmentation results for all methods at noise level σ = 0.01.

Figure 4.3: Segmentation results for the ring tract in (a).

89 (a) Crossing fiber phantom (from [161] visualized using [25]). (b) DSC for phantom at different noise levels. Inset are FA maps for (left to right) images with σ = 0, 0.02, and Proposed Niethammer et al. [166] Feddern et al. [84] Descoteaux et al. [74] Lenglet et al. [139] (c) Sample segmentation results for all methods at noise level σ = Over-segmentation is shown in yellow while under-segmentation is shown in red. The ground truth segmentation is shown in gray. Figure 4.3: Segmentation results for the ring tract in (a). Note that we obtain significantly higher Dice coefficients than competing methods as we are able to better model curved tracts and fiber crossings. Further, our approach generates consistent results across various noise levels. 65

90 4.3.2 Real Data Experiment We employ 18 expertly-drawn manual segmentations of cingulum bundles from 9 dmri scans as ground truth segmentations to test the accuracy of our algorithm on real data. Figure 4.4b shows a representative example of the region of interest and its corresponding anchor curve. Note that the curvature of the cingulum makes a piecewise constant image model a poor choice for this segmentation task. Figure 4.4a shows the resulting DSC for all 18 cingulum segmentations. Note that our proposed approach performs better than the competing methods, even beating the average DSC reported on only two cingulum bundles in [19]. The piecewise constant approaches in [74, 139] failed to segment the cingulum. Instead, the segmentation leaked and delineated the corpus callosum seen in red in Figure 4.4b. Meanwhile, the geodesic active contours approach of Feddern et al. [84] showed difficulty dealing with noisy edge information, leading to over-segmentation. The closest competing method to ours is that of Niethammer et al. [166], one sample of which is shown in Figure 4.4c. Although the DSC values are somewhat comparable (Fig. 4.4a), our approach showed consistent (i.e., over all 18 tracts) reduction in the amount of under-segmentation compared to [166]. This reduction, likely due to our use of a more localized image appearance model, was most pronounced around the genu and splenium (as highlighted by the blue arrows). Quantitatively, we observed a significant reduction of 10.48% in the number of under-segmented voxels (Niethammer et al. [166]: µ = voxels, σ = voxels; Proposed: µ = voxels, σ = voxels. p = 0.018). 4.4 Discussion We have proposed a cross-sectional piecewise constant model for diffusion MRI segmentation, allowing us to combine local diffusion information with global shape information. We presented the segmentation algorithm in Figure 4.2 based on this image model and have shown its ability to accurately segment in crossing regions and in highly curved tract regions Design Decisions The proposed segmentation algorithm consists of multiple steps, each of which include their own design decisions. We reviewing these design decisions in detail here and justify the choices we made in the creation of this algorithm. Anchor Curve Generation We employed the minimal path tractography algorithm of Zalesky [235] to generate our anchor curves. One limitation of the minimal path tractography algorithm is the quantiza- 66

(a) Dice coefficients for all algorithms over 18 cingulum bundles. Results for Awate et al. are over only 2 datasets (taken from [19]). Although Neithammer et al.

[166] (b) Sample anchor curve (yellow) and ground truth segmentation (gray) of a cingulum bundle. (c) Sample cingulum Segmentations (gray) with under-segmentation shown in red. Figure 4.

91 (a) Dice coefficients for all algorithms over 18 cingulum bundles. Results for Awate et al. are over only 2 datasets (taken from [19]). Although Neithammer et al. s DSC is comparable, it suffered from localized under-segmentation in difficult areas (see text and Fig. 3(c)). Proposed Niethammer et al. [166] (b) Sample anchor curve (yellow) and ground truth segmentation (gray) of a cingulum bundle. (c) Sample cingulum Segmentations (gray) with under-segmentation shown in red. Figure 4.4: Results on the segmentation of cingulum bundles from real dmri scans. A sample is shown in (b). Note that we obtain higher Dice coefficients than competing methods. For the methods that were able to segment the cingulum, we were better able to reduce under-segmentation as highlighted by the blue arrows in (c). 67

92 tion effects that arise from the graph representation of the dmri data. The image space is discretized and the edge connectivity leads to a limited, discrete, set of directions along which a tract can grow. The result of these discretizations is that the generated tracts can look jagged (an artifact commonly known as the stair-step artifact) and will have difficulty following the exact curvature of the underlying axons. To reduce the quantization effects of the minimal path tractography algorithm in our work, we up-sampled our dmr images to a resolution of 1 mm 3, bringing the resolution of our data inline with those used in other graph-based tractography algorithms [108, 205, 235, 47]. Further, we smoothed the resulting anchor curve r(s) = [x(s), y(s), z(s)] by applying an averaging filter of size 5 mm to each of x(s), y(s), and z(s). In practice, we noted that these quantization effects had little measurable impact on our algorithm, even without the up-sampling and smoothing performed here. This result is due to the fact that, despite producing a jagged anchor curve, the resulting cross-sections produced in our experiments still satisfied the piecewise constancy assumption. This fortunate outcome may not occur in all situations, so up-sampling the image and smoothing the anchor curve is recommended. We did notice, however, that the quantization effects from minimal path tractography played a significant effect on the approach of Niethammer et al. [166] where tensors are reoriented to the Frenet frame of the anchor curve prior to assuming a global piecewise constant image model. In this case, the quantization errors in the anchor curve led to errors in the tensor reorientation, leading to a noisier tensor field within the tract of interest. For example, had we not upsampled the image and smoothed the anchor tract, we saw a reduction in Dice similarity coefficient (DSC) of roughly 5% (0.72 to 0.67) for the approach of Niethammer et al. in our phantom experiment. It was the result of the minimal path tractography in the approach of Niethammer et al. that led to us up-sampling the image and smoothing the anchor tract, thereby providing us with a more accurate comparison of the two methods. Obtaining Tract Cross-Sections When producing cross-sections of the fiber tract, the first question that arises is how many cross-sections to generate. At a high level, you would like to ensure that the shape of the tract you re trying to segment does not vary significantly between two neighbouring crosssections. This is the hidden assumption under which we can linearly interpolate the final 3D distance map. If the shape of our tract of interest varies significantly between neighbouring cross-sections, then the interpolated 3D distance map would not capture that variability. In our work, we sampled the anchor curve, and produced tract cross-sections, at both 1 mm intervals (i.e., roughly voxel resolution) and at 0.5 mm intervals. We found no measurable differences between the results at these two sampling frequencies, which makes sense as little discernible change in shape can occur at or below voxel resolution. A more 68

intelligent solution would be to vary the frequency of cross-sections in proportion to the curvature of the anchor curve, but this idea has yet to be explored.

In this work, we tested crosssectional plane pixel sizes of 1 mm 2, 0.6 mm 2, and 0.3 mm 2.

93 intelligent solution would be to vary the frequency of cross-sections in proportion to the curvature of the anchor curve, but this idea has yet to be explored. When generating the 2D cross-sectional images, we also need to concern ourselves with the question of what in-plane resolution we would like to use. In this work, we tested crosssectional plane pixel sizes of 1 mm 2, 0.6 mm 2, and 0.3 mm 2. We found that the qualitative appearance of the resulting 3D distance maps improved as the resolution increased, but saw no changes in the resulting DSC in the segmentation. Another concern with the cross-sectional images is their overall length and width. In our work, we extend the cross-sectional images until they leave the original image space Ω except when the anchor curve intersects the cross-sectional plane in more than one location. For such occurrences, we crop the cross-sectional plane at the midpoint between anchor curve intersection points. A more intelligent solution would be to limit the size of the cross-sectional planes in relation to the spatial distance d max used in the segmentation step. Finally, we note that depending on the quality of the seeds provided to generate the anchor curve, the curve may not extend to the extremities of the tract. In these cases, the cross-sectional planes may not cover the full tract from end to end. However, we also note that at the endpoints of the anchor curve (s = 0 and s = 1), the vectors T and N do define a plane that extends beyond the anchor curve to capture the tract s extremities. The same can be said for a plane defined by T and B. Further, we can rotate these two planes around N and B respectively to sweep out the extreme ends of each tract as shown in Figure 4.5. These planes can then be added to our collection of cross-sectional planes and processed in the same manner. In our work, we selected our endpoints carefully, but this additional sweep of planes can reduce the dependence on providing quality seeds. (b) Rotating {T, N} planes about N (a) Rotating {T, B} planes about B Figure 4.5: At the endpoints of the anchor curve, we can create additional planes spanned by T, B and T, N respectively to capture the extremities of the fiber tract. 69

94 Cross-Sectional Piecewise Constancy Ideally, the cross-sectional piecewise constancy assumption should ensure that diffusion dissimilarities computed between the anchor curve and the cross-sectional image show a clear, bimodal distribution. One mode would represent distances within the tract of interest and the second mode would represent distances from outside the tract of interest. Unfortunately, two concerns arise that impact our ability to obtain the bimodal distribution we desire: 1. The impact of FA on diffusion dissimilarities: With mean diffusivity being relatively constant throughout the brain, the majority of diffusion dissimilarity is dependant upon orientation and anisotropy. Further, the magnitude of diffusion dissimilarity due to orientation will vary as anisotropy varies. For example, two tensors that vary only in orientation will have a small dissimilarity if they are nearly isotropic as opposed to highly anisotropic. As a result, anisotropy plays a major role in the range of dissimilarities that can be obtained between diffusion measurements. As our algorithm computes diffusion dissimilarities with the anchor curve on a per cross-sectional basis, the range of dissimilarities we obtain will be different for each cross section. In particular, in areas where the FA of the anchor curve is low, the range of dissimilarities will be smaller than where the FA of the anchor curve is high. As a result, when we map these dissimilarities back into 3D to perform segmentation, the division between tract and non-tract dissimilarity values becomes blurred. Figures 4.6a and 4.6c show and example of this effect on the phantom used in our experiments. The blue and red histograms show the dissimilarity values inside and outside the tract of interest respectively. In Figure 4.6a, it can be seen that these dissimilarities overlap considerably. However, when we divide the dissimilarity values by the FA of the anchor curve, the dissimilarity values within each class become more clearly separable, as can be seen in Figure 4.6c. Normalizing dissimilarity values by the FA on the anchor curve helps maintain a consistent range of dissimilarities across the 3D image space Ω. For this reason, we employ this FA normalization in our algorithm. 2. Disproportionate ranges in diffusion dissimilarities: Given the cross-sectional piecewise constancy assumption, one would expect that diffusion dissimilarities within the tract of interest to cover a rather tight range of small values. Meanwhile, diffusion dissimilarities outside the tract of interest, where a wide variety of anatomical fibers exist, could potentially cover a large range of values. In fact, we see this phenomenon in practice as shown in Figures 4.6a and 4.6c. These distributions of dissimilarities within and outside the tract are particularly difficult to model in a segmentation algorithm as there is no clear bimodal distribution of values. In fact, the histogram in Figure 4.6c could just as easily be modelled using three Gaussian distributions (e.g., 70

(a) No FA Normalization, No Log Mapping (b) No FA Normalization, With Log Mapping (c) FA Normalized, No Log Mapping (d) FA Normalized, With Log Mapping Figure 4.

Histograms were computed for the synthetic phantom used in our experiments, with σ = 0.02.

Note that including FA normalization makes the classes more easily separable.

each class easier to model. two for outside the tract and one for inside the tract) as it could be modelled by one Gaussian distribution.

95 (a) No FA Normalization, No Log Mapping (b) No FA Normalization, With Log Mapping (c) FA Normalized, No Log Mapping (d) FA Normalized, With Log Mapping Figure 4.6: Distance image histograms for the four cases related to FA normalization of distances and log mapping of the distances. Histograms were computed for the synthetic phantom used in our experiments, with σ = Distances within the tract are shown in blue while distances outside the tract are shown in red. Note that including FA normalization makes the classes more easily separable. Meanwhile, applying the log mapping leads to more Gaussian-like behavior and similar magnitude ranges for each class, thereby making the distribution of distances within each class easier to model. two for outside the tract and one for inside the tract) as it could be modelled by one Gaussian distribution. To ensure that our diffusion dissimilarity values better conform to bimodal distributions that segmentation algorithms can efficiently deal with, we perform a log mapping of the FA normalized dissimilarity data. The effect of the log mapping is seen in Figure 4.6d. The range of diffusion dissimilarities within the tract of interest are expanded to generate a more pronounced Gaussian distribution within that class. Further, the range of diffusion dissimilarities outside the tract of interest are condensed to create 71

96 a distribution that more closely resembles one that is uni-modal. The resulting segmentation can more accurately separate the classes in Figure 4.6d than in Figure 4.6a as a result of this log mapping. Mapping Distances to Image Space Generating a 3D image of our cross-sectional diffusion dissimilarities is a standard scattered data interpolation problem. There are many ways to perform scattered data interpolation [87] with the most popular being based on radial basis functions (RBFs) and Delaunay triangulation. Given the large image space and large number of cross-sectional image pixels generated in our work, these popular methods consume large amounts of memory (> 20 Gb on real dmri data), making them ill-suited for practical use. Instead, we use a variant of the radial basis function approach by assuming more compact support. Voxel values for our 3D distance map are interpolated from the k-nearest neighbours in the cross-sectional data and those nearest neighbours are weighted using an RBF. The compact support created from using only the k-nearest neighbours (and not all data points) allows us to avoid the large memory requirements of traditional scattered data interpolation methods. With the phantom image used in our experiments, the interpolation problem was small enough to perform a full radial basis function interpolation. We used the resulting 3D distance maps from this full RBF interpolation to select our choice of k in the compactsupport RBF interpolation approach. Setting k = 5 in our algorithm provided us with 3D distance maps that had the minimal sum-squared difference between the 3D images interpolated from the compact and full RBF approaches. Distance Map Segmentation To segment the tract of interest from the 3D distance map, we employ a probabilistic version of the Chan-Vese segmentation method as presented in [54]. The initial binary segmentation for the algorithm is provided by traditional k-means clustering of the image data. Given that the algorithm in [54] reduces to a convex optimization problem, we expect that alternative initializations should also give similar results, but the convergence speed may differ Segmenting Preterm Infant dmri Our primary motivation in improving dmri segmentation techniques is to obtain an algorithm that can accurately segment fiber tracts that are emerging in the preterm infant brain. We showed an example of these emerging tracts in the IOF of the infant s brain in Figure 4.1c. We hypothesize that our segmentation approach can more accurately segment that preterm infant s IOF than competing algorithms. To explore that hypothesis, we ap- 72

97 (a) Ground Truth (b) Proposed (c) Niethammer et al. [166] (d) Descoteaux et al. [74] Figure 4.7: Segmentation results on the inferior occipitofrontal fasciculi (IOF) in the brain of an infant born preterm (the same infant shown in Figure 4.1c). The ground truth segmentation, shown in (a) was drawn manually by an expert. Note that our proposed segmentation algorithm provided a result in (b) that is visually similar to the ground truth, while the two competing algorithms either undersegmented (c) or oversegmented (d) the IOF. plied our segmentation algorithm, as well as the four competing algorithms used earlier, to the IOF from the preterm infant dmri scan shown in Figure 4.1c. Figure 4.7 displays the segmentation results on the preterm infant ILF for our technique as well as those of Niethammer et al. [166] and Descoteaux et al. [74]. For this example, we found that our algorithm generated a result more similar to the manually delineated ground truth than the two competing algorithms shown in the figure. For the approach of Niethammer et al., the algorithm was only able to identify two regions if the smoothness term was significantly high. Increasing the weight on that smoothness term made the algorithm unable to identify the narrow endpoints of the tract. For the algorithm of Descoteaux et al., the Gaussian used to model the diffusion inside the IOF was not an accurate enough choice and led to oversegmentation. Two other competing algorithms were also applied to this segmentation task: The active contour approach of Feddern et al. [84], and the hybrid active contour and active region approach of Lenglet et al. [139]. In both cases, the algorithm was unable to identify multiple regions. This preliminary result for the dmri segmentation of preterm infant fiber tracts is encouraging in the sense that our algorithm generated an anatomically plausible segmentation result, something the competing algorithms were unable to accomplish. Even so, a more thorough validation of this technique would be required before concluding that our algorithm shows greater accuracy in segmenting preterm infant fiber tracts. 4.5 Conclusions We proposed herein a cross-sectional piecewise constant model for diffusion MRI segmentation, allowing us to combine local diffusion information with global shape information. 73

98 Using an anchor curve obtained via tractography, we are able to generate cross-sections of the tract and apply the piecewise constant model at that local level. Using that model, we were able to homogenize the dmri data so that a piecewise constant image segmentation algorithm can subsequently be applied. We have shown that the resulting segmentation algorithm is better capable of handling curved tracts and crossing regions than many competing methods [19, 74, 84, 139, 166]. We further applied our algorithm to the segmentation of an emerging tract in the preterm infant brain and the results were promising. These results suggest that our cross-sectional piecewise constant model can be important in accurately model the diffusion measurements of different fiber tracts and obtaining quality segmentation results. 74

99 Chapter 5 Global, Competitve, Closed-Form Probabilistic Tractography 5.1 Introduction and Motivation One of the strengths of diffusion MRI (dmri) is that it provides us with the ability to non-invasively assess the integrity and neural connectivity of the brain s white matter 1. The connectivity is inferred from the measured local diffusion of water molecules, where diffusion rates are known to be maximal along the direction of the underlying axons [36]. The innovation of dmri has led to the ability to perform tractography: the delineation of neural connections in the brain. Initial streamline approaches gave us the ability to visualize these connections by generating 3D space curves tangent to the direction of maximal local diffusion [30, 154]. These initial methods were limited in that they did not consider the effects of noise, partial voluming, or other imaging artifacts. The streamlines are also formed independently of each other, making it impossible for them to adjust to the position of neighbouring tracts. Further, the streamlines grow based on local information; only the local diffusion profile and the current curvature of the streamline are used to determine its immediate next growth step. This combination of local decision-making, independent curve formation, and disregard of possible errors in the diffusion measurements, can lead streamlines off the path of the underlying neuronal fibers, resulting in a phenomenon known as tract jumping [156]. The phenomenon of tract jumping is amplified when scanning preterm infants. The smaller size of the infant s brain combined with the limited resolution of dmri leads to a greater amount of aliasing between tracts. The effects of this aliasing can be seen in Figure 5.1. For a seed region placed in the splenium of the corpus callosum, the degree to which we see spurious streamlines connecting to anterior portion of the brain is much higher 1 This chapter is based on our published work in [47, 48]. 75

Those spurious streamlines are examples of tract jumping in the tractography algorithm. in the preterm brain than in the adult brain.

100 (a) Adult DTI - Splenium (b) 29 wk. Preterm - Splenium Figure 5.1: Examples of tract jumping in both the adult and preterm infant brain. The streamline tractography (performed by FACT [30]) results are shown for streamlines passing through a region-of-interest located in the splenium. Note that in both cases, the forceps major (i.e., the upside-down U-shaped tract) is correctly identified, but that the preterm infant s tractography result shows a greater degree of spurious streamlines in the anterior direction. Those spurious streamlines are examples of tract jumping in the tractography algorithm. in the preterm brain than in the adult brain. These spurious streamlines give clinicians a false sense of connectivity between functional brain regions. Later tractography algorithms attempted to address tract jumping in various ways. Probabilistic tractography methods use a Monte Carlo generation of streamline curves where the streamlines grow based on a noise model [40, 88]. These probabilistic techniques incorporate the impact of noise into the resulting output, which provides a measure of confidence for the connections the algorithm maps out. Approaches involving diffusion simulation have also been proposed [62, 124]. These methods simulate diffusion in a local neighbourhood and use the resulting diffusion front, along with various heuristics, to discover the underlying fiber tracts. By simulating the diffusion over the whole local neighbourhood, these techniques incorporate more dmri data into growing a tract than streamline-based tractography algorithms. Minimal path-based approaches have also been proposed based on Dijkstra s algorithm [108, 235]. While these are popular due to their short computation time, guarantees on optimality, and their ability to encode the noise models, they still generate tracts in a local and greedy fashion. Any information we wish to incorporate into tractography algorithms is currently limited to a local scale (e.g., fractional anisotropy, local tract curvature). Recent global tractography algorithms have attempted to identify fiber tracts through 76

101 a combination of global optimization and curve fitting [3, 85]. Even so, these algorithms continue to be driven solely by local tract properties. What is common to all these algorithms is that their estimates of the fiber tracts are independent of each other s locations, thus overlooking the fact that these tracts exist in the same spatial domain and affect each other s position. To improve the robustness of tractography algorithms, we must address the aspects of tractography algorithms that limit their effectiveness: locally-driven decision making and independent tract formation. With these characteristics in mind, we propose a novel approach to tractography by incorporating the notion of competition. At a high level, we propose a tractography algorithm where tracts compete for space within the brain. This competition is introduced using multiple target regions in a graph-based random walk framework [95]. The steps of the random walker are driven by the dmri data so that the walker s trajectory aligns with the underlying neural pathways. In this framework, we compute the probability that a random walker will reach a particular target region first. The temporal aspect of this random walk encodes the notion of competition into the tractography process. The connection probabilities are then computed as a closed-form solution that explicitly incorporates the effects of multiple target regions. We present our tractography algorithm in the following section and apply the algorithm to both synthetic data and 30 diffusion MR images from the MIDAS database [60]. Results show that introducing competition allows us to reduce erroneous connectivity while also reducing the effect local noise has on the resulting connection probabilities. 5.2 Methods Overview We begin by stating the problem formally. Let I : Ω M be a diffusion MRI volume that maps a point x in the image space Ω to a fiber orientation distribution function (fodf) ψ x M. The fodfs ψ x (r) capture the probability of a fiber tract being incident to the unit vector r. Fiber ODFs can be computed from any of the diffusion models presented in Chapter 2. Typical techniques include sharpening the diffusion model [98, 127, 71, 125], using statistical sampling techniques [39, 88, 134, 120], or learning from training data [212]. We consider a point x and its relation to a set of target regions Υ = {R 1,, R K } in terms of a random walk problem. Our goal is to determine a mapping z : Υ x I [0, 1] K (5.1) that would tell us the probabilities of a dmri-driven random walker, originating at a position x, stumbling on target region R k before any other region R k Υ \ R k. By controlling the random walker using the fodf information, the probabilities we generate are interpreted 77

(a) A random walk on an image grid. The trajectory of the walker is shown in green. (b) The grid s edge weights are computed from the diffusion model. Figure 5.

direction, the more likely the random walker will select that edge. as connection probabilities in the same way as traditional probabilistic tractography results (i.e., tractograms).

102 (a) A random walk on an image grid. The trajectory of the walker is shown in green. (b) The grid s edge weights are computed from the diffusion model. Figure 5.2: Our proposed tractography algorithm is based on a random walk over the image grid where the grid s edge weights are computed from the dmri data; the more diffusion we measured along an edge direction, the more likely the random walker will select that edge. as connection probabilities in the same way as traditional probabilistic tractography results (i.e., tractograms). A visual example of the random walk problem is shown in Figure dmri Graph Embedding To setup the random walk problem described above, we first require a graph-based representation on which the random walk can be modelled. As in [108, 235, 205], we represent a diffusion MRI volume as an undirected graph G = (V, E) where the set of nodes V is the set of voxels, i.e., V = {v Ω}, and edges e ij E, E = V V, connect neighbouring voxels in the dmri volume. As the fodfs captures only local directional diffusion information, we restrict the graph s edges to a voxel s 26 neighbours in 3D. Finally, a weight function w(e ij ) : E R + assigns a strength to each edge. Note that the edge weight corresponds to the likelihood of the random walker to traverse that edge. We would like the edge weight w(e ij ) to be proportional to the probability of an anatomical connection between voxels i and j. Graph-based tractography methods employ a widely-used weighting function that fits that description [108, 235, 205]. It is given as: w(e ij ) P odf (i, r ij ) + P odf (j, r ji ) (5.2) where r ij is the direction of the vector connecting the voxels implied by edge e ij and P odf is given by P odf (v, r uv ) = 1 ψ v (θ, φ)ds (5.3) Z(v) (θ,φ) β 78

103 Here, β is the solid angle around the edge direction r uv and ψ v is the fodf at voxel v. The variable of integration ds is the infinitesimal spherical surface element and Z(v) is a scaling constant set as in [108, 205]. Typically, β spans the spherical cap of a cone centred around r uv [108, 205]. A graphical example of this weighting function is shown in Figure 5.2b. The integration in (5.3) has been shown to be necessary in capturing the diffusion information between voxels, particularly in areas of high diffusion anisotropy where the direction of maximal diffusion does not align with an edge [235]. To date, this ODF integral has been approximated numerically using a tessellation of the sphere [108, 235, 205]. This numerical integration is a concern not only in terms of approximation error, but also in terms of the computational burden of performing numerical integration for the large number of edges connecting voxels in a typical 3D dmri field. In the following section, we derive an exact analytical solution to the edge weight integration in (5.3) and we show how the derivation of this analytical solution is achieved using a spherical harmonic representation of the fodf along with precise angular rotations that address the complicated limits of integration Analytical Edge Weight Integration Spherical harmonics have been a popular choice in representing fiber ODFs from various diffusion models [71, 115, 236]. An ODF ψ v (θ, φ) at voxel v can be represented using a real spherical harmonic expansion as, ψ v (θ, φ) = K l l=0 m= l F m l Y m l (θ, φ) (5.4) where Fl m are coefficients and Yl m (θ, φ) are the harmonic basis functions. The integer m : ( l m l) is known as the harmonic s order while the integer l 0 is the degree of the harmonic. The harmonic basis functions are defined using the associated Legendre functions P m l as 2l+1 4π P l 0 (cos(φ)), if m = 0 Yl m = N(l, m)pl m (cos(φ)) cos(mθ), if m > 0 (5.5) N(l, m)pl m (cos(φ)) sin(mθ), if m < 0 where N(l, m) = 2l+1 (l m)! 2 4π (l+m)! is a normalization constant. Note that θ represents the azimuthal angle (from the positive x-axis) and φ the polar angle (from the positive z- axis) of the spherical coordinate system. For fiber ODFs, only the even-numbered orders are required due to the antipodal symmetry of the diffusion MRI imaging sequence [115]. Although our method works for any l, typically degrees l 8 are used as larger-degree harmonic representations often over-fit the dmri data [71]. 79

104 Considering ψ v (θ, φ) as our fodf from (5.3) in the spherical coordinate system, we can greatly simplify the edge weight integration. Using the linearity of integration, we obtain β ψ v (θ, φ)ds = K l Fl m l=0 m= l β Y m l (θ, φ)ds. (5.6) Note that the effect of the dmri data is limited to the setting of the constants Fl m and that the integration occurs only over the spherical harmonic basis functions. As a result, the integral in (5.6) does not vary over the graph and only needs to be calculated once for each edge direction. Given integral limits defining the solid angle β, the integration in (5.6) can be performed offline to speed up later graph constructions. While we have been able to reduce the number of integrals that need to be calculated, we have yet to show that the integrals in (5.6) can be calculated analytically. The first step in achieving this goal is to rewrite the integral limits in a way that facilitates an analytical solution. In [108, 205], the area of integration includes all directions (θ, φ) that are within an angular distance α of the edge direction (denoted as (θ 0, φ 0 )). This results in a cone of influence with its central axis along (θ 0, φ 0 ) and an apex angle of 2α. The spherical cap created by this cone is our area of integration, β, and is usually expressed as a solid angle. The angle α is usually chosen to obtain a spherical cap with a solid angle of 4π M, where 4π sr is the solid angle of the complete sphere and M is the neighbourhood size (e.g., M = 26 for 26-connectivity). In general, integrating over the spherical cap of this cone leads to complicated limits of integration that make an analytical solution to the integral difficult, if not impossible [229]. We have, however, identified a special case where these integral limits simplify into constant values. If an edge is aligned with the direction φ = 0 (or φ = π), then the small circle used to define our limits of integration can be traced by setting φ = α and letting θ vary between zero and 2π. The spherical cap defining our area of integration becomes the set of concentric circles centred at (0, 0) and having angular radii less than α. As such, we can integrate this spherical cap with the following integral limits: φ [0, α] and θ [0, 2π]. (5.7) These constant integral limits are imperative in obtaining an analytical solution to (5.3), as well will show later in this section. Note that these integral limits are only valid for the special case where the edge is aligned with φ = 0. In order to be able to use these integral limits for any arbitrarily-directed edge, we must first rotate the spherical harmonic expansion so as to align the edge in question with φ = 0. The rotation of a spherical harmonic expansion is performed by multiplying the coefficients Fl m by a rotation matrix 80

105 R l. We create the rotation matrices using the recurrence relations defined in [109] to obtain rotated spherical harmonic coefficients F m l. While there is a computational expense in rotating the ODFs, we will show in the results section that this cost is not prohibitive. Further, we note that these rotation matrices can be precomputed and only need to be computed once for a given graph connectivity scheme. At this point, we have shown that by using spherical harmonics and rotations, we can simplify (5.3) to P odf (v, r uv ) = K l α 2π Fv,l m l=0 m= l 0 0 Y m l (θ, φ) sin(φ)dθdφ (5.8) where Fv,l m are the rotated spherical harmonic coefficients for voxel v. Note that ds has been replaced with the appropriate substitution ds = sin(φ)dθdφ. What remains to be shown is that the simplified integral in (5.8) can be solved analytically and that solving (5.8) analytically is more computationally efficient than performing numerical integration. Given the real spherical harmonic basis and the integral limits from (5.7), we can prove that for m 0: α 2π 0 0 Y m l (θ, φ) sin(φ)dθdφ = 0. (5.9) This result can be seen by expanding (5.9). For m > 0, we see that (5.9) can be rewritten as α 2π 0 0 = N(l, m) = N(l, m) N(l, m)p m l α 0 α 0 (cos(φ)) cos(mθ) sin(φ)dθdφ 2π Pl m (cos(φ)) sin(φ) [ sin(mθ) Pl m (cos(φ)) sin(φ) m as sin(2πm) = sin(0). The same can be shown for m < 0. 0 cos(mθ)dθdφ ] 2π 0 dφ = 0 This result is encouraging as it allows us to further reduce the number of integrals we need to compute in order to obtain the edge weight defined in (5.8), thereby increasing the computational efficiency of the integration. Further, we require only the rotated spherical harmonic coefficients Fl m for cases where m = 0. The other rotated spherical harmonic coefficients no longer need to be calculated. Finally, for the spherical harmonics basis functions for m = 0, it can be shown that 81

106 α 2π Yl = = (θ, φ)sin(φ)dθdφ (5.10) α 2π 2l + 1 4π P l 0 (cos(φ))sin(φ)dθdφ 0 0 2l + 1 4π = 2π α 2l + 1 4π 0 α P 0 l (cos(φ))sin(φ) 0 2π P 0 l (cos(φ))sin(φ)dφ 0 dθdφ Let ω = cos(φ). Then dω = sin(φ)dφ. Substituting ω into the above gives us α 2π Yl where ω = cos(φ) and dω = sin(φ)dφ. (θ, φ)sin(φ)dθdφ 2l + 1 α = 2π Pl 0 4π (cos(φ))sin(φ)dφ 0 2l + 1 cos(α) = 2π 4π 1 P 0 l (ω)dω (5.11) The result in (5.11) shows that the ODF edge weight integration reduces to integrations over the Legendre polynomials. Table 5.1 shows the first few Legendre polynomials used in our edge weight integration as well as their indefinite integrals (ignoring the additive constants). Higher order Legendre polynomials follow the same pattern as in Table 5.1 and are easily analytically integrable. Evaluating the integrals in Table 5.1 using the limits from (5.11) completes the integration in (5.8). l P l (ω) Pl (ω) 0 1 ω (3ω2 1 1) 2 (ω3 ω) (35ω4 30ω ) 8 (7ω5 10ω 3 + 3ω) (231ω6 315ω ω 2 1 5) 16 (33ω7 63ω ω 3 5ω) (6435ω ω ω ω ) 128 (715ω9 1716ω ω 5 420ω ω) Table 5.1: The Legendre polynomials and their indefinite integrals of even degree up to l = 8. By relying on the spherical harmonic representation, rotating the fodfs to simplify the limits of integration, and using the integrals of the Legendre polynomials in Table 5.1, we are able to analytically compute the edge weights for graph-based dmri analysis techniques including our random walker tractography algorithm as well as those algorithms in [108, 235, 205]. 82

107 5.2.4 Random Walker Tractography With the graph-based dmri representation computed in the previous section, we can now setup the random walk problem defined earlier in Section Recall, our random walk problem seeks to find the mapping in (5.1) that would tell us the probabilities of a dmridriven random walker, originating at a voxel x, stumbling on target region R k before any other competing target region. To solve for the connection probabilities in (5.1), we use the fact that the solution of the combinatorial Dirichlet problem is equivalent to the solution that we seek [95]. In our context, the Dirichlet problem is essentially to find the probabilities z that minimizes the Dirichlet integral D[z] = 1 2 Ω z 2 dω (5.12) over the diffusion volume Ω. Using our graph representation, we can rewrite (5.12), as in [95], to be D[z] = 1 2 zt Lz (5.13) where L is the Laplacian matrix of the graph defined earlier. The graph Laplacian is given in [95] as d i if i = j, L ij = w ij if (i, j) E, (5.14) 0 otherwise where w ij is the edge weight for edge e ij, given by (5.2), and d i is the degree of node i (the sum of edge weights incident on i). Once we have the Dirichlet integral set up as in (5.13), we can differentiate with respect to z and find its critical points. Since L is positive semi-definite, the only critical point of (5.13) will be the global minimum of (5.12). This derivation can be found in [95] and leads to a system of linear equations with respect to the given target regions {R 1,, R K }. If we consider, for each region R k in turn, a set C k of nodes within the region, we can define a matrix M = [m 1,, m K ] describing the known probabilities for our seed regions: 1 if x C k m k (x) = (5.15) 0 otherwise. The resulting linear system of equations takes the form L u Z u = B T M (5.16) 83

108 where our matrix Laplacian from (5.14) is reordered as L = [ Ls B B T L u ] (5.17) Essentially, the Laplacian matrix is ordered so that the nodes within target regions {R 1,, R K } appear first and the remaining nodes follow. As such, L u is the Laplacian of the subgraph excluding regions {R 1,, R K } while B encodes the connectivity between those non-targeted nodes and the given target regions (whose Laplacian is L s ). The resulting matrix Z u = [z 1,, z K ] are the connection probabilities for the remaining unseeded nodes. Directly solving (5.16) for Z u provides us with the probabilities we seek: the probability a neural connection exists between x and each target region. Note that the resulting probabilities from (5.16) are obtained by matrix inversion and do not require the Monte Carlo sampling seen in [40, 88]. This fact guarantees a repeatable result. For a more detailed version of this derivation, please refer to [95] Modelling the Background Conventional tractography algorithms typically include a termination condition whereby tracking is halted in regions where the fiber direction becomes unclear. This termination condition, typically described by a threshold on fractional anisotropy [30, 40, 88] or local variance in tract direction [154], is used to ensure that the algorithm does not generate fiber tracts where no evidence of coherent axonal tracts exist. We capture this same tract termination condition by introducing an additional background target region R bgnd. This region is used to model background regions where the diffusion MRI data is incapable of capturing the underlying axonal fibers. Even with this additional target region, the interpretation of the probabilities we compute in (5.16) remains the same. They represent the probability of a random walker reaching a particular target region before any other target region including the background. 5.3 Experimental Results In the construction of our competitive tractography algorithm, we developed two main contributions. First, we developed an exact solution to the dmri graph embedding problem by analytically solving for the edge weights in (5.2). Second, we showed that our competitive tractography algorithm can be formulated as a random walker problem and solved using the technique in [95]. 84

5.3.1 Graph Construction In this section, we aim to show the gains in the numerical accuracy and computational efficiency of our dmri graph embedding technique compared to commonly adopted numerical

The following results were obtained using three 3D diffusion MRI volumes publicly available from the 2009 Pittsburgh Brain Competition [200].

Fiber ODFs were computed as in [71, 115] with a spherical harmonic expansion of degree l = 6. As a comparison technique, we performed the integration in (5.

109 5.3.1 Graph Construction In this section, we aim to show the gains in the numerical accuracy and computational efficiency of our dmri graph embedding technique compared to commonly adopted numerical schemes. These gains will be shown both separately and in the context of existing minimal path dmri tractography algorithms. The following results were obtained using three 3D diffusion MRI volumes publicly available from the 2009 Pittsburgh Brain Competition [200]. Each dataset was comprised of 256 diffusion weighted images acquired with a b-value of 1500 s/mm 2 at 2 mm isotropic voxel resolution. Fiber ODFs were computed as in [71, 115] with a spherical harmonic expansion of degree l = 6. As a comparison technique, we performed the integration in (5.3) using numerical integration from equally-spaced samples on the sphere. The samples on the sphere were obtained from different order tessellations of an icosahedron. Method Comp. Time (s) NRMS Error (%) Mean Std. Dev. Mean Std. Dev. Proposed ǫ ǫ 1st order (42 pts.) nd order (162 pts.) 3rd order (642 pts.) Table 5.2: Computation time and normalized root-mean-squared error results for various methods of calculating P odf in (5.3). Results shown for our exact method versus numerical integration using different order tesselations of an icosahedron. Note our proposed method gives accurate results within machine precision ǫ. Table 5.2 displays the mean and standard deviation of the normalized root mean squared (NRMS) error for all values of P odf calculated over the approximately 200,000 fiber ODFs in each volume for different order tessellations (i.e., numerical approximation with 42, 162, and 642 vertices respectively). Also shown is the mean and standard deviation in computation time for solving (5.3) for a dmri volume. Timing results for our proposed solution include the spherical harmonic rotations mentioned in Section As the quality of the tessellation improves, the numerical approximations approach our exact result. However, as the numerical approximation improves, the computational burden increases. Yet, our solution is able to compute (5.3) roughly 5.8 times faster than the coarsest numerical approximation and almost 44 times faster than numerical approximation using a third-order tessellation. 85

The regions highlighted in cyan are compared to results for numerical approximation with 2 nd order tesselation of an icosahedron.

110 Exact Exact Approx. (a) Seed ROI in Corpus Callosum body (green arrow) Approx. (b) Seed ROI in Splenium (green arrow) Figure 5.3: Connection probabilities using the exact and the numerical approximation of P odf in (5.3). The tractography results shown are based on our exact solution for obtaining the graph edge weights. The regions highlighted in cyan are compared to results for numerical approximation with 2 nd order tesselation of an icosahedron. Maximum intensity projections along the coronal and axial planes are shown in (a) and (b) respectively. Note the qualitative differences in connection probabilities in the highlighted regions. We applied our exact ODF integration to the graph-based minimal path tractography algorithm in [205]. Sample connectivity results are shown in Figure 5.3 for seed regions in the body of the Corpus Callosum (CC) and in the Splenium (shown by the green arrows). Our exact ODF integration generates regions of high fiber probability that are less diffuse and extend further from the seed regions provided to the minimal path tractography algorithm. These results, highlighted respectively in the two displayed regions of interest (denoted in cyan), are likely due to the avoidance of error accumulation when approximating the integral in (5.3). Figure 5.4 shows the percentage error between the connection probabilities generated using the exact and approximate ODF integration solutions. Note that we are able to reduce error in the resulting probabilities by over 15% in places Tractography To examine the properties of our proposed competitive tractography algorithm, we apply our approach to both synthetic dmri data and 30 diffusion tensor images from the MIDAS database [60]. The tensor images are interpolated up to 1 mm isotropic resolution using the approach described in [108]. Fiber ODFs are estimated from the tensor data by first sharpening the diffusion tensors as in [127], then computing the ODF of the sharpened tensors as is done in [108]. For comparison purposes, we generate results using three different algorithms: The minimal path approach presented in [108, 235]. This tractography algorithm uses Dijkstra s algorithm on the same graph formulation we present here. This algorithm does not incorporate competition. 86

Note how the error from numerical approximation reaches over 15% Our proposed random walker tractography algorithm without competition.

111 (a) Corpus Callosum (b) Splenium Figure 5.4: Percentage error in connection probabilities for the examples in Figure 5.3. Maximum intensity projections along the coronal and axial planes are shown respectively. Note how the error from numerical approximation reaches over 15% Our proposed random walker tractography algorithm without competition. The algorithm is run for each seed region and background region pair separately. Results from this approach are used to isolate the differences between our algorithm and the minimal path tractography algorithms from [108, 235]. Our proposed random walker tractography algorithm with competition. We run our algorithm with all seed regions and the background region simultaneously. We use this algorithm to assess the effect of competition to tractography. Voxels with fractional anisotropy less than 0.15 are used to define the background region. This threshold matches the termination criteria seen in existing tractography algorithms [30, 40, 88] Synthetic dmri Data Figure 5.5 shows two synthetic examples motivating the use of target region competition in tractography. Both the dmri with kissing vertical and c-shaped fibers as well as the image with the crossing fibers contain regions where two fiber tracts may compete for space. The corresponding fibers are represented by the target regions shown in yellow and white respectively. Rician noise was added to the corresponding diffusion weighted images to obtain images with signal-to-noise ratios of db (kissing phantom) and db (crossing phantom) respectively. We hypothesize that the use of competition will reduce the chances of detecting erroneous connections. This hypothesis will be validated if lower connection probabilities are observed within the non-targeted tracts. Figure 5.6 shows the tractography results corresponding to the target regions in the vertical tracts for the three different tractography algorithms listed earlier. Note that the 87

6a show high connection probabilities in the unseeded (curved and horizontal) tracts. Further, the connection probabilities do not vary smoothly with respect to the underlying fiber structure.

112 (a) Kissing Fibers (b) Crossing Fibers Figure 5.5: Synthetic dmri phantoms showing complex fiber relationships of kissing and crossing fibers. These two phantoms are used in our tractography experiments with the competing target regions delineated by white and yellow boxes respectively. minimal path tractography results in Figure 5.6a show high connection probabilities in the unseeded (curved and horizontal) tracts. Further, the connection probabilities do not vary smoothly with respect to the underlying fiber structure. This can be seen in the crossing region in Figure 5.6a. Additional non-zero connection probabilities are also seen in the background regions, which demonstrate a lack of coherent connectivity. This noise in the tractograms stems from computing connection probabilities based solely on local information. Figure 5.6b displays the results of our algorithm without competition. Note that, like probabilistic tractography algorithms shown in [40, 88], our connection probabilities decrease as a function of distance and are therefore analyzed in logarithmic scale. Even in logarithmic scale, we see a decrease in tract jumping in the non-targeted tract. We also see a smoother spatial distribution in connection probabilities. These differences are due to the computation of all connection probabilities through the closed-form solution to (5.16). Similar results were seen for the seeded curved and horizontal tracts. Figure 5.6c displays the tractography results from our algorithm with competition. Note that, as hypothesized, the competition further reduces the connection probabilities within the non-targeted tract. For the kissing phantom, the connection probabilities in the nontargeted curved tract decrease by an average (over the tract) of 42.80% with the addition of competition. The crossing phantom shows a similar decrease in connection probabilities 88

Note that our approach in (b) and (c) produces smoother tractograms than the minimal path approach in (a) as we avoid local computation

113 (a) Minimum Path [235] (b) Proposed (without competition) (c) Proposed (with competition) Figure 5.6: Tractography results for the crossing and kissing fiber phantoms. Note that our approach in (b) and (c) produces smoother tractograms than the minimal path approach in (a) as we avoid local computation of the connection probabilities. Further, the introduction of competition reduces the chances of tract jumping as shown by the decreasing probability we see away from the targeted vertical tracts. This result is highlighted by the green arrows. 89

(b) Seeds for the corpus callosum body (a) Seeds for the genu (c) Seeds for the splenium Figure 5.

These target regions are chosen within the crossing region between the CC and the internal capsule in order to test the effect that the competing seeds within the internal capsule have on connections

These results support the hypothesis that spatial competition in a tractography algorithm reduces the generation of erroneous connections. 5.3.

114 (b) Seeds for the corpus callosum body (a) Seeds for the genu (c) Seeds for the splenium Figure 5.7: Target regions defined for examining the strength of the tracts passing through different regions of the Corpus Callosum (CC). These target regions are chosen within the crossing region between the CC and the internal capsule in order to test the effect that the competing seeds within the internal capsule have on connections within the CC. in the non-targeted, horizontal tract (48.57%). Similar results were again seen when targeting the curved and horizontal tracts. These results support the hypothesis that spatial competition in a tractography algorithm reduces the generation of erroneous connections Real dmri Data To test the effect of our competition-based tractography on real data, we set up an experiment to examine the different sections of the corpus callosum (CC): the Genu, the Splenium, and the CC Body. For 30 diffusion MRI scans from the MIDAS database [60], we place target regions in the anterior, middle, and posterior regions of the internal capsule where the internal capsule crosses with the CC as shown in Figure 5.7. The target regions are mapped from the LONI ICBM DTI-81 atlas [155] using DT-REFinD deformable registration [234]. These target regions were chosen to test the effect competing regions have in restricting connections through the internal capsule to different sections of the CC. We hypothesize that the Genu, CC Body, and Splenium would be most strongly connected to the anterior, middle, and posterior regions of the internal capsule respectively. We further hypothesize that adding competition will reduce the connection probabilities in the non-targeted sections of the CC. Figure 5.8 shows a representative example of the tractograms generated near the midsaggital plane for the given target regions. Shown in Figure 5.8a are the tractography results for the minimal path approach in [108, 235]. These results show the effect imaging noise have on the computed connection probabilities. The connections probabilities vary erratically within the CC due to their local computation, as highlighted in the splenium by the blue arrow. 90

Genu Corpus Callosum Body Splenium (a) Minimal Path Tractography

competition) (c) Tractography Results using Proposed (with

8: Tractography results near the mid-saggital plane for the

The different sections of the Corpus Callosum (CC) are

Tractography results within the CC are scaled up and shown in

Note that the connection probabilities for the minimal path

An example is pointed out by the blue arrow.

115 Genu Corpus Callosum Body Splenium (a) Minimal Path Tractography Results [235] (b) Tractography Results using Proposed (without competition) (c) Tractography Results using Proposed (with competition) Figure 5.8: Tractography results near the mid-saggital plane for the target regions in Figure 5.7. The different sections of the Corpus Callosum (CC) are delineated in blue using atlas-based segmentation. Tractography results within the CC are scaled up and shown in insets. Note that the connection probabilities for the minimal path approach in (a) are noisy as a result of their local computation. An example is pointed out by the blue arrow. Our approach is less susceptible to noise and shows a smoother result. Further, note that the addition of competition reduces the connection probabilities outside the seeded sections of the CC, as shown by the darker regions highlighted by the green arrows. 91

116 Figures 5.8b and 5.8c display the tractography results from our proposed approach without and with competition respectively. Note that our connection probabilities are smoother and that the different sections of the CC show more homogeneous probabilities than in the minimal path tractography results. Again, this stems from computing the connection probabilities through the closed-form solution to (5.16). Further, the results with competition show sharper connection patterns than seen without competition, as highlighted by the green arrows in the CC body and splenium. These results imply less erroneous connectivity for the different sections of the CC. To further investigate the connectivity detected by the three tractography algorithms, we compute the marginal connections probabilities for each target region (i.e., the connection probabilities of each target region divided by the sum of all three target s probabilities). For the proposed approach, we follow the convention established in [88] and analyze the max-min normalized log-probabilities. These marginal probabilities capture how well the algorithms can differentiate between the connections of different sections of the CC. Figure 5.9 displays the resulting marginal probabilities. Note that in all cases, our competitive approach gives the highest marginal probabilities. In all cases, a Kruskal-Wallis test shows these results to be statistically significant with p-values of , , and for the Genu, CC Body, and Splenium respectively. These results support our hypothesis that the Figure 5.9: Marginal connection probabilities within the different sections of the Corpus Callosum with respect to the three target regions displayed in Figure 5.7. Note that our approach with competition shows the highest marginal connection probabilities, reflecting improved localization of different sections of the Corpus Callosum. 92

117 addition of competition into tractography improves the quality of the connectivity patterns we compute. 5.4 Discussion Our primary motivation in pursuing these contributions was to improve the quality of tractography results from preterm infant dmri scans. We showed in Figure 5.1b an example of tract jumping from the splenium in the preterm infant brain. It is our goal to reduce this tract jumping using our competitive tractography algorithm. To explore our ability to achieve this goal, we applied our competitive tractography algorithm, as well as the competing minimal path tractography algorithm, to identify connections to the splenium in the dmri scan from the preterm infant in Figure 5.1b. We placed a target region in the splenium as well as a competing region around the fornix, as shown in Figure 5.10a. As with our earlier experiments, we perform tractography with the minimal path technique of Zalesky [235], with our random walker tractography algorithm, and with competiton introduced into our tractography algorithm. We followed the precedent set by [1] and used an FA threshold of 0.1 to define our background region. Results for all three algorithms are shown in Figure We observed that, for all three algorithms, the obtained connection probabilities were more diffuse than what we saw in the adult brain. This result is consistent with the fact that the FA within the preterm infant brain is lower than in the adult brain. As a result, the connectivity is more difficult to localize in general. It is also interesting to note that the minimal path tractography result is less noisy than what was seen in the adult data in Figure 5.8a. This discrepancy is a result of the image quality; the adult dmri scans from the MIDAS database contained the minimal six diffusion weighted images [60], while our preterm infant dmri scans contained 72 diffusion weighted images. The greater number of diffusion weighted images reduced the impact of image noise in the model fitting and tractography steps. We further note that the normal limitations of graph-based tractography algorithms, and probabilistic tractography algorithms, are present in our results. As with our earlier adult dmri results, the connection probabilities decrease as a function of tract length. This is a common effect in probabilistic tractography algorithms [88]. Further, graphbased tractography algorithms are prone to quantization errors, as the edges in the graph are a sparse representation of the directions along which a tract can grow [235]. These quantization errors can make it difficult to track fibers that curve at an angle different than those of the edges in the graph. Even so, we see improvement in the preterm infant dmri connection probabilities as a result of introducing competition. In Figure 5.10c, we see connection probabilities extend into the fornix and the internal capsule. By introducing a competing target region around the fornix, those connection probabilities decrease as shown in Figure 5.10d. These results 93

The splenium was targeted with the region highlighted in green in (a).

118 (a) Competing Target Regions (b) Splenium with Minimal Path Tractography [235] (c) Splenium with Proposed Algorithm (no competition) (d) Splenium with Proposed Algorithm (with competition) Figure 5.10: Tractography results for the splenium for a preterm infant scanned at 29 weeks PMA. The splenium was targeted with the region highlighted in green in (a). For our competitive tractography algorithm, the red region around the fornix was used as a competing region. Note that with the introduction of competition, we see lower connection probabilities between the splenium the fornix. 94

119 are encouraging, and suggest that including the notion of competition into tractography could help in reducing spurious connections detected in the preterm infant brain. 5.5 Conclusions We presented herein a multi-region tractography approach that allows for the incorporation of competition between multiple target regions. Our approach is based on random walks on a graph-based representation of the dmri scan. Our competitive tractography algorithm consists of two main contributions. First, we presented for the first time an exact solution to the dmri graph embedding employed in [108, 235, 205]. We showed that our analytical solution to the edge weight ODF integral is both computationally efficient and numerically accurate, which speeds up and improves the accuracy of existing graph-based dmri analysis. Second, we present the first competitive tractography algorithm and show how its connection probabilities are obtained through a closed-form solution that ensures repeatability of the results. Further, our algorithm is capable of incorporating knowledge that is beyond the local scale at which most current tractography algorithms operate. Results of our algorithm on thirty adult brain dmri from the MIDAS database show less noisy connection probabilities in various fiber tracts by avoiding local computation of these probabilities. We further show that incorporating competition into the tractography process improves the localization of fiber tracts and reduces connectivity to competing tracts. FIXME: Add a comment on the preterm result. 95

120 Chapter 6 STEAM - Statistical Template Estimation for Abnormality Mapping 6.1 Introduction and Motivation Worldwide, more than one in ten infants are born prematurely (earlier than 37 weeks gestational age) and are at high risk of adverse neurodevelopmental outcome [43]. This abnormal neurodevelopment is believed to be due to white matter dysmaturation or injuries acquired over the period of the infant s neonatal intensive care [20, 77]. As a result, there has been a strong effort in identifying these white matter abnormalities early. The earlier these abnormalities are detected, the sooner clinicians can intervene and either improve a preterm infant s neurodevelopmental health, or set up appropriate rehabilitative care to aid in that child s growth and maturation 1. Diffusion tensor imaging (DTI) provides us with the ability to probe both white matter organization and integrity, potentially making it a valuable tool to identify such white matter abnormalities. That potential has been examined in recent studies with various links being identified between diffusion measures (like fractional anisotropy [FA] and mean diffusivity [MD]) and neurodevelopmental outcome [2, 131, 192, 70, 31, 190, 66, 191]. These group-based studies have provided us with a further understanding of how DTI-based abnormalities could indicate future neurodevelopmental delay, but they stop short of applying those conclusions to individual cases. As a result, there s still the open question of, Given the DTI scan of a single preterm infant, what cues can we extract from that one scan to gauge whether that infant will have an adverse neurodevelopmental outcome? The question of subject-specific outcome projection has been examined in structural MRI with observer rating systems being proposed based on the number and size of white mat- 1 This chapter is based on our published work in [53] 96

analysis (coverage), the resolution at which the statistical analysis is undertaken (scale), and whether the analysis

Note that our proposed techinque, STEAM, is the only technique that provides a personalized, fine-scale analysis of the

However, these techniques are specific to structural MRI and do not harness the potential DTI has of providing

Alternatively, an argument could be made that recent DTI group studies, like those cited earlier, identify where to look

Unfortunately if an experimental group contains a wide variety of abnormalities (compared to a control group), that

This limitation in group studies is a concern with preterm infants as recent studies have noted that adverse

121 Figure 6.1: Popular preterm infant DTI analysis techniques presented according to the amount of the brain included in the analysis (coverage), the resolution at which the statistical analysis is undertaken (scale), and whether the analysis technique is group-based or personalized. Note that our proposed techinque, STEAM, is the only technique that provides a personalized, fine-scale analysis of the whole brain. ter lesions [149] or on the presence of intraventricular hemorrhages (IVH) [177]. However, these techniques are specific to structural MRI and do not harness the potential DTI has of providing additional diagnostic information. Alternatively, an argument could be made that recent DTI group studies, like those cited earlier, identify where to look and what to look for. Unfortunately if an experimental group contains a wide variety of abnormalities (compared to a control group), that experimental group variance makes it a challenge to identify statistically significant differences in group studies. This limitation in group studies is a concern with preterm infants as recent studies have noted that adverse neurodevelopmental outcomes in preterm infants can manifest themselves in multiple ways [20]. This intra-group variability in preterm infant group-level analysis might be masking certain outcome-predictive image cues. To capture this intra-group variability and determine its relevance, we require a technique that produces a personalized result. By identifying abnormal diffusion measurements at the level of an individual DTI scan, we would have the potential to identify and examine the impact of variability within experimental or control groups. 97

122 Our goal is to generate a subject-specific DTI analysis technique which can flag brain regions with abnormal DTI measurements that are indicative of future neurodevelopmental delay. Such an analysis technique can take on many forms, as is evidenced by the presence of multiple comparable group-based analysis techniques. Figure 6.1 summarizes group-based analysis techniques, placing them according to their scale (i.e., the spatial resolution at which the statistical analysis is performed) and their coverage (i.e., the amount of the DTI scan that is analyzed). In terms of scale, existing techniques range from computing statistics at the level of each scan (e.g., connectome mapping [57]), to the level of each segmented region (e.g., tractography-based [31] or atlas-based [191]), to the level of individual voxels (e.g., Tract-based Spatial Statistics [70]). In terms of coverage, we see techniques that cover small regions of interest (e.g., ROI-based analysis [190, 66]), to the white matter skeleton (e.g., Tract-based Spatial Statistics [70]), to the whole brain (e.g., Voxel-based analysis [2]). Different choices of scale and coverage result in algorithms with different strengths. While a fine scale analysis technique has the ability to accurately localize specific abnormalities, a coarse scale analysis technique can detect small but widespread changes in the brain [169]. Similarly, a full-brain analysis is valuable in an exploratory setting where the location of structural abnormalities are not known a priori. On the other hand, a localized analysis allows for the examination of specific brain structures without introducing potential confounding factors from the rest of the brain. In our context of subject-specific DTI screening, we would prefer an exploratory analysis technique that can localize potential abnormalities. Covering the whole brain would be valuable as recent ROI-based studies in sub-cortical white matter [66] and cortical gray matter [223] suggest that not only are patterns of maturation observable within those regions on a DTI scan, but also that deviation from the normal maturation pattern could be indicative of adverse neurodevelopmental outcome. Adding those regions to the more frequently studied deep white matter would provide us with greater potential to identify clinically valuable abnormalities. Equally, analyzing a DTI scan at the scale of its individual voxels would give us a greater ability to identify and localize regions with abnormal diffusion measurements. This advantage features prominently in the voxel-based analysis works of Aeby et al. [2] and Gimenez et al. [93] where diffusion measurements in small, but clinically important, brain regions were identified as being related to neurodevelopmental outcome. It is evident from Figure 6.1 that voxel-based analysis (VBA) [14] provides the greatest coverage and finest scale of analysis compared to other techniques. Using two groups of scans, VBA spatially aligns all scans from both groups, then computes statistical tests at each voxel, resulting in a fine-scale, group-based, statistical analysis of diffusion measurements over the whole brain. While VBA is attractive due to its ability to maximize scale and coverage, it is currently limited to group-based studies and is susceptible to various design decisions in the analysis pipeline [122, 123]. In particular, the choice of image regis- 98

123 tration algorithm, the size of the image smoothing kernel, the choice of multiple comparison correction scheme, and the decision of how to handle non-normally distributed data can all impact the conclusions that one draws from a VBA analysis [122]. As a result of these VBA susceptibilities, one has to take care in setting up and documenting a VBA pipeline. With these points in mind, we desire the ability to perform a subject-specific analysis of a DTI scan in a similar fashion to VBA while obtaining a reliable result that can be related to neurodevelopmental outcome. It is towards this goal that we introduce STEAM: Statistical Template Estimation for Abnormality Mapping. The STEAM technique consists of two parts. First, we generate a collection of 3D statistical template images that capture, at the scale of each individual voxel, the distribution of diffusion measurements for a group a preterm infants with normal developmental outcome. This template collection acts as our normative statistical model and can be computed offline (i.e., during downtime) from a control group s DTI scans. Second, we use these statistical templates to perform VBA by spatially aligning a new DTI scan to the corresponding template and applying singlesample statistical tests. In this fashion, we are able - for the first time - to compare a single preterm infant s DTI scan to a normative preterm population and generate results at the level of individual voxels. As part of this VBA-style analysis, we provide a thorough review of what choices we make in our analysis pipeline to ensure a reliable outcome for this form of whole brain DTI analysis. We also provide the code for STEAM, as well as our statistical templates, online to those who wish to use this analysis technique 2. Finally, we show that our STEAM analysis engine - the incorporation of our preterm statistical DTI templates into a VBA pipeline - can provide further insights into how abnormal diffusion measurements can impact the neurodevelopment of individual subjects, a benefit that existing analysis technique cannot provide. The remainder of the chapter is organized as follows. Section 6.2 provides an overview of our STEAM analysis engine, including the creation of statistical templates (in Section 6.2.1), the voxel-based analysis (in Section 6.2.2), and the generation of a full collection of DTI templates (in Section 6.2.3). Section 6.3 provides an overview of the cohort we scanned to validate this technique, the imaging parameters we used, and the decisions made on which DTI scans to include in the statistical templates. Section 6.4, we examine and validate STEAM by comparing our statistical templates and VBA results to existing literature. Finally, in Section 6.5, we conclude with a discussion on the suitability of VBA for preterm DTI analysis and the potential for this technique in future studies. 6.2 Methods At a high level, STEAM works by producing a statistical model for the DTI scans of healthy preterm infant brains, then compares a new DTI scan to that model. We refer to the model

124 Figure 6.2: Flow diagram of the template creation procedure of Guimond et al. [96]. The subject s scans are aligned to a given target image and averaged. The corresponding image transformations are inverted, averaged, then applied to the average image to adjust the template to an unbiased shape and size. Multiple iterations are then done to reduce registration error. Each step in the pipeline is colour-coded by section, with the image alignment (yellow) discussed in Section 6.2.1, the model fitting (red) discussed in Section 6.2.1, and the bias correction steps (blue) discussed in Section creation step as statistical template estimation (i.e., the STE in STEAM) while we call the model comparison step abnormality mapping (i.e., the AM in STEAM). We present both steps in the following sub-sections before rounding out STEAM by addressing different diffusion measurements (e.g., FA, MD, etc...) and different image scales Statistical Template Estimation To facilitate subject-specific VBA for preterm infants, we first establish a statistical reference model for the normative preterm DTI brain scan. We provide this model through the construction of population-specific statistical templates. These templates are computed offline from a collection of DTI scans from a normative control group, resulting in a succinct statistical model of a normal preterm infant population. The templates are generated using a DTI-extended version of the scalar image atlasbuilding technique of Guimond et al. [96]. An overview of the technique is presented in Figure 6.2. The technique of Guimond et al. involves three basic steps: (a) aligning all given scans to a chosen target image T ; (b) computing, at each voxel x, the mean M(x) and (co-)variance S(x) of the image data, and (c) transforming the resulting mean and (co-)variance images to an average frame of reference that is not biased by the choice of the initial target image T. 100

125 Image alignment The template creation procedure of Guimond et al. begins with a sample set of DTI scans, W = {I 1,, I k }, from a control group and the objective of spatially aligning all scans to the same frame of reference. In order to begin this alignment process, a scan from the group is selected as our initial target image T and all other DT images I i (i [1, k]) are registered to the target using the state of the art in DT image registration techniques. While the choice of target introduces a bias on brain shape and size, we discuss how to correct for that bias in Section The registration is performed by independently aligning each DTI scan I i to the chosen target image T in two steps. First, we perform an affine registration using FSL s Linear Image Registration Tool (FLIRT) [112, 113] to remove any pose or scale differences between the given image I i and the target T. The affine transformation obtained from this registration step was one that maximized the normalized mutual information between the FA of the given image I i and that of the target T. Thus, this registration step aligns each image to the target image as best as possible without non-linearly warping the images themselves. The affine registration is then followed by a deformable registration using DT-REFinD [234]: a full tensor version of the deformable demons algorithm [221]. The resulting deformation from DT-REFinD minimizes the log-euclidean [13] sum-squared difference between the given tensor image I i and the target T. We relaxed the smoothness parameters for the DT- REFinD registration (using σ def = 1, σ update = 0.0, max. step length = 2.0 voxels [234]) as they were found, by qualitative inspection, to give good results over a variety of ventricle sizes. This registration step introduces non-linear warping to the sample DTI scans I i (j [1, k]) in order to better align them to the target image T. Note that DT-REFinD operates using the full diffusion tensor and not a measure, like FA, that is derived from the tensors. The use of the full diffusion tensor in image registration has been shown to significantly improve registration accuracy [178] over the use of individual tensor features. Voxel-wise Model Fitting Once all DTI scans have been anatomically aligned, we proceed with fitting a multivariate Gaussian distribution to the tensor data at each voxel. The choice of Gaussian distribution here is equivalent to the commonly-used t-test in standard VBA frameworks and comes with the same assumptions of normalcy on the image data. It has been well established that tensors do not form a vector space [225] and so algebraic operations on tensors are not guaranteed to give a tensor as a result. Due to this constraint, we model the tensor data in the log-euclidean space, thereby ensuring that we respect the manifold of positive semi-definite 2 nd order tensors [13]. The log-euclidean mean tensor 101

126 image M is computed as M(x) = 1 k k logm (φ i I i (x)). (6.1) i=1 where φ i is the concatenation of the affine and non-rigid deformations for the scan I i, and matrix logarithm function logm( ) is used to map the tensors to log-euclidean space. The log-euclidean tensor covariance image S(x) is computed as well, though with the limited number of images commonly used in preterm DTI analysis studies, we rely on the shrinkage approach of Schäfer and Strimmer to get a more reliable estimate of the voxel-by-voxel covariance matrices than the maximum likelihood estimator can provide [198]. Note that the mean and covariance images are kept in the log-euclidean space throughout this work to ensure that the tensors are manipulated properly. As each multivariate Gaussian distribution is fit at each voxel, it becomes important to record whether the log-euclidean tensors at a voxel are indeed well represented by a multivariate Gaussian distribution. Therefore, we employ a Henze-Zirkler multivariate normalcy test at each voxel to determine the probability that the given log-euclidean tensors were obtained from a multivariate Gaussian distribution [104]. These probabilities are captured in an image, P, for each template and can then be taken into consideration when performing VBA. We will discuss how P can be used while performing VBA in Section Removing Target Image Bias Once the Gaussian distributions are fit at each voxel, the resulting mean and covariance images are still in the reference space of the chosen target image T. Therefore, the choice of target image impacts the size and shape of the brain in the resulting mean and covariance images. In order to remove this bias to the target, it is necessary to deform both the mean image M and covariance image S so that the brain shape and size in both images more closely resembles the shape and size of the average brain of the population. As given by Guimond et al., the target image bias correction can be computed from the computed deformations φ i (i [1, k]): φ 1 (x) = 1 k k i=1 φ 1 i (x) (6.2) where φ 1 i is the inverse of the deformation that warps image I i to the target. By inverting those deformations, we obtain warps that deform the target towards each individual sample image I i. Averaging these inverted deformations generates the correction warp φ 1 that deforms the template towards an average brain shape and size. The correction warp is applied to the mean image M to obtain a new target image T T = φ 1 M (6.3) 102

127 Figure 6.3: STEAM centers around a 3D statistical template modeling the DTI scan of a healthy preterm infant brain. This template consists of the three 3D images seen on the left: a mean image M, a covariance image S, and a normalcy p-value image P. To conserve space, we will refer to a 3D statistical template using the stacked visualization on the right. 103

128 Figure 6.4: The proposed analysis pipeline for VBA. A new DTI scan is aligned to the STEAM statistical template, then values at each voxel are compared to the Gaussian distributions in the template using a χ 2 -test. The voxels whose tensors are significantly different than their corresponding Gaussian distribution, after multiple comparison correction, are identified and visualized. and the atlas-creation process then repeats itself, using this new target, in order to remove any errors that may be caused by the registration algorithm limited ability to match to a target image far from the average brain shape and size. The template-creation procedure is repeated three times as this number of iterations has been empirically shown to be enough for the algorithm to converge [96, 178]. By doing this correction, the initial choice of target image T does not bias the final template [96]. Note that the computation of the covariance image S is only required for the final iteration when the bias correction steps have been applied and so it does not need to be corrected as the mean image is in (6.3). Together, the mean image M, covariance image S, and p-value image P, form a single statistical template for the given normative population. An example of a statistical template is visualized in Figure 6.3. Note that the template provides a distribution of the diffusion tensors at each voxel as well as a measure of how trustworthy those distributions are, resulting in a full statistical description of what a normal preterm infant brain looks like on a DTI scan Abnormality Mapping Our statistical template provides us with a model of how the preterm infant brain should look on an early DTI scan. Such a model becomes valuable in its ability to assess the normality of a newly-obtained preterm DTI scan. This new scan can be aligned to that template and voxel-by-voxel statistical tests can be done to identify regions of the brain where the measured diffusion is significantly different from what we see in normal preterm 104

129 infant population. This notion of aligning scans and detecting outliers at a voxel-by-voxel level is what underpins the concept of VBA. The VBA framework consists of two main tasks: image alignment and voxel-by-voxel comparisons (Figure 6.4). Typically, VBA is done by aligning images from both the experimental and control groups, then performing a paired t-test (or T 2 -test in the case of multivariate data) at each voxel to identify group differences [14]. In our case, we already have our statistical template as a normative model. As a result, we propose aligning individual scans to our template and performing single-sample statistical tests to identify voxels with significantly abnormal diffusion. By performing the VBA in this way, we are uniquely able to produce a subject-specific abnormality map that highlights where the diffusion abnormalities are. We are also able to shift the majority of the time-consuming registration steps to the template creation process, a process which can be run offline. Given a new DTI scan I test, our VBA process begins by aligning it to the mean image M in our statistical template. This image alignment step is performed in the same manner as in Section First, we perform an affine registration with FA images to align image I test to the mean image M of the template using FSL FLIRT [112, 113]. This registration step is followed by a full tensor deformable registration using DT-REFinD [234]. As in the atlas-creation step, these registration algorithms were selected and tuned to make use of all the information in the diffusion tensor during the image alignment process. Once aligned to the template s image space, we can perform voxel-by-voxel statistical tests to identify outliers. For full tensor images we compute the Mahalanobis distance at each voxel x in log-euclidean space: d(x) =vec (logm (I test (x)) M(x)) T S 1 (x) where vec( ) is the matrix vectorization function. vec (logm (I test (x)) M(x)) (6.4) Note that M is already in the log- Euclidean space (as mentioned in Section 6.2.1) and does not require matrix logarithm transformation. Conceptually, the Mahalanobis distance can be seen as a multi-dimensional version of a z-score: a distance between a sample and a distribution s mean, divided by the variance of the distribution (in this case captured by the inverse of S). To identify outliers, we first assume the tensors in the new scan I test are from the same distributions represented by the template. Under this assumption, the Mahalanobis distances from (6.4) follow a Chi-squared distribution χ 2 p with p = 6 degrees of freedom (as there are six unique elements in each tensor) [147]. If these Mahalanobis distances fall outside the (1 α)-quantile of the distribution χ 2 6, then those distances are outliers of the distribution. For these outliers, we can reject (with confidence 1 α) the assumption 105

130 that their log-euclidean tensor values come from the distributions described in the preterm infant statistical template [147]. While this voxel-wise test above provides us with a way of identifying statistical outliers, it comes with two major caveats. First, it is possible that tensors at certain voxels in our statistical template are unlikely to follow a Gaussian distribution. For those voxels, the results of our statistical test may be unreliable, which raises questions about how those results should be reported. Second, we are performing multiple statistical tests and we have to make our significance threshold α more stringent to account for these multiple comparisons. VBA results, like the ones STEAM produces, are known to be susceptible to these two design decisions [122], so we examine these two points in greater detail and justify the decisions we make with respect to both of them. Addressing Non-Gaussian Data When generating a statistical DTI template, we assume that at each voxel, the distribution of diffusion measurements can be modeled as Gaussian. In previous DTI VBA studies, this has been shown not to be the case [122]. Consequently, when creating the statistical templates, we test for normalcy at each voxel and provided the probability of the data being normally distributed through the image P (see Section 6.2.1). We can then see if the diffusion data at voxel x is normally distributed by checking whether P(x) > α, where α is an multiple comparison corrected significance threshold. We should be wary of statistics computed at voxels where P(x) < α as the normal distributions fit at these voxel locations do not accurately represent the underlying distribution of possible diffusion measurements. Depending on how conservative we want to be, we could discard the statistics computed at these voxels, weight the contribution of the statistics at these voxels by how likely their data is to be normally distributed, or simply use the statistics despite the lack of normalcy in the data at those locations. Multiple Comparison Correction When performing multiple statistical tests, as we are doing at each voxel in an image, there is always the potential that a certain fraction of voxel values will be marked as outliers by chance. For example, if we select α = 0.05 for our Chi-squared Mahalanobis test above, then on average, 5% of the voxels in I test will be marked as outliers when they shouldn t be. To address this phenomenon and obtain a more reliable threshold for detecting outliers, we can use various techniques to correct for multiple comparisons [144]. Generally, there is no consensus on how to perform multiple comparison correction [165]. Still, four techniques are commonly seen in the VBA community: (a) Bonferroni correction [45], (b) Gaussian Random Field Theory [233], (c) False Discovery Rate [41], and (d) Permutation testing [164]. The most conservative correction approach is Bonferroni cor- 106

131 rection which assumes that the Mahalanobis distances computed in (6.4) are independent from one voxel to another. When performing VBA, this assumption is overly strict as the values at image voxels are usually similar to their neighbours [165]. As a result, Bonferroni correction is rarely used as it can miss certain outliers by ignoring this neighborhood correlation. Gaussian Random Field (GRF) Theory shares some similarities with Bonferroni correction as it scales the significance threshold α by the number of independent statistical comparisons. However, GRF theory assumes that the neighboring image values follow a Gaussian-like profile (i.e., smooth), thereby allowing us to more accurately estimate the number of independent statistical comparisons [233]. Unfortunately, due to sharp transitions between narrow fiber tracts, the typical DTI scan is usually not smooth enough for GRF theory to provide a notably less conservative correction scheme than Bonferroni [165]. Smoothing the DT images with a Gaussian kernel would alleviate this problem, but the amount of smoothing required to get a less conservative significance threshold is relatively high (greater than 6 mm full width at half-max Gaussian smoothing given our cohort size) [165]. Such extensive smoothing would also blur out narrow fiber tracts, making it more difficult to localize abnormalities with our statistical template 3. For these reasons, GRF theory is ill-suited to correct for multiple comparisons in DTI studies. Permutation testing is popular for multiple comparison correction when performing group studies as it makes few assumptions about the relationships between input statistics [164]. Instead, permutation tests computes statistical group differences for various random permutations of labelings for the group members. The significance threshold is then chosen from the distribution of these group differences so that only a fraction (α) of the random group label permutations were above this threshold. Permutation testing has become popular as it is based on distance metrics and not any specific distribution. This condition has allowed for some preprocessing of statistical maps as is done in Threshold-Free Cluster Enhancement [204]. Unfortunately, applying permutation testing in STEAM would require maintaining two groups: one group containing only I test, and the other containing all the DTI scans used to create the statistical template. Maintaining these two groups may be cumbersome if the number of scans used to create the template is large. Also, retaining the DTI scans used for template creation leads to redundancy between those scans and the template itself. For these reasons, permutation testing is not ideal for multiple comparison correction in STEAM. As a result of the limitations of these other methods, we use False Discovery Rate (FDR) to perform multiple comparison correction. Like permutation testing, FDR does not make any assumptions on the relationships between input statistics or on the smoothness of the images, yet it is less conservative than Bonferroni correction or GRF theory [41]. FDR also 3 Note that this GRF-based smoothing for multiple-comparison correction is different from the scale-space smoothing discussed later in section

132 has the benefit of being applicable to the single-sample statistical tests that are performed in STEAM. By taking into consideration multiple comparison correction, as well as non-gaussian distributed data, the STEAM analysis engine address two major caveats surrounding voxelbased analysis, resulting in a technique that can be used to identify abnormalities at the voxel level in a single DTI scan Completing the Template Collection While we have presented STEAM with respect to a full tensor statistical template, it has been common to examine preterm infant DTI scans using simpler features of interest from the diffusion tensors, in particular FA and MD [2, 93]. It is also common in VBA to smooth the images being analyzed in order to examine their content at different spatial scales [123]. STEAM also provides this functionality through an expansion of its statistical template collection, a collection that we describe below. Templates for Diffusion Features While we described STEAM s template-creation technique with respect to the full diffusion tensor image, the same technique is used to generate statistical templates for various tensor features including fractional anisotropy, mean diffusivity, and all others shown in Figure 6.5. As in the full tensor case, the DTI scans are aligned using a combination of FSL FLIRT and DT-REFinD to obtain the same deformations φ i (i [1, k]) as in the full tensor analysis. Once the scans are aligned, the scalar diffusion feature (e.g., FA, MD) are computed from the aligned tensor images φ i I i (i [1, k]). The model fitting step then simplifies to computing scalar mean and variances at each voxel, while using the Lilliefors test for normalcy at each voxel [142]. The template bias correction is then computed from the deformations φ i (i [1, k]) in the same way as in the full tensor case. We refer to these individual scalar templates by their mean image M f, variance image S f, and normalcy p-value image P f, where f is the tensor feature being modeled (e.g., f = FA, or f = M D, etc.). STEAM can compute statistical templates for 14 different diffusion features (including the full log-euclidean tensor), obtaining the mean, (co-) variance, and p-value images shown in Figure 6.5. When a new DTI scan needs to be analyzed on a specific tensor feature of interest, STEAM performs the analysis in a similar fashion to the full tensor case. The DTI scan is aligned to the mean image of the full tensor template using FSL FLIRT and DT-REFinD. Once aligned, we compute the scalar diffusion feature (e.g., FA, MD) from the aligned test image and compare it to the statistical template for that feature. Note that since the same image registration algorithms were used regardless of the choice of template, each template should be anatomically aligned. The statistical test for the scalar images simplifies to the z- 108

$Among these measures are mean diffusivity (MD), the tensor shape measures (c l, c p, c s ) defined in [230], fractional anisotropy (FA), log-euclidean FA (LFA), relative anisotropy (RA), each$

133 Figure 6.5: A visualization of the various diffusion tensor measures for which we generated preterm infant statistical templates. Among these measures are mean diffusivity (MD), the tensor shape measures (c l, c p, c s ) defined in [230], fractional anisotropy (FA), log-euclidean FA (LFA), relative anisotropy (RA), each individual eigenvalue (λ 1, λ 2, λ 3 ) of the diffusion tensor, radial diffusivity (RD), tensor norm ( D F ), and volume ratio (VR). Note that templates for all of these measures are computed at multiple spatial scales. test and the same multiple comparison correction is applied. In this way, STEAM produces VBA results for individual tensor features from a single DTI scan. Templates for Different Image Scales It has also been general practice in the VBA community to smooth images before performing VBA. The rationale behind performing this smoothing is to (a) reduce the number of false outliers identified due to misregistration, (b) to make the image data at each voxel more likely to be normally distributed, and (c) to introduce a spatial scale to the analysis [123]. Unfortunately, it has been well noted that different levels of smoothing can result in very different conclusions being drawn from a VBA analysis [122, 123]. It has been suggested that when VBA is performed on DTI scans, researchers should provide results for a range of smoothing scales [123]. In order to accommodate different spatial scales, we expand STEAM to produce templates at various smoothing scales. Specifically, Gaussian smoothing is applied to the aligned images φ i I i (i [1, k]) prior to the computation of the mean and (co-) variance images described earlier. STEAM can create templates smoothed with a range of Gaussian filters whose full width at half maximum (FWHM) values have been seen in previous DTI VBA literature [123]. When a new DTI scan I test is obtained for analysis, STEAM can smooth I test with the equivalent Gaussian function, then perform VBA at that spatial scale in the same manner as described earlier. In this fashion, STEAM can perform VBA and report results across multiple spatial scales. 109

134 By introducing n spatial scales, and 14 different diffusion features, a full STEAM analysis engine includes 14n statistical templates, each of which include a mean, (co-) variance, and p-value images as shown in Figure 6.5. These additional templates give STEAM the added flexibility to isolate specific diffusion abnormalities that manifest at different spatial scales. 6.3 Demographics and Clinical Factors of our Preterm Infant Cohort Cohort Demographics To validate STEAM, we make use of an existing cohort of 195 premature newborns born between 24 to 32 weeks gestational age (GA) at the Children s & Women s Health Centre of British Columbia, 177 are described in Chau et al. [66] and an additional 18 infants recruited since that work was published. Cohort exclusion criteria included 1) congenital malformation or syndrome; 2) antenatal infection; or 3) large parenchymal hemorrhagic infarction (> 2 cm) detected using head ultrasound scanning. This prospective study was approved by the University of British Columbia Clinical Research Ethics Board. The newborns enrolled in this cohort were evaluated with MRI scans in the neonatal period (outlined below) and had neurodevelopmental assessments at a corrected age of 18 months with the Bayley Scales of Infant and Toddler Development, Third Edition (Bayley-III) [35] and the Peabody Developmental Motor Scales, Second Edition (PDMS-II) [86]. The 3 composite scores (cognitive, language and motor scores) of the Bayley-III have a mean of 100 and standard deviation of 15. The PDMS-II provides a more sensitive assessment of motor function yielding gross, fine and total motor scores with a mean of 100 and standard deviation of Image Scoring and Quality Control Of the 195 very preterm neonates, 170 were scanned within the first weeks of life once they were clinically stable. One hundred and fifty-two (152) of these 170 infants were scanned again at term-equivalent age, with 0.85 to (7.98±3.32) weeks between scans. The resulting 322 diffusion MRI scans cover the age range of to (36.38±4.89) weeks post-menstrual age (PMA). Our MRI studies were carried out on a Siemens (Erlangen, Germany) 1.5T Avanto using VB 13A software and included the following sequences: 3D coronal volumetric T 1 -weighted images (repetition time [TR], 36 ms; echo time [TE], 9.2 ms; field of view [FOV], 200 mm; slice thickness, 1 mm, no gap) and a multi-slice 2D axial EPI diffusion MR acquisition (TR 4900 ms; TE 104 ms; FOV 160 mm; slice thickness, 3 mm; no gap) with 3 averages of 12 non-colinear gradient directions, resulting in an in-plane resolution of mm. The DTI acquisition was repeated twice, once with a diffusion weighting (b-value) of 600 s/mm 2 and once with a diffusion weighting of 700 s/mm 2. The two DTI acquisitions were then 110

135 Table 6.1: Deomgraphics of the experimental and control groups for the preterm infant cohort used in this thesis to validate STEAM. Experimental and control groups are defined in Section P-values are shown between the two groups. Note that there are no significant differences in birth age, scan age, sex, or brain volume between the experimental and control groups. Cohort Control Experimental P-Value Demographic Group Group Number of Subjects Number of Scans Sex (M/F) 30 / / 41 Avg. GA at Birth Avg. PMA at Scan Brain Volume (cm 3 ) combined to create a single diffusion tensor image. The combined diffusion weighted image set was preprocessed (i.e., eddy current corrected and skull stripped) using the FSL Diffusion Toolbox (FDT) pipeline 4 and tensors were then fit using RESTORE [65]: a weighted leastsquares tensor fitting algorithm implemented in the Camino toolkit 5. An experienced neuroradiologist (K.J.P.) reviewed the resulting MR images for presence of white matter injury (WMI), intraventricular hemorrhages (IVH), ventriculomegaly (VM), and poor image quality. The full neuroradiological review was performed on the T1 images using the following protocols. The presence of WMI was identified using a system found to be predictive of adverse neurodevelopmental outcome at 12 to 18 months of age [149]. We noted IVH using the grading of Papile et al. [177] and VM using the grading system of Cardoza et al. [63]. We employ a high standard for image quality by visibly checking for evidence of motion corruption and various image artifacts discussed in Tournier et al. [213] and Gallichan et al. [90]. To avoid corrupting our DTI analysis of the whole brain, we included a scan in our study only if the entire scan is free of all degradations. Of the 322 scans we collected, 192 of them met that stringent criteria and were included in this study. Of the 130 excluded scans, 42 we excluded due to excessive motion, 76 were removed due to vibrational artifacts similar to those described by Gallichan et al. [90], and the remaining 12 were removed due to the presence of other artifacts described by Tournier et al. [213] Defining Experimental and Control Groups To generate a set of statistical templates that capture the range of normal brain development, we first must define what criteria we use to decide whether an infant s DTI scan and neurodevelopmental outcome are normal, then determine which scans in our cohort fit

136 192 Quality DTI Scans Presence of WMI on T1? no Presence of IVH on T1? no Bayley-III Score < 85? no PDMS-II Score < 85? no Control Group 76 Scans yes yes yes yes 43 Scans 31 Scans 34 Scans 8 Scans Experimental Group 113 Scans Figure 6.6: Exclusion criteria for the group of preterm infant DTI scans used to create the normative statistical templates. The number of scans excluded by each criteria are listed below the corresponding criteria (WMI = white matter injury, IVH = intraventricular hemmorhage, Bayley-III = Bayley Scales of Infant and Toddler Development, PDMS-II = Peabody Developmental Motor Scales). Excluded scans of sufficient quality were used to validate the VBA aspect of STEAM. A more detailed description of that VBA test set is given with the corresponding experiments in Section 6.4. Note that scans were included in the template only if the infant s measures of neurodevelopment are within 1 standard deviation of the normal mean (> 85). Further details on these exclusion criteria are given in Section that criteria. Those scans that fit these criteria will comprise our control group from which our statistical templates will be built. The full control group selection criteria is shown in Figure 6.6. Infants were included in the control group if their scores on all six composite measures of neurodevelopment (Bayley- III and PDMS-II) where at least within 1 standard deviation of the normal mean (> 85). Of those infants that met this criteria, we further excluded any infants that showed they acquired brain injury, either white matter lesions or intraventricular hemorrhage, on MRI (as identified by the protocols described in Section 6.3.2). In our cohort, we obtained 76 scans from 55 infants that satisfied these criteria. The distribution by PMA of the DTI scans in the control group is given in Figure 6.7. Full demographics of these infants are given in Table 6.1. Figure 6.7: The distribution of DTI scans used to generate each statistical preterm infant template. Bars are color-coded based on the age windows used for each template (see Section 6.3.3). 112

137 Table 6.2: Demographics for the preterm infant cohort divided by age window for which we created a statistical template. Numbers are provided for the control group first, followed by the experimental group, with the p-value (1-way ANOVA) of the group differences shown in brackets. Note that there are significant age differences (highlighted in bold) between experimental and control groups for the second and fourth-youngest statistical template age windows. Demographics Template Post-Menstrual Age (PMA) Groups control/exp (p-val) weeks weeks weeks weeks Number of Subjects 24 / / / / 17 Males - Females / / / / 7-10 Number of Scans 24 / / / / 17 Avg. GA at Birth / (0.55) / (0.03) / (0.13) / (0.04) Avg. PMA at Scan / (0.32) / (0.003) / (0.43) / (0.68) Brain Volume (cm 3 ) / (0.31) / (0.06) / (0.03) / (0.55) 113

138 Given a control group consisting of 55 infants and 76 scans, we have the luxury to sub-divide our control group according to PMA at time of scan. By performing this subdivision, we are able to reduce the amount of variance in our statistical templates that is caused by PMA, resulting in a greater ability to identify statistical abnormalities. We chose to sub-divide our control group into roughly 4-5 week time windows as highlighted by the different colors in Figure 6.7. This sub-division allows us to maintain a similar number of scans (i.e., similar statistical power) in each time window while also optimizing the trade-off between the number of scans per time window and the age-related image variance within each window. Full demographics for each sub-group are given in Table 6.2. As can be seen in Table 6.2, the experimental and control sub-groups are generally similar to each other, with only the four bolded comparisons showing significant differences. As far as demographic differences across time windows, we saw the control group for the earliest age window had a significantly lower GA at birth than the second (32-36 weeks PMA, p = ) and fourth (41-45 weeks PMA, p = ) age windows, but no other significant differences were found between GA at birth for any pair of templates (smallest p = ). No other significant differences in demographics were found for the control groups across time windows. 6.4 Experimental Results Our proposed STEAM analysis engine contains two steps: the creation of a normative statistical template collection, and a VBA pipeline to identify areas of abnormality in a single DTI scan. The following sections present results on both these steps, specifically how these results compare with known anatomical findings as well as how STEAM-generated results compare to existing single-scan evaluations of preterm infants (e.g., [149]). We examine the hypotheses that STEAM: Generates statistical templates that show the growth, development, and inter-subject variability we would expect to see in the normal preterm infant brain over the examined time period. Generates subject-specific abnormality maps that can be reliably interpreted and are consistent with a subject s structural MR evaluation (e.g., [149]). Identifies brain abnormalities that relate to neurodevelopmental outcome at a corrected age of 18 months. Identifies brain abnormalities that are separate to, yet complement those, that can be identified on structural MRI. The results that follow support these hypotheses but should not be considered a full clinical evaluation of STEAM. Our goal is simply to show a broad proof of concept. 114

Sagittal Coronal Axial 28 31 Weeks PMA 32 36 Weeks PMA 37 40 Weeks PMA 41 45 Weeks PMA Figure 6.

Note that all figures are drawn with the same scaling so any change in size is due to brain development.

1 Validation of Normative Statistical Templates Our STEAM statistical template creation procedure

Qualitatively, the templates displayed the expected anatomical organization of major fiber tracts with the

Further, we see greater lateral growth of these tracts as post-menstrual age increases, which agrees with

This lateral growth is also consistent with histological findings that have identified a reduction of the

We also saw increased contrast between the major fiber tracts and the rest of the brain as gestational age

This result was expected as the maturation of the major fiber tracts during this period increases the FA

139 Sagittal Coronal Axial Weeks PMA Weeks PMA Weeks PMA Weeks PMA Figure 6.8: Axial, coronal, and saggital slices of the mean color FA maps for our four preterm infant DTI templates. Note that all figures are drawn with the same scaling so any change in size is due to brain development. Also note the template quality makes it easy to distinguish major fiber tracts Validation of Normative Statistical Templates Our STEAM statistical template creation procedure generated the four preterm DTI templates whose mean images, M, are displayed in Figure 6.8. Qualitatively, the templates displayed the expected anatomical organization of major fiber tracts with the Genu, Splenium, Optic Radiations, and Corticospinal Tracts clearly identifiable from each mean image. Further, we see greater lateral growth of these tracts as post-menstrual age increases, which agrees with earlier DTI findings [191, 181, 210]. This lateral growth is also consistent with histological findings that have identified a reduction of the subplate zone and an expansion of the white matter over this period [128]. We also saw increased contrast between the major fiber tracts and the rest of the brain as gestational age increases. This result was expected as the maturation of the major fiber tracts during this period increases the FA within those tracts [77]. This increase in FA contrast is also aided by a decrease of FA in cortical and sub-cortical regions, a decrease that is consistent with a decrease in the radial organization of neurons [210]. This FA decrease has also been reported in an ROI-based study [223] and is likely due to dendritic arborization of neurons in the subplate zone [152]. 115

Sagittal Coronal Axial 28 31 Weeks PMA 32 36 Weeks PMA 37 40 Weeks PMA 41 45

9: Axial, coronal, and saggital slices of the FA coefficient of variation for

Note that as the brain develops the inter-subject variability increases.

greater development in that part of the brain over this 28-45 week PMA time

For each of our templates, we also examined the covariance at each voxel to

$fraction of the mean value µ. Figure 6.$ 9 shows a representative example: the coefficient of variation images for the

Qualitatively, we saw greater variation in FA in the posterior portion of the

140 Sagittal Coronal Axial Weeks PMA Weeks PMA Weeks PMA Weeks PMA Figure 6.9: Axial, coronal, and saggital slices of the FA coefficient of variation for our four preterm infant DTI templates. Note that as the brain develops the inter-subject variability increases. Also, the variability is greater in the posterior part of the brain, suggesting greater development in that part of the brain over this week PMA time period. For each of our templates, we also examined the covariance at each voxel to determine where we see the greatest variability within the normal preterm infant brain. In particular, we examined the coefficient of variation images c(x) = σ(x) µ(x) (6.5) where the standard deviation, σ at each voxel x is displayed as a fraction of the mean value µ. Figure 6.9 shows a representative example: the coefficient of variation images for the FA templates. Qualitatively, we saw greater variation in FA in the posterior portion of the brain, which agrees with greater development of the occipital lobes during this time period [107, 211]. This greater development in the occipital lobes has also been shown in earlier DTI studies [57, 191, 210]. We further observed an increase in the coefficient of variation over time, suggesting that the brain structure of preterm infants becomes more diverse over time. This result agrees with the work of Brown et al. which showed increased variability in connectome measures over the same time period [57]. This result also agrees with the development of sulci over this age period [34] and that sulci have features that are unique to each individual [170]. We note that this trend appears to be independent of age at birth. Only the first template control 116

141 group showed a significantly lower distribution of birth age compared to other templates (see Section 6.3.3), yet the coefficient of variation increases across the four templates. To avoid excessive clutter in this thesis, we have posted all our templates on the STEAM project website ( where they can be viewed online (via a webbased 3D image viewer) and downloaded. The observations identified here were generally consistent across diffusion features, specifically the lateral growth of white matter tracts, the decreased radial organization for neurons in cortical regions, and greater variability in the posterior region of the brain. The consistency of these results with existing DTI and histological studies suggest that our STEAM templates capture the expected growth and variability previously identified in the normal preterm infant brain over the examined time period Subject-Specific Abnormality Maps: Proof of Concept We further generated abnormality maps for the 113 DTI scans in our experimental group by comparing them to the STEAM templates of the appropriate postmenstrual age. These 113 scans were analyzed using our VBA pipelines for four common diffusion measures: fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (λ 1 ). We focused on these four diffusion measures as they have been the focus of many previous preterm DTI studies [2, 66, 181, 218], allowing us to examine our results in the context of that earlier work. To account for multiple imaging scales, we generated abnormality maps for all scales from 0 8mm (at 1mm intervals) and marked a voxel as abnormal if its value was significantly different from the STEAM template on a majority of image scales. As a proof of concept, we present two notable cases to highlight how STEAM-generated abnormality maps can be interpreted and how they can be used to inform on the presence of anatomical abnormalities. The first case is shown in Figure The STEAM-generated abnormality maps for the four diffusion measures are presented in Figure 6.10(a) and show the presence of widespread abnormalities in MD, RD, and axial diffusivity. The infant s corresponding T1 scan was rigidly aligned to the template space and is displayed in Figure 6.10(b). This infant s T1 scan showed the presence of multiple small hyperintense lesions, all of which were manually identified by a trained expert and highlighted in red. Finally, Figure 6.10(c) show the FA and MD images from the subject and the template mean in the complementary colours of purple and green respectively. The blending of these images results in shades of gray where the images match and highlights differences between the aligned scans in one of the two image colours. These blended images allow us to examine for the presence of image registration errors that may impact the abnormality maps STEAM generates (e.g., neighbouring green and purple structures would imply misalignment. A more thorough introduction to image blending can be found in [110]). The infant s T1 MRI scores, as well as their neurodevelopmental test scores at 18 months are shown in the table at the bottom of Figure

FA MD RD λ 1 (a) Widespread STEAM-detected abnormalities

rigidly-registered to template (white matter lesions shown

(subject in purple, template in green) MRI-based Scores

corrected) WMI [149] IVH [177] VM [63] Cognitive Motor 2 -

and neurodevelopmental scores for this infant Figure 6.

abnormalities and how those detected abnormalities compare

The results from STEAM s voxel-based analysis for FA, MD,

These results show abnormally high MD, RD, and λ 1 over a

These results are consistant both with the presence of white

infant s significantly reduced neurodevelopmal test scores

The registration accuracy between the infant s DTI scan at

We do see some misregistration around the posterior portion

142 FA MD RD λ 1 (a) Widespread STEAM-detected abnormalities identified in deep gray and white matter (b) Infant T1 rigidly-registered to template (white matter lesions shown in red) MD FA (c) Blended Images: Subject and Template Mean (subject in purple, template in green) MRI-based Scores Bayley-III Test Scores (32 wks. PMA) (18 mo. corrected) WMI [149] IVH [177] VM [63] Cognitive Motor 2 - Moderate 0 - Absent 1 - Mild (d) Comparative clinical and neurodevelopmental scores for this infant Figure 6.10: A case study of how STEAM can be used to identify DTI abnormalities and how those detected abnormalities compare to structural MRI abnormalities and outcome. The results from STEAM s voxel-based analysis for FA, MD, RD, and λ 1 are shown in (a). These results show abnormally high MD, RD, and λ 1 over a large region encompassing deep gray and white matter. These results are consistant both with the presence of white matter lesions on the infant s T1 scan shown in (b) and the infant s significantly reduced neurodevelopmal test scores at 18 months corrected age (shown in the table above). The registration accuracy between the infant s DTI scan at the STEAM statistical template is shown in (c). We do see some misregistration around the posterior portion of the ventricles on the MD blended image, which is a result of ventriculomegaly. However, this misregistration is small in comparison to the STEAM-detected DTI abnormalities. The combination of all these results suggest that STEAM is identifying a true structural abnormality in this infant. 118

143 As an initial step in interpreting these STEAM-generated abnormality maps, we looked for the possibility of image registration errors. By examining Figure 6.10(c), we do see purple regions around the posterior portion of the ventricles, suggesting that, after aligning to the template, the ventricles in the infant s DTI scan remained slightly larger than those in the template s mean image. This result is not surprising as this infant showed presence of mild ventriculomegaly (VM) that, apparently, our image registration techniques could not fully account for. Even so, the misregistration is limited to a much smaller region than the abnormalities present on the STEAM-generated abnormality maps. These detected abnormalities extend well beyond the periventricular region and into regions occupied by major white matter fiber tracts (e.g., splenium), and those major tracts are well-aligned as evidenced in the blended FA image in Figure 6.10(c). While misregistration would account for some of the detected abnormalities near the posterior portion of the ventricles, it alone cannot explain the widespread MD, RD, and axial diffusivity abnormalities identified by STEAM. Instead, the infant s T1 scan provides some additional clues that corroborate the abnormalities identified by STEAM. Specifically, multiple hyperintense white matter lesions appear scattered in the same areas as the abnormalities identified by STEAM. It is believed that these lesions are an indication of a more widespread diffuse white matter injury in the neighboring tissue [20]. The abnormalities identified by STEAM match that description, suggesting that we are capturing a greater extent of the diffuse white matter injury than the lesions display on the T1 scan. This interpretation also agrees with the low neurodevelopmental test scores obtained at 18 months as one would expect that such widespread brain abnormalities would have a profound impact on later neurodevelopmental outcome. The STEAM-generated results for a second infant are shown in Figure 6.11(a) along with their T1 scan in Figure 6.11(b), the blended FA and MD images in Figure 6.11(c), and the infant s T1 evaluation and neurodevelopmental scores in the corresponding table. The STEAM results suggest increased FA and axial diffusivity in the left occipital lobe, the left frontal-temporal lobe, and in various cortical regions. The blended MD image in Figure 6.11(c) shows no notable image registration error, while the blended FA image shows clear FA differences but none appear to be due to anatomical misalignment (which would appear as neighboring purple and green structures). The lack of registration error suggests that the abnormalities identified by STEAM are indicative of anatomical abnormalities. Comparing the STEAM abnormality maps to the infant s T1 scan, we see that the larger regions of STEAM-detected abnormalities are in the proximity of a large white matter lesion in the left hemisphere. This proximity suggests a relationship between the lesion and the nearby FA abnormalities that is consistent with previous findings [20]. Further, the increased FA in these parts of the subplate zone suggest a reduced maturation of those regions, a result commonly seen in the presence of injury [210]. One would hypothesize that 119

FA MD RD λ 1 (a) Widespread STEAM-detected abnormalities identified in sub-cortical gray and white matter (b) Infant T1 rigidly-registered to

Scores Bayley-III Test Scores (29 wks. PMA) (18 mo.

abnormalities and outcome. The results from STEAM s voxel-based analysis for FA, MD, RD, and λ 1 are shown in (a).

These results are consistant with the presence of nearby white matter lesions on the infant s T1 scan shown in (b) as well as the infant s

144 FA MD RD λ 1 (a) Widespread STEAM-detected abnormalities identified in sub-cortical gray and white matter (b) Infant T1 rigidly-registered to template (white matter lesions shown in red) MD FA (c) Blended Images: Subject and Template Mean (subject in purple, template in green) MRI-based Scores Bayley-III Test Scores (29 wks. PMA) (18 mo. corrected) WMI [149] IVH [177] VM [63] Cognitive Motor 3 - Severe 2 - Moderate 0 - Absent (d) Comparative clinical and neurodevelopmental scores for this infant Figure 6.11: A second case study of how STEAM can be used to identify DTI abnormalities and how those detected abnormalities compare to structural MRI abnormalities and outcome. The results from STEAM s voxel-based analysis for FA, MD, RD, and λ 1 are shown in (a). These results show abnormally high FA and λ 1 in areas of cortical gray matter and superficial white matter on the left side of the brain. These results are consistant with the presence of nearby white matter lesions on the infant s T1 scan shown in (b) as well as the infant s significantly reduced motor test scores at 18 months corrected age (shown in the table above). The registration accuracy between the infant s DTI scan at the STEAM statistical template is shown in (c). While some of the abnormalities are around the cortex, there is no discernable registration error in these regions. The combination of all these results suggest that STEAM is identifying a true structural abnormality in this infant. 120

145 the extent of these abnormalities would predict neurodevelopmental outcome as reflected in the lower than expected motor function at 18 months. While these two cases show what abnormalities STEAM captures at the level of a single scan, further insights can be gathered by considering both scans together. First, we note that these two scans show very different patterns of abnormality despite the fact that both have T1 scans showing white matter lesions and both have altered functional outcomes at 18 months corrected age. If we performed a group-based study where the experimental group contained both of these infant s scans, the best that study would be able to do is identify brain regions where the overwhelming majority of experimental group scans were different than the corresponding control group. That group study would not capture potentially large intra-group differences as displayed in these two cases. Further, it is interesting to note that the infant in Figure 6.10 showed a greater amount of STEAM-detected abnormalities, as well as lower neurodevelopmental scores at 18 months, than the infant in Figure The amount of abnormality identified by STEAM in these two cases agrees with their later neurodevelopmental outcome. The same cannot be said for the scores obtained from the T1 scan. The T1 scan for the second infant (in Figure 6.11) showed a greater presence of white matter lesions than the first infant (in Figure 6.10), as well as presence of intraventricular hemorrhage, yet the second infant showed better on neurodevelopmental tests at 18 months corrected age. While these results are only for two cases out of many, it raises the question of whether the volume of STEAM-detected abnormalities may be, in part, indicative of future neurodevelopmental outcome. We examine the potential of that abnormality-outcome relationship in the following section Relating STEAM Abnormalities to Outcome While STEAM can be used to generate personalized abnormality maps, the question remains as to whether the abnormalities identified by STEAM are indeed meaningful and clinically relevant. We saw in the previous section that the volume of STEAM-detected abnormalities was indicative of neurodevelopmental outcome for two selected infants. Here, we examine whether that trend holds for our cohort as a whole. As a proof of concept, we narrow our goal to identifying whether the volume of STEAM-detected abnormalities can be used to differentiate between infants with normal and abnormal motor outcomes. Table 6.3 lists our cohort s experimental group according to Bayley Motor Score (18 months corrected age) and PMA at scan. Within this experimental group, we identify three main sub-groups based on outcome: those with a normal motor outcome (Bayley score > 85, highlighted in green), those with a clinically abnormal outcome (Bayley score 70, highlighted in red), and borderline cases (highlighted in yellow). To compare the volume of STEAM-detected abnormalities between the normal and abnormal groups, we 121

146 Table 6.3: The DTI scans used to test the relationship between our voxel-based analysis and motor outcome at 18 months corrected age. The number of scans are grouped according to post-menstrual age and Bayley motor score. Scans with clinically abnormal motor outcomes are highlighted in red while scans with normal motor outcomes are highlighted in green. Borderline, or low normal, cases are highlighted in yellow. Bayley Post-Menstrual Age (wks.) Scan Motor Score Total > Scan Total first quantify the extent of STEAM-detected abnormalities as, Extent = V olume of Abnormal Region (in voxels) Brain V olume (in voxels) (6.6) so that the volume of STEAM-detected abnormalities is normalized by the brain volume. We then hypothesize that the extent of STEAM abnormalities should be significantly higher in the abnormal group than in the normal group. To test for this group difference, we perform a two-way ANCOVA (analysis of covariance) with sex, age at birth, and brain volume included as covariates. We note that each of these three covariates have been identified as having an impact on infant neurodevelopment [190, 78, 101, 22] and we wish to isolate the effect of these covariates from the group differences that STEAM may identify. The two-way ANCOVA results are presented in the boxplots in Figure 6.12 for the four most commonly studied diffusion measures: FA, MD, RD, and λ 1. For three of the four diffusion measures, we saw a significantly higher extent of STEAM-detected abnormalities in the abnormal outcome group than in the normal outcome group (MD: p = , RD: p = , λ 1 : p = ). In the case of FA, the p-value of was not significant, but was still small enough to suggest that a group difference could exist depending on the cohort and on the template settings. We discuss this point further in Section 6.5. Note that the experimental group does not include the scans used to create the statistical templates, so the group differences identified here are present despite of the fact that the scans in the normal motor outcome group showed abnormalities that eliminated them from being used in the templates themselves. Had our experimental group contained scans that met our control group criteria, we hypothesize that the group difference would be even larger. With regard to the covariates of sex, age at birth, and brain volume, we found no significant relationships between them and the extent of STEAM detected abnormalities (smallest p-value = for brain volume and RD abnormalities, largest p-value = for birth age and λ 1 abnormalities). We believe that the lack of significant results for these 122

147 (a) FA (b) MD (c) RD (d) λ 1 Figure 6.12: Comparison of STEAM-detected abnormalities in fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (λ 1 ) between infants with normal and abnormal motor development. P-values were computed using two-way ANCOVA with sex, birth age, and brain volume included as covariates. Note that for all diffusion features except FA, the extent of STEAM-detected abnormalities is significantly higher for the infants with abnormal motor outcome. covariates is a result of the fact that the templates do not differ along the dimensions of these covariates. When generating the statistical templates, all healthy scans are used equally and so the variability due to sex, age at birth, and brain volume is incorporated into the template itself, making it difficult to identify abnormalities related to those factors. 123

148 While the extent of STEAM abnormalities is able to differentiate, on average, between infants with normal and abnormal motor outcomes, we do note with red circles in Figure 6.12 the presence of outlier results (i.e., false positives) for the normal outcome group. These outliers can be explained by the fact that the normal motor outcome group is comprised of scans that displayed at least one other measure of abnormality. Of the ten scans identified as outliers, three were from infants that had moderate to severe white matter injury as scored using [149], two were from infants that had low Bayley language scores (< 85), one showed presence of moderate IVH, and one was an outlier due to poor image registration (this outlier is shown in Figure 6.15 and discussed further in Section 6.5). The remaining three outlier scans were only outliers on the FA measure where the group difference was not significant. Finally, we note that while the group differences were significant, the correlations between the extent of STEAM-detected abnormalities and Bayley motor scores were not (lowest p-value = for axial diffusivity; p-value computed using a linear mixed effects model to account for the presence of multiple scans from the same infants). As a result, we cannot guarantee that the extent of STEAM-detected abnormalities are sufficient to predict motor outcome. It is likely that abnormality severity and location may also play roles for more refined predictions of outcome. Even so, these group differences are consistent with the results from the previous section and suggest that STEAM is capturing meaningful abnormalities across our cohort Comparing STEAM Abnormalities to T1 Abnormalities While we have shown examples of how STEAM can generate personalized results and have shown that those results can be related to neurodevelopmental outcome, we have yet to look at how STEAM compares to existing ways of grading a preterm infant s individual MR scan. In particular, we are interested in knowing if the results obtained from STEAM are consistent with those obtained from another MRI grading system and whether STEAM Table 6.4: The DTI scans used to test the relationship between our voxel-based analysis and the presence of white matter injury (WMI) scored according to [149]. The number of scans are grouped according to post-menstrual age and WMI score. Scans with significant white matter lesions are highlighted in red while lesion-free scans are highlighted in green. Scans with single, small lesions are highlighted in yellow. WMI Post-Menstrual Age (wks.) Scan Grade [149] Total 0 - Absent Mild Moderate Severe Scan Total

149 provides any additional information over such a grading system. To examine this issue, we compare STEAM to a white matter injury (WMI) grading system that was found to be predictive of adverse neurodevelopmental outcome in preterm infants at 12 to 18 months of age [149]. This WMI grading system is based on the examination of the size and number of hyperintense white matter lesions seen on the infant s T1 MRI. These lesions have been identified as being indicative of axonal dematuration [20]. Given these lesion grades, we expect to see the extent of STEAM-detected abnormalities increase in the presence of lesions. Table 6.4 lists our cohort s experimental group according to their white matter injury grade as defined in [149]. Lesion-free scans are identified in green while scans with a significant number of lesions are highlighted in red. More marginal cases containing single, small lesions (less than 2mm in diameter) are highlighted in yellow. To test whether the presence of lesions affects STEAM-detected abnormalities, we perform a two-way ANCOVA between the lesion-free group (in green) and the group of scans containing the presence of significant lesions (in red). We use ANCOVA here to test for a difference in STEAM-detected abnormalities between these groups while isolating the effects of sex, age at birth, and brain volume as covariates. The two-way ANCOVA results are presented as boxplots in Figure 6.13 for the four studied diffusion measures: FA, MD, RD, and λ 1. In all cases, we saw a larger amount of STEAM-detected abnormalities in the group with lesions than in the group without lesions. Excluding FA, the group differences were statistically significant (MD: p = , RD: p = , λ 1 : p = ). In the case of FA, the larger STEAM-detected abnormalities in the group with lesions was not suggestive of a true group difference (p = ). Once again, note that these group differences are present despite the fact that our experimental group does not contain any of the scans used to create the statistical templates. As a result, the lesion-free scans used in this analysis still contain some abnormalities that eliminated them from being used to create the templates themselves. While we are able to use STEAM abnormality extent to differentiate between lesion and non-lesion groups, we once again note the presence of outliers. These outlier scans in the non-lesion group can also be explained by some other measure of abnormality. Of the 12 scans identified as outliers, six were from infants with low Bayley motor scores (< 85), two were from infants with low Bayley Language scores (< 85), one showed presence of moderate IVH, and one scan was an outlier due to poor registration (this is the same outlier shown in Figure 6.15). The two remaining scans were only outliers on the FA measure where the group difference was not significant. The presence of these outliers make it impossible for us to guarantee that the presence of white matter lesions always results in a greater extent of STEAM-detected abnormalities. In fact, Figures 6.10 and 6.11 show examples of scans where a greater lesion volume actually resulted in a lower extent of STEAM-detected abnormalities. That said, the group 125

150 (a) FA (b) MD (c) RD (d) λ 1 Figure 6.13: Comparison of STEAM-detected abnormalities in fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (λ 1 ) between infants with and without white matter lesions. P-values were computed using two-way ANCOVA with sex, birth age, and brain volume included as covariates. Note that for all diffusion features except FA, the extent of STEAM-detected abnormalities is significantly higher for the infants with white matter lesions, which is consistent with previous literature [20]. differences reported in Figure 6.13 do agree with the literature reviewed in [20] suggesting a link between hyperintense T1 lesions and more diffuse brain injuries. 126

151 6.5 Discussion We have introduced herein the STEAM technique for the personalized analysis of DTI scans of the developing preterm infant brain. STEAM consists of two parts. First, we created a collection of statistical DTI templates for both the full diffusion tensor as well as a range of features derived from the diffusion tensors (e.g., FA, MD). Our template-creation pipeline is based on the technique of Guimond et al. and ensures an unbiased estimate of the average DTI scan of a population [96]. As part of that template estimation, we employed DT-REFinD, a full tensor DTI registration algorithm, to obtain the greatest accuracy we could in aligning anatomical structures across the DTI scans from our control group. The resulting templates contained the mean, variance, and normalcy p-value estimates at each voxel for a normative preterm infant population. This template estimation allows us to generate a statistical model offline, which reduces the amount of image registrations and statistical computations that need to be done to analyze a DTI scan on-the-fly. The second component of STEAM is a full VBA processing pipeline that involves aligning an individual DTI scan to the template, then performing voxel-by-voxel statistical tests to identify abnormalities. Following the advice given in [122, 123], we examined various choices involved in setting up a VBA pipeline, in particular the multiple comparison correction scheme, what level of image smoothing to perform, and what to do with data that is not normally distributed. In all three cases, we followed the accepted convention in the VBA field and in the latter two cases, proposed the use of normalcy p-value images P i and a collection of smoothed templates that capture a range spatial scales. The results from our STEAM analysis engine are summarized subject-specific abnormality maps, maps that no other existing technique provides. We evaluated STEAM first qualitatively by showing that our generated templates display the type of brain development that is consistent with the reduction of the subplate zone [128, 191], the increased dendritic arborization in the cortex [223, 152], and the rapid development of the occipital lobes [210, 107, 211] that has been observed in previous DTI, MRI, and histological studies. We also showed qualitative examples of STEAM s voxelbased analysis on four common diffusion features (FA, MD, RD, λ 1 ) and identified how the resulting abnormality maps both corroborate and expand upon the results seen on T1 MRI scans [20]. We further evaluated STEAM quantitatively by performing VBA on the 113 DTI scans from our cohort that were not used in the creation of our templates. We showed that there exists a relationship between the extent of abnormalities detected by STEAM and neurodevelopmental outcome at 18 months corrected age. We also identified a relationship between the presence of white matter lesions and an increased volume of STEAM-detected abnormalities, which is consistent with existing literature [20]. These results serve as a 127

152 Figure 6.14: A collection of views from the STEAM website. From the website, one can download the STEAM source code, template collections (a) or even individual templates within each the collection (b). Also, the STEAM website boasts an online image viewer (c) that allows an interested user to examine each STEAM statistical template used in this work (d). proof of concept and show that STEAM is sufficiently reliable to be useful for preterm DTI analysis. Finally, we have made our STEAM templates, as well as the source code for STEAM, publicly available to further facilitate research involving preterm DTI analysis. The code and templates are available at This STEAM website allows users to download the whole template collection, the source code, or even individual templates (Figure 6.14(a)). When a request is submitted (Figure 6.14(b)), a PHP script collects the requested files (in the case of the template images, they are provided in nifti format) into a single zip archive and s a link to the archive to the requesting user. The STEAM website also allows for online 3D viewing of the STEAM templates used in this work (Figure 6.14(c)). Every image in our STEAM experiments can be viewed through this online image viewer and users can browse between different diffusion measures, age ranges, statistics, and image scales (Figure 6.14(d)). While we do provide access to our statistical templates, we recommend that users create their own templates for their studies as the choice of scanner and imaging protocol can impact diffusion measurements [237]. 128

(a) FA STEAM abnormality map (b) Blended FA images (subject in purple, template in green) Figure 6.15: An example of image misregistration leading to false positives in the STEAM abnormality maps.

In (b), we see in the same area that the subject s right corticospinal tract (in purple) does not align with the corresponding tract in the template (again highlighted by blue arrows).

While STEAM has the strength of providing a fine-scale, personalized assessment of DTI abnormality over the whole brain, it is not without limitations.

The primary criticism has been the impact of registration quality on VBA results.

153 (a) FA STEAM abnormality map (b) Blended FA images (subject in purple, template in green) Figure 6.15: An example of image misregistration leading to false positives in the STEAM abnormality maps. In (a), a large region of FA abnormality is identified by STEAM around the right corticospinal tract (shown highlighted by blue arrows). In (b), we see in the same area that the subject s right corticospinal tract (in purple) does not align with the corresponding tract in the template (again highlighted by blue arrows). In this case, the misalignment pattern matches the abnormality pattern, strongly suggesting that these STEAM-detected abnormalities are false positives. While STEAM has the strength of providing a fine-scale, personalized assessment of DTI abnormality over the whole brain, it is not without limitations. The abnormality detection that STEAM performs is based on VBA: a technique that has seen its share of criticism over the years [203, 123, 122]. The primary criticism has been the impact of registration quality on VBA results. If the image registration step does not succeed in aligning the anatomy in the DTI scans, the resulting statistical tests would not be comparing properties of the same anatomical region. The result of this registration error in the template creation step would be increased variance in the template, variance that could hide true abnormalities from being detected by STEAM. If there is registration error when comparing a new scan to the template, then misaligned structures would be identified as abnormalities. One such example was shown in Figure 6.10(c) where the posterior portion of an infant s ventricles did not align to the template s mean image. We have attempted to mitigate the impact of image registration errors in multiple ways. First, we used DT-REFinD: a state-of-the-art tensor image registration algorithm that uses the full diffusion tensor to guide the image registration process [234]. In doing so, DT-REFinD is able to provide a more accurate structural alignment than registration algorithms based only on FA or some feature derived from the tensors [178]. Further, DT- REFinD is a non-linear registration algorithm that allows us to deform and align images with greater freedom than a linear registration algorithm like FSL FLIRT [112, 113]. Even so, results in Figure 6.10 suggest that registration error can still persist in STEAM, potentially because topological differences may exist between brains that image registration algorithms have difficulty addressing [82]. It is for this reason that we recommend examining the 129

HST.583 Functional Magnetic Resonance Imaging: Data Acquisition and Analysis Fall 2008

MIT OpenCourseWare http://ocw.mit.edu HST.583 Functional Magnetic Resonance Imaging: Data Acquisition and Analysis Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.