William Yang Group 14 Mentor: Dr. Rogerio Richa Visual Tracking of Surgical Tools in Retinal Surgery using Particle Filtering

Similar documents
3D Registration based on Normalized Mutual Information

Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware

Optical Flow Estimation with CUDA. Mikhail Smirnov

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Image Registration Lecture 4: First Examples

Nonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H.

Multi-modal Image Registration Using the Generalized Survival Exponential Entropy

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems

B. Tech. Project Second Stage Report on

Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices

Georgia Institute of Technology, August 17, Justin W. L. Wan. Canada Research Chair in Scientific Computing

Motion estimation for video compression

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity

Fast Image Registration via Joint Gradient Maximization: Application to Multi-Modal Data

Capturing, Modeling, Rendering 3D Structures

ANALYSIS OF RELIABILITY AND IMPACT FACTORS OF MUTUAL INFORMATION SIMILARITY CRITERION FOR REMOTE SENSING IMAGERY TEMPLATE MATCHING

Displacement estimation

l ealgorithms for Image Registration

Direct Rendering. Direct Rendering Goals

CS 231A Computer Vision (Fall 2012) Problem Set 3

Peripheral drift illusion

Image Registration. Prof. Dr. Lucas Ferrari de Oliveira UFPR Informatics Department

EE795: Computer Vision and Intelligent Systems

Autonomous Navigation for Flying Robots

2D-3D Registration using Gradient-based MI for Image Guided Surgery Systems

Hybrid Spline-based Multimodal Registration using a Local Measure for Mutual Information

RIGID IMAGE REGISTRATION

OpenCL Implementation Of A Heterogeneous Computing System For Real-time Rendering And Dynamic Updating Of Dense 3-d Volumetric Data

Face Detection CUDA Accelerating

Parallelism. CS6787 Lecture 8 Fall 2017

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

CS664 Lecture #16: Image registration, robust statistics, motion

The SIFT (Scale Invariant Feature

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov

Novel Lossy Compression Algorithms with Stacked Autoencoders

A Pyramid Approach For Multimodality Image Registration Based On Mutual Information

Medical Image Registration by Maximization of Mutual Information

Stereo Vision II: Dense Stereo Matching

Feature Tracking and Optical Flow

Scan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection

GPU Accelerating Speeded-Up Robust Features Timothy B. Terriberry, Lindley M. French, and John Helmsen

A Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Graphics Processing Unit (GPU) Acceleration of Machine Vision Software for Space Flight Applications

Dense Image-based Motion Estimation Algorithms & Optical Flow

ICA mixture models for image processing

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

CS 223B Computer Vision Problem Set 3

NIH Public Access Author Manuscript Proc Int Conf Image Proc. Author manuscript; available in PMC 2013 May 03.

FFT-Based Astronomical Image Registration and Stacking using GPU

Robust Shape Retrieval Using Maximum Likelihood Theory

Outline 7/2/201011/6/

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Image Processing. Image Features

Random projection for non-gaussian mixture models

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

XIV International PhD Workshop OWD 2012, October Optimal structure of face detection algorithm using GPU architecture

Tracking Computer Vision Spring 2018, Lecture 24

Discriminative classifiers for image recognition

Local Image Features

Motion Estimation and Optical Flow Tracking

GPUML: Graphical processors for speeding up kernel machines

Utilizing Graphics Processing Units for Rapid Facial Recognition using Video Input

2D Image Processing Feature Descriptors

IMAGE registration [1] is a technique for defining. Efficient Acceleration of Mutual Information Computation for Nonrigid Registration using CUDA

High-Performance Image Registration Algorithms for Multi-Core Processors. A Thesis. Submitted to the Faculty. Drexel University

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors

The Lucas & Kanade Algorithm

EE795: Computer Vision and Intelligent Systems

Parallel Tracking. Henry Spang Ethan Peters

Registration of Hyperspectral and Trichromatic Images via Cross Cumulative Residual Entropy Maximisation

Learning-based Neuroimage Registration

A 2D-3D Image Registration Algorithm using Log-Polar Transforms for Knee Kinematic Analysis

Discussion on Bayesian Model Selection and Parameter Estimation in Extragalactic Astronomy by Martin Weinberg

Feature Tracking and Optical Flow

Nonrigid Registration using Free-Form Deformations

Pattern Matching & Image Registration

Non-Rigid Image Registration III

K-Means and Gaussian Mixture Models

implementation using GPU architecture is implemented only from the viewpoint of frame level parallel encoding [6]. However, it is obvious that the mot

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Feature Based Registration - Image Alignment

Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors

Mixture Models and EM

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Fast marching methods

Portland State University ECE 588/688. Graphics Processors

Local Features Tutorial: Nov. 8, 04

Overview of Proposed TG-132 Recommendations

Large Displacement Optical Flow & Applications

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

ELEC Dr Reji Mathew Electrical Engineering UNSW

Digital Volume Correlation for Materials Characterization

Object Detection Design challenges

Map-Enhanced UAV Image Sequence Registration and Synchronization of Multiple Image Sequences

Transcription:

Mutual Information Computation and Maximization Using GPU Yuping Lin and Gérard Medioni Computer Vision and Pattern Recognition Workshops (CVPR) Anchorage, AK, pp. 1-6, June 2008 Project Summary and Paper Selection Our project, which involves developing a software package for the visual tracking of surgical tools during retinal surgery, involves the use of a particle filter for stochastic optimization tracking of mutual information, a visual similarity measure useful under conditions of limited texture information and variable illumination and rotation. A robust optimization procedure would use a higher number of particles in the filter, which results in an increase in computational power to maintain a high refresh rate. Part of our maximum deliverable is to run part of the software on a graphics processing unit (GPU) using CUDA, NVIDIA s GPU computing architecture, which is capable of certain types of massively parallel calculations, and would increase the computational throughput available to us. This paper, by Lin and Medioni, evaluate a GPU implementation of calculating the mutual information metric. The direct relevance to our project as a result of the use of mutual information with a GPU/CUDA is apparent. In addition, images from retinal surgery were used as testing datasets for this paper, which should give us a realistic idea of what to expect from a GPU in our environment. Technical Summary Introduction Mutual information is an information theory quantity that is useful for registration and comparison of multi-modal images. Aligning these images is especially important in medical imaging, but is a difficult problem to solve especially with direct pixel-to-pixel similarity measures such as sum squared difference (SSD) or normalized cross-correlation (NCC). Maximization of mutual information is a common approach and is the one taken in this paper; however, it is computationally expensive as calculating the marginal and joint probabilities at each pixel requires an exponential number of computations. This paper focuses on adapting

Viola s approach to computing mutual information since it provides a close form solution of the derivatives needed for a gradient descent approach to maximization. Mutual Information Approximation The mutual information of two random variables u and v is defined as MI(u,v) = h(u)+h(v) h(u,v) where the entropy and joint entropy are defined respectively as: h(v) = -sum v [p(v) ln(p(v))] h(u,v) = -sum u,v [p(u,v) * ln(p(u,v))] Viola s method estimates the probability density p(v) over the samples A by the Parzen Window method: p(v) = (1/N A ) * sum (vj from A) G w (v-v j ) where G w is a Gaussian function with variance w. This technique, robust to noise, uses a Gaussian Mixture model to estimate the density of samples drawn from an unknown distribution. Since the entropy function can also be written as the negative expectation of ln p(v), the entropy of a random variable can then be approximated as where B is another sample set. This leads us to arrive at an approximation of mutual information: MI(u,v) is then maximized over a rigid transformation T using an approximated derivative: where W is a weighting factor. CUDA Implementation On a fundamental level, CUDA features the Single Instruction, Multiple Data (SIMD) computational architecture, where a single instruction operates on multiple data streams in lock step. Use of the shared memory is key to speeding up computations as it is located on-chip;

however, it is relatively small in size (16K in the GPU tested in the paper). The inner summations of MI(u,v) and (d/dt)mi(u,v) were parallelized as each element in B can be calculated independently of the others. Such an approach also facilitates fast loading of shared memory from global memory since parallel loading is possible when there are no bank conflicts. Validation and Results Registration of two retinal images using a rigid transformation was performed by maximization of mutual information. A triangular mesh was used, where the mutual information derivative is computed over the displacements of the vertices of each triangle. Gradient descent was then performed iteratively until convergence. The paper found that, as the number of samples increased, the time for the GPU to compute MI derivatives remained relatively constant while CPU time increased quadratically. At 1000 samples, a CPU running four parallel threads was still two orders of magnitude slower than the GPU. In another experiment, a 50 iteration registration between two retinal images was performed. Samples on fractional positions used bilinear interpolation on the CPU; since each iteration required a new interpolation, additional overhead of copying the interpolated data to the GPU memory for every iteration was incurred. Nevertheless, the CPU implementation took hours while the GPU implementation completed in minutes.

Conclusions and Future Work Using the approximation methods described here as well as effectively using shared memory enables using mutual information with a high enough number of samples and iterations for use in image registration within a reasonable amount of time as compared to a CPU-based approach. Future work would entail integration into applications and conducting additional experiments. The authors also expressed interest in comparing the performance with other approximation methods. Critical Analysis This paper is interesting from both a technical standpoint and in terms of presenting a novel approach using current technology. The mathematical details of the algorithm were presented adequately, and a serial pseudocode algorithm was also included in the paper. On the other hand, while there is some description of the experiments that the authors performed to evaluate performance, I found that more detail about the test datasets would have been helpful. In addition, it would be ideal if the retinal images used could be released or be from a standardized research dataset. The paper could have also discussed the pros and cons of choosing Viola s algorithm as compared to other options, although the authors do mention exploring other algorithms as part of future work. Variance between different testing datasets was also absent, which limited my ability to assess how data-specific their results were. From a technical standpoint, I felt that some consideration should have been given to the memory bandwidth of the CPU implementation, especially as it was considered an important factor in GPU performance. Bilinear interpolation is expensive on the CPU and can be done on the GPU using texture memory 1, and is future work that I would like to see. I will grant the

benefit of the doubt that the older CUDA architecture being used in the paper may not yet have had this feature, which is present in the CUDA architecture today. Relevance to Project and Conclusions We currently use a joint histogram method to calculate mutual approximation, which the authors of the paper identified as being more difficult to extend to a GPU for computation of MI derivatives. We may proceed with our current histogram method as a paper is cited as having successfully implemented mutual information with histograms on a GPU 2. In addition, we do not need to calculate the derivatives of MI for gradient descent maximization; we identified early on that, for our purposes, gradient descent was inappropriate as an optimization method due to the problem of local minima in tool tracking. The factor not addressed is the performance of calculating many mutual information measures in succession over a smaller number of samples, as would effectively be our case with the particle filter. Preliminary testing with source code released by the authors of this paper as well as by Shams has not yielded promising results; however, the current method employed is admittedly inefficient where the high number of memory transfers between the host and GPU device is too high to realize any benefits from the parallel processing on the GPU. In addition, we hope to run multiple mutual information calculations on the GPU in parallel, rather than accelerate the speed of an individual MI calculation. Nevertheless, this paper effectively demonstrates that a GPU implementation of mutual information possesses the capability to significantly increase computational throughput. Reading List 1. NVIDIA CUDA C Programming Guide 4.1. 2011 2. R. Shams and N. Barnes. Speeding up mutual information computation using NVIDIA CUDA hardware. Digital Image Computing Techniques and Applications, pages 555 560, 2007.