Registration by continuous optimisation. Stefan Klein Erasmus MC, the Netherlands Biomedical Imaging Group Rotterdam (BIGR)

Registration by continuous optimisation Stefan Klein Erasmus MC, the Netherlands Biomedical Imaging Group Rotterdam (BIGR)

Registration = optimisation C t x t y 1

Example 2

Example fixed image moving image 2

Math F(x) = fixed image, M(x) = moving image x = voxel coordinate Transformation function: T(x ; p) p = vector of transformation parameters Cost function: C( p ) measures similarity of fixed image F(x) and deformed moving image M( T(x; p) ) Find p that minimises C 3

Iterative optimisation p k+1 = p k + a k. d k d k = search direction a k = step size gradient descent: d C p k ( pk) g k 4

Gradient descent p k+1 = p k - a k. g k P 1 p 1 g 1 p 2 p 2 g 2 p 3 = p 3 - a k. g 3 : : : : : : k+1 k k 5

Gradient descent p k+1 = p k - a k. g k p 1 p 1 g 1 p 2 p 2 g 2 = C p 1 k p 3 = p 3 - a k. g 3 : : : : : : k+1 k k 5

6 Cost function derivative Example for mean of squared differences: x x x x p T p x T x p p x T x p p x T x p M )) ; ( M( ) F( N 2 M )) ; ( M( ) F( N 2 C )) ; ( M( ) F( N 1 ) C( t 2

Choice of d k p k+1 = p k + a k. d k 7

Choice of d k gradient descent C p 1 p 2 8

Choice of d k smarter steps C p 1 p 2 8

Choice of d k cheaper steps C p 1 p 2 8

Choice of d k p k+1 = p k + a k. d k gradient descent: Newton: quasi-newton: d k = - g k d k = - [H k ] -1 g k d k = - B k g k smarter steps conjugate gradient: d k = - g k + β k d k-1 stochastic gradient: d k - g k cheaper steps 9

Experimental comparison Cardiac CT, 97x97x97 voxels, artifically deformed 10

Experimental comparison Cardiac CT, 97x97x97 voxels, artifically deformed 11

Experimental comparison Error measure: e 1 N x T ( x) Tˆ( x) 12

Experimental comparison 3 gradient descent quasi-newton conjugate gradient stochastic gradient e [mm] 2.5 2 1.5 1 0.5 0 0.001 0.01 0.1 1 10 100 1000 computation time

Choice of a k p k+1 = p k + a k. d k 14

Choice of a k Too small steps C p 1 p 2 15

Choice of a k Too large steps C p 1 p 2 15

Choice of a k p k+1 = p k + a k. d k constant: a k = a slowly decaying: a k = f ( k ) = a / ( A + k ) a exact line search: a k = argmin a C ( p k + a d k ) inexact line search: a k argmin a C ( p k + a d k ) [Wolfe conditions] adaptive: a k = F ( progress in previous iterations ) 16

Stochastic gradient descent with adaptive strategy for a k p k1 p k f(t k ) g k 20 f(t k ) a/(a t k ) a 10 0 0 250 500 t k1 t k sigmoid( g T k g k1 ) 1 0-1 -5 0 5 17

Stochastic gradient descent with adaptive strategy for a k p k1 p k f(t k ) g k 20 f(t k ) a/(a t k ) a 10 0 0 250 500 t k1 t k sigmoid( g Choose a such that: T k g k1 max. voxel displacement per iteration < (with 95% probability) ) 1 0 [mm] -1-5 0 5 17

Experimental comparison 6 prostate MR image pairs: nonrigid registration evaluation measure: overlap of manual segmentations after registration 18

Experimental comparison [mm] A non-adaptive2000 0.03125 0.0625 0.125 0.25 0.5 1.0 2.0 4.0 8.0 1.25 2.5 5 10 20 40 80 160 320 0.8 0.85 0.9 0.95 [mm] A adaptive2000 0.03125 0.0625 0.125 0.25 0.5 1.0 2.0 4.0 8.0 1.25 2.5 5 10 20 40 80 160 320 0.8 0.85 0.9 0.95 non-adaptive adaptive 19

Experimental comparison Experiments with: brain, lung, prostate CT, MRI sum of squared differences, mutual information, normalized mutual information rigid, nonrigid A 20, voxelsize good results in all experiments! 20

Local similarity measures MI = mutual information assumes grey-value distribution does not vary over image domain LMI = localised mutual information = 1 MI ( x ) (aka: regional MI, conditional MI, spatial information encoded MI) x N 21

Summary Parametric formulation can be solved by continuous optimisation Derivative-based methods: require Extensive literature C p Basic method: gradient descent Popular choice: quasi-newton or conjugate gradient icm inexact line search Recommended : stochastic gradient descent with adaptive step sizes 22

Literature Nocedal & Wright: Numerical Optimization IEEE Trans. Image Processing 2007 - Klein, Staring, Pluim Evaluation of optimization methods for nonrigid medical image registration using mutual information and B-splines Int. J. Computer Vision 2009 - Klein, Pluim, Staring, Viergever Adaptive stochastic gradient descent optimisation for image registration IEEE Trans. Image Processing 2000 - Thevenaz, Unser Optimization of mutual information for multiresolution image registration 23

Rigid and nonrigid registration Various cost functions, transformation models, multiresolution strategies etc. Many optimisation algorithms implemented Free: http://elastix.isi.uu.nl Based on Insight ToolKit (ITK): http://www.itk.org IEEE Trans. Medical Imaging 2010 - Klein, Staring, Murphy, Viergever, Pluim elastix: a toolbox for intensity based medical image registration 24