Registration by continuous optimisation Stefan Klein Erasmus MC, the Netherlands Biomedical Imaging Group Rotterdam (BIGR)
Registration = optimisation C t x t y 1
Registration = optimisation C t x t y 1
Registration = optimisation C t x t y 1
Registration = optimisation C t x t y 1
Example 2
Example fixed image moving image 2
Example fixed image moving image 2
Example fixed image moving image 2
Math F(x) = fixed image, M(x) = moving image x = voxel coordinate Transformation function: T(x ; p) p = vector of transformation parameters Cost function: C( p ) measures similarity of fixed image F(x) and deformed moving image M( T(x; p) ) Find p that minimises C 3
Iterative optimisation p k+1 = p k + a k. d k d k = search direction a k = step size gradient descent: d C p k ( pk) g k 4
Gradient descent p k+1 = p k - a k. g k P 1 p 1 g 1 p 2 p 2 g 2 p 3 = p 3 - a k. g 3 : : : : : : k+1 k k 5
Gradient descent p k+1 = p k - a k. g k p 1 p 1 g 1 p 2 p 2 g 2 = C p 1 k p 3 = p 3 - a k. g 3 : : : : : : k+1 k k 5
6 Cost function derivative Example for mean of squared differences: x x x x p T p x T x p p x T x p p x T x p M )) ; ( M( ) F( N 2 M )) ; ( M( ) F( N 2 C )) ; ( M( ) F( N 1 ) C( t 2
Choice of d k p k+1 = p k + a k. d k 7
Choice of d k gradient descent C p 1 p 2 8
Choice of d k smarter steps C p 1 p 2 8
Choice of d k cheaper steps C p 1 p 2 8
Choice of d k p k+1 = p k + a k. d k gradient descent: Newton: quasi-newton: d k = - g k d k = - [H k ] -1 g k d k = - B k g k smarter steps conjugate gradient: d k = - g k + β k d k-1 stochastic gradient: d k - g k cheaper steps 9
Experimental comparison Cardiac CT, 97x97x97 voxels, artifically deformed 10
Experimental comparison Cardiac CT, 97x97x97 voxels, artifically deformed 11
Experimental comparison Error measure: e 1 N x T ( x) Tˆ( x) 12
Experimental comparison 3 gradient descent quasi-newton conjugate gradient stochastic gradient e [mm] 2.5 2 1.5 1 0.5 0 0.001 0.01 0.1 1 10 100 1000 computation time
Choice of a k p k+1 = p k + a k. d k 14
Choice of a k Too small steps C p 1 p 2 15
Choice of a k Too large steps C p 1 p 2 15
Choice of a k p k+1 = p k + a k. d k constant: a k = a slowly decaying: a k = f ( k ) = a / ( A + k ) a exact line search: a k = argmin a C ( p k + a d k ) inexact line search: a k argmin a C ( p k + a d k ) [Wolfe conditions] adaptive: a k = F ( progress in previous iterations ) 16
Stochastic gradient descent with adaptive strategy for a k p k1 p k f(t k ) g k 20 f(t k ) a/(a t k ) a 10 0 0 250 500 t k1 t k sigmoid( g T k g k1 ) 1 0-1 -5 0 5 17
Stochastic gradient descent with adaptive strategy for a k p k1 p k f(t k ) g k 20 f(t k ) a/(a t k ) a 10 0 0 250 500 t k1 t k sigmoid( g Choose a such that: T k g k1 max. voxel displacement per iteration < (with 95% probability) ) 1 0 [mm] -1-5 0 5 17
Experimental comparison 6 prostate MR image pairs: nonrigid registration evaluation measure: overlap of manual segmentations after registration 18
Experimental comparison [mm] A non-adaptive2000 0.03125 0.0625 0.125 0.25 0.5 1.0 2.0 4.0 8.0 1.25 2.5 5 10 20 40 80 160 320 0.8 0.85 0.9 0.95 [mm] A adaptive2000 0.03125 0.0625 0.125 0.25 0.5 1.0 2.0 4.0 8.0 1.25 2.5 5 10 20 40 80 160 320 0.8 0.85 0.9 0.95 non-adaptive adaptive 19
Experimental comparison Experiments with: brain, lung, prostate CT, MRI sum of squared differences, mutual information, normalized mutual information rigid, nonrigid A 20, voxelsize good results in all experiments! 20
Local similarity measures MI = mutual information assumes grey-value distribution does not vary over image domain LMI = localised mutual information = 1 MI ( x ) (aka: regional MI, conditional MI, spatial information encoded MI) x N 21
Local similarity measures MI = mutual information assumes grey-value distribution does not vary over image domain LMI = localised mutual information = 1 MI ( x ) (aka: regional MI, conditional MI, spatial information encoded MI) x N can be efficiently implemented with stochastic gradient descent! 21
Summary Parametric formulation can be solved by continuous optimisation Derivative-based methods: require Extensive literature C p Basic method: gradient descent Popular choice: quasi-newton or conjugate gradient icm inexact line search Recommended : stochastic gradient descent with adaptive step sizes 22
Literature Nocedal & Wright: Numerical Optimization IEEE Trans. Image Processing 2007 - Klein, Staring, Pluim Evaluation of optimization methods for nonrigid medical image registration using mutual information and B-splines Int. J. Computer Vision 2009 - Klein, Pluim, Staring, Viergever Adaptive stochastic gradient descent optimisation for image registration IEEE Trans. Image Processing 2000 - Thevenaz, Unser Optimization of mutual information for multiresolution image registration 23
Rigid and nonrigid registration Various cost functions, transformation models, multiresolution strategies etc. Many optimisation algorithms implemented Free: http://elastix.isi.uu.nl Based on Insight ToolKit (ITK): http://www.itk.org IEEE Trans. Medical Imaging 2010 - Klein, Staring, Murphy, Viergever, Pluim elastix: a toolbox for intensity based medical image registration 24