Compressive Sensing Algorithms for Fast and Accurate Imaging Wotao Yin Department of Computational and Applied Mathematics, Rice University SCIMM 10 ASU, Tempe, AZ
Acknowledgements: results come in part from joint work with 1. Wolfgang Stefan, Yilun Wang, Yin Zhang (Rice) 2. Junfeng Yang (Nanjing, China) 3. Weihong Guo (Case Western) Work supported by the NSF, ONR, and Sloan Foundation
Compressive Sensing Pioneered by Candes-Romberg-Tao 05, Donoho 05 Goal: acquires a signal fast and robustly CS requires: 1. Signal has a sparse structure u = Ψx, x is sparse/compressible jointly sparse (for multiple signals) 2. Encoding: linear, incoherent with Ψ, physically implementable 3. Decoding: take advantages of sparsity, fast, robust to noise
Decoding Algorithmic Challenges: 1. Large, dense data (sensing matrices and measurements) 2. Standard methods (simplex, interior-point) not suitable 3. Some applications have the real-time need Opportunities: 1. Fast way of computing Ax and A x 2. Structured random matrices help 3. Solution structure helps
Model u: the signal A is the sensing matrix, f contains the measurements For this talk, we consider the model: min αtv v (u) + β Ψu w,1 + 1 2 Au f 2 M and its extensions TV v (u) = pixels v i (Du) i an extension of total variation, Rudin-Osher-Fatemi 92 capture sharp edges useful in image reconstruction
Model u: the signal A is the sensing matrix, f contains the measurements For this talk, we consider the model: min αtv v (u) + β Ψu w,1 + 1 2 Au f 2 M and its extensions TV v (u) = pixels v i (Du) i an extension of total variation, Rudin-Osher-Fatemi 92 capture sharp edges useful in image reconstruction A fast splitting algorithm, various applications integration with support/edge detection, Bregman reg. GPU implementation
Variable Splitting Consider min f (Lx) + g(x). L: linear operator Introduce y = Lx, obtain min{f (y) + g(x) : Lx y = 0}
Variable Splitting Consider min f (Lx) + g(x). L: linear operator Introduce y = Lx, obtain min{f (y) + g(x) : Lx y = 0} Apply augmented Lagrangian, obtain L(x, y; λ) = f (y) + λ, Lx y + c 2 Lx y 2 2 + g(x)
Variable Splitting Consider min f (Lx) + g(x). L: linear operator Introduce y = Lx, obtain min{f (y) + g(x) : Lx y = 0} Apply augmented Lagrangian, obtain L(x, y; λ) = f (y) + λ, Lx y + c 2 Lx y 2 2 + g(x) 1. (x k+1, y k+1 ) min x,y L(x, y) 2. Update λ, e.g., λ k+1 λ k + c(lx k+1 y k+1 )
Variable Splitting Consider min f (Lx) + g(x). L: linear operator Introduce y = Lx, obtain min{f (y) + g(x) : Lx y = 0} Apply augmented Lagrangian, obtain L(x, y; λ) = f (y) + λ, Lx y + c 2 Lx y 2 2 + g(x) 1. (x k+1, y k+1 ) min x,y L(x, y) 2. Update λ, e.g., λ k+1 λ k + c(lx k+1 y k+1 ) Instead, alternating direction method (ADM) does 1. x k+1 min x L(x k, y k ) 2. y k+1 min y L(x k+1, y k ) 3. λ k+1 λ k + γc(lx k+1 y k+1 ) ADM goes back to Douglas-Rachford in 50s and various methods for PDEs and monotone operator equations.
Split the Right Way! Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2.
Split the Right Way! Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2. Alternate x/y-directions? Doesn t make it simpler. Not right.
Split the Right Way! Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2. Alternate x/y-directions? Doesn t make it simpler. Not right. Alternate TV (u) and 1 2 Au f 2 2? One subproblem is ROF: min TV (u) + c u v 2 2 The other is least-squares Not bad, but solving multiple ROFs is not fast.
Split the Right Way Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2. TV ( ) i ( ) i and (D ). Split them.
Split the Right Way Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2. TV ( ) i ( ) i and (D ). Split them. Keep (D ) with 1 2 A f 2 2.
Split the Right Way Let us simplify. Keep TV and ignore Ψu w,1. Consider min αtv (u) + 1 2 Au f 2 2. TV ( ) i ( ) i and (D ). Split them. Keep (D ) with 1 2 A f 2 2. One subproblem is shrinkage. The other is least-squares, having closed-form sol s for many A. The right way for denoise, deblurring, color, some CS probs, etc. FTVd Wang-Yang-Y-Zhang 07, 1 year before split Bregman
Variable Splitting Split Bregman (Goldstein & Osher) applies ADM to min α v i (Du) i + β Ψu w,1 + 1 2 Au f 2 M.
Variable Splitting Split Bregman (Goldstein & Osher) applies ADM to min α v i (Du) i + β Ψu w,1 + 1 2 Au f 2 M. Introduce y = Du and z = Ψu, obtain min u,y,z {α v i y i + β z w,1 + 1 2 Au f 2 M : y = Du, z = Ψu}
Variable Splitting Split Bregman (Goldstein & Osher) applies ADM to min α v i (Du) i + β Ψu w,1 + 1 2 Au f 2 M. Introduce y = Du and z = Ψu, obtain min u,y,z {α v i y i + β z w,1 + 1 2 Au f 2 M : y = Du, z = Ψu} Augmented Lagrangian: min α v i y i + c 1 u,y,z 2 y Du λ 1 2 2 +β z w,1 + c 2 2 z Ψu λ 2 2 2 + 1 2 Au f 2 M
Variable Splitting Split Bregman (Goldstein & Osher) applies ADM to min α v i (Du) i + β Ψu w,1 + 1 2 Au f 2 M. Introduce y = Du and z = Ψu, obtain min u,y,z {α v i y i + β z w,1 + 1 2 Au f 2 M : y = Du, z = Ψu} Augmented Lagrangian: min α v i y i + c 1 u,y,z 2 y Du λ 1 2 2 Apply ADM +β z w,1 + c 2 2 z Ψu λ 2 2 2 + 1 2 Au f 2 M minimizing y and z is shrinkage, closed-form, component-wise minimizing u can be easy too λ k+1 i λ k i + γ( ), i = 1, 2 Global convergence for γ (0, 5+1 2 ) by He et al.
FTVd: Denoising+Deblurring A is a known convolution minimizing u solves (c 1 D D + c 2 I + A A)u = r, one FFT and one ifft
FTVd: Denoising+Deblurring A is a known convolution minimizing u solves (c 1 D D + c 2 I + A A)u = r, one FFT and one ifft Test 1: 512 512 Lena, Test 2: 1024 1024 Man Time from 2 years ago, on an old PC
FTVd: Color Deblurring and Impulsive Noise uses 3-channel TV, L1 fidelity kau f k1, need another splitting main computation is FFTs and iffts
RecPF: CS from Partial Fourier A is subsampled Fourier, pioneering work by Lustig-Donoho-Pauly 08 works with any sampling pattern (radial lines, spiral, random, etc.) main computation is still FFTs and iffts 250 250 images, 17% samples, 0.2s CPU
Self-Feeding Sparse SENSE Self-Feeding Sparse SENSE (Huang-Chen-Y-Lin-Ye-Guo-Reykowski 09) #CH min αtv g (u) + β Ψu 1 + A(S j u) f j 2 2 PPI, g-factor, sensitivity maps j=1 a 8-CH head coil, 5x acceleration, low RMSE. Better RMSE than SENSE, GRAPPA, CG-CS SENSE
RecPC: CS from Partial Circulant/Toeplitz Toeplitz and circulant matrices: t n t n 1 t 1 t n+1 t n t 2 T =......... t 2n 1 t 2n 2 t n t n t n 1 t 1 t 1 t n t 2 and C =......... t n 1 t n 2 t n A is a subsampled circulant matrix (Tropp et al 06, Baraniuk et al 07, Marcia et al 08, Hauput 08, Romberg 09, Ying s group 09) Important: C = (FDF ), D is diagonal Sensing matrix: A = P(FDF )
Random Circulant vs I.I.D. Random Random circulant has less freedom, is less random. However, they give no difference for CS, for most signals.
Random Circulant vs I.I.D. Random Random circulant has less freedom, is less random. However, they give no difference for CS, for most signals.
RecPC: CS from Partial Circulant/Toeplitz For images, partial circulant seems to be better than partial Fourier Radial Fourier Random Fourier Random Circulant
Other applications of operator splitting Frame-based image processing CS reconstruction for permuted Walsh measurements Nonparametric statistics: copula estimation Matrix completion, rank minimization Semidefinite programming
Other applications of operator splitting Frame-based image processing CS reconstruction for permuted Walsh measurements Nonparametric statistics: copula estimation Matrix completion, rank minimization Semidefinite programming Next, other exciting techniques...
EdgeCS: Edge Guided CS Reconstruction Guo-Y 10, runs 2-5 iterations: 1. RecPF with weighted TV 2. Subpixel edge detection, partial edge OK 3. Update weights for TV Has theoretical justifications
EdgeCS: Edge Guided CS Reconstruction Guo-Y 10, runs 2-5 iterations: 1. RecPF with weighted TV 2. Subpixel edge detection, partial edge OK 3. Update weights for TV Has theoretical justifications From noiseless samples:
EdgeCS: Edge Guided CS Reconstruction Guo-Y 10, runs 2-5 iterations: 1. RecPF with weighted TV 2. Subpixel edge detection, partial edge OK 3. Update weights for TV Has theoretical justifications From noisy samples:
Bregman Iterative Regularization: noisy CS reconstruction min µ u 1 + 1 2 Au b 2 2 2 2 1.5 1.5 1 1 0.5 0.5 0 0 0.5 0.5 1 1 1.5 true signal BPDN recovery 1.5 true signal BPDN recovery 2 0 50 100 150 200 250 2 0 50 100 150 200 250 µ = 48.5 Not sparse µ = 49 Poor fitting
Bregman Iterative Regularization: noisy CS reconstruction min µ u 1 + 1 2 Au b 2 2 2 2 1.5 1.5 1 1 0.5 0.5 0 0 0.5 0.5 1 1 1.5 true signal BPDN recovery 1.5 true signal BPDN recovery 2 0 50 100 150 200 250 2 0 50 100 150 200 250 µ = 48.5 Not sparse µ = 49 Poor fitting Bregman, µ = 150: u k min µ u 1 + 1 2 Au bk 2 2, bk+1 b k + (b Au k ) 2 1.5 1 0.5 0 0.5 1 1.5 true signal Bregman recovery 2 0 50 100 150 200 250 Itr 5
ADM+Bregman Applied to l p Minimization Chartrand 09: min{ (Du) i p : Au = b} Iterations: 1. for Bregman iterations 2. for ADM iterations 3. 4. Bregman update (add back)
ADM+Bregman Applied to l p Minimization Chartrand 09: min{ (Du) i p : Au = b} Iterations: 1. for Bregman iterations 2. for ADM iterations 3. 4. Bregman update (add back) Phantom: 256 256, noiseless, 10 radial lines, 3.8% Uterus: 1024 1024, noisy, random k-space, 15%
Edge Detection from Incomplete Fourier Samples Stefan, Viswanathan, Gelb, Renaut: uses l 1 and matching wave forms, no Gibbs edges Stefan has a preliminary GPU implementation. See his poster. In MATLAB, on Xeon CPU, 2.5 minutes On Telsa C1060 GPU: 15 seconds (10x speedup)
Papers, Software, and Demos available at http://www.caam.rice.edu/~optimization/l1