Introduction to Topics in Machine Learning

Similar documents
Sparse Reconstruction / Compressive Sensing

Signal Reconstruction from Sparse Representations: An Introdu. Sensing

Collaborative Sparsity and Compressive MRI

ELEG Compressive Sensing and Sparse Signal Representations

The Fundamentals of Compressive Sensing

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff

Compressive. Graphical Models. Volkan Cevher. Rice University ELEC 633 / STAT 631 Class

Compressed Sensing for Rapid MR Imaging

Structurally Random Matrices

EE123 Digital Signal Processing

Image Reconstruction from Multiple Sparse Representations

Compressive Sensing for Multimedia. Communications in Wireless Sensor Networks

Blind Compressed Sensing Using Sparsifying Transforms

Measurements and Bits: Compressed Sensing meets Information Theory. Dror Baron ECE Department Rice University dsp.rice.edu/cs

Recovery of Piecewise Smooth Images from Few Fourier Samples

Off-the-Grid Compressive Imaging: Recovery of Piecewise Constant Images from Few Fourier Samples

Sparse sampling in MRI: From basic theory to clinical application. R. Marc Lebel, PhD Department of Electrical Engineering Department of Radiology

Compressive Sensing Based Image Reconstruction using Wavelet Transform

Outlier Pursuit: Robust PCA and Collaborative Filtering

Compressed Sensing for Electron Tomography

Signal and Image Recovery from Random Linear Measurements in Compressive Sampling

COMPRESSIVE VIDEO SAMPLING

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

IMA Preprint Series # 2211

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Randomized sampling strategies

ComputerLab: compressive sensing and application to MRI

Compressed Sensing Algorithm for Real-Time Doppler Ultrasound Image Reconstruction

Weighted-CS for reconstruction of highly under-sampled dynamic MRI sequences

1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics

Non-Differentiable Image Manifolds

The convex geometry of inverse problems

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

Compressive Sensing. A New Framework for Sparse Signal Acquisition and Processing. Richard Baraniuk. Rice University

Compressive Sensing based image processing in TrapView pest monitoring system

Robust image recovery via total-variation minimization

G Practical Magnetic Resonance Imaging II Sackler Institute of Biomedical Sciences New York University School of Medicine. Compressed Sensing

Lecture 17 Sparse Convex Optimization

P257 Transform-domain Sparsity Regularization in Reconstruction of Channelized Facies

arxiv: v2 [physics.med-ph] 22 Jul 2014

University of Alberta. Parallel Sampling and Reconstruction with Permutation in Multidimensional Compressed Sensing. Hao Fang

Sparse Solutions to Linear Inverse Problems. Yuzhe Jin

Compressive Sensing Applications and Demonstrations: Synthetic Aperture Radar

The Benefit of Tree Sparsity in Accelerated MRI

2D and 3D Far-Field Radiation Patterns Reconstruction Based on Compressive Sensing

Tomographic reconstruction: the challenge of dark information. S. Roux

Detection Performance of Radar Compressive Sensing in Noisy Environments

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator

A Relationship between the Robust Statistics Theory and Sparse Compressive Sensed Signals Reconstruction

Face Recognition via Sparse Representation

Optimal Sampling Geometries for TV-Norm Reconstruction of fmri Data

COMPRESSIVE POINT CLOUD SUPER RESOLUTION

EE369C: Assignment 6

Support Vector Machines.

arxiv: v1 [cs.lg] 3 Jan 2018

Compressive Sensing: Theory and Practice

Compressive Sensing of High-Dimensional Visual Signals. Aswin C Sankaranarayanan Rice University

29 th NATIONAL RADIO SCIENCE CONFERENCE (NRSC 2012) April 10 12, 2012, Faculty of Engineering/Cairo University, Egypt

A* Orthogonal Matching Pursuit: Best-First Search for Compressed Sensing Signal Recovery

ESE 531: Digital Signal Processing

RECOVERY OF PARTIALLY OBSERVED DATA APPEARING IN CLUSTERS. Sunrita Poddar, Mathews Jacob

Iterative CT Reconstruction Using Curvelet-Based Regularization

Reconstruction of CT Images from Sparse-View Polyenergetic Data Using Total Variation Minimization

Online Subspace Estimation and Tracking from Missing or Corrupted Data

Central Slice Theorem

Compressive Sensing. Part IV: Beyond Sparsity. Mark A. Davenport. Stanford University Department of Statistics

A Geometric Analysis of Subspace Clustering with Outliers

CHAPTER 9 INPAINTING USING SPARSE REPRESENTATION AND INVERSE DCT

Refractive Index Map Reconstruction in Optical Deflectometry Using Total-Variation Regularization

Projection-Based Methods in Optimization

Recent Developments in Model-based Derivative-free Optimization

KSVD - Gradient Descent Method For Compressive Sensing Optimization

ELEG Compressive Sensing and Sparse Signal Representations

Algebraic Iterative Methods for Computed Tomography

Adaptive Reconstruction Methods for Low-Dose Computed Tomography

SPARSE SIGNAL RECONSTRUCTION FROM NOISY COMPRESSIVE MEASUREMENTS USING CROSS VALIDATION. Petros Boufounos, Marco F. Duarte, Richard G.

Digital Signal Processing Lecture Notes 22 November 2010

Bilevel Sparse Coding

CoSaMP: Iterative Signal Recovery from Incomplete and Inaccurate Samples By Deanna Needell and Joel A. Tropp

Image reconstruction based on back propagation learning in Compressed Sensing theory

Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery

SPARSITY ADAPTIVE MATCHING PURSUIT ALGORITHM FOR PRACTICAL COMPRESSED SENSING

Rectification and Distortion Correction

Image Processing Via Pixel Permutations

Learning Splines for Sparse Tomographic Reconstruction. Elham Sakhaee and Alireza Entezari University of Florida

Uniqueness in Bilinear Inverse Problems with Applications to Subspace and Sparsity Models

Modified Iterative Method for Recovery of Sparse Multiple Measurement Problems

Compressive Rendering of Multidimensional Scenes

Compressed Sensing and L 1 -Related Minimization

Introduction. Wavelets, Curvelets [4], Surfacelets [5].

Parameter Estimation in Compressive Sensing. Marco F. Duarte

Chapter 6. Curves and Surfaces. 6.1 Graphs as Surfaces

Applications Of Compressive Sensing To Surveillance Problems

Final Review. Image Processing CSE 166 Lecture 18

Biophysical Techniques (BPHS 4090/PHYS 5800)

Short-course Compressive Sensing of Videos

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING

Algebraic Iterative Methods for Computed Tomography

Transcription:

Introduction to Topics in Machine Learning Namrata Vaswani Department of Electrical and Computer Engineering Iowa State University Namrata Vaswani 1/ 27

Compressed Sensing / Sparse Recovery: Given y := Ax recover x from y when y is shorter than x. Use sparsity of x. Low-rank Matrix Completion: Given a subset of entries of a low-rank matrix M, complete the matrix given y = PΩ (M), find M. Ω: set of indices of the observed entries Matrix Sensing: given a set of n linear functions of M, find M using the fact that M is low-rank given y = A(M) where A(.) is a linear operator, find M. This can be written as y i =< A i, M > where < A, B >= trace(a B) is the usual inner product. Robust PCA: given Y := X + L, find X and L L = unknown low rank matrix. X = sparse matrix (corresponds to outliers) Phase retrieval: compute vector x from y := Ax 2. Here. means element-wise magnitude of the vector. More specifically y i = A i x 2 (here A i is the i-th row of A). Namrata Vaswani 2/ 27

the term phase retrieval comes from Fourier imaging where A is the DFT matrix; but now it s used more generally for any matrix A Ranking and individualized ranking estimation Namrata Vaswani 3/ 27

Applications CS: projection imaging - MRI, CT, single-pixel camera, radar,... MC: recommendation system design, e.g., Netflix problem Matrix sensing: one special case is phase retrieval. Notice that we can rewrite y i = A i xx A i =< xx, A i A i > RPCA: recommendation system design in the presence of outliers, Video analytics, Survey data analysis, Phase retrieval: astronomy, X-ray crystallography,... Namrata Vaswani 4/ 27

Non-convex Problems: Alternating Minimization and Gradient Descent I Alternating Min Goal: compute min x,y f (x, y) when f (.) is non-convex Clearly min x,y f (x, y) = min x (min y f (x, y)) but of course in most cases, RHS is also hard to compute. Consider the class of problems where the min is easy when one variable is fixed, i.e., min y f (x 0, y) is easy for a given x 0 and min x f (x, y 0 ) is easy for a given y 0. A common solution: Alt-Min Start with an initial guess x0. Compute y1 arg min y f (x 0, y) Compute x 1 arg min y f (x, y 1 ) Repeat above until a stopping criterion is met. Guarantees? Till very recently none. Recent work: Namrata Vaswani 5/ 27

Non-convex Problems: Alternating Minimization and Gradient Descent II If initialized carefully, Alt-Min gets to within a small error of the true solution in a finite number of iterations. Possible to bound this number also. A common approach to initialization: spectral method - compute the top eigenvector of an appropriately defined matrix Guarantees exist for Matrix Completion and for Phase Retrieval Gradient descent based approaches for non-convex problems With a suitable initialization, it is possible to get a guarantee Truncated gradient descent idea of truncated Wirtinger flow paper: the gradient turns out to be a weighted average of certain vectors; discard those weights that are too large and compute a truncated gradient estimate Namrata Vaswani 6/ 27

Old slides on sparse recovery Namrata Vaswani 7/ 27

The sparse recovery / compressed sensing problem Given y := Ax where A is a fat matrix, find x. underdetermined system, without any other info, has infinite solutions Key applications where this occurs: Computed Tomography (CT) or MRI CT: acquire radon transform of cross-section of interest typical set up: obtain line integrals of the cross-section along a set of parallel lines at a given angle, and repeated for a number of angles from 0 to π), common set up: 22 angles, 256 parallel lines per angle by Fourier slice theorem, can use radon transform to compute the DFT along radial lines in the 2D-DFT plane Projection MRI is similar, directly acquire DFT samples along radial lines parallel lines is most common type of CT, other geometries also used. Given 22x256 data points of 2D-DFT of the image, need to compute the 256x256 image Namrata Vaswani 8/ 27

Limitation of zero-filling A traditional solution: zero filling + I-DFT set the unknown DFT coeff s to zero, take I-DFT not good: leads to spatial aliasing Zero-filling is the minimum energy (2-norm) solution, i.e. it solves min x x 2 s.t. y = Ax. Reason clearly, min energy solution in DFT domain is to set all unknown coefficients to zero, i.e. zero-fill (energy in signal) = (energy in DFT)*2π, so min energy solution in DFT domain is also the min energy solution The min energy solution will not be sparse because 2-norm is not sparsity promoting In fact it will not be sparse in any other ortho basis either because x 2 = Φx 2 for any orthonormal Φ. Thus min energy solution is also min energy solution in Φ basis and thus is not sparse in Φ basis either But most natural images, including medical images, are approximately sparse (or are sparse in some basis) Namrata Vaswani 9/ 27

Sparsity in natural signals/images Most natural images, including medical images, are approximately sparse (or are sparse in some basis) e.g. angiograms are sparse brain images are well-approx by piecewise constant functions (gradient is sparse): sparse in TV norm brain, cardiac, larynx images are approx. piecewise smooth: wavelet sparse Sparsity is what lossy data compression relies on: JPEG-2000 uses wavelet sparsity, JPEG uses DCT sparsity But first acquire all the data, then compress (throw away data) In MRI or CT, we are just acquiring less data to begin with - can we still achieve exact/accurate reconstruction? Namrata Vaswani 10/ 27

Use sparsity as a regularizer Min energy solution min x x 2 s.t. y = Ax is not sparse, but is easy to compute ˆx = A (AA ) 1 y Can we try to find the min sparsity solution, i.e. find min x x 0 s.t. y = Ax Claim: If true signal, x 0, is exactly S-sparse, this will have a unique solution that is EXACTLY equal to x 0 if spark(a) > 2S spark(a) = smallest number of columns of A that are linearly dependent. in other words, any set of (spark-1) columns are always linearly independent proof in class Even when x is approx-sparse this will give a good solution But finding the solution requires a combinatorial search: O( S ( m ) k=1 k ) = O(m S ) Namrata Vaswani 11/ 27

Practical solutions [Chen,Donoho 95] [Mallat,Zhang 93] Basis Pursuit: replace l 0 norm by l 1 norm: closest norm to l 0 that is convex min x 1 s.t. y = Ax x Greedy algorithms: Matching Pursuit, Orthogonal MP Key idea: all these methods work if columns of A are sufficiently incoherent work : give exact reconstruction for exactly sparse signals and zero noise, give small error recon for approx. sparse (compressible) signals or noisy measurements Namrata Vaswani 12/ 27

Compressive Sensing name: instead of capturing entire signal/image and then compressing, can we just acquire less data? i.e. can we compressively sense? MRI (or CT): data acquired one line of Fourier projections at a time (or random transform samples at one angle at a time) if need less data: faster scan time new technologies that use CS idea: single-pixel camera, A-to-D: take random samples in time: works when signal is Fourier sparse imaging by random convolution decoding sparse channel transmission errors. Main contribution of CS: theoretical results Namrata Vaswani 13/ 27

General form of Compressive Sensing Assume that an N-length signal, z, is S-sparse in the basis Φ, i.e. z = Φx and x is S-sparse. We sense y := Ψz = ΨΦ }{{} Ax It is assumed that Ψ is incoherent w.r.t. Φ or that A := ΨΦ is incoherent Find x, and hence z = Φx, by solving min x 1 s.t. y = Ax x A random Gaussian matrix, Ψ, is incoherent w.h.p for S-sparse signals if it contains O(S log N) rows And it is also incoherent w.r.t. any orthogonal basis, Φ w.h.p. This is because if Ψ is r-g, then ΨΦ is also r-g (φ any orthonormal matrix). Same property for random Bernoulli. Namrata Vaswani 14/ 27

Quantifying incoherence Rows of A need to be dense, i.e. need to be computing a global transform of x. Mutual coherence parameter, µ := max i j A i A j / A i 2 A j 2 spark(a) = smallest number of columns of A that are linearly dependent. Or, any set of (spark(a) 1) columns of A are always linearly independent. RIP, ROP many newer approaches... Namrata Vaswani 15/ 27

Quantifying incoherence : RIP A K N matrix, A satisfies the S-Restricted Isometry Property if constant δ S defined below is positive. Let A T, T {1, 2,... N} be the sub-matrix obtained by extracting the columns of A corresponding to the indices in T. Then δ S is the smallest real number s.t. (1 δ S ) c 2 A T c 2 (1 + δ S ) c 2 for all subsets T {1, 2,... N} of size T S and for all c R T. In other words, every set of S or less columns of A has singular values b/w 1 ± δ S every set of S or less columns of A approximately orthogonal A is approximately orthogonal for any S-sparse vector, c. Namrata Vaswani 16/ 27

Examples of RIP If A is a random Gaussian, random Bernoulli, or Partial Fourier matrix with about O(S log N) rows, it will satisfy RIP(S) w.h.p. Partial Fourier * Wavelet: somewhat incoherent Namrata Vaswani 17/ 27

Use for spectral estimation and comparison with MUSIC Given a periodic signal with period N that is a sparse sum of S sinusoids, i.e. x[n] = k X [k]e j2πkn/n where the DFT vector, X, is a 2S-sparse vector. In other words, x[n] does not contain sinusoids at arbitrary frequencies (as allowed by MUSIC), but only contains harmonics of 2π/N and the fundamental period N is known. In matrix form, x = F X where F is the DFT matrix and F 1 = F. Namrata Vaswani 18/ 27

Suppose we only receive samples of x[n] at random times, i.e. we receive y = Hx where H is an undersampling matrix (exactly one 1 in each row and at most one 1 in each column) With random time samples it is not possible to compute covariance of x[n] := [x[n], x[n 1],... x[n M]], so cannot use MUSIC or the other standard spectral estimation methods. But can use CS. We are given y = HF X and we know X is sparse. Also, A := HF is the conjugate of the partial Fourier matrix and thus satisfies RIP w.h.p. If have O(S log N) random samples, we can find X exactly by solving min X 1 s.t. y = HF X X Namrata Vaswani 19/ 27

Quantifying incoherence : ROP θ S1,S 2 : measures the angle b/w subspaces spanned by A T1, A T2 for disjoint sets, T 1, T 2 of sizes less than/equal to S 1, S 2 respectively θ S1,S2 is the smallest real number such that c1 A T 1 A T 2c2 < θ S1,S2 c1 c2 for all c1, c2 and all sets T 1 with T 1 S1 and all sets T 2 with T 2 S2 In other words θ S1,S2 = min min c1 A T 1 A T 2c2 T 1,T 2: T 1 S1, T 2 S2 c1,c2 c1 c2 Can show that δ S is non-decreasing in S, θ is non-decreasing in S1, S2 Also θ S1,S2 δ S1+S2 Also, A T1 A T2 θ T1, T 2 Namrata Vaswani 20/ 27

Theoretical Results If x is S-sparse, y = Ax, and if δ S + θ S,2S < 1, then basis pursuit exactly recovers x If x is S-sparse, y = Ax + w with w 2 ɛ, and δ 2S < ( 2 1), then solution of basis-pursuit-noisy, ˆx satisfies basis-pursuit-noisy: x ˆx C 1 (δ 2S )ɛ min x 1 s.t. y Ax 2 ɛ x Namrata Vaswani 21/ 27

MP and OMP Namrata Vaswani 22/ 27

Applications DSP applications Fourier sparse signals Random sample in time Random demodulator + integrator + uniform sample with low rate A-to-D N length signal that is sparse in any given basis Φ Circularly convolve with an N-tap all-pass filter with random phase Random sample in time or use random demodulator architecture Namrata Vaswani 23/ 27

Compressibility: one definition Namrata Vaswani 24/ 27

Papers to Read Decoding by Linear Programming (CS without noise, sparse signals) Dantzig Selector (CS with noise) Near Optimal Signal Recovery (CS for compressible signals) Applications of interest for DSP Beyond Nyquist:... Tropp et al Sparse MRI:... Lustig et al Single pixel camera: Rice, Baranuik s group Compressive sampling by random convolution : Romberg Namrata Vaswani 25/ 27

Sparse Recon. with Partial Support Knowledge Modified-CS (our group s work) Weighted l 1 von-borries et al Namrata Vaswani 26/ 27

Treating Outliers as Sparse Vectors Dense Error Correction via ell-1 minimization Robust PCA Recursive Robust PCA (our group s work) Namrata Vaswani 27/ 27