The convex geometry of inverse problems

Similar documents
The convex geometry of inverse problems

An incomplete survey of matrix completion. Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison

ELEG Compressive Sensing and Sparse Signal Representations

Sparse Reconstruction / Compressive Sensing

Compressive Sensing. Part IV: Beyond Sparsity. Mark A. Davenport. Stanford University Department of Statistics

Introduction to Topics in Machine Learning

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

ELEG Compressive Sensing and Sparse Signal Representations

Lecture 17 Sparse Convex Optimization

A Course in Convexity

Signal Reconstruction from Sparse Representations: An Introdu. Sensing

Section 5 Convex Optimisation 1. W. Dai (IC) EE4.66 Data Proc. Convex Optimisation page 5-1

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff

Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator

Collaborative Sparsity and Compressive MRI

Convex Optimization MLSS 2015

60 2 Convex sets. {x a T x b} {x ã T x b}

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

Lecture 2 September 3

The Benefit of Tree Sparsity in Accelerated MRI

Convex Optimization - Chapter 1-2. Xiangru Lian August 28, 2015

A More Efficient Approach to Large Scale Matrix Completion Problems

Bilevel Sparse Coding

Blind Deconvolution Problems: A Literature Review and Practitioner s Guide

The Fundamentals of Compressive Sensing

A Geometric Analysis of Subspace Clustering with Outliers

A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem

arxiv: v1 [cs.lg] 3 Jan 2018

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING

Algebraic Iterative Methods for Computed Tomography

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets. Instructor: Shaddin Dughmi

Numerical Optimization

Integral Geometry and the Polynomial Hirsch Conjecture

K-Means and Gaussian Mixture Models

2. Convex sets. x 1. x 2. affine set: contains the line through any two distinct points in the set

P257 Transform-domain Sparsity Regularization in Reconstruction of Channelized Facies

Compressive. Graphical Models. Volkan Cevher. Rice University ELEC 633 / STAT 631 Class

Convex Optimization. 2. Convex Sets. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University. SJTU Ying Cui 1 / 33

Convex Optimization. Convex Sets. ENSAE: Optimisation 1/24

Structurally Random Matrices

Investigating Mixed-Integer Hulls using a MIP-Solver

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

Convex Sets. Pontus Giselsson

SDLS: a Matlab package for solving conic least-squares problems

Open problems in convex geometry

Convex optimization algorithms for sparse and low-rank representations

ALGORITHMS FOR DICTIONARY LEARNING

Graph Connectivity in Sparse Subspace Clustering

ONLINE ROBUST PRINCIPAL COMPONENT ANALYSIS FOR BACKGROUND SUBTRACTION: A SYSTEM EVALUATION ON TOYOTA CAR DATA XINGQIAN XU THESIS

What Does Robustness Say About Algorithms?

Week 5. Convex Optimization

Outlier Pursuit: Robust PCA and Collaborative Filtering

2. Convex sets. affine and convex sets. some important examples. operations that preserve convexity. generalized inequalities

Limitations of Matrix Completion via Trace Norm Minimization

2D and 3D Far-Field Radiation Patterns Reconstruction Based on Compressive Sensing

Rank Minimization over Finite Fields

Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization

Lecture 2. Topology of Sets in R n. August 27, 2008

Divide and Conquer Kernel Ridge Regression

Convexization in Markov Chain Monte Carlo

Lecture 27: Fast Laplacian Solvers

SUMMARY. In this paper, we present a methodology to improve trace interpolation

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Support Vector Machines.

1. Represent each of these relations on {1, 2, 3} with a matrix (with the elements of this set listed in increasing order).

Open problems in convex optimisation

Lecture 12 March 4th

Proximal operator and methods

Math 5593 Linear Programming Lecture Notes

Image Restoration and Background Separation Using Sparse Representation Framework

Note Set 4: Finite Mixture Models and the EM Algorithm

Nearest Neighbor with KD Trees

Blind Identification of Graph Filters with Multiple Sparse Inputs

IN some applications, we need to decompose a given matrix

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 4: Convex Sets. Instructor: Shaddin Dughmi

An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems

Conic Duality. yyye

Automorphism Groups of Cyclic Polytopes

Javad Lavaei. Department of Electrical Engineering Columbia University

Convexity: an introduction

Convex Sets. CSCI5254: Convex Optimization & Its Applications. subspaces, affine sets, and convex sets. operations that preserve convexity

Geometry and Statistics in High-Dimensional Structured Optimization

3. The Simplex algorithmn The Simplex algorithmn 3.1 Forms of linear programs

Algebraic Iterative Methods for Computed Tomography

Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization. Author: Martin Jaggi Presenter: Zhongxing Peng

18.S34 (FALL 2007) PROBLEMS ON HIDDEN INDEPENDENCE AND UNIFORMITY

Convex Sets (cont.) Convex Functions

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Semistandard Young Tableaux Polytopes. Sara Solhjem Joint work with Jessica Striker. April 9, 2017

We have set up our axioms to deal with the geometry of space but have not yet developed these ideas much. Let s redress that imbalance.

Compressive Sensing for Multimedia. Communications in Wireless Sensor Networks

Alternating Projections

Geometric structures on manifolds

Math 734 Aug 22, Differential Geometry Fall 2002, USC

Bipartite Graph based Construction of Compressed Sensing Matrices

Introduction to Coxeter Groups

UNLABELED SENSING: RECONSTRUCTION ALGORITHM AND THEORETICAL GUARANTEES

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if

Modern Signal Processing and Sparse Coding

Transcription:

The convex geometry of inverse problems Benjamin Recht Department of Computer Sciences University of Wisconsin-Madison Joint work with Venkat Chandrasekaran Pablo Parrilo Alan Willsky

Linear Inverse Problems Find me a solution of m x n, m<n Of the infinite collection of solutions, which one should we pick? Leverage structure: y = Φx Sparsity Rank Smoothness Symmetry How do we design algorithms to solve underdetermined systems problems with priors?

Sparsity 1-sparse vectors of Euclidean norm 1 1 Convex hull is the unit ball of the l 1 norm -1 1-1

x 2 Φx=y minimize x 1 subject to Φx = y x 1 Compressed Sensing: Candes, Romberg, Tao, Donoho, Tanner, Etc...

Rank 2x2 matrices plotted in 3d rank 1 x 2 + z 2 + 2y 2 = 1 Convex hull: X = i σ i (X)

2x2 matrices plotted in 3d X = i σ i (X) Nuclear Norm Heuristic Fazel 2002. R, Fazel, and Parillo 2007 Rank Minimization/Matrix Completion

Integer Programming Integer solutions: all components of x are ±1 (-1,1) (1,1) Convex hull is the unit ball of the l 1 norm (-1,-1) (1,-1)

x 2 Φx=y minimize subject to x Φx = y x 1 Donoho and Tanner 2008 Mangasarian and Recht. 2009.

Parsimonious Models rank model weights atoms Search for best linear combination of fewest atoms rank = fewest atoms needed to describe the model

Union of Subspaces X has structured sparsity: linear combination of elements from a set of subspaces {Ug}. Atomic set: unit norm vectors living in one of the Ug x G =inf w g : x = w g, w g U g g G g G Proposed by Jacob, Obozinski and Vert (2009).

Permutation Matrices X a sum of a few permutation matrices Examples: Multiobject Tracking (Huang et al), Ranked elections (Jagabathula, Shah) Convex hull of the permutation matrices: Birkhoff Polytope of doubly stochastic matrices Permutahedra: convex hull of permutations of a fixed vector. [1,2,3,4]

Moments: convex hull of of [1,t,t 2,t 3,t 4,...], t, some basic set. System Identification, Image Processing, Numerical Integration, Statistical Inference Solve with semidefinite programming Cut-matrices: sums of rank-one sign matrices. Collaborative Filtering, Clustering in Genetic Networks, Combinatorial Approximation Algorithms Approximate with semidefinite programming Low-rank Tensors: sums of rank-one tensors Computer Vision, Image Processing, Hyperspectral Imaging, Neuroscience Approximate with alternating leastsquares

Atomic Norms Given a basic set of atoms,, define the function A x A =inf{t >0 : x tconv(a)} A When is centrosymmetric, we get a norm x A =inf{ a A c a : x = a A c a a} IDEA: minimize subject to z A Φz = y When does this work? How do we solve the optimization problem?

Tangent Cones Set of directions that decrease the norm from x form a cone: T A (x) ={d : x + αd A x A for some α > 0} y = Φz x minimize z A subject to Φz = y T A (x) {z : z A x A } x is the unique minimizer if the intersection of this cone with the null space of Φ equals {0}

Gaussian Widths When does a random subspace, U, intersect a convex cone C at the origin? Gordon 88: with high probability if w(c) =E codim(u) w(c) 2 max x, g x C S n 1 Where is the Gaussian width. g N (0,I n ) Corollary: For inverse problems: if Φ is a random Gaussian matrix with m rows, need for recovery of x. m w(t A (x)) 2

Robust Recovery Suppose we observe minimize subject to y = Φx + w z A Φz y δ w 2 δ ˆx If is an optimal solution, then provided that m c 0w(T A (x)) 2 (1 ) 2 x ˆx 2δ Φz y δ {z : z A x A }

Calculating Widths Hypercube: m n/2 Sparse Vectors, n vector, sparsity s<0.25n m 2s log n s s +1 Block sparse, M groups (possibly overlapping), maximum group size B, k active groups m 2k (log (M k)+b)+k Low-rank matrices: n1 x n2, (n1<n2), rank r m 3r(n 1 + n 2 r)

General Cones Theorem: Let C be a nonempty cone with polar cone C*. Suppose C* subtends normalized solid angle µ. Then w(c) 3 log 4 µ Corollary: For a vertex-transitive (i.e., symmetric ) polytope with p vertices, O(log p) Gaussian measurements are sufficient to recover a vertex via convex optimization. For n x n permutation matrix: m = O(n log n) For n x n cut matrix: m = O(n)

Algorithms minimize z Φz y 2 2 + µz A Naturally amenable to projected gradient algorithm: z k+1 = Π ηµ (z k ηφ r k ) shrinkage Relaxations: Π τ (z) = arg min 1 u 2 z u2 + τu A A 1 A 2 = x A2 x A1 NB! tangent cone gets wider Hierarchy of relaxations based on θ-bodies yield progressively tighter bounds on the atomic norm

Atomic Norm Decompositions Propose a natural convex heuristic for enforcing prior information in inverse problems Bounds for the linear case: heuristic succeeds for most sufficiently large sets of measurements Stability without restricted isometries Standard program for computing these bounds: distance to normal cones Algorithms and approximation schemes for computationally difficult priors

Extensions... Width Calculations for more general structures Recovery bounds for structured measurement matrices (application specific) Incorporating stochastic noise models Understanding of the loss due to convex relaxation and norm approximation Scaling generalized shrinkage algorithms to massive data sets

JELLYFISH SGD for Matrix Factorizations. Example: Idea: approximate minimize (L,R) (Lu R T v Z uv ) 2 + µ u L u 2 F + µ v R v 2 F (u,v) E Step 1: Pick (u,v) and compute residual: e =(L u R T v Z uv ) Step 2: Take a gradient step: Lu minimize R v (u,v) E (X uv M uv ) 2 + µx X LR T with Christopher Ré (1 γµu )L u γer v (1 γµ v )R v γel u

JELLYFISH Observation: With replacement sample=poor locality Idea: Bias sample to improve locality. Algorithm: Shuffle the data. 1. Process {L 1 R 1, L 2 R 2, L 3 R 3 } in parallel 2. Process {L 1 R 2, L 2 R 3, L 3 R 1 } in parallel 3. Process {L 1 R 3, L 2 R 1, L 3 R 2 } in parallel Big win: No locks! (model access)

Example Optimization

Example Optimization Shuffle all the rows and columns

Example Optimization Shuffle all the rows and columns

Example Optimization Shuffle all the rows and columns Apply block partitioning

Example Optimization Shuffle all the rows and columns Apply block partitioning

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat...

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat...

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat...

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat...

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat...

Example Optimization Shuffle all the rows and columns Apply block partitioning Train on each block independently Repeat... Solves Netflix prize in under 1 minute on a 40 core machine Over 100x faster than standard solvers