RECONSTRUCTION ALGORITHMS FOR COMPRESSIVE VIDEO SENSING USING BASIS PURSUIT Ida Wahidah 1, Andriyan Bayu Suksmono 1 1 School of Electrical Engineering and Informatics, Institut Teknologi Bandung Jl. Ganesa No. 1, Bandung email: wahidah@ittelkom.ac.id 1, suksmono@yahoo.com 2 ABSTRACT Compressive sensing is a novel field in signal processing. In this paper, we study the possibility of applying compressive sensing to video processing and reconstruction. Principally, there are two stages in signal processing at the transmitter side of a communication system, i.e. acquisition and projection (measurement) stage. The acquisition stage usually transforms the signal into a sparse basis, in which only a few coefficients are considered significant. Furthermore, the projection stage will select coefficients based on certain random distribution, for instance, Gaussian or Bernoulli process. On the other hand, the reconstruction stage that is commonly performed at the receiver side has to acquire the original signal from the altered and/or corrupted one. Until today, the implementation of BP to reconstruct video signal is quite time consuming. This is because the nature of video that is composed by huge data in the form of frames, pixels, as well as bits. In the simulation scenario, we use a low frame rate and low resolution video. For the resolution of 8x6 pixels, frame rate of 15 fps, and duration of 8 seconds, we will have approximately 5.8x1 5 pixels to be examined. It is equivalent to 9 processing steps for standard block size. Keywords: Compressive sensing, video coding, Reconstruction algorithm, Basis pursuit. 1 INTRODUCTION There were various algorithms proposed to reconstruct highly incomplete signal. Those algorithms can be categorized into three major classes, i.e. convex optimization, greedy methods, and iterative thresholding. In this study, we investigate the comparison between convex optimization represented by basis pursuit (BP) and matching pursuit (MP) that is classified as greedy algorithm. Theoretically, basis pursuit should outperform MP in terms of accuracy. In contrast, MP might be faster due to its simplicity. We will endeavor to yield recommendation of suitable algorithm for video processing. In order to support this algorithm, we employ inverse discrete cosine transform for sparsity basis and Gaussian distribution for projection basis. In general, basis pursuit will find the optimum reconstructed signal by means of linear programming. The measured signal will be decomposed into smaller chunks taken from an over complete dictionary. The decision on which dictionary element must be combined with the others is resulted from the calculation of smallest l1 norm. In principal, basis pursuit is more associated to optimization problem rather than algorithm. Among the common algorithms utilized to solve BP problem recently are Dantzig simplex and interior point method. These methods find a solution by exploring points set in a polyhedron such that all equalities and inequalities are met. 2 COMPRESSIVE SENSING In the conventional sensing and sampling method, the acquired data are manifold. This is due to the Nyquist-based analog sampling, as well as massive acquisition during digital sampling. With compressive sensing, it is proven that we could apply sampling frequency below Nyquist limit to signals characterized as sparse. Yet, the reconstructed signal/image/video quality remains satisfactory in terms of PSNR. 2.1 Definition Generally, compressive sensing is based on exploiting sparsity of the signal in some domain. It consists of several main stages, i.e. sparsification by transformation, projection (measurement), and reconstruction. Let x = {x[1],...,x[n]} be a set of N samples of a real valued random process X. Let s be the representation of x in the domain, that is: N x = s = i=1 s i ψ i (1) where s = [s1,..., sn] is an N-vector of weighted coefficients s i = <x, i >, and = [ 1 2 N ] is an N N basic matrix with i being the i-th basic column vector. The number of VI-87
VI-88 The 6 th International Conference on Information & Communication Technology and Systems non-zero coefficients K after transformation is usually much less than N. The next stage is signal projection or measurement. This is where the difference between compressive sensing and other coding mechanism occurs. For image or video frame, the projection stage determines which pixels will be taken for the next processing block. The transformation is conducted by means of a measurement matrix, based on basic distribution function. The main idea is to remove this sampling redundancy by needing only M samples of the signal, where K < M «N. Let y be an M-length measurement vector given by: y = x, where is an M N measurement matrix. The above expression can be written in terms of s as: y = x = s (2) At the decoder, each key frame x = Ψs with size N is excellently approximated via basis pursuit, which solves the convex optimization problem as: s = arg min s 1, such that s = As = y (3) where y is an M 1 vector, y = Фx, A = ФΨ is a M N matrix, s 1 is the l1 norm of s, i.e., the sum of the absolute value of each component in s. Basis Pursuit (BP) is a principle for decomposing a signal into an optimal superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients. BP is applied to a linear programming formulation of Eq. (3), in which the search path for each iteration is obtained by projecting the received signal to the feasible dictionary set. 2.2 Sparsity Basis We use an i.i.d. Gaussian measurement matrix for and inverse DCT for in equation (2). This choice of ensures that the restricted isometry property is fulfilled. The inverse discrete cosine transform for two dimensional signals can be expressed as follow: A mn M 1 N 1 = α p α q B pq cos π 2m + 1 p cos π 2n + 1 q 2M 2N p= q= Where m M-1, n N-1 1 M, p= α p = 2 M, 1 p M-1 α q = 1 N, q= (4) 2, 1 q N-1 N There are various transformation methods that can be utilized to sparsify signals, embracing DHWT, DCT, curvelet, IDCT, etc. We choose IDCT for its simplicity yet reliable results. After applying IDCT to the original signal, we have a sparser signal in domain. More than 5% of the coefficients are considered very low compared to the coefficients at low frequency. 5 4 3 2 1-1 -2 2 4 6 8 1 12 Figure 1. IDCT coefficient comparison for a 32x32 image The IDCT coefficients are then reshaped to form a vector so that it can be multiplied by the measurement matrix. The data type should be also adjusted to a more suitable one. At the receiver, the output of reconstruction stage should be processed by DCT block in order to obtain the original image. 2.3 Projection Basis The projection stage determines the selected coefficients to be transmitted or stored. The greater number of coefficients selected the better signal quality at the receiver. On the other hand, more coefficients mean more bit or bandwidth required to transmit the signal. The generation of measurement matrix is done through randn function in Matlab. This matrix converts an N-length vector to an M- length vector. y = x y 1 y 2 = y k IDCT Result for Image Input φ 11 φ 12 φ 1n φ 21 φ 22 φ 2n φ k1 φ k2 φ kn x 1 x 2 x n (5) The measurement matrix is changed to orthonormal basis by orth function. It is presumed that the receiver has knowledge regarding this matrix; hence we will multiply the received signal y by the transpose of the matrix.
Reconstruction Algorithms for Compressive Video Sensing Using Basis Pursuit-Ida Wahidah VI-89 3 BASIS PURSUIT METHODS BP finds signal representations in overcomplete dictionaries by convex optimization. It obtains the decomposition that minimizes the l1 norm of the coefficients. Because of the nondifferentiability of the l1 norm, this optimization principle leads to sparser decompositions. Because it is based on global optimization, it can stably super-resolve in some ways better than other methods from the class of greedy algorithms. BP can be used with noisy data by solving an optimization problem with an l1 norm of coefficients. Examples show that it can suppress noise while preserving structure that is wellexpressed in the dictionary. BP is closely connected with linear programming. Recent advances in large scale linear programming, associated with interior-point methods, can be applied to BP, and make it possible, with certain dictionaries, to nearly-solve the BP optimization problem in linear time. There are a few implementations of basis pursuit environment; such as Atomizer by Donoho, et al. and l1-magic by Candes, et al. Experiments with standard time-frequency dictionaries indicate some of the benefits of BP. 3.1 Linear Programming Basis Pursuit develops a connection with linear programming (LP). The linear program in socalled standard form is a constrained optimization problem defined in terms of a variable x R m by min c T x subject to Ax = b, x (6) where c T x is the objective function, Ax = b is a collection of equality constraints, and x is a set of bounds. The Basis Pursuit problem can be reformulated as a linear program in the standard form (6) by making the following translations: x u, v ; c 1,1 ; A Φ, Φ ; b s (7) The connection between Basis Pursuit and linear programming is very useful in several ways. 3.2 Interior Point Method In the last fifty years, a tremendous work has been done on the solution of linear programs. Until the 198's, most work focused on variants of Dantzig's simplex algorithm. Currently, some spectacular breakthroughs have been made by the use of interior-point methods", which use an entirely different principle. From our point of view, we are free to consider any algorithm from the literature as a candidate for solving the BP optimization problem; both the simplex and interior-point algorithms offer interesting insights to BP. The feasible points {x : Ax = b; x } is a convex polyhedron in R m (a simplex"). The simplex method works by walking around the boundary of this simplex, jumping from one vertex of the polyhedron to an adjacent vertex at which the objective is better. Interior point methods start from a point x () inside the interior of the simplex (x ()» ) and go through the interior". Since the solution of an LP is always at an extreme point of the simplex, the current iterate x (k) should approach the boundary. One may abandon the basic iteration and conduct a crossover" procedure that uses simplex iterations to find the extreme point. We start from a solution to the overcomplete representation problem () = s with () >. One iteratively modifies the coefficients, while maintaining feasibility (k) = s, and applying a transformation that sparsifies the vector (k). At some iteration, the vector has n significantly nonzero entries, and it becomes clear" that those correspond to the basis appearing in the final solution. This method forces all the other coefficients to zero and jumps" to the decomposition in terms of the n selected atoms. Advances in interior point methods for convex optimization over the past 2 years, led by Nesterov s work, have made prominent solvers for several BP problems. Boyd and Vandenberghe outline a relatively simple primal-dual algorithm for linear programming which we have followed for the implementation of (P1), (PA), and (PD). For the set up of notation, we briefly review their algorithm here. The standard-form linear program is min z c, z subject to A z = b, f i z (8) where the search vector z R N, b R K, A is a KxN matrix, and each of the f i, i = 1,, m is a linear function: f i (z) = <c i,z> + d i, for some c i R N, d i R. At the optimal point z*, there will exist dual vectors v* R K, * R m, * such that the Karush-Kuhn-Tucker conditions are satisfied: (KKT) c + A T v + λ i i c i =, λ i f i z =, i = 1,, m, A z = b, f i z, i = 1,, m. (9) The primal dual algorithm finds the optimal z* (along with optimal vectors * and *) by solving this system of nonlinear equations. The solution procedure is the Newton method: at an interior point (z k, k, k ) (by which f i (z k ) <, k > ), the system is linearized and solved. However,
Value VI-9 The 6 th International Conference on Information & Communication Technology and Systems the step to new point (z k+1, k+1, k+1 ) must be modified so that we remain in the interior. 4 SIMULATION RESULTS In this section we report our experimental results, namely the effect of compressive sensing with different combination of measurement parameter M, on reconstructed video PSNR. The system is applied to grayscale video sequence, i.e. the Traffic sequence of 8x6 pixels. In all our simulations, a block size of N = 8 8 = 64 pixels is used. This block size was observed to provide a good trade-off between efficiency, reconstruction complexity, and decoding time. Fig. 2 to Fig. 4 show the original and reconstructed video frames for various measurement rate. It can be seen that the larger size of measurement matrix results in more satisfactory quality. The original IDCT coefficients for 77 th, 8 th, and 83 rd frame of Traffic.avi are shown in Fig. 5 to Fig. 7 respectively. As we use a better measurement rate for the 8 th frame, the average error is lower. For the 77 th frame, the mean error of interior point method is 1.9 with processing time of 74 seconds. In contrast, the 8 th mean error is only.697 with total elapsed time of 151 sec. 6 4 2-2 -4 Original signal (x) and error signal (e) Projection input Reconstruction error -6 5 1 15 2 25 3 35 4 45 5 Figure 6. Original and error signal for 12 th frame (MR = 2%) Figure 2. Input and reconstructed 12 th frame with measurement rate of 2% Video processing "traffic.avi" 77th frame (8x6 pixels, 15 fps, total frame 12, grayscale), IDCT result (k=24, pdtol=1exp-3) and reconstruction error 3 original error 2 1-1 Figure 3. Input and reconstructed 77 th frame with measurement rate of 4% -2-3 2 25 3 35 4 45 5 Figure 7. Original and error signal for 77 th frame (MR = 4%) Figure 4. Input and reconstructed 8 th frame with measurement rate of 6% 4 Original (x) and error (e) signal Masukan proyeksi Kesalahan rekonstruksi 2 Figure 5. Input and reconstructed 83 rd frame with measurement rate of 8% -2-4 -6 5 1 15 2 25 3 35 4 45 5 Figure 8. Original and error signal for 8 th frame (MR = 6%)
PSNR (db) PSNR (db) PSNR (db) PSNR (db) Reconstruction Algorithms for Compressive Video Sensing Using Basis Pursuit-Ida Wahidah VI-91 6 4 2 Original (x) and error (e) signal for measurement rate of 8% Projection input Reconstruction error 29 28 27 26 25 PSNR for measurement matrix of 4% -2 24 23 22-4 21-6 5 1 15 2 25 3 35 4 45 5 Figure 9. Original and error signal for 83 rd frame (MR = 8%) 2 2 4 6 8 1 12 Figure 11. PSNR measurement for MR = 4% and frame number 1 to 12 The PSNR calculation for measurement rate of 2%, 4%, 6%, and 8% are shown in Fig. 1 to Fig. 13. The average PSNR value is 21.59 db, 24.99 db, 28.11, and 34.42 db respectively; and it can be improved by increasing the measurement matrix size. From our experiment in M = 8%, the average PSNR achieves 36.11 db. The video signal quality is very good compared with original signal. If we set 3 db as acceptable PSNR, then measurement rate should be around 65%. 37 36 35 34 33 32 31 3 PSNR for measurement rate of 8% 27 26 25 24 23 22 21 2 PSNR for measurement rate of 2% 29 2 4 6 8 1 12 Figure 12. PSNR measurement for MR = 8% and frame number 1 to 12 Another interesting observation from the graphics is that the minimum PSNR for MR 4% occurs between frame number 36 and 4. Otherwise, despite the local minimum at frame 4, the global minimum PSNR for MR 6 % occurs at frame 78. 19 2 4 6 8 1 12 Figure 1. PSNR measurement for MR = 2% and frame number 1 to 12 38 36 34 32 3 28 PSNR comparison for various measurement rate MR 2% MR 4% MR 6% MR 8% 26 24 22 2 2 4 6 8 1 12 Figure 13. PSNR comparison for frame 1 to 1
VI-92 The 6 th International Conference on Information & Communication Technology and Systems 5 CONCLUSION In this paper, we propose a system based on the novel concept of compressive sensing to achieve low complexity acquisition of video. Each frame of the video is split into a number of smaller nonoverlapping blocks of equal size to reduce the complexity of CS algorithms and exploit the varying sparsity across blocks within a frame. Compressive sensing is performed on all frames that satisfy simple sparsity test. Reference frames are used to predict sparsity of the blocks within successive frames. Experimental results show great potential for compressive sensing for video acquisition, with up to 4% savings with good reconstruction quality. Acceptable PSNR is achieved for projection/ measurement rate of 6%. REFERENCES [1] Compressed Sensing. Donoho, D.L. 4, s.l. : IEEE, April 26, IEEE Transactions on Information Theory, Vol. 52, pp. 1289-136. [2] Robust Uncertainty Principles: Exact Signal Recovery from Highly Incomplete Frequency Information. E.J. Candes, J. Romberg, T. Tao. 2, s.l. : IEEE, February 26, IEEE Transactions on Information Theory, Vol. 52, pp. 489-59. [3] Baraniuk, R.G. Compressive Sensing [Lecture Notes]. IEEE Signal Processing Magazine. July 27, Vol. 24, 7, pp. 118-121. [4] Analog-to-Information Conversion via Random Demodulation. S. Kirolos, J. Laska, M. Wakin, M. Duarte, D. Baron, T. Ragheb, Y. Massoud, R.G. Baraniuk. Dallas : IEEE, 26. IEEE Dallas Circuits and Systems Workshop. [5] Compressive Imaging for Video Representation and Coding. M.B. Wakin, J.N. Laska, M.F. Duarte, D. Baron, S. Sarvotham, D. Takhar, K.F. Kelly, R.G. Baraniuk. Beijing : s.n., 26. Picture Coding Symposium. [6] Compressive Sampling of Binary Images. V. Stankovic, L. Stankovic, S. Cheng. Hainan : s.n., 28. Congress on Image and Signal Processing. [7] Needell, D. Topics in Compressed Sensing. Davis : University of California, Davis, 29. [8] L.W. Kang, C.S. Lu. Distributed Compressive Video Sensing. Academia Sinica. s.l. : Institute of Information Science, 28. [9] Near-Optimal Signal Recovery from Random Projections: Universal Encoding Strategies? E.J. Candes, T. Tao. 12, s.l. : IEEE, December 26, IEEE Transactions on Information Theory, Vol. 52, pp. 546-5425. [1] Baraniuk, R. Exploiting Sparsity in Compressive Sensing. [PDF Document] s.l., USA: Rice University, July 28