CHAPTER 9 INPAINTING USING SPARSE REPRESENTATION AND INVERSE DCT

CHAPTER 9 INPAINTING USING SPARSE REPRESENTATION AND INVERSE DCT 9.1 Introduction In the previous chapters the inpainting was considered as an iterative algorithm. PDE based method uses iterations to converge to the solution and the Exemplar based method uses iterations to propagate the information from the boundary to the interior of the inpainting area. The exemplar based methods incur additional cost as they adopt an exhaustive search looking for a best matching patch. All the inpainting algorithms depend on the information present in the boundary. Hence the error caused near the boundary is propagated into the interior regions of inpainting. These problems are avoided by the method proposed in this chapter, thus addressing the last research objective. 9.2 Compressive Sensing Compressive sensing is a data acquisition technique that aims to reduce a number of measurements required for representation of signals/images by exploiting its compressibility or sparsity. Most of the signals/images will be sparse in some transform domain. A signal is said to be sparse signal if that can be compactly expressed as a linear combination of a small number of elementary signals[37]. The sparse representation theory states that sparse signals can be exactly approximated from a small number of elementary signals. The sparse representation of the signals can be 123

achieved by exploiting its compressibility or diversity, where only a small fraction of the coefficients are zero. The discrete cosine transform (DCT) represents an image as a sum of sinusoids of varying magnitudes and frequencies. The DCT has the property that, for a given image, most of the visually significant information about the image is concentrated only in a few coefficients. This ability of DCT is called energy compaction. This helps to discard the coefficients that are relatively small without introducing any visual distortion in the reconstructed image. This property of DCT is generally used in compression but in this chapter, it is used to inpaint the images. (a) (b) Fig 9.1: (a) image compression, (b) reconstruction; X - Input image, Ψ Transform matrix, S sparse transform coefficients, S= X, Ψ, Φ stable measurement matrix that ensures that the salient information in any K-sparse or compressible signal is not damaged by the dimensionality reduction from X M x N to Y m x n, Y- compressed image (ΦX) 124

Compressive sensing predicts that sparse signals and images can be reconstructed from what was previously believed to be incomplete information. Instead of taking the whole data about the images/signals it takes only the data where the coefficient value is high. The process of compressive sensing is shown in Fig 9.1 Hence, any image/signal can be expressed as, X = ΨS, where S= X, Ψ = Ψ T X and where T denotes the transpose operation. Clearly, X and S are equivalent representations of the same signal/image, with X in the time domain and S in the Ψ domain. Relating the above terms with the proposed method: X is the input image, Ψ is the transform matrix which can be thought of as DCT transform coefficient matrix, Φ is the random selection in CS which can be the pixels that we leave during downsampling, S is the sparse transform coefficients in CS; in this case it can be the transform matrix that remains after thresholding and Y is the downsampled image. The final step of getting back the image in CS is X= ΨS; here it is the zero padded transform matrix passing through inverse discrete cosine transform. Inpainting could be considered as the reconstruction step of compressive sensing where the image has to be predicted from few coefficients. Super resolution[32] image is a higher resolution image extracted from one or more low resolution image. This can be thought of as an inpainting process where the new pixels in higher resolution are considered as unknown pixels. In all these cases the sparse coefficients in the transform domain could be calculated with the known image pixels and these coefficients are used to predict the unknown values by applying suitable sized inverse DCT. The theory was derived from Compressive sensing developed by Donoho [40], [41]. More theory on sparse representation could be found in [42], [43]. Inpainting in Sparse domain is considered as denoising task in [44], [45]. These methods create 125

dictionaries as in [46] to perform the denoising. In [47] a Bayesian framework is used with an iterative EM algorithm to inpaint the image. 9.3 Inpainting Using Inverse DCT (IDCT) The two dimensional DCT transform for an image is described in chapter 3.2.3. The transformation kernel can be represented in a matrix form φ, which remains the same for any image of the same size. The DCT coefficients can be represented as transformed matrix A. For a square image a, it can be calculated using the kernel as given in Eq 9.1. [A]=[φ].[a].[φ T ] (9.1) where [.] represents the matrix and φ T is the transpose of φ. Similarly the image could be reconstructed from the coefficients using Eq. 9.2 [a]=[φ T ].[A].[φ] (9.2) The first element in the transformed matrix represents the DC element and entries with increasing vertical and horizontal index values represent higher vertical and horizontal frequencies. The values on the upper left correspond to the low frequencies and those on the bottom right correspond to the high frequencies. For most images, all the significant values are populated on the top left and the values at the bottom right corner are negligible. The images could be reconstructed from few significant values in a visually plausible manner. Conversely the insignificant values added in the high frequency range could be used to reconstruct an enlarged image. This principle is being utilized in this chapter for inpainting and producing super resolution images. The block diagram for inpainting is shown in Fig 9.2. 126

Fig 9.2: Block diagram showing the process of inpainting For inpainting an image, the mask is accepted from the user and the mask pixels are eliminated which results in reduced size of the image. The DCT coefficients are constructed for the reduced image (m x n) using DCT kernel of same size (m x n) and zeros are added in the high frequency area to make the transformed matrix size equal to that of the original size (M x N). The image is reconstructed from the coefficients using the Inverse DCT kernel of size (M x N). For irregular aperiodic masks the elimination of the mask pixels creates distortion. To avoid this, the lines containing the mask pixels are eliminated. This enables the image to be in regular shape. For color images, the algorithm is applied separately to each of the three color component. The RGB components of the image are separated and after processing each component separately, the results are combined to get back the inpainted color image. For producing the super resolution images, the DCT coefficients are obtained from the low resolution image and zeros are padded to the right and bottom ends of the transformed matrix to convert the image to the required resolution. An Inverse DCT kernel for the updated size is used and the image is reconstructed using Eq 9.2 127

9.4 Experimental Analysis The effectiveness of the DCT in energy compaction is analyzed at various levels. A percentage of the DCT coefficients of the images are neglected from the lower right part of the matrix and the images are reconstructed with lesser coefficients. The mean square error between the actual image and the reconstructed image is calculated. The error values versus various percentages of unknown DCT coefficients for few images are plotted in Fig 9.3. The images used for the analysis is given in Table 9.4. Error 180 160 140 120 100 80 60 40 20 0 50 75 80 90 % of unknown co-efficients DCT_1 DCT_2 DCT_3 DCT_4 DCT_5 DCT_6 Fig 9.3 Effect of unknown DCT coefficients in image reconstruction Image quality of the reconstructed image decreases as the number of known coefficients reduces. Fig 9.4 shows the comparison between the percentage of information known and the error value for an image. It can be seen that when the known information is less than 25% the error value increases tremendously. 128

percentage information Error value 50000 40000 30000 20000 10000 percentage information 0 97.6794.684.2680.7164.1236.6517.14 Percentage of information known Fig 9.4 performance of Image reconstruction using reduced DCT coefficients The performance of the system in producing super resolution images is measured by down sampling images and reconstructing back using suitable sized IDCT kernel. The enlargement is done in various levels. At each level the image is enlarged to twice the size of the reconstructed image of the previous level. The reconstruction up to two levels shows good results, beyond which the reconstructed images shows ringing effect. These are demonstrated in Fig. 9.5. The result of eliminating consecutive rows and columns and reconstructing the image with remaining portions of the image is given in Table 9.1. The percentage of the known pixels used to reconstruct and the mean square error calculated from the actual image is also tabulated. 129

(a) (b) Fig 9.5: (a),(b) 256 x 256 image constructed after one and 3 levels respectively For the inpainting application masks in the form of periodic, regular grids and aperiodic irregular grids are considered for various images. The result of inpainting by removing the grids for gray scale images are given in Table 9.2. The results of inpainting the periodic mask area in color images are listed in Table 9.3. The reconstructed images are clearly visible without much distortion but they appear darker than the original image when the inpainting area increases. This effect is noticeable in color images. 130

Table 9.1 Image reconstruction after ignoring varied number of rows and columns Before inpainting After inpainting % of coefficients known Mean Square error 64% 616.04 36.7% 2.26e+03 17% 4.27e+04 131

Table 9.2 Inpainting with various mask on gray scale images Image with grids Reconstructed image % of known coefficients Mean Square error 80% 716.09 64% 699.68 84% 1.2e+03 36.7% 2.3e+03 132

94.6% 134.96 84.6% 180.94 89% 359.05 133

Table 9.3 Inpainting of color images with periodic mask. Original Image Image with grids After reconstruction % known coefficients 64.13 % per color channel 17.14 % per color channel 17.14 % per color channel 36.19 % per color channel 134

Table 9.4 Input images used for experimenting DCT compaction. DCT_1 DCT_2 DCT_3 DCT_4 DCT_5 DCT_6 9.5 Conclusion DCT makes an image sparse in the transform domain. The transformed image consists of few significant values with which the image could be reconstructed. Through experimentation it is found that with 50% of the coefficients the images were reconstructed in a visually plausible way. Mild distortions are produced in certain 135

images when reconstructed with 30% of the coefficients. Visually noticeable distortions are produced when reconstructed with coefficients less than 25%. Introducing more insignificant values and reconstructing a larger sized image is explored in this chapter. This concept is used for super resolution and inpainting. The image is transformed into its sparse representation with the known coefficients and required amount of insignificant values are added to it. The image is then reconstructed using Inverse DCT. Due to periodic nature of DCT, the image pixels at periodic intervals are reconstructed in a better way. The aperiodic reconstruction is converted into periodic form by losing the information in a periodic form. This leads to additional loss of information. When the aperiodic mask becomes thicker, the quality of image reconstruction drops. The overall brightness of the reconstructed image reduces with reduced number of known coefficients which is predominant in color images. This is because the amplitude of the DC coefficient is to be shared with the newly added pixels. The method discussed in this chapter is a non iterative process and obtains the result with few matrix multiplication operations. This makes the algorithm as the fastest method suggested in the thesis. The inpainted value does not depend on the boundary, but on the entire set of known pixels. The dependency of the boundary pixels are eliminated which addresses the final research objective. 136