946 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL /$ IEEE

Size: px
Start display at page:

Download "946 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL /$ IEEE"

Transcription

1 946 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 Compress Compound Images in H.264/MPGE-4 AVC by Exploiting Spatial Correlation Cuiling Lan, Guangming Shi, Member, IEEE, and Feng Wu, Senior Member, IEEE Abstract Compound images are a combination of text, graphics and natural image. They present strong anisotropic features, especially on the text and graphics parts. These anisotropic features often render conventional compression inefficient. Thus, this paper proposes a novel coding scheme from the H.264 intraframe coding. In the scheme, two new intramodes are developed to better exploit spatial correlation in compound images. The first is the residual scalar quantization (RSQ) mode, where intrapredicted residues are directly quantized and coded without transform. The second is the base colors and index map (BCIM) mode that can be viewed as an adaptive color quantization. In this mode, an image block is represented by several representative colors, referred to as base colors, and an index map to compress. Every block selects its coding mode from two new modes and the previous intramodes in H.264 by rate-distortion optimization (RDO). Experimental results show that the proposed scheme improves the coding efficiency even more than 10 db at most bit rates for compound images and keeps a comparable efficient performance to H.264 for natural images. Index Terms Base colors and the index map, compound image compression, dynamic programming, residual scalar quantization. I. INTRODUCTION B ESIDES natural images, there are millions of artificial visual contents generated by computers every day, such as Web pages, PDF files, slides, online games, and captured screens. They are usually a combination of text, graphics, and natural image. This is why they are generally called as compound images. In particular, with cloud computing becoming more and more popular [1], compound images often need to be displayed on remote clients, wireless projectors, and thin clients. Some of the clients are unable to directly render them from files. Compressing and transmitting compound images provides a generic solution to these clients. However, in this solution, how to efficiently compress compound images has become a prevalent and critical problem. The state-of-the-art image compression standards (e.g., JPEG [2], JPEG2000 [3] and the intraframe coding of H.264 [4]) are Manuscript received January 20, 2009; revised October 13, First published December 15, 2009; current version published March 17, This work was done while C. Lan was with Microsoft Research Asia, Beijing, , China. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Oscar C. Au. C. Lan and G. Shi are with the Department of Electrical Engineering, Xidian University, Xi an, , China ( lancuiling@see.xidian.edu.cn; gmshi@xidian.edu.cn). F. Wu is with Microsoft Research Asia, Beijing, , China ( fengwu@microsoft.com). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TIP all designed for natural images. The correlation among samples is mainly exploited by transforms (e.g., 2-D DCT or wavelet). Because of the complexity issue, 2-D transform is implemented by two 1-D transforms. Such separate implementation cannot handle anisotropic correlation rather than the horizontal and vertical. Thus, approaches are proposed to efficiently exploit the anisotropic correlation among samples. One category of these approaches is to perform directional prediction before transform as H.264 intraframe coding does [5]. After removing the directional correlation, prediction residues can be assumed as isotropic again. Another category is to apply directional transforms, such as directional wavelet [6], [7] and directional DCT [8], [9], which incorporate directional information into transforms, to reduce the high-frequency energy in the transform domain. They significantly improve the coding performance on natural images with rich anisotropic correlations. However, all of these approaches are not efficient enough when compressing compound images. Taking the extreme anisotropic features of text and graphics into account, some schemes are proposed for compound image coding. The general ideas can be categorized into three groups. Image-coding-based approaches They adopt conventional image coding schemes but improve the bit allocation between text/graphics and natural image areas because the text/graphics areas are often blurred after compression [10], [11]. Thus, the quantization steps in text/graphics areas are decreased and more bits are allocated to them. For a fixed bit budget, it would correspondingly decrease bits for the coding of natural image areas. Consequently, the overall quality after compression is still not good. Layer-based approaches They adopt the mixed raster content (MRC) image model for compression [12] [17], where one compound image is decomposed into a foreground layer, a background layer, and a binary mask plane at block or image level. The mask plane indicates which layer each pixel belongs to and can be compressed by mature binary coding schemes, such as JBIG and JBIG2. The foreground and background layers are smoothed by data filling algorithms and then compressed by conventional image coding schemes. A detailed survey on the MRC schemes has been reported in [15]. It demonstrates significant gains over conventional image coding schemes. However, there are several drawbacks in the approaches. First, the performance is greatly influenced by segmentation. Block-threshold and rate-distortion optimized methods are proposed to optimize the segmentation in [16] and [17]. Second, without special processing, the holes resulted from segmentation will deteriorate the /$ IEEE

2 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 947 coding performance. For that, iterative block filling in spatial or transform domain is developed in [18] [20]. Third, separately coding text colors in the foreground layer and text shapes in the mask plane will also hurt the coding performance. A shape primitive extraction and coding (SPEC) scheme is proposed in [21], where shape primitives containing both colors and shapes are losslessly compressed by a combined shape-based and palette-based coding algorithm. Block-based approaches They first classify blocks in compound images into different types according to their spatial properties [22]. Image features, such as histogram, gradient and the number of colors, are often used for classification [23], [24]. Then different type blocks are compressed by different coding schemes to better adapt to their statistical properties [23] [25]. Considering the sparse histogram distribution of colors in text/graphics blocks, a novel method is proposed to represent a text/graphics block by several base colors and an index map. Furthermore, Ding et al. develop this method as an intracoding mode and incorporate it into H.264 [26]. Thus, the coding performance of that scheme on compressing compound images is significantly improved. The proposed scheme in this paper adopts the block-based architecture as a basis. However, our focus is how to better exploit the spatial correlation in compound images without transform. H.264 intraframe coding is taken as the benchmark to develop our scheme, where the mode-based design and the ratedistortion optimization mode selection provide an easy way to combine spatial-domain and transform-domain methods. Our main contribution is to develop a comprehensive and systematic coding scheme by fully taking the properties of compound images into account. In our scheme, two new intramodes (RSQ and BCIM) are proposed to exploit the spatial correlation in compound images. In the RSQ mode, intraprediction residues are directly scalar quantized and coded by CABAC without transform. The idea of directly compressing intraprediction residues has been reported in [27] for lossless coding of natural images. In this paper, we will study the lossy case that is suitable for compound images and how to select the quantization in spatial domain. In the BCIM mode, the conversion from an image block to several base colors and an index map is extended to different size blocks to adapt to the nonstationary properties of compound images. Moreover, the rate cost is considered in the proposed BCIM mode and the number of base colors is determined in the sense of rate-distortion optimization. Following the coding of base colors, each index in the map is coded by context-based entropy coding. With the two new modes and the RDO mode selection method, the proposed scheme significantly outperforms all existing schemes on compressing compound images; meanwhile, it keeps the comparable performance to H.264 on coding natural images. The rest of this paper is organized as follows. Section II gives a brief overview of the proposed coding scheme. The two new intracoding modes are discussed in detail in Section III. The mode structure and the mode selection are also explained in this section. The typical experimental results presented in Section IV Fig. 1. Exemplified text block: (a) amplified block; (b) luminance sample values. show the excellent performance of our scheme. Finally, we draw conclusions in Section V. II. PROPOSED COMPRESSION SCHEME We would like to first analyze the properties of compound images before introducing our scheme. Different from natural images, compound images have their own characteristics especially on the text and graphics parts. To explain it clearly, Fig. 1 shows an exemplified block with two letters P and o. First, edges in compound images between letters and background are much sharper than those in natural images. Some edges have several-pixel transition because of shadow effects, whereas the others do not have any transition. Second, geometries of edges are usually complicated and irregular. They are difficult to predict along a certain direction. Third, the block only has limited number of different sample values. For such text blocks, the intuitive feeling is that traditional transform will fail to give a compact representation in the transform domain. To verify it, we analyze the properties of text and graphics blocks in quantity by introducing two features: spectral activity measure (SAM) and spatial frequency measure (SFM). Both of them are proposed in [28]. Let us denote an image block as, and. and are the total numbers of samples in one column and one row, respectively. is the corresponding signal to in the frequency domain. SAM is a measurement of image predictability and it is defined in the frequency domain as In this paper, we use DCT instead of DFT to get because it is the transform for compression in our scheme. SAM has a dynamic range of. Lower values of SAM imply lower predictability. SFM indicates the activity level of an image. It is defined as (1) (2)

3 948 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 Fig. 2. Histograms of SAM and SFM on text/graphics and natural image blocks. and are defined as row and column frequency, respectively. They indicate the variations between two rows and between two columns. An image with high activity in the spatial domain will have a large value of SFM. Images with small SAM and large SFM are usually difficult to compress by transform because they are not predictable. With the two measurements, we analyze 3150 text/graphics blocks and 4552 natural image blocks of size The SAM and SFM are calculated on each block. The histograms of SAM and SFM on those two types of blocks are depicted in Fig. 2. The SAM of the text and graphics blocks is concentrated in the lowvalue areas and it indicates a low predictability of such blocks; the SAM of the natural image blocks is scattered over a much wider range. However, the SFM of the text/graphics blocks is scattered over high-value areas and it means high activities in the spatial domain. Meanwhile, the natural image blocks are much smoother and have a compact distribution of SFM in low-value areas. All these indicate that the text and graphics blocks are hard to compress efficiently by transform coding compared with natural images. If transform is removed from the compression of text and graphics block, what techniques can be used to exploit the spatial correlations among samples? One way is to directly compress the prediction residues by entropy coding without transform. It has been successfully adopted in H.264 interframe coding in [29] and lossless intraframe coding in [27]. However, it is not applied to lossy intraframe coding of natural images yet because the residues still have some correlations that can be further exploited by transform. However, considering the extreme sharp edges and the surrounding shadows in text and

4 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 949 Fig. 3. Block diagram of the proposed scheme. graphics blocks as depicted in Fig. 1, it should be more efficient than the transform coding. That motivates us to propose the spatial domain residual scalar quantization method to compress them. Moreover, we know that the number of colors in text and graphics parts is usually very limited compared with natural images, as shown in Fig. 1. In other words, text and graphics parts have much sparse distributions of histogram. This property is utilized in our paper by converting a text/graphics block to several base colors and an index map. This leads to a reduced symbol set representing a block. The causal neighboring indexes are treated as contexts for index coding. It is known as the proposed base colors and the index map method. How can the proposed two methods be integrated into the existing image coding techniques to efficiently compress various images? Fortunately, H.264 intraframe coding provides a flexible and efficient framework to accommodate them. The block diagram of the proposed scheme for coding compound images as well as natural images is depicted in Fig. 3. Similar to H.264, an input image is partitioned into regular and nonoverlapped blocks. There are three possible paths for coding each block. The first path is the existing method in H.264 intraframe coding. Intradirectional prediction and transform coding contribute much to the high compression efficiency on natural image parts. The second path is the proposed RSQ method. After the directional prediction, the residual is directly quantized and entropy coded. The third path is the proposed BCIM method. It does not use intraprediction. The input block is converted to a compact representation of several base colors and an index map. The above three methods can all be applied to different block sizes of 16 16, 8 8, and 4 4. The mode selection is completed by the RDO algorithm similar to that in H.264. In addition, no matter which mode is selected, the reconstructed samples will update the reference buffer as the reference for subsequent samples. III. DETAILS OF THE COMPRESSION SCHEME To adapt to the H.264 intraframe coding, the two proposed methods are developed as two intracoding modes: RSQ and BCIM. They will be discussed in detail in this section. A. RSQ Mode For text and graphics blocks containing edges of many directions as shown in Fig. 1, intraprediction along a single direction cannot completely remove the directional correlation among samples. After intraprediction, residues still preserve strong anisotropic correlation. In this case, it is not efficient to perform a transform on them. One method is to skip the transform and directly code prediction residues, which is similar to traditional pulse-code modulation (PCM). However, the question is whether the performance of PCM is better than that of a transform for text and graphics residual blocks. To answer it, we introduce the method proposed in [30], [31] to analyze the coding gain of PCM over a transform. Given the same rate, the coding gain is defined as the ratio of distortions on transform coefficients and residual samples, respectively, as is the distortion of PCM and is the distortion on transform coefficients. We assume the distortions result from the optimal quantization. Considering a stationary source, each sample will have the same variance. is then equal to is a factor that depends on the probability distribution function of signal and it is defined as is equal to is a factor that has a similar form to (5) and it depends on the probability distribution of the th transform coefficient, meanwhile corresponds to its variance. is the number of transform coefficients in a block, and it equals 64 for 8 8 size blocks. Substituting (4) (6) into (3), we have Thus, the coding gain can be obtained numerically in a statistical sense. If is larger than 1, it indicates that nontransform coding is more efficient than transform coding. To statistically investigate the coding gain of PCM over a transform on text and graphics residual blocks, we collect 8297 residual blocks of size 8 8 from text and graphics parts. Taking DCT as the transform, we get the coding gain by (7) numerically; it is 1.96 (larger (3) (4) (5) (6) (7)

5 950 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 number in that block. of and given as is the inverse quantization function (10) Fig. 4. Intraprediction in RSQ mode. (a) Eight prediction directions with direction 2 (DC mode) as an exception. (b) Example of intraprediction along direction 4. than 1). That indicates nontransform coding is more efficient than DCT transform coding on compressing text and graphics residual blocks. Thus, the residual scalar quantization method is developed to directly compress them. In the RSQ mode, eight prediction directions, as illustrated in Fig. 4(a), are designed to get better residues in the rate distortion sense. Different from H.264, the residual quantization and sample reconstruction are done pixel by pixel within a block in our scheme. The reconstructed pixel can be used for the prediction of the next pixel along the same direction. The use of intrablock reconstructed pixels for prediction is based on the assumption that the shorter the prediction distance is, the stronger the correlation will be. Taking direction 4 as an example as shown in Fig. 4(b), the pixel marked by b that is to be predicted will use the nearest reconstructed integer pixel a for prediction instead of the filtered boundary reconstructed sample M. For a given prediction direction, only the nearest reconstructed integer pixel along that direction without filtering is used for prediction. The reason is that the filtering performed on reconstructed pixels would blur sharp edges in text and graphics blocks and decrease the prediction accuracy. Thus, in directions 5, 6, 7, and 8, the reconstructed nearest integer pixel that is chosen for prediction will have a distance from the current pixel farther than a combination of reconstructed pixels. After directional intraprediction, quantization is performed on the residues. Similar to the method proposed in [29] and [32], which has been integrated into the KTA software [33], the quantization in our coding scheme is a sort of dead-zone plus uniform threshold quantization. It is described by a function of residual signal as is the quantization step and is the adaptive rounding offset value. Parameter controls the width of the dead zone, which is equal to. is used to get the largest integer that is not greater than. In order to maintain an equal expected value before quantization and after inverse quantization, the rounding offset is adaptively updated within a block as (9) is a positive weighting factor and is initialized to zero at the beginning of the quantization of each block. is the pixel (8) is used to get the nearest integer of. is a parameter related to the nonzero region offset. Note that if is zero, is zero too. To balance the rate with distortion, parameters, and all change with the quantization parameter in H.264. In the entropy coding part, quantized residues are compressed in a similar manner to the coding of DCT transform coefficients. However, they use separate context models considering the difference of their statistical distributions. Without the energy compacted to the left-up corner, the quantized samples are scanned line by line but not in a zigzag order. In addition, we find that nonzero quantized residues usually have the same amplitude within a text/graphics block. They are also bit-consuming for their large amplitudes. Therefore, we will use a flag to indicate whether the current quantized residual is the same as the former one or not. If it is the same, only the flag needs to be coded for that quantized residual. B. BCIM Mode Having limited colors but complicated shapes is another property of the text and graphics parts on compound images. Such text/graphics blocks can be expressed concisely by several base colors together with an index map. It is somewhat like color quantization that is a process of choosing a representative set of colors to approximate all the colors of an image [34]. In the BCIM mode, we first get the base colors of a block by using a clustering algorithm. All the base colors constitute a base color table. Then, each sample in the block will be quantized to its nearest base color. The index map indicates which base color is used by each sample. Different from color quantization, each text/graphics block, but not an entire image, has its own base colors and an index map for representation in our scheme. Thus, it is content adaptive for each block. In addition, since the base color number of a block is small, fewer bits are required to represent each mapped index. Let us take the luminance plane of a text/graphics block as an example, with each sample expressed by 8 bits. If four base colors are selected to approximate colors of that block, only two bits are required to represent each sample s index without compression. Thus, the total bits required to represent the block can be reduced from to, saving almost 3/4 of the bits. Taking chrominance components of a block into account, the base colors and an index map representation can also be applied on a vector block consisting of three color components such as that in the color space. In this case, each base color is a vector but only one index map is required for that vector block. For convenience, we will refer to the representation on a vector block as Dim3 and that only on luminance with chrominance components coded by conventional method as Dim1. The Dim3 case is efficient for blocks having high correlation among three color components and the Dim1 case is efficient for those with low correlation among them.

6 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 951 In this mode, there are several key problems to solve. First, base colors should be chosen to well represent a block. Second, after base colors are selected, there should be a well designed entropy coding method to code the base colors and index map. If the coding method has been decided and is good enough, the problem is then simplified as the searching of the best base colors for that block. Base colors involve two aspects: Base color number and their values,. is a scalar in the Dim1 case and a vector in Dim3. In general, the larger the base color number is, the less the distortion of the approximation will be; but more bits will be spent on the coding of base colors and the index map because of the increasing of symbol set and base colors. Thus, a trade-off between distortion and rate should be given. Taking the minimization of the rate-distortion cost (Lagrangian cost) as the optimization objective, we can formulate it as a problem of getting the optimal base color number,as (11) An upper limit of base color number is set to restrict the symbol set size of indexes globally. It is four in our scheme considering the characteristics of text/graphics blocks. Thus, has a dynamic range of. is the Lagrangian multiplier the same as that in H.264. It is related to the quantization parameter in H.264. is the distortion of the representation using base colors. Correspondingly, represents its rate cost. To achieve the optimization in (11), should be optimal first at the given base color number. An image block of pixels corresponds to a set of points,. is a scalar in Dim1 and a vector, indicated by in the Dim3 case. Given the target base color number, the set is partitioned into subsets, (, and ). The colors in subset are then approximated by the base color. Targeting at minimizing the total distortion of the set, the determination of base color values on representing a block can be formulated as the searching of the optimal partition and their representative centers as (12) Clustering methods such as K-mean, LBG-VQ, and TSVQ (tree-structure vector quantization) can be adopted to solve (12). However, K-mean and LBG-VQ algorithms are sensitive to initial cluster centers and easy to get into local optimization. For the iterative updating, they are also time-consuming. In addition, to get with ranging from 1 to in (11), clustering should be done times. However, this will greatly increase the computational burden. One way to solve that is to set an adaptive termination condition and get the base color number adaptively. Ding et al. incorporate a distortion threshold into TSVQ to get the number of base colors by clustering only once [26] but they have not considered the rate cost. However, it will hurt the performance leaving the rate out of consideration or when the base colors are not representative enough. In this paper, we adopt dynamic programming to not only achieve the global optimal partition but also avoid intensive computation caused by clustering of multiple times and the iterative updating. [34], [35] give detailed discussions on dynamic programming for quantization. On one hand, as a clustering method, it is close to global optimization at the distortion criteria, especially for the Dim1 case. On the other hand, by calculating the best partition with base color number set to, the partition results with base color numbers ranging from 1 to are also available, facilitating the evaluation of rate costs for different base color numbers. This is because they are just the sub problems of the partition. For the Dim1 case, the complexity is just [34], where indicates the number of different colors in a block. Usually, for a text/graphics block. If dynamic programming is adopted for partition, the original data of a block should be sorted first. In the Dim1 case, it is easy to compare the sample values. In the Dim3 case, the comparison is done on the projected scalar values, which are attained by projecting input vectors,, to the principle axis of the data set as (13) is the principal eigenvector of the covariance matrix of the data set. For the projection, multiple vectors may be projected to the same scalar value. Then vectors having the same projected scalar value are replaced by their averages. With points of the same value recorded by their value and count, the points of set evolve to points of the set,. The count of will be indicated by. After the sorting of,wehave for the Dim1 case and for the Dim3 case. Then, dynamic partition is performed on the sorted data of the set. Here,. In our scheme, the partitioning of data set to clusters denoted by is set as the target problem. Correspondingly,,, as its sub problems, can be solved once is figured out. It can be explained by the fact that the partition of the first points of set must be optimal in the partition. This property ensures its high computational efficiency. In general, the partition of subset to clusters can be denoted by, named partition, where,. Let denote the th cutting point (last cutting point) in. Then the principle of optimality can be denoted as (14) is the distortion of the partition for data set. is the distortion of the set and equal to (15) is the representative point of set and represented by their mean here. can then be determined by a linear

7 952 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 Fig. 6. Structure of modes in the proposed scheme. Fig. 5. Illustration of base colors and the index map coding: (a) the text block; (b) luminance sample values; (c) base color table with equal to four; (d) the index map; (e) the defined neighboring indexes as contexts of the current index. search by (14) provided that,, is known. Once is determined, becomes known as (16) That is how the optimal partition is constructed by the bottom-up dynamic programming, so are and its sub problems. Once the partition and representative base colors are available, the rate in (11) can be obtained by entropy coding. The base color number, base color values, and the index map will be encoded in turn. To better explain the coding procedure, we will take the Dim1 case on a block as an example and illustrate it in Fig. 5. The text block to be coded is depicted in (a), with (b) indicating the values of luminance samples. After the dynamic programming clustering, the partitions and corresponding representative base colors are obtained. One case with equal to four is shown in (c). According to that base color table, we choose a nearest base color for each sample of the block. This can be considered as the quantization of original colors to base colors. With quantized values replaced by their indexes in the color table, an index map is gotten and shown in (d). Afterwards, each index can be compressed by context-based entropy coding with four neighboring indexes illustrated in (e) as its contexts. However, if is 4, which means four possible symbols for each neighboring index, there are about kinds of contexts in total for the coding index considering its four neighboring positions. To reduce the memory and calculation requirements for so many kinds of contexts, we adopt the technology proposed in [26] to decrease the number of contexts. By mapping contexts having the same structure to one pattern, they can be reduced to 15 basic patterns as. If the boundary of the index map is considered, another nine context patterns need to be added and 24 context patterns are required in total. The letters in the four positions of each pattern denote indexes in the left, left-up, up, and right-up, respectively, identifying their similarities and differences. Taking the index forms of ({3111}) and ({2000}) with indicating the current index as examples, we can see that they all belong to the context pattern of, which means the indexes in the left-up, up, and right-up are the same but they are different from the left one. Before being sent to the context-entropy encoder, the current index should be first remapped to a new index related to its context pattern, adapting to the reduced context patterns. For simplicity of description, letters also indicate the index values for a concrete context pattern. The remapping rule of to is described as if if if (17) if. If is the same as some index in the neighborhood, will be directly encoded under its context pattern. However, if is not the same with any neighboring index, it will be supposed to be the same with letters out of that pattern and remapped. For of the pattern,if is 1, which is the same with, it will be remapped to 0; if is 3 and the same with, it will be remapped to 1; but when is 0 that is different from all the neighboring indexes, it will be remapped to 2 considering that the first letter out of that pattern is ;if is 2, larger than 0, it

8 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 953 Fig. 7. Six testing images/documents used in our experiments: (a) Web Page; (b) Various; (c) Slide; (d) Files; (e) Natural Image; (f) Compound1. will be remapped to 3 corresponding to letter. The remapped index is then entropy coded under its context pattern. After the coding, the actual rate cost of is then found. We get the rate cost for each,, fairly straightforwardly. They will be used to (11) to decide the optimal base color number. Furthermore, when is known, base colors and the index map will be coded in the same method and put into the final bit stream. Instead of using the binarization schemes in H.264, such as Exp-Golomb, the base color number, values and remapped indexes are all simply expressed by their binary format. By exploring correlation among bits when coding each symbol by binary arithmetic as proposed in [36], even better performance than multisymbol arithmetic codec is achieved. Besides, to save bits on coding the values of base colors, we cut the size of their symbol set from 256 (0 255) to 33 (0,8,16 248,255) by (18) and represent the base color values before and after the operation respectively. It is beneficial to the saving of bits since it can decrease the size of the symbol set and concentrate their distributions. However, this will introduce more distortion because of the precision loss. In order to balance them, this processing is only performed on blocks. C. Mode Selection and Mode Structure Each mode has its advantages at dealing with blocks of different features. One question that arises here is how to fully take advantage of each mode in the proposed scheme. It can be solved by the RDO algorithm that has been adopted by H.264. The best mode with the best block partition having the minimum rate-distortion cost will be selected to compress the current block. All modes in the proposed scheme can be categorized into two types: spatial domain (SD) and DCT frequency domain (FD). There is a flag in the bit stream to distinguish them. The organized structure of all the modes is shown in Fig. 6. FD indicates the original intramodes in H.264, where the compression is performed in the DCT domain. SD indicates our proposed RSQ and BCIM modes. To adapt to the local nonstationary property ofcompound images, the spatial domain (SD) modes are applied to

9 954 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 Fig. 8. Experimental results on rate distortion curves , 8 8, and 4 4 block sizes as those DCT frequency domain (FD) modes. The best mode in the spatial domain is compared with the best mode in the DCT frequency domain for the same size block in the rate-distortion sense. The better one is selected. For 8 8 and 4 4 size blocks, 8 prediction directions are designed for the RSQ mode as shown in Fig. 4(a). The DC mode in the spatial domain is replaced by the BCIM mode in the stream syntax. For those small size blocks, the BCIM mode is only performed on the luminance component in our scheme for simplicity. In blocks, the BCIM mode takes the place of the DC intramode. The better one between Dim3 and Dim1 is selected based on the rate-distortion criteria. For the Dim3 case, when the input image format is YUV 4:2:0, interpolation will be performed on UV color planes to get the same size color planes to facilitate the 3-D clustering. To be consistent with the block size of the DCT transform, the selection between direct quantization and transform coding on the residual block, which is obtained by the prediction with three possible directions, is performed on 4 4 sub blocks. Fig. 9. Visual quality comparison of part ( ) of decoded Various at about 0.37 bpp: (a) part of the original image; (b) decoded by JPEG2000, PSNR Y db; (c) decoded by original H.264, PSNR Y db; (d) decoded by Proposed PSNR Y db.

10 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 955 Fig. 10. Comparisons of Ding s approach, the scheme with the BCIM mode and the proposed scheme. IV. EXPERIMENTAL RESULTS We integrate the proposed methods into H.264/MPEG-4 AVC reference software JM14.0 [37] to evaluate their performances. Since the de-blocking filter in H.264 often blurs decoded compound images, it is disabled in our experiments. As shown in Fig. 7, five captured screen images and a compound document are used as test images: (a) is a web page, (b) is a snapshot of a typical screen scene, (c) is a slide, (d) is a combination of files, and (e) is a natural image. Their size is (f) is a compound document with size of Text and graphics are different in different compound images or in different regions of the same image. Some text and graphics blocks have no transition, whereas others have rich shadows. Some symbols on them are small and only occupy a or smaller size block but others may take up several blocks. Experimental results of the PSNR on the luminance signal versus bit-rate curves are depicted in Fig. 8. is set 47 to 7 with a step of 5. The results of H.264 integrated with only the proposed RSQ mode are marked by RSQ and those integrated with only the BCIM mode are marked by BCIM. The curves of H.264 with the RSQ and BCIM modes together are our proposed scheme, marked by Proposed. Another two schemes are selected for comparison: JPEG2000 and H.264 intraframe coding. For JPEG2000, we only compress the luminance plane by all bits with chrominance planes uncompressed. For the same bit budget, the results should be better than those all three color planes are compressed. As depicted in Fig. 8, the BCIM mode has a better performance at low and middle bit rates and the RSQ mode has a better performance at middle and high bit rates for the first four compound images. By selecting the RSQ mode or the BCIM mode by RDO, the proposed scheme presents good performance at all bit rates. Compared with H.264 intraframe coding and JPEG2000, more than 10-dB gain at most bit rates is achieved in Web Page. Similar results are observed in the other compound images. For the fifth image, which is a natural image, the proposed scheme has a comparable performance to H.264. For Compound1, about 10 db is obtained with RSQ mode or BCIM mode integrated, similar to the performance gain in the layer-based scheme in [14]. TABLE I PERCENTAGES OF THE RSQ AND BCIM MODES AT DIFFERENT RATES The percentages of the RSQ and BCIM modes in all testing images are given in Table I. They are obtained at three different rates: 0.4, 0.8, and 1.2 bpp. One can observe that the percentages of the RSQ mode increase with rate increasing, whereas the percentages of the BCIM mode decrease. But the phenomenon is not clear in Natural Image. It cannot be observed in Compound1 because of many binary texts. To evaluate the visual quality, parts of the magnified reconstructed images Various are shown in Fig. 9, at about 0.37 bpp. The images decoded by JPEG2000 and H.264 [(b) and (c), respectively] show severe blur and ring artifacts on the text and graphics parts. With the proposed two modes, the perceptual quality is greatly improved, close to the original image. The proposed scheme is also compared with Ding s approach [26]. Although the BCIM mode has the similar idea as that, the technology in the BCIM mode is improved significantly. In Ding s approach, the rate cost is not considered and the base colors are directly selected by tree-structure vector quantization; it cannot achieve optimized rate-distortion performance. In BCIM mode, the rate cost is considered and the base colors are selected in the sense of RDO with clustering method of dynamic programming. Moreover, the BCIM mode is adaptive to different block sizes and component combinations. Experimental results of Ding s approach, the scheme with BCIM mode only and the proposed scheme are shown in Fig. 10. It demonstrates

11 956 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010 the BCIM mode outperforms Ding s approach with a 1 4dB gain at low and middle bit rates and more than 3 db gain at high bit rates. Furthermore, the integration of the RSQ mode enables the proposed scheme to achieve a much higher performance at middle and high bit rates. Finally, the complexity of the proposed scheme is discussed. Since the transform is skipped in the proposed RSQ and BCIM modes, the decoding complexity of the proposed modes is lower than that of the H.264 intramodes. At the encoder, the complexity of the RSQ mode is low too for the same reason. But the proposed BCIM mode needs to select base colors by clustering and its complexity is a little higher than that of H.264 intramodes. In the mode selection, the rate-distortion costs of all modes are calculated and then the mode with the minimum cost is selected for coding. Because of the proposed modes, the mode selection needs to check double the choices than it does in H.264 intracoding. It can, however, be significantly decreased by fast mode selection in future. V. CONCLUSION We propose a compound image compression scheme by fully exploring spatial domain properties of compound images. Two spatial domain modes, called residual scalar quantization (RSQ) and base colors and the index map (BCIM), are integrated into H.264 intraframe coding; they achieve significant gains at all bit rates. The RSQ mode can cope with complicated text and graphics blocks in a simple way, which is just to quantize the intraprediction residues without a transform. The BCIM mode provides the ability to have a high performance improvement for the efficient representation form of the text/graphics block. They are both able to preserve the spatial structures of the text and graphics parts, important to visual quality. A rate distortion optimal method, similar to that in H.264, simplifies the mode selection and avoids the performance loss imported by the inaccurateness of segmentation. In short, this paper points out a good way to extend H.264 to compress compound images with simple technical extensions and to moderate complexity increasing because of addition mode selections. ACKNOWLEDGMENT The authors would like to thank Dr. Y. Lu, Dr. B. Qin, and W. Ding for their valuable discussions. REFERENCES [1] Cloud Computing [Online]. Available: Cloud_computing [2] W. P. Pennebaker and J. L. Mitchell, JPEG: Still Image Compression Standard. New York: Van Nostrand Reinhold, [3] D. Taubman and M. Marcellin, JPEG2000: Image Compression Fundamentals, Standards, and Practice. Norwell, MA: Kluwer, [4] Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264/ISO/IEC AVC),, 2003, JVT of ISO/IEC MPEG and ITU-T, JVTG-050. [5] T. Wiegand, G. J. Sullivan, G. Bjntegaard, and A. Luthra, Overview of the H.264/AVC video coding standard, IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp , Jul [6] W. Ding, F. Wu, X. Wu, S. Li, and H. Li, Adaptive directional lifting-based wavelet transform for image coding, IEEE Trans. Image Process., vol. 16, no. 2, pp , Feb [7] C. L. Chang and B. Girod, Direction-adaptive discrete wavelet transform for image compression, IEEE Trans. Image Process., vol. 16, no. 5, pp , May [8] H. Xu, J. Xu, and F. Wu, Lifting-based directional DCT-like transform for image coding, IEEE Trans. Circuits Syst. Video Technol., pp , Oct [9] B. Zeng and J. J. Fu, Directional discrete cosine transforms A new framework for image coding, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 3, pp , Oct [10] K. Konstantinides and D. Tretter, A method for variable quantization in JPEG for improved text quality in compound documents, in Proc. Int. Conf. Image Processing, Chicago, IL, Oct. 1998, vol. 2, pp [11] A. Zaghetto and R. L. de Queiroz, Segmentation-driven compound document coding based on H.264/AVC-INTRA, IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp , Jul [12] Mixed Raster Content (MRC) ITU-T Study Group 8, Draft Recommendation T.44, [13] L. Bottou, P. Haffner, P. Howard, P. Simard, Y. Bengio, and Y. LeCun, High quality document image compression using DjVu, J. Electron. Imag., vol. 7, no. 3, pp , Jul [14] A. Zaghetto and R. L. de Queiroz, MRC compression of compound documents using H.264/AVC-I, presented at the XXV Simpósio Brasileiro de Telecomunicações, Recife, Brazil, Sep [15] R. L. de Queiroz, Compressing compound documents, in The Document and Image Compression Handbook, M. Barni, Ed. New York: Marcel-Dekker, [16] R. L. de Queiroz, Z. Fan, and T. Tran, Optimizing block-thresholding segmentation for multilayer compression of compound images, IEEE Trans. Circuits Syst. Video Technol., vol. 9, pp , Oct [17] H. Cheng and C. A. Bouman, Document compression using rate-distortion optimized segmentation, J. Electron. Imag., vol. 10, pp , Apr [18] R. L. de Queiroz, On data-filling algorithms for MRC layers, in Proc. Int. Conf. Image Processing, Vancouver, BC, Canada, Sep. 2000, vol. 2, pp [19] G. Lakhani and R. Subedi, Optimal filling of FG/BG layers of compound document images, in Proc. Int. Conf. Image Processing, Atlanta, GA, Oct. 2006, pp [20] R. L. de Queiroz, Pre-processing for MRC layers of scanned images, in Proc. Int. Conf. Image Processing, Atlanta, GA, Oct. 2006, pp [21] T. Lin and P. Hao, Compound image compression for real-time computer screen image transmission, IEEE Trans. Image Process., vol. 14, no. 7, pp , Jul [22] A. Said and A. Drukarev, Simplified segmentation for compound image compression, in Proc. Int. Conf. Image Processing, Kobe, Japan, 1999, vol. 1, pp [23] X. Li and S. Lei, Block-based segmentation and adaptive coding for visually lossless compression of scanned documents, in Proc. Int. Conf. Image Processing, Oct. 1999, vol. I, pp [24] W. Ding, D. Liu, Y. He, and F. Wu, Block-based fast compression for compound images, in Proc. Int. Conf. Multimedia and Expo (ICME), 2006, pp [25] D. Mukherjee, C. Chrysafis, and A. Said, Low complexity guaranteed fit compound document compression, in Proc. I, Sep. 2002, vol. 1, pp [26] W. Ding, Y. Lu, and F. Wu, Enable efficient compound image compression in H.264/AVC intra coding, in Proc. Int. Conf. Image Processing, Oct. 2007, vol. 2, pp [27] Y. Lee, K. Han, and G. Sullivan, Improved lossless intra coding for H.264/MPEG-4 AVC, IEEE Trans. Image Process., vol. 15, pp , [28] M. Mrak, S. Grgic, and M. Grgic, Picture quality measures in compression systems, in Proc. Int. IEEE Region 8 Conf. Computer as a Tool EUROCON, Sep. 2003, vol. 1, pp [29] M. Narroschke and J. Ostermann, Adaptive Prediction Error Coding in Spatial and Frequency Domain for H.264/AVC ITU-T Q.6/GS16,doc. VCEG-AB06. Bangkok, Thailand, Jan [30] Y. Wang, J. Ostermann, and Y. Zhang, Digital Video Processing and Communications. Upper Saddle River, NJ: Prentice-Hall, [31] N. Jayant, Signal Compression, Coding of Speech, Audio, Image and Video. Singapore: World Scientific, 1997.

12 LAN et al.: COMPRESS COMPOUND IMAGES IN H.264/MPGE-4 AVC BY EXPLOITING SPATIAL CORRELATION 957 [32] G. J. Sullivan, Adaptive quantization encoding technique using an equal expected-value rule, presented at the JVT Meeting Contribution JVT-N011, Hong Kong, Jan [33] Key Technical Area Software of the ITU-T, 2008 [Online]. Available: version jm11.0kta1.8 [34] X. Wu, Color quantization by dynamic programming and principal analysis, ACM Trans. Graph., vol. 11, no. 4, pp , Oct [35] J. D. Bruce, Optimum Quantization, D.Sc. thesis, Massachusetts Inst. Technol., Cambridge, May 14, [36] C. S. Lee and H. W. Park, Multisymbol data compression using a binary arithmetic coder, Electron. Lett., vol. 38, no. 3, pp , Jan [37] Reference Software of H.264/AVC 2008 [Online]. Available: iphome.hhi.de/suehring/tml/download/old_jm/, version jm14.0 Cuiling Lan received the B.Sc. degree in electrical engineering from Xidian University, China, in She is currently pursuing the Ph.D. degree at the College of Electrical Engineering, Xidian University. Currently, her research interests focus on image and video compression. Guangming Shi (M 07) received the B.S. degree in electronic engineering in 1985, the M.S. degree in automation in 1988, and the Ph.D. degree in intelligent signal processing in 2002, all from the Xidian University, China. His research interests include image and video compression, compressed sensing theory and its application on UWB, signal sampling and processing theory, inverse problems in imaging, and optimization. Feng Wu (M 99 SM 06) received the B.S. degree in electrical engineering from Xidian University, China, in He received the M.S. and Ph.D. degrees in computer science from the Harbin Institute of Technology, China, in 1996 and 1999, respectively. He joined in Microsoft Research Asia, formerly named Microsoft Research China, as an Associate Researcher in He has been a researcher with Microsoft Research Asia since 2001 and is now a Senior Researcher/Research Manager. His research interests include image and video representation, media compression and communication, as well as computer vision and graphics. He has authored or coauthored over 150 papers published in various IEEE journals and some other International conferences and forums, e.g., MOBICOM, INFOCOM, ICIP, ICME, and ISCAS. Dr. Wu received (as a coauthor) the best paper award at IEEE CSVT 2009, PCM 2008, and SPIE VCIP 2007.

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding Ali Mohsin Kaittan*1 President of the Association of scientific research and development in Iraq Abstract

More information

USING H.264/AVC-INTRA FOR DCT BASED SEGMENTATION DRIVEN COMPOUND IMAGE COMPRESSION

USING H.264/AVC-INTRA FOR DCT BASED SEGMENTATION DRIVEN COMPOUND IMAGE COMPRESSION ISS: 0976-9102(OLIE) ICTACT JOURAL O IMAGE AD VIDEO PROCESSIG, AUGUST 2011, VOLUME: 02, ISSUE: 01 USIG H.264/AVC-ITRA FOR DCT BASED SEGMETATIO DRIVE COMPOUD IMAGE COMPRESSIO S. Ebenezer Juliet 1, V. Sadasivam

More information

Histogram Based Block Classification Scheme of Compound Images: A Hybrid Extension

Histogram Based Block Classification Scheme of Compound Images: A Hybrid Extension Histogram Based Block Classification Scheme of Compound Images: A Hybrid Extension Professor S Kumar Department of Computer Science and Engineering JIS College of Engineering, Kolkata, India Abstract The

More information

Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation

Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation Enhanced Hybrid Compound Image Compression Algorithm Combining Block and Layer-based Segmentation D. Maheswari 1, Dr. V.Radha 2 1 Department of Computer Science, Avinashilingam Deemed University for Women,

More information

Mixed Raster Content for Compound Image Compression

Mixed Raster Content for Compound Image Compression Mixed Raster Content for Compound Image Compression Final Project Presentation EE-5359 Spring 2009 Submitted to: Dr. K.R. Rao Submitted by: Pritesh Shah (1000555858) MOTIVATION In today s world it is impossible

More information

Performance Comparison between DWT-based and DCT-based Encoders

Performance Comparison between DWT-based and DCT-based Encoders , pp.83-87 http://dx.doi.org/10.14257/astl.2014.75.19 Performance Comparison between DWT-based and DCT-based Encoders Xin Lu 1 and Xuesong Jin 2 * 1 School of Electronics and Information Engineering, Harbin

More information

Block-Matching based image compression

Block-Matching based image compression IEEE Ninth International Conference on Computer and Information Technology Block-Matching based image compression Yun-Xia Liu, Yang Yang School of Information Science and Engineering, Shandong University,

More information

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS Xie Li and Wenjun Zhang Institute of Image Communication and Information Processing, Shanghai Jiaotong

More information

Mixed Raster Content for Compound Image Compression

Mixed Raster Content for Compound Image Compression Mixed Raster Content for Compound Image Compression SPRING 2009 PROJECT REPORT EE-5359 MULTIMEDIA PROCESSING Submitted to: Dr. K. R. Rao Submitted by: (1000555858) 1 P a g e INDEX Sr. # TOPIC page (1)

More information

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation 2009 Third International Conference on Multimedia and Ubiquitous Engineering A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation Yuan Li, Ning Han, Chen Chen Department of Automation,

More information

Reduced Frame Quantization in Video Coding

Reduced Frame Quantization in Video Coding Reduced Frame Quantization in Video Coding Tuukka Toivonen and Janne Heikkilä Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P. O. Box 500, FIN-900 University

More information

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

Multilayer Document Compression Algorithm

Multilayer Document Compression Algorithm Multilayer Document Compression Algorithm Hui Cheng and Charles A. Bouman School of Electrical and Computer Engineering Purdue University West Lafayette, IN 47907-1285 {hui, bouman}@ ecn.purdue.edu Abstract

More information

An Efficient Mode Selection Algorithm for H.264

An Efficient Mode Selection Algorithm for H.264 An Efficient Mode Selection Algorithm for H.64 Lu Lu 1, Wenhan Wu, and Zhou Wei 3 1 South China University of Technology, Institute of Computer Science, Guangzhou 510640, China lul@scut.edu.cn South China

More information

CMPT 365 Multimedia Systems. Media Compression - Image

CMPT 365 Multimedia Systems. Media Compression - Image CMPT 365 Multimedia Systems Media Compression - Image Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Facts about JPEG JPEG - Joint Photographic Experts Group International

More information

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Complexity Reduced Mode Selection of H.264/AVC Intra Coding Complexity Reduced Mode Selection of H.264/AVC Intra Coding Mohammed Golam Sarwer 1,2, Lai-Man Po 1, Jonathan Wu 2 1 Department of Electronic Engineering City University of Hong Kong Kowloon, Hong Kong

More information

PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9.

PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R. Rao Presented by: Suvinda Mudigere Srikantaiah

More information

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Reducing/eliminating visual artifacts in HEVC by the deblocking filter. 1 Reducing/eliminating visual artifacts in HEVC by the deblocking filter. EE5359 Multimedia Processing Project Proposal Spring 2014 The University of Texas at Arlington Department of Electrical Engineering

More information

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction

Compression of RADARSAT Data with Block Adaptive Wavelets Abstract: 1. Introduction Compression of RADARSAT Data with Block Adaptive Wavelets Ian Cumming and Jing Wang Department of Electrical and Computer Engineering The University of British Columbia 2356 Main Mall, Vancouver, BC, Canada

More information

Implementation and analysis of Directional DCT in H.264

Implementation and analysis of Directional DCT in H.264 Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu Introduction A

More information

signal-to-noise ratio (PSNR), 2

signal-to-noise ratio (PSNR), 2 u m " The Integration in Optics, Mechanics, and Electronics of Digital Versatile Disc Systems (1/3) ---(IV) Digital Video and Audio Signal Processing ƒf NSC87-2218-E-009-036 86 8 1 --- 87 7 31 p m o This

More information

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:

More information

Video Coding Using Spatially Varying Transform

Video Coding Using Spatially Varying Transform Video Coding Using Spatially Varying Transform Cixun Zhang 1, Kemal Ugur 2, Jani Lainema 2, and Moncef Gabbouj 1 1 Tampere University of Technology, Tampere, Finland {cixun.zhang,moncef.gabbouj}@tut.fi

More information

Modified SPIHT Image Coder For Wireless Communication

Modified SPIHT Image Coder For Wireless Communication Modified SPIHT Image Coder For Wireless Communication M. B. I. REAZ, M. AKTER, F. MOHD-YASIN Faculty of Engineering Multimedia University 63100 Cyberjaya, Selangor Malaysia Abstract: - The Set Partitioning

More information

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING Md. Salah Uddin Yusuf 1, Mohiuddin Ahmad 2 Assistant Professor, Dept. of EEE, Khulna University of Engineering & Technology

More information

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

2014 Summer School on MPEG/VCEG Video. Video Coding Concept 2014 Summer School on MPEG/VCEG Video 1 Video Coding Concept Outline 2 Introduction Capture and representation of digital video Fundamentals of video coding Summary Outline 3 Introduction Capture and representation

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Discrete Cosine Transform Fernando Pereira The objective of this lab session about the Discrete Cosine Transform (DCT) is to get the students familiar with

More information

An Intraframe Coding by Modified DCT Transform using H.264 Standard

An Intraframe Coding by Modified DCT Transform using H.264 Standard An Intraframe Coding by Modified DCT Transform using H.264 Standard C.Jilbha Starling Dept of ECE C.S.I Institute of technology Thovalai,India D.Minola Davids C. Dept of ECE. C.S.I Institute of technology

More information

Performance analysis of Integer DCT of different block sizes.

Performance analysis of Integer DCT of different block sizes. Performance analysis of Integer DCT of different block sizes. Aim: To investigate performance analysis of integer DCT of different block sizes. Abstract: Discrete cosine transform (DCT) has been serving

More information

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712,

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

JPEG IMAGE CODING WITH ADAPTIVE QUANTIZATION

JPEG IMAGE CODING WITH ADAPTIVE QUANTIZATION JPEG IMAGE CODING WITH ADAPTIVE QUANTIZATION Julio Pons 1, Miguel Mateo 1, Josep Prades 2, Román Garcia 1 Universidad Politécnica de Valencia Spain 1 {jpons,mimateo,roman}@disca.upv.es 2 jprades@dcom.upv.es

More information

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation Optimizing the Deblocking Algorithm for H.264 Decoder Implementation Ken Kin-Hung Lam Abstract In the emerging H.264 video coding standard, a deblocking/loop filter is required for improving the visual

More information

Wireless Communication

Wireless Communication Wireless Communication Systems @CS.NCTU Lecture 6: Image Instructor: Kate Ching-Ju Lin ( 林靖茹 ) Chap. 9 of Fundamentals of Multimedia Some reference from http://media.ee.ntu.edu.tw/courses/dvt/15f/ 1 Outline

More information

Video compression with 1-D directional transforms in H.264/AVC

Video compression with 1-D directional transforms in H.264/AVC Video compression with 1-D directional transforms in H.264/AVC The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Kamisli, Fatih,

More information

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms RC24748 (W0902-063) February 12, 2009 Electrical Engineering IBM Research Report Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms Yuri Vatis Institut für Informationsverarbeitung

More information

1740 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 7, JULY /$ IEEE

1740 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 7, JULY /$ IEEE 1740 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 7, JULY 2010 Direction-Adaptive Partitioned Block Transform for Color Image Coding Chuo-Ling Chang, Member, IEEE, Mina Makar, Student Member, IEEE,

More information

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and

More information

FOR compressed video, due to motion prediction and

FOR compressed video, due to motion prediction and 1390 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 24, NO. 8, AUGUST 2014 Multiple Description Video Coding Based on Human Visual System Characteristics Huihui Bai, Weisi Lin, Senior

More information

EE Low Complexity H.264 encoder for mobile applications

EE Low Complexity H.264 encoder for mobile applications EE 5359 Low Complexity H.264 encoder for mobile applications Thejaswini Purushotham Student I.D.: 1000-616 811 Date: February 18,2010 Objective The objective of the project is to implement a low-complexity

More information

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments 2013 IEEE Workshop on Signal Processing Systems A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264 Vivienne Sze, Madhukar Budagavi Massachusetts Institute of Technology Texas Instruments ABSTRACT

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Jung-Ah Choi and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712, Korea

More information

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC Proceedings of the 7th WSEAS International Conference on Multimedia, Internet & Video Technologies, Beijing, China, September 15-17, 2007 198 Reduced 4x4 Block Intra Prediction Modes using Directional

More information

Block-based Watermarking Using Random Position Key

Block-based Watermarking Using Random Position Key IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.2, February 2009 83 Block-based Watermarking Using Random Position Key Won-Jei Kim, Jong-Keuk Lee, Ji-Hong Kim, and Ki-Ryong

More information

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H. EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY

More information

Compression of Stereo Images using a Huffman-Zip Scheme

Compression of Stereo Images using a Huffman-Zip Scheme Compression of Stereo Images using a Huffman-Zip Scheme John Hamann, Vickey Yeh Department of Electrical Engineering, Stanford University Stanford, CA 94304 jhamann@stanford.edu, vickey@stanford.edu Abstract

More information

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD Siwei Ma, Shiqi Wang, Wen Gao {swma,sqwang, wgao}@pku.edu.cn Institute of Digital Media, Peking University ABSTRACT IEEE 1857 is a multi-part standard for multimedia

More information

A new predictive image compression scheme using histogram analysis and pattern matching

A new predictive image compression scheme using histogram analysis and pattern matching University of Wollongong Research Online University of Wollongong in Dubai - Papers University of Wollongong in Dubai 00 A new predictive image compression scheme using histogram analysis and pattern matching

More information

Image Compression Algorithm and JPEG Standard

Image Compression Algorithm and JPEG Standard International Journal of Scientific and Research Publications, Volume 7, Issue 12, December 2017 150 Image Compression Algorithm and JPEG Standard Suman Kunwar sumn2u@gmail.com Summary. The interest in

More information

Features. Sequential encoding. Progressive encoding. Hierarchical encoding. Lossless encoding using a different strategy

Features. Sequential encoding. Progressive encoding. Hierarchical encoding. Lossless encoding using a different strategy JPEG JPEG Joint Photographic Expert Group Voted as international standard in 1992 Works with color and grayscale images, e.g., satellite, medical,... Motivation: The compression ratio of lossless methods

More information

Interactive Progressive Encoding System For Transmission of Complex Images

Interactive Progressive Encoding System For Transmission of Complex Images Interactive Progressive Encoding System For Transmission of Complex Images Borko Furht 1, Yingli Wang 1, and Joe Celli 2 1 NSF Multimedia Laboratory Florida Atlantic University, Boca Raton, Florida 33431

More information

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding 2009 11th IEEE International Symposium on Multimedia Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding Ghazaleh R. Esmaili and Pamela C. Cosman Department of Electrical and

More information

Digital Video Processing

Digital Video Processing Video signal is basically any sequence of time varying images. In a digital video, the picture information is digitized both spatially and temporally and the resultant pixel intensities are quantized.

More information

VHDL Implementation of H.264 Video Coding Standard

VHDL Implementation of H.264 Video Coding Standard International Journal of Reconfigurable and Embedded Systems (IJRES) Vol. 1, No. 3, November 2012, pp. 95~102 ISSN: 2089-4864 95 VHDL Implementation of H.264 Video Coding Standard Jignesh Patel*, Haresh

More information

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold Ferda Ernawan, Zuriani binti Mustaffa and Luhur Bayuaji Faculty of Computer Systems and Software Engineering, Universiti

More information

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame consists of two interlaced fields, giving a field rate of 50

More information

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Efficient MPEG- to H.64/AVC Transcoding in Transform-domain Yeping Su, Jun Xin, Anthony Vetro, Huifang Sun TR005-039 May 005 Abstract In this

More information

Xin-Fu Wang et al.: Performance Comparison of AVS and H.264/AVC 311 prediction mode and four directional prediction modes are shown in Fig.1. Intra ch

Xin-Fu Wang et al.: Performance Comparison of AVS and H.264/AVC 311 prediction mode and four directional prediction modes are shown in Fig.1. Intra ch May 2006, Vol.21, No.3, pp.310 314 J. Comput. Sci. & Technol. Performance Comparison of AVS and H.264/AVC Video Coding Standards Xin-Fu Wang (ΞΠΛ) and De-Bin Zhao (± ) Department of Computer Science, Harbin

More information

Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path

Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path G Abhilash M.Tech Student, CVSR College of Engineering, Department of Electronics and Communication Engineering, Hyderabad, Andhra

More information

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS Ye-Kui Wang 1, Miska M. Hannuksela 2 and Moncef Gabbouj 3 1 Tampere International Center for Signal Processing (TICSP), Tampere,

More information

MANY image and video compression standards such as

MANY image and video compression standards such as 696 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL 9, NO 5, AUGUST 1999 An Efficient Method for DCT-Domain Image Resizing with Mixed Field/Frame-Mode Macroblocks Changhoon Yim and

More information

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM TENCON 2000 explore2 Page:1/6 11/08/00 EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM S. Areepongsa, N. Kaewkamnerd, Y. F. Syed, and K. R. Rao The University

More information

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai

More information

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264 A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264 The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm International Journal of Engineering Research and General Science Volume 3, Issue 4, July-August, 15 ISSN 91-2730 A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

More information

Research on Distributed Video Compression Coding Algorithm for Wireless Sensor Networks

Research on Distributed Video Compression Coding Algorithm for Wireless Sensor Networks Sensors & Transducers 203 by IFSA http://www.sensorsportal.com Research on Distributed Video Compression Coding Algorithm for Wireless Sensor Networks, 2 HU Linna, 2 CAO Ning, 3 SUN Yu Department of Dianguang,

More information

Efficient Dictionary Based Video Coding with Reduced Side Information

Efficient Dictionary Based Video Coding with Reduced Side Information MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Efficient Dictionary Based Video Coding with Reduced Side Information Kang, J-W.; Kuo, C.C. J.; Cohen, R.; Vetro, A. TR2011-026 May 2011 Abstract

More information

Adaptive Quantization for Video Compression in Frequency Domain

Adaptive Quantization for Video Compression in Frequency Domain Adaptive Quantization for Video Compression in Frequency Domain *Aree A. Mohammed and **Alan A. Abdulla * Computer Science Department ** Mathematic Department University of Sulaimani P.O.Box: 334 Sulaimani

More information

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 1, JANUARY 2001 111 A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

More information

An Efficient Context-Based BPGC Scalable Image Coder Rong Zhang, Qibin Sun, and Wai-Choong Wong

An Efficient Context-Based BPGC Scalable Image Coder Rong Zhang, Qibin Sun, and Wai-Choong Wong IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 981 An Efficient Context-Based BPGC Scalable Image Coder Rong Zhang, Qibin Sun, and Wai-Choong Wong Abstract

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

( ) ; For N=1: g 1. g n

( ) ; For N=1: g 1. g n L. Yaroslavsky Course 51.7211 Digital Image Processing: Applications Lect. 4. Principles of signal and image coding. General principles General digitization. Epsilon-entropy (rate distortion function).

More information

Video Compression Standards (II) A/Prof. Jian Zhang

Video Compression Standards (II) A/Prof. Jian Zhang Video Compression Standards (II) A/Prof. Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2009 jzhang@cse.unsw.edu.au Tutorial 2 : Image/video Coding Techniques Basic Transform coding Tutorial

More information

BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION FILTERS

BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION FILTERS 4th European Signal Processing Conference (EUSIPCO ), Florence, Italy, September 4-8,, copyright by EURASIP BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION

More information

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010 EE 5359 Low Complexity H.264 encoder for mobile applications Thejaswini Purushotham Student I.D.: 1000-616 811 Date: February 18,2010 Fig 1: Basic coding structure for H.264 /AVC for a macroblock [1] .The

More information

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain Author manuscript, published in "International Symposium on Broadband Multimedia Systems and Broadcasting, Bilbao : Spain (2009)" One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

More information

SCREEN CONTENT IMAGE SEGMENTATION USING LEAST ABSOLUTE DEVIATION FITTING. Shervin Minaee and Yao Wang

SCREEN CONTENT IMAGE SEGMENTATION USING LEAST ABSOLUTE DEVIATION FITTING. Shervin Minaee and Yao Wang SCREEN CONTENT IMAGE SEGMENTATION USING LEAST ABSOLUTE DEVIATION FITTING Shervin Minaee and Yao Wang Department of Electrical and Computer Engineering, Polytechnic School of Engineering, New York University,

More information

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC) STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC) EE 5359-Multimedia Processing Spring 2012 Dr. K.R Rao By: Sumedha Phatak(1000731131) OBJECTIVE A study, implementation and comparison

More information

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC Jung-Ah Choi, Jin Heo, and Yo-Sung Ho Gwangju Institute of Science and Technology {jachoi, jinheo, hoyo}@gist.ac.kr

More information

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000 Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000 EE5359 Multimedia Processing Project Proposal Spring 2013 The University of Texas at Arlington Department of Electrical

More information

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Final Report Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao May 8, 2002 Abstract Stream

More information

5.1 Introduction. Shri Mata Vaishno Devi University,(SMVDU), 2009

5.1 Introduction. Shri Mata Vaishno Devi University,(SMVDU), 2009 Chapter 5 Multiple Transform in Image compression Summary Uncompressed multimedia data requires considerable storage capacity and transmission bandwidth. A common characteristic of most images is that

More information

Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression

Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression Gang Hua and Onur G. Guleryuz Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005 ganghua@rice.edu

More information

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING Dieison Silveira, Guilherme Povala,

More information

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC Randa Atta, Rehab F. Abdel-Kader, and Amera Abd-AlRahem Electrical Engineering Department, Faculty of Engineering, Port

More information

Image Error Concealment Based on Watermarking

Image Error Concealment Based on Watermarking Image Error Concealment Based on Watermarking Shinfeng D. Lin, Shih-Chieh Shie and Jie-Wei Chen Department of Computer Science and Information Engineering,National Dong Hwa Universuty, Hualien, Taiwan,

More information

Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique

Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique Lossless and Lossy Minimal Redundancy Pyramidal Decomposition for Scalable Image Compression Technique Marie Babel, Olivier Déforges To cite this version: Marie Babel, Olivier Déforges. Lossless and Lossy

More information

Reconstruction PSNR [db]

Reconstruction PSNR [db] Proc. Vision, Modeling, and Visualization VMV-2000 Saarbrücken, Germany, pp. 199-203, November 2000 Progressive Compression and Rendering of Light Fields Marcus Magnor, Andreas Endmann Telecommunications

More information

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER

DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER DCT-BASED IMAGE COMPRESSION USING WAVELET-BASED ALGORITHM WITH EFFICIENT DEBLOCKING FILTER Wen-Chien Yan and Yen-Yu Chen Department of Information Management, Chung Chou Institution of Technology 6, Line

More information

Professor, CSE Department, Nirma University, Ahmedabad, India

Professor, CSE Department, Nirma University, Ahmedabad, India Bandwidth Optimization for Real Time Video Streaming Sarthak Trivedi 1, Priyanka Sharma 2 1 M.Tech Scholar, CSE Department, Nirma University, Ahmedabad, India 2 Professor, CSE Department, Nirma University,

More information

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile An Efficient Intra Prediction Algorithm for H.264/AVC High Profile Bo Shen 1 Kuo-Hsiang Cheng 2 Yun Liu 1 Ying-Hong Wang 2* 1 School of Electronic and Information Engineering, Beijing Jiaotong University

More information

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error.

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error. ON VIDEO SNR SCALABILITY Lisimachos P. Kondi, Faisal Ishtiaq and Aggelos K. Katsaggelos Northwestern University Dept. of Electrical and Computer Engineering 2145 Sheridan Road Evanston, IL 60208 E-Mail:

More information

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames Ki-Kit Lai, Yui-Lam Chan, and Wan-Chi Siu Centre for Signal Processing Department of Electronic and Information Engineering

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

Advanced Video Coding: The new H.264 video compression standard

Advanced Video Coding: The new H.264 video compression standard Advanced Video Coding: The new H.264 video compression standard August 2003 1. Introduction Video compression ( video coding ), the process of compressing moving images to save storage space and transmission

More information

Fast Wavelet-based Macro-block Selection Algorithm for H.264 Video Codec

Fast Wavelet-based Macro-block Selection Algorithm for H.264 Video Codec Proceedings of the International MultiConference of Engineers and Computer Scientists 8 Vol I IMECS 8, 19-1 March, 8, Hong Kong Fast Wavelet-based Macro-block Selection Algorithm for H.64 Video Codec Shi-Huang

More information

Partial Video Encryption Using Random Permutation Based on Modification on Dct Based Transformation

Partial Video Encryption Using Random Permutation Based on Modification on Dct Based Transformation International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 2, Issue 6 (June 2013), PP. 54-58 Partial Video Encryption Using Random Permutation Based

More information

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Jeffrey S. McVeigh 1 and Siu-Wai Wu 2 1 Carnegie Mellon University Department of Electrical and Computer Engineering

More information