1 NOVEL TECHNIQUE FOR IMPROVING THE METRICS OF JPEG COMPRESSION SYSTEM N. Baby Anusha 1, K.Deepika 2 and S.Sridhar 3 JNTUK, Lendi Institute Of Engineering & Technology, Dept.of Electronics and communication, India Abstract: JPEG (Joint Photographic Experts Group) is an international compression standard for continuoustone still, both grayscale and color. JPEG standard supports two basic compression methods. The DCTbasedLossy compression method, and the predictive based Lossless compression method. The DCTbasedLossy compression method is widely used today for a large number of applications. This technique converts a signal into elementary frequency components. This work involves the design and implementation of JPEG Encoder and Decoder for the compression of gray scale s and color s as well, the process involves the application of Discrete Cosine Transformation and quantization followed by Zigzag scan and Run Length Encoding techniques as a part of compressing the input, while the decompression involves the same operations in a reverse order. Keywords: Discrete Cosine Transform(DCT), JPEG Image Compression, Quantization, Run Length Coding(RLC), Zigzag Scan. 1. INTRODUCTION Image compression basically deals with the application of various data compression techniques on digital s. Digital representation of analog signals requires huge storage, It had always been a great challenge to transfer such files within the available limited bandwidth and storage requirement constraints. Unlike all of the other compression methods, JPEG is not a single algorithm. Instead, it may be thought of as a toolkit of compression methods that may be altered to fit the needs of the user. JPEG may be adjusted to produce very small, compressed s that are of relatively poor quality in appearance but still suitable for many applications. Conversely, JPEG is capable of producing very highquality compressed s that are still far smaller than the original uncompressed data. 2. IMAGE COMPRESSION FUNDAMENTALS The need for compression becomes apparent when number of bits per is computed resulting from typical sampling rates and quantization methods. 2.1 PRINCIPLES BEHIND COMPRESSION Number of bits required to represent the information in an can be minimized by removing the redundancy present in it. There are three types of redundancies: (i) spatial redundancy, which is due to the correlation or dependence between neighboring pixel values; (ii) spectral redundancy, which is due to the correlation between different color planes or spectral bands; (iii) temporal redundancy, which is present because of correlation between different frames in s. Image compression research aims to reduce the number of bits required to represent an by removing the spatial and spectral redundancies as much as possible. Data redundancy is of central issue in digital compression. If n1 and n2 denote the number of information carrying units in original and compressed respectively, then the compression ratio CR can be defined as CR=n1/n2 and relative data redundancy RD of the original can be defined as RD=11/CR; Three possibilities arise here: (1) If n1=n2, then CR=1 and hence RD=0 which implies that original do not contain any redundancy between the pixels. (2) If n1>>n2, then CR and hence RD>1 which implies considerable amount of redundancy in the original. (3) If n1<<n2, then CR>0 and hence RD  which indicates that the compressed contains more data than original. 2.2 IMAGE COMPRESSION Image compression is very important for efficient transmission and storage of s. Demand for communication of multimedia data through the telecommunications network and accessing the multimedia data through Internet is growing explosively. With the use of digital cameras, requirements for storage, manipulation, and transfer of digital s, has grown explosively. These files can be very large and can occupy a lot of memory. A gray scale that is x pixels have 65, 536 elements to store and a typical 640 x 480 color have nearly a million. Downloading of these files from internet can be very time consuming task. Image data comprise of a significant portion of the multimedia data and they occupy the major portion of the communication bandwidth for multimedia communication. Therefore development of efficient techniques for compression has become quite necessary. A common characteristic of most s is Volume 2, Issue 2 March April 2013 Page 165
2 that the neighboring pixels are highly correlated and therefore contain highly redundant information. The basic objective of compression is to find an representation in which pixels are less correlated. The two fundamental principles used in compression are redundancy and irrelevancy. Redundancy removes redundancy from the signal source and irrelevancy omits pixel values which are not noticeable by human eye. JPEG and JPEG 2000 are two important techniques used for compression. Many other committees and standards have been formed to produce de jure standards (such as JPEG), while several commercially successful initiatives have effectively become de facto standards (such as GIF). Image compression standards bring about many benefits, such as: (1) easier exchange of files between different devices and applications; (2) reuse of existing hardware and software for a wider array of products; (3) existence of benchmarks and reference data sets for new and alternative developments. As our use of and reliance of computers continues to grow, so too does our need for efficient ways of storing large amounts of data. For example someone with a web page or online catalogthat uses dozens or perhaps hundreds of swill more than likely need to use some form of compression to store those s. This is because the amount of space required to hold unadulterated s can be prohibitively large in terms of cost. Fortunately, there are several methods of compression available today. These fall into two general categories: lossless and lossy compression. The JPEG standard is a collaboration among the International Telecommunication Union (ITU), International Organization for Standardization (ISO), and International Electro technical Commission (IEC). Its official name is "ISO/IEC Digital compression and coding of continuoustone still ", and "ITUT Recommendation T.81".The JPEG process is a widely used form of lossy compression that centers around the Discrete Cosine Transform. JPEG have the following modes of operations : (a) Lossless mode: The is encoded to guarantee exact recovery of every pixel of original even though the compression ratio is lower than the lossy modes. (b) Sequential mode: It compresses the in a single lefttoright, toptobottom scan. (c) Progressive mode: It compresses the in multiple scans. When transmission time is long, the will display from indistinct to clear appearance. (d) Hierarchical mode: Compress the at multiple resolutions so that the lower resolution of the can be accessed first without decompressing the whole resolution of the. The last three DCTbased modes (b, c, and d) are lossy compression because precision limitation to compute DCT and the quantization process introduce distortion in the reconstructed. The lossless mode uses predictive method and does not have quantization process. The hierarchical mode can use DCTbased coding or predictive coding optionally. The most widely used mode in practice is called the baseline JPEG system, which is based on sequential mode, DCTbased coding and Huffman coding for entropy encoding. 2.3 COMPRESSION TECHNIQUES The compression techniques are broadly classified into two categories depending whether or not an exact replica of the original could be reconstructed using the compressed. They are: 1. Lossy Image Compression 2. Lossless Image Compression LOSSY IMAGE COMPRESSION Lossy schemes provide much higher compression ratios than lossless schemes. Lossy schemes are widely used since the quality of the reconstructed s is adequate for most applications.by this scheme, the decompressed is not identical to the original, but reasonably close to it. The transformation is applied to the original. The quantization process results in loss of information. The entropy coding after the quantization step, however, is lossless. The decoding is a reverse process. Firstly, entropy decoding is applied to compressed data to get the quantized data. Secondly, reverse quantization is applied to it & finally the inverse transformation to get the reconstructed. Major performance considerations of a lossy compression scheme include: Compression ratio Signal  to noise ratio Speed of encoding & decoding. Lossy compression techniques includes following schemes: 1. Transformation coding 2. Vector quantization 3. Fractal coding 4. Block Truncation Coding 5. Sub band coding VECTOR QUANTIZATION The basic idea in this technique is to develop a dictionary of fixedsize vectors, called code vectors. A vector is usually a block of pixel values. A given is then partitioned into nonoverlapping blocks (vectors) called vectors. Then for each in the dictionary is determined and its index in the dictionary is used as the encoding of the original vector. Thus, each is represented by a sequence of indices that can be further entropy coded LOSSLESS IMAGE COMPRESSION In lossless compression techniques, the original can be perfectly recovered from the compressed (encoded). These are also called noiseless since they do not add noise to the signal ().It is also known as entropy coding since it use statistics/decomposition Volume 2, Issue 2 March April 2013 Page 166
3 techniques to eliminate/minimize redundancy. Lossless compression is used only for a few applications with stringent requirements such as medical imaging. Lossless compression techniques includes following schemes: 1. Run length encoding 2. Huffman encoding 3. LZW coding 4. Area coding RUN LENGTH ENCODING This is a very simple compression method used for sequential data. It is very useful in case of repetitive data. This technique replaces sequences of identical symbols (pixels), called runs by shorter symbols. The run length code for a gray scale is represented by a sequence {Vi, Ri } where Vi is the intensity of pixel and Ri refers to the number of consecutive pixels with the intensity Vi. If both Vi and Ri are represented by one byte, this span of 12 pixels is coded using eight bytes yielding a compression ratio of 1: PROPOSED ARCHITECTURE The prescribed architecture of JPEG Image Compression using DCT for Grayscale s is shown below Fig.3.1: Prescribed Architecture of JPEG Image Compression using DCT for Grayscale Images We will discuss in detail about each block in the above block diagram: DCT (Discrete cosine transform) Quantization Zigzag Scan RLC (Run length coding) Inverse RLC Inverse Zigzag Reverse Quantization IDCT (Inverse Discrete Cosine Transform) 3.1 DISCRETE COSINE TRANSFORM 1. ONEDIMENSIONAL DCT The most common DCT definition of a 1D sequence of length N is In both equations as above, α(k) is defined as: The basis sequences of the 1D DCT are real, discretetime sinusoids are defined by: Each element of the transformed list X[k] in equation of Forward DCT is the inner dot product of the input list x[n] and a basis vector. Constant factors are chosen so the basis vectors are orthogonal and normalized. The DCT can be written as the product of a vector (the input list) and the N x N orthogonal matrix whose rows are the basis vectors. 2. TWODIMENSIONAL DCT The twodimensional discrete cosine transform (2DDCT) is used for processing signals such as s. The 2D DCT resembles the 1D DCT transform since it is a separable linear transformation; that is if the twodimensional transform is equivalent to a onedimensional DCT performed along a single dimension followed by a onedimensional DCT in the other dimension. For example, in an n x m matrix, S, the 2D DCT is computed by applying it to each row of S and then to each column of the result. Since the 2D DCT can be computed by applying 1D transforms separately to the rows and columns, hence the 2D DCT is separable in the two dimensions. The 2D DCT is similar to a Fourier transform but uses purely real math. It has purely real transform domain coefficients and incorporates strictly positive frequencies. The 2D DCT is equivalent to a DFT of roughly twice the length, operating on real data with even symmetry, where in some variants the input and/or output data are shifted by half a sample. As the 2D DCT is simpler to evaluate than the Fourier transform, it has become the transform of choice in compression standards such as JPEG. The 2D DCT represents an as a sum of sinusoids of varying magnitudes and frequencies. It has the property that, for a typical, most of the visually significant information about the is concentrated in just a few coefficients of the DCT. The mathematical definition of DCT is : Volume 2, Issue 2 March April 2013 Page 167
4 The above equation is called the analysis formula or the forward transform Because the DCT uses cosine functions, the resulting matrix depends on the horizontal, diagonal, and vertical frequencies. Therefore am black with a lot of change in frequency has a very random looking resulting matrix, while an matrix of just one color, has a resulting matrix of a large value for the first element and zeros for the other elements. Mathematically, the DCT is perfectly reversible and there is no loss of definition until coefficients are quantized. The pixels in the DCT describe the proportion of each twodimensional basis function present in the. Each basis matrix is characterized by a horizontal and vertical spatial frequency. The matrices arranged from left to right and top to bottom in order of decreasing frequencies. The topleft function (brightest pixel) is the basis function of the "DC" coefficient, with frequency {0,0} and represents zero spatial frequency. It is the average of the pixels in the input, and is typically the largest coefficient in the DCT of "natural" s. Along the top row the basis functions have increasing horizontal spatial frequency content. Down the left column the functions have increasing vertical spatial frequency content. 3.2 QUANTIZATION The block of 8 x 8 DCT coefficients are divided by an 8 x 8 quantization table. In quantization the low DCT coefficients of the high frequencies are discarded. Thus, quantization is applied to allow further compression of entropy encoding by neglecting insignificant low coefficients. The DCT implies that many of the higher frequencies of an can be discarded without any perceived degradation of quality. In lossy compression, quantization exploits both facts by scaling the DCT coefficients to levels that will result in the zeroing of most of the higher frequencies, but maintaining most of the s energy. The 8 x 8 block of DCT coefficients is now ready for compression by quantization. A remarkable and highly useful feature of the JPEG process is that in this step, varying levels of compression and quality are obtainable through selection of specific quantization matrices. This enables the user to decide on quality levels ranging from 1 to 100, where 1 gives the poorest quality and highest compression, while 100 gives the best quality and lowest compression. As a result, the quality/compression ratio can be tailored to suit different needs. Subjective experiments involving the human visual system have resulted in the JPEG standard quantization matrix. With a quality level of 50, this matrix renders both high compression and excellent decompressed quality. If, however, another level of quality and compression is desired, scalar multiplies of the JPEG standard quantization matrix may be used. For a quality level greater than 50 (less compression, higher quality), the standard quantization matrix is multiplied by (100quality level)/50. For a quality level less than 50 (more compression, lower quality), the standard quantization matrix is multiplied by 50/quality level. The scaled quantization matrix is then rounded and clipped to have positive integer values ranging from 1 to 255. Quantization is achieved by dividing each element in the transformed matrix D by the corresponding element in the quantization matrix, and then rounding to the nearest integer value. 3.3 ZIGZAG SCAN After doing 8x8 DCT and quantization over a block we have new 8x8 blocks which denotes the value in frequency domain of the original blocks. Then we have to reorder the values into one dimensional form in order to encode them. the AC terms are scanned in a Zigzag manner. The reason for this zigzag traversing is that we traverse the 8x8 DCT coefficients in the order of increasing the spatial frequencies. So, we get a vector sorted by the criteria of the spatial frequency. After we are done with traversing in zigzag the 88 matrix we have now a vector with 64 coefficients (0, ). Fig 3.1: Zigzag Scan 3.4 RUNLENGTH CODING Runlength encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs: for example, simple graphic s such as icons, line drawings, and animations. It is not useful with files that don't have many runs as it could greatly increase the file size. Now we have the one dimensional quantized vector with a lot of consecutive zeroes. We can process this by run length coding of the consecutive zeroes. Let's consider the 63 AC coefficients in the original 64 quantized vectors first. For example, we have: 57, 45, 0, 0, 0, 0, 23, 0, 30, 16, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,..., 0 We encode for each value which is not 0, than add the number of consecutive zeroes preceding that value in front of it. The RLC (run length coding) is: (0,57) ; (0,45) ; (4,23) ; (1,30) ; (0,16) ; (2,1) ; EOB Volume 2, Issue 2 March April 2013 Page 168
5 The EOB (End of Block) is a special coded value. If we have reached in a position in the vector from which we have till the end of the vector only zeroes, we'll mark that position with EOB and finish the RLC of the quantized vector. Note that if the quantized vector does not finishes with zeroes (the last element is not 0), we do not add the EOB marker. Actually, EOB is equivalent to (0,0), so we have : (0,57) ; (0,45) ; (4,23) ; (1,30) ; (0,16) ; (2,1) ; (0,0) 3.5 RUNLENGTH DECODER The 4bit runlength is a count of the number of zero data values occurred between the last nonzero data value and the current one. The 4bit datalength is the number of bits following this 8bit word that make up the actual nonzero data point. A datalength of 0 signifies either the end of a data block or if the runlength is 15 then the event of 16 consecutive zero data values. 3.6 INVERSE ZIGZAG The frequency matrix is ordered in a zigzag fashion as described in the following Figure 3.2 which takes a long time since it has to make decisions at every bit, is running. Reconstruction of our begins by decoding the bit stream representing the quantized matrix C. Each element of C is then multiplied by the corresponding element of the quantization matrix originally used. The IDCT is next applied to matrix R, which is rounded to the nearest integer. 3.9 INVERSE DISCRETE COSINE TRANSFORM The Inverse Discrete Cosine Transform unit is definitely the most complex unit in the JPEG decoder. The IDCT requires many multiplications and additions of irrational values and is computationally intensive. Since a floating point ALU is very difficult to design, very large, and very slow floating point arithmetic is generally never done in custom hardware designs except for the datapath of a microprocessor where it can be properly shared among many different uses. Here are the equations for the 2Dimensional 8x8 Inverse Discrete Cosine Transform Fig 3.2: After Inverse Zigzag 3.7 REVERSE QUANTIZATION The Reverse Quantization requests data values from its input. It multiplies these data values by the corresponding value in the Quantization table and then places them in the appropriate location in the 8x8 JPEG data block. During JPEG encoding the frequency components of the data block are ordered so that the low frequency components are at the beginning and higher frequency components follow. This data block is then passed on to the Inverse Discrete Cosine Transform unit. Since the Reverse Quantization block is in the middle of the JPEG decoder pipeline and is relatively simple. It requests the Huffman Decoder to give it data and with that data it assembles a data block and requests the Inverse Discrete Cosine Transform unit to decode it. This allows almost all of the operations of the Quantization Unit to be done while the Huffman decoder, Volume 2, Issue 2 March April 2013 Page 169 IDCT: s x, y x, y u 0 v0 4. DESIGN METRICS CuCvSu, v cos n Cn 1 n MEAN SQUARE ERROR 2x 1 u 2y 1 cos v 16 The MSE is the second moment (about the origin) of the error, and thus incorporates both the variance of the estimator and its bias. For an unbiased estimator, the MSE is the variance of the estimator. Like the variance, MSE has the same units of measurement as the square of the quantity being estimated. In an analogy to standard deviation, taking the square root of MSE yields the root mean square error or root mean square deviation (RMSE or RMSD), which has the same units as the quantity being estimated; for an unbiased estimator, the RMSE is the square root of the variance, known as the standard deviation. If is a vector of n predictions, and is the vector of the true values, then the MSE of the predictor is: 4.2 PEAK SIGNAL TO NOISE RATIO Peak SignaltoNoise Ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its
6 International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Volume 2, Issue 2, March April 2013 ISSN representation. Because many signals have a very wide dynamic range, PSNR is usually expressed in terms of the logarithmic decibel scale. PSNR is most easily defined via the mean squared error (MSE). Given a noisefree m n monochrome I and its noisy approximation K, MSE is defined as: The PSNR is defined as: Qua 4.3 COMPRESSION RATIO The size of the compressed divided by the size of the original and this value will be subtracted from 1 and the final value gives the compression ratio. This ratio gives an indication of how much compression is achieved for a particular. Most algorithms have a typical range of compression ratios that they can achieve over a variety of s. Because of this, it is usually more useful to look at an average compression ratio for a particular method. The compression ratio typically affects the picture quality. Generally, the higher the compression ratio, the poorer the quality of the resulting. The tradeoff between compression ratio and picture quality is an important one to consider when compressing s. Size lity of the Level PSN MSE CR R x % x % x % x % Compression ratio = 1 (Compressed size / Original size) x 100% 4.4 BITS PER PIXEL The number of bits of information stored per pixel of an or displayed by a graphics adapter. The more bits there are, the more colors can be represented, but the more memory is required to store or display the. Bpp = numbers of bits/number of pixels 5. EXPERIMENTAL ANALYSIS RESULTS FOR GRAY SCALE IMAGES For Baboongray Volume 2, Issue 2 March April 2013 Fig 5.1: Quality level vs Design metrics for Baboongray For Lenagray Page 170
7 Qual ity Level Size of the PSNR MSE CR Qual ity Level Size of the PSN R MSE CR 10 x % 10 x % 40 x % 40 x % 60 x % 60 x % 80 x % 80 x % Fig 5.2: Quality level vs Design metrics for Lenagray For Rosesgray Fig 5.3: Quality level vs Design metrics for Rosesgray 6. CONCLUSION For the gray scale s, different objective fidelity quality metrics like Peak Signal to Noise Volume 2, Issue 2 March April 2013 Page 171
8 Ratio(PSNR), Mean Square Error(MSE) and Compression Ratio(CR) have been arrived. The findings for gray scale s are [9] Data Compression Book (The Complete Reference) by David Salomon [10] Digital Image Processing, 2/E by Gonzalez For Rosesgray, the PSNR value is and it is more compared to other input gray scale s. For Rosesgray, the CR value is 96.7% and it is more compared to other input gray scale s. For Rosesgray, the MSE value is and it is less compared to other input gray scale s. Finally the proposed JPEG algorithm can be extended for Region Of Interest(ROI) segmentation based compression technique which involves dividing the into two s namely front portion of the and back portion of the. The back portion of the comes under redundant data and so it can be compressed to the maximum extent. This doesn't effect the front portion of the and so the compression can be done better only to our desired area. REFERENCES [1] Z. Lin, J. He, X. Tang, and C. K. Tang, Fast, automatic and fine grained tampered JPEG detection via DCT coefficient analysis, Pattern Recognit., vol. 42, pp , [2] G. K. Wallace, The JPEG still picture compression standard, IEEE Trans. Consumer Electron., vol. 38, no. 1, pp. XVIII XXXIV, Feb [3] Kesavan, Hareesh. Choosing a DCT Quantization Matrix for JPEG Encoding. Web page. avan/ [4] McGowan, John. The Discrete Cosine Transform. Web page. [5] Wallace, Gregory K. The JPEG Still Picture Compression Standard. Paper submitted in December1991 for publication in IEEE Transactions on Consumer Electronics. [6] Wolfgang, Ray. JPEG Tutorial. Web page. [7] Nelson, Mark, The Data Compression Book: Featuring Fast, Efficient, Data Compression Techniques in C, M&T Books, Redwood City, CA, [8] Ramstad, Tor A. Still Image Compression, New York: CRC Press, Volume 2, Issue 2 March April 2013 Page 172
More information