An introduction to JPEG compression using MATLAB

Similar documents
Compression II: Images (JPEG)

Introduction ti to JPEG

Video Compression An Introduction

IMAGE COMPRESSION. October 7, ICSY Lab, University of Kaiserslautern, Germany

Lecture 8 JPEG Compression (Part 3)

IMAGE COMPRESSION USING FOURIER TRANSFORMS

CS 335 Graphics and Multimedia. Image Compression

Digital Image Processing

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay

Compression of Stereo Images using a Huffman-Zip Scheme

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

VIDEO SIGNALS. Lossless coding

COLOR IMAGE COMPRESSION USING DISCRETE COSINUS TRANSFORM (DCT)

CMPT 365 Multimedia Systems. Media Compression - Image

Features. Sequential encoding. Progressive encoding. Hierarchical encoding. Lossless encoding using a different strategy

Image Compression Techniques

Multimedia Signals and Systems Still Image Compression - JPEG

Lecture 8 JPEG Compression (Part 3)

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

INF5063: Programming heterogeneous multi-core processors. September 17, 2010

Digital Image Representation Image Compression

Wireless Communication

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Chapter 1. Digital Data Representation and Communication. Part 2

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Image Compression Algorithm and JPEG Standard

Multimedia Communications. Transform Coding

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Image, video and audio coding concepts. Roadmap. Rationale. Stefan Alfredsson. (based on material by Johan Garcia)

2.2: Images and Graphics Digital image representation Image formats and color models JPEG, JPEG2000 Image synthesis and graphics systems

Digital Image Representation. Image Representation. Color Models

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Lecture 5: Compression I. This Week s Schedule

CSEP 521 Applied Algorithms Spring Lossy Image Compression

MRT based Fixed Block size Transform Coding

The PackBits program on the Macintosh used a generalized RLE scheme for data compression.

JPEG Compression. What is JPEG?

Topic 5 Image Compression

Source Coding Techniques

JPEG: An Image Compression System

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Lecture 6 Introduction to JPEG compression

MPEG-4: Simple Profile (SP)

AN ANALYTICAL STUDY OF LOSSY COMPRESSION TECHINIQUES ON CONTINUOUS TONE GRAPHICAL IMAGES

Image and Video Compression Fundamentals

compression and coding ii

7.5 Dictionary-based Coding

JPEG. Wikipedia: Felis_silvestris_silvestris.jpg, Michael Gäbler CC BY 3.0

( ) ; For N=1: g 1. g n

IMAGE COMPRESSION TECHNIQUES

Redundant Data Elimination for Image Compression and Internet Transmission using MATLAB

Biomedical signal and image processing (Course ) Lect. 5. Principles of signal and image coding. Classification of coding methods.

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Fundamentals of Video Compression. Video Compression

IMAGE COMPRESSION USING HYBRID QUANTIZATION METHOD IN JPEG

Steganography using Odd-even Based Embedding and Compensation Procedure to Restore Histogram

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia

Digital Video Processing

Steganography: Hiding Data In Plain Sight. Ryan Gibson

Image Compression Standard: Jpeg/Jpeg 2000

JPEG: An Image Compression System. Nimrod Peleg update: Nov. 2003

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Digital Image Processing

Interactive Progressive Encoding System For Transmission of Complex Images

New Perspectives on Image Compression

Lossless Image Compression having Compression Ratio Higher than JPEG

Repetition 1st lecture

VC 12/13 T16 Video Compression

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

Image Compression. CS 6640 School of Computing University of Utah

CHAPTER 6 A SECURE FAST 2D-DISCRETE FRACTIONAL FOURIER TRANSFORM BASED MEDICAL IMAGE COMPRESSION USING SPIHT ALGORITHM WITH HUFFMAN ENCODER

JPEG 2000 compression

Ian Snyder. December 14, 2009

Image data compression with CCSDS Algorithm in full frame and strip mode

ISSN Vol.03,Issue.09 May-2014, Pages:

IMAGE COMPRESSION. Chapter - 5 : (Basic)

Using Streaming SIMD Extensions in a Fast DCT Algorithm for MPEG Encoding

NOVEL ALGORITHMS FOR FINDING AN OPTIMAL SCANNING PATH FOR JPEG IMAGE COMPRESSION

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

APPM 2360 Project 2 Due Nov. 3 at 5:00 PM in D2L

THE TRANSFORM AND DATA COMPRESSION HANDBOOK

7: Image Compression

Image Coding and Data Compression

A Parallel Reconfigurable Architecture for DCT of Lengths N=32/16/8

Computer Faults in JPEG Compression and Decompression Systems

Digital Image Processing

Video Codec Design Developing Image and Video Compression Systems

HYBRID TRANSFORMATION TECHNIQUE FOR IMAGE COMPRESSION

ROI Based Image Compression in Baseline JPEG

TKT-2431 SoC design. Introduction to exercises

JPEG decoding using end of block markers to concurrently partition channels on a GPU. Patrick Chieppe (u ) Supervisor: Dr.

ISSN Vol.06,Issue.10, November-2014, Pages:

Stereo Image Compression

13.6 FLEXIBILITY AND ADAPTABILITY OF NOAA S LOW RATE INFORMATION TRANSMISSION SYSTEM

Transcription:

An introduction to JPEG compression using MATLAB Arno Swart 30 October, 2003 1 Introduction This document describes the popular JPEG still image coding format. The aim is to compress images while maintaining acceptable image quality. This is achieved by dividing the image in blocks of 8 8 pixels and applying a discrete cosine transform (DCT) on the partitioned image. The resulting coefficients are quantised, less significant coefficients are set to zero. After quantisation two encoding steps are made, zero run length encoding (RLE) followed by an entropy coding. Part of the JPEG encoding is lossy (information may get lost. Therefore, the reconstruction may not be exact), part is lossless (no loss of information). JPEG allows for some flexibility in the different stages, not every option is explored. Also the focus is on the DCT and quantisation. The entropy coding will only be briefly discussed and the file structure definitions will not be considered. There is a MATLAB implementation of (a subset of) JPEG that can be used alongside this document. It can be down-loaded from the course s internet pages, follow the link to Wavelet & FFT resources. Try understanding the different functions in the source code while reading the document. Some questions will ask you to modify the code slightly. 2 Tiling and the DCT To simplify matters we will assume that the width and height of the image are a multiple of 8 and that the image is either gray scale or RGB (red, green, blue). Each pixel is assigned a value between 0 and 255, or a triplet of RGB values. In the case of a color RGB picture a pointwise transform is made to the YUV (luminance, blue chrominance, red chrominance) color space. This space is in some sense more decorrelated than the RGB space and will allow for better 1

quantisation later. The transform is given by Y 0.299 0.587 0.114 U = 0.87 0.3313 0.5 R G + 0 0.5, (1) V 0.5 0.4187 0.0813 B 0.5 and the inverse transform is R G = 1 0 1.402 1 0.34414 0.71414 Y U 0.5. (2) B 1 1.772 0 V 0.5 The next step is dividing each component of the image in blocks of 8 8 pixels. This is needed since the DCT has no locality information whatsoever. There is only frequency information available. We continue by shifting every block by 128, this makes the values symmetric around zero with average close to zero. The DCT of a signal f(x, y) and the inverse DCT (IDCT) of coefficients F (u, v) are given by F (u, v) = 1 4 C(u)C(v) 7 f(x, y) = 1 7 7 4 u=0 v=0 where x=0 C(z) = 7 (2x+1)uπ y=0 f(x, y) cos( ) cos( (2y+1)vπ ), C(u)C(v)F (u, v) cos( (2x+1)uπ ) cos( (2y+1)vπ ), { 1/ 2 if z = 0, 1 otherwise. Question 0 What is the relation between YUV and f? Question 1 The DCT is half of the Fourier transform. Why not use the full Fourier representation instead of the cosine basis functions? It is because the DCT can approximate linear functions with less coefficients. To see this take the array 8,, 24, 32, 40, 48, 56, 64. Do a forward Fourier transform, set the last 4 bits to zero (this is what will happen in the quantisation step explained later) and do a inverse Fourier transform. Perform the same procedure for the DCT. What do you notice? Can you explain the results? Question 2 The DCT is given as a two dimensional transform. For implementation purposes it is more efficient to write it as a sequence of two onedimensional transforms. Derive and implement this. Question 3 The DCT only uses 64 basis functions. One can precompute these and reuse them in the algorithm at the expense of a little extra memory. A function that returns the basis functions is provided with the source. Build this in to the main code. The blocks are processed row by row, from left to right. (3) (4) 2

3 Quantisation and Zigzag scan So far there has been no compression. We will now proceed by introducing zeros in the arrays by means of quantisation. Every coefficient c ij is divided by an integer q ij and rounded to the nearest integer. The importance of the coefficients is dependent on the human visual system, the eye is much more sensitive to low frequencies. Measurements led to standard quantisation tables, listed below. Note that the tables for luminance and chrominance are different (for gray scale images only the luminance tables are used). The tables are 1 11 10 24 40 51 61 12 12 14 19 26 58 60 55 Q LUM = 1 14 13 24 40 57 69 56 14 17 22 29 51 87 80 62 s 18 22 37 56 68 109 103 77 (5) 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 17 18 24 47 99 99 99 99 18 21 26 66 99 99 99 99 Q CHR = 1 24 26 56 99 99 99 99 99 47 66 99 99 99 99 99 99 s 99 99 99 99 99 99 99 99 (6) 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 (In the actual computations, q ij is an integer which value is obtained by rounding to the nearest integer and taking the maximum with 1. Why maximum with 1?) The variable s in the above formulas stands for scale. The higher s, the fewer coefficients will be set to zero. This is also known as the quality level. Question 4 Another more primitive quantisation technique is thresholding. Every coefficient is divided by the same constant value (all q ij are equal). Implement this, can you find a noticeable difference? The next ingredient is the so-called zigzag scan. Every block is converted to a vector by gathering elements in zigzag fashion. See figure (3) for the zigzag scheme. This method has the advantage that low-frequency coeffients are placed first, while high frequency components come last in the array. This is likely to introduce long substrings of zeros at the end of the array. Question 5 What does low-frequency and high-frequency look like in an image. Can you see the logic in discarding (quantising to zero) the high-frequency components? 1 The JPEG standard also allows for inclusion of custom tables in the file header 3

(0,0) (7,0) (0,7) (7,7) Figure 1: The zigzag scan order. Question 6 In what sense is the YUV color space more decorrolated than the RGB space (cf. the 2nd paragrapfh of Section 2). The first element of the resulting array is called the DC component. The other 63 entries are the AC components. They are treated separately in the entropy coding process. Question 7 What is the interpretation of the DC coefficient? (In the signal processing community DC stands for direct current while AC stands for alternating current.) 4 RLE and entropy coding This section describes the final step in the compression. We will start with the DC component, which is simplest. We expect the DC components of a cluster of neighbouring blocks to differ only slightly from each other. For this reason we use differential coding, that is, we visit the DC coefficients of the blocks in row-by-row, left to right fashion and record the difference with the previous DC s (except for the first which remains unchanged). Then, on the array of differences, we proceed with Huffman coding; for a brief description of Huffman coding, see the next section. Huffman coding is an entropy coder. For the remaining 63 AC coefficients we use zero RLE. This means we first store the number of zeros preceding a nonzero and then the nonzero. For example the values 0250001 become the pairs (1, 2), (0, 5), (3, 1). The value (0, 0) is 4

a special value signaling that the remaining coefficients are all zero. Since our zigzag scheme clusters the zeros at the end of the array we probably have good compression here. Now we come at a point where there is a difference between the JPEG standard and our implementation. We use 8 bits per value in the pair and do Huffman coding on the bit stream. The official JPEG standard works differently, but will probably not have significantly more compression. The standard proposes to store no number greater than 15 in the first component of the pair. For example 17 zeros and a one would become (15, 0), (2, 1). Now only four bits (a nibble) are needed for this number. For the second number we store the number of bits that the coefficient needs. This also needs a maximum of 4 bits and we put all bits in a bit stream and apply Huffman coding. After decoding we know the number of bits per number so we can reconstruct our zero length and bit length values. For the actual coefficients we also make one bit stream, only taking the number of bits we need. This stream is also Huffman compressed. When decoding we know this number of bits, it came from our first stream, so we can also retrieve these coefficients. Question 8 What part of the JPEG encoding is lossy and what part is lossless? How much compression is achieved with the lossless part of JPEG? When (for what value of s) do we have lossless compression? 5 Huffman coding (optional) Huffman coding is the optimal lossless scheme for compressing a bit stream. It works by fixing an n and dividing the input bit-stream in sequences of length n. For simplicity of explanation, assign symbols, say A, B, C,,..., to all sequences of zeros and ones (bit-codes) of length n. Now, a bit-stream that we want to encode might look like ABAAB AACB. Next, the Huffman encoder calculates how often each symbol occurs. Then, the symbols are assigned new bit-codes, but now not necessarily of length n. The more often a symbol occurs (as the A in our example), the smaller the length of the new bit-code will be. The output of the Huffman encoder is a bit stream formed by the new bit-coded symbols. For decoding, we need to know where the bit-code for one symbol stops and a new one begins. This is solved by enforcing the unique prefix condition: the bit-code of no symbol is the prefix of the bit-code of another symbol. For instance, the four symbols A, B, C and might be coded as in Table 1: A might be coded as 0 in the output bit-stream, while it might represent 00 if n = 2 (assuming there are only four symbols A, B, C, and ) in the input bit-steam. The number of bits to represent the string ABAAB AACB of 10 symbols reduces from 10 2 = 20 bits to 5 1 + 3 2 + 1 3 + 1 3 = 17 bits. Note that the reduction would be better if A would occur more often than in the example. JPEG allows for coding according to the number of occurences or according to predefined tables included in the header of the jpg-picture. There are standard tables compiled from counting the number of occurences in a large number of images. 5

original code symbol ouput code 00 A 0 01 B 11 10 C 100 11 101 original bit-stream: 00010000011100001001 symbolized: ABAAB AACB huffman coded bit-stream: 01100111010010011 Table 1: If the original input bit-stream is diveded in sequences of length 2 then the table gives the coding for the four symbols as obtained with the Huffman coding for, for instance, the string ABAAB AACB. Note that reading the huffman coded bit-stream from left to right leads to exact reconstriction of the original bit-stream. 6 Experiments Try out for yourself how well JPEG compresses images. The source code comes with a few images in MATLAB.mat format. Try to make graphs of the scaling factor versus the number of bits per pixel. Also try to make visual estimations of the quality. A more quantitative measure would be the mean squared error of the reconstructed image with respect to the original. Later on you can compare the results with the performance of the more advanced JPEG-2000 format (it uses wavelets for compression). 6