Analysis of Huffman and Run-length encoding Compression Algorithms on Different Image Files

Similar documents
IMAGE COMPRESSION TECHNIQUES

Multimedia Networking ECE 599

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

REVIEW ON IMAGE COMPRESSION TECHNIQUES AND ADVANTAGES OF IMAGE COMPRESSION

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

So, what is data compression, and why do we need it?

A Research Paper on Lossless Data Compression Techniques

Image coding and compression

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation

1.6 Graphics Packages

Lecture 8 JPEG Compression (Part 3)

Message Communication A New Approach

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

Research Article Does an Arithmetic Coding Followed by Run-length Coding Enhance the Compression Ratio?

VC 12/13 T16 Video Compression

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

IMAGE COMPRESSION USING HYBRID QUANTIZATION METHOD IN JPEG

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Lecture 8 JPEG Compression (Part 3)

Lossless Compression Algorithms

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Fundamentals of Video Compression. Video Compression

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Digital Image Representation Image Compression

CS101 Lecture 12: Image Compression. What You ll Learn Today

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

Ch. 2: Compression Basics Multimedia Systems

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

A Comprehensive Review of Data Compression Techniques

Multimedia on the Web

Image Compression Algorithm and JPEG Standard

Compression; Error detection & correction

Data Compression Fundamentals

Image Coding and Compression

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

Image Formats. Ioannis Rekleitis

Volume 2, Issue 9, September 2014 ISSN

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

Medical Image Compression using DCT and DWT Techniques

COLOR IMAGE COMPRESSION USING DISCRETE COSINUS TRANSFORM (DCT)

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

Standard File Formats

Compression II: Images (JPEG)

Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2

Elementary Computing CSC 100. M. Cheng, Computer Science

CS 335 Graphics and Multimedia. Image Compression

3.01C Multimedia Elements and Guidelines Explore multimedia systems, elements and presentations.

Text Data Compression and Decompression Using Modified Deflate Algorithm

A Novel Image Compression Technique using Simple Arithmetic Addition

Lecture 5: Compression I. This Week s Schedule

CMPT 365 Multimedia Systems. Media Compression - Image

Multimedia Communications. Transform Coding

JPEG Compression Using MATLAB

Intro. To Multimedia Engineering Lossless Compression

Graphics File Formats

EE67I Multimedia Communication Systems Lecture 4

Hybrid Image Compression Using DWT, DCT and Huffman Coding. Techniques

7: Image Compression

color bit depth dithered

Digital Image Processing

Highly Secure Invertible Data Embedding Scheme Using Histogram Shifting Method

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

Compressing 2-D Shapes using Concavity Trees

A Methodology to Detect Most Effective Compression Technique Based on Time Complexity Cloud Migration for High Image Data Load

Video Compression An Introduction

Digital Image Processing

The Power and Bandwidth Advantage of an H.264 IP Core with 8-16:1 Compressed Reference Frame Store

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

FRACTAL IMAGE COMPRESSION OF GRAYSCALE AND RGB IMAGES USING DCT WITH QUADTREE DECOMPOSITION AND HUFFMAN CODING. Moheb R. Girgis and Mohammed M.

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

AN ANALYTICAL STUDY OF LOSSY COMPRESSION TECHINIQUES ON CONTINUOUS TONE GRAPHICAL IMAGES

Optimizing run-length algorithm using octonary repetition tree

IMAGE COMPRESSION TECHNIQUES

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

ROI Based Image Compression in Baseline JPEG

Data Representation and Networking

Image, video and audio coding concepts. Roadmap. Rationale. Stefan Alfredsson. (based on material by Johan Garcia)

A New Algorithm based on Variable BIT Representation Technique for Text Data Compression

A New Compression Method Strictly for English Textual Data

Common File Formats. Need a standard to store images Raster data Photos Synthetic renderings. Vector Graphic Illustrations Fonts

A New Lossy Image Compression Technique Using DCT, Round Variable Method & Run Length Encoding

Lecture 6 Review of Lossless Coding (II)

This is not yellow. Image Files - Center for Graphics and Geometric Computing, Technion 2

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay

Data and information. Image Codning and Compression. Image compression and decompression. Definitions. Images can contain three types of redundancy

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Ramani A.V 2 HEAD OF CS & SRMV CAS, Coimbatore, Tamilnadu, India

Cost Minimization by QR Code Compression

Video Compression MPEG-4. Market s requirements for Video compression standard

Computers Are Your Future Prentice-Hall, Inc.

Data Compression Algorithm for Wireless Sensor Network

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

IMAGE COMPRESSION USING HYBRID TRANSFORM TECHNIQUE

Common Technology Words and Definitions

CS/COE 1501

INFS 2150 / 7150 Intro to Web Development / HTML Programming

Digital Technologies Hangarau Matihiko Level 1

Transcription:

Analysis of Huffman and Run-length encoding Compression Algorithms on Different Image Files Aliyu Ishola Nasiru and Afolayan Tolulope Ambibola Department of Information and Communication Science University of Ilorin Email: aliyu.in@unilorin.edu.ng Abstract Viewing and downloading uncompressed images from the internet on mobile devices might take a longer time. This makes the data plan costly and brings unpleasant user experience. Usually, when virtual server is hired to host a website with images, money is paid for the amount of storage and the amount of data that the server sends and receives over a period of time. Image compression allows streaming of more compressed images to viewers without paying more for the bandwidth used. For this purpose, this paper studied and implemented two compression algorithms i.e. Run-lengthen coding (RLE) and Huffman on four image file formats; Joint Photographic Experts Group(JPEG), Bitmap Image File (BMP),Graphics Interchange Format (GIF), Portable Network Graphics (PNG) using C#. Experimentally, results show that RLE performs better than Huffman in compressing GIF, BMP, JPG, and PNG images, with very low compression ratio and high saving percentage. The only instance where Huffman performed better was on a BMP file with less repeating strings. Run-length also compresses in a minimal amount of time compared to Huffman. It is recommended that Huffman and RLE algorithms can be used when lossless compression is required. When Huffman and RLE are the options available for compression, RLE could be considered, but for complicated images with possibilities of less repeating strings, Huffman should be considered. Keywords: Run length Encoding (RLE), Huffman Coding (HC), Image Compression, Image files. 1. Introduction Pictures have been with us since the dawn of time. However, the way pictures have been represented and displayed has changed greatly. Originally, every picture is unique, either represented or displayed in a physical way, such as paint on a cave wall or etchings on the stone. The use of digital images has increased at a rapid pace over the past decade due to computer generated (synthetic) images, particularly for special effects in advertising and entertainments (Shankar, 2010).Image compression plays a major role in a digital domain. The more the image is compressed, the less amount of storage is required (Mahmud, 2012, Arora and Kumar,2018). Data compression is the science of reducing the amount of data used to convey information. It relies on the fact that information, by its nature, is not random but exhibits order and patterns. If that order and patterns can be extracted, the essence of the information can be represented and transmitted using less data than what would be needed for the original. Then, at the receiving end, the original can be intimately or closely reconstructed (David and Giovanni,2010).Basically, data compression is performed by a program that uses a formula or algorithm to determine how to shrink the size of a particular data (Joshi,Raval, Dandawate, Joshi, Metkar,2014)).These programs find the common pieces of data blocks that can be omitted, shrunk, removed or substituted with smaller patterns. The more of the repeated blocks it finds, the more it can compress (David and Giovanni,2010). Nonetheless, compressing data can save storage capacity, speed file transfer, and decrease costs for storage hardware and network bandwidth (Joshi etal, 2014). 183

Images are composed of pixels and each pixel represents the color at a single point in the image; an image will therefore consist of millions of pixels. The richer the image, the more pixels and the bigger the size, the more bandwidth and space required. An uncompressed image, that is, an image in its raw form is quite expensive in terms of space and bandwidth requirements. Hence, image compression that will permanently get rid of some information in the image to save storage space and ease transfer is needed. Compression techniques can be categorized into two types i.e. lossless or lossy. Lossless compression enables the original data to be reconstructed the same as it was before compression without the loss of a single bit of data. It is usually used for text, executable files, medical field, where the loss of words or numbers could change the information or could be harmful (Mozammil, Zakariya and Inamullah, 2012). Lossy compression on the other hand, permanently eliminates bits of data that are redundant, unimportant or imperceptible. Lossy compression is used in graphics, audio, video, and images, where the removal of some data bits has little or no discernible effect on the representation of the content (Joshi et al, 2014). This paper presents a comparative analysis between two lossless compression algorithm; Huffman and Run-length encoding on various image file formats such as JPEG, BMP,GIF,PNG using C#. 2. Related Work Often times, when it comes to comparison between RLE and Huffman encoding, it is usually on text, they are scarcely compared on images. In this work, the comparison analysis is based on their performance on a different image file format. Sharma(2010) studied various compression techniques and compared them based on their usage in different applications and their advantages and disadvantages. The work concludes that Huffman is easy to implement, produces optimal and compact code, relatively slow, depends on statistical model of data, decoding is difficult due to different code lengths, it has an overhead due to Huffman tree, always used in JPEG. Run-length coding is simple to implement, fast to execute; compression ratio is slow as compared to other algorithms, used mostly for TIFF, BMP and PCX files. Ibrahim and Mustapha (2015) compared Huffman and RLE using C++ program to compress a set of text files and the results show that Huffman performs better than RLE on all types of text file. Shankar (2010) compared Run-length coding and Huffman on a single image, based on the results, it was concluded that RLE is very easy to implement, but would not necessarily reduce the size of image and greater compression ratio can be achieved in a crowded image. Huffman coding can provide optimal compression and error free decompression. (Kodituwakkuet al, 2010 and Maan, 2013) opined that in most cases, Huffman performs better than RLE on text files and images. The interest of this work is to see on what image file format Huffman outperforms RLE. (Yuan, Guo, Sun and Ju, 2016) proposed a power efficient System-on-a-Chip test data compression method using alternating statistical run-length coding. Experimental results show that a high compression ratio, low scan-in test power dissipationand little extra area overhead during System-on-a- Chip scan testing were obtained. (Shukla and Gupta, 2015) combined DCT and run-lenght encoding for image compression, result shows that high compression rates are achieved and visually negligible difference between compressed images and original images. 3. Methodology In this work, two popular data compression, algorithms are implemented, analyzed and compared. For measuring the performance, the following parameters are used: compression ratio, saving percentage, computational time and the file formats are JPEG, BMP, GIF, and PNG. 184

3.1 RLE Algorithm (Run Length Encoding) Run-length encoding is a data compression algorithm that is supported by most bitmap file formats, such as TIFF, BMP, and PCX. RLE is suited for compressing any type of data regardless of its information content, but the content of the data will affect the compression ratio achieved by RLE. Although most RLE algorithms cannot achieve the high compression ratios of the more advanced compression methods, RLE is both easy to implement and quick to execute, making it a good alternative to either using a complex compression algorithm or leaving your image data uncompressed. RLE works by reducing the physical size of a repeating string of characters. This repeating string, called a run, is typically encoded into two bytes (Ibrahim and Mustapha, 2015). RLE Pseudocode Given a binary image of dimension n x m, with a background pixel intensity of 0 and foreground intensity of 1. set color to 0 set count to 0 for each pixel in the image if current pixel not equal to color write count set color to current pixel color set count to 1 else increment count by 1 if count not equal to 0 write count // record last run 3.2 Huffman Algorithm The Huffman Algorithm generates variable length code in such a way that high frequency symbols are represented with a minimum number of bits and low frequency symbols are represented by a relatively high number of bits (Yadav, 2006). Huffman coding is an entropy encoding algorithm used for lossless data compression in computer science and information theory. The term refers to the use of a variablelength code table for encoding a source symbol (such as a character in a file) where the variable-length code table has been derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol. The following are the steps of Huffman Algorithm: Step 1: Compute or collect the total number of symbols and their relative frequency Step 2: Arrange all the symbols in decreasing order of their frequencies Step 3: Construct Huffman Tree from the list of symbols Creating the tree: 1. Start with as many leaves as there are symbols. 2. Enqueue all leaf nodes into the first queue (by probability in increasing order so that the least likely item is in the head of the queue). 3. While there is more than one node in the queues: 1. Dequeue the two nodes with the lowest weight. 185

2. Create a new internal node, with the two just-removed nodes as children (either node can be either child) and the sum of their weights as the new weight. 3. Enqueue the new node into the rear of the second queue. 4. The remaining node is the root node; the tree has now been generated. Step 4: Assign the code. Figure 1 : The flowchart of Run length Encoding (Murray,Vanryper,1996) 4. Result and Discussion The lower the compression ratio, the better and the more it performs. RLE has a very low compression ratio on image files except for the Hildebrantmed.bmp file that has very high compression ratio for RLE and very low for Huffman. From figure 3, RLE has better saving percentage compared to Huffman, although Huffman is running neck to neck with it on some image file formats which are the GIF images. That is, they both do great on GIF images. Huffman does not perform well on JPEG images compared to RLE. The saving percentage increases when the file size after compression is far smaller than the original file size. That is, the smaller the difference between file size before and after compression, the lesser the saving percentage. The Hildebrantmed.bmp is still the only image that gives Huffman algorithm advantage over RLE. Table 4 shows the compressed size files of the compression analysis. The table simply depicts that RLE clearly compresses better than Huffman techniques, except on the large complicated image with less repeating strings, where Huffman does better by compressing 470kb file to 204kb, while RLE compresses 186

the same picture to 381k. The output for the compressed files for both algorithms on.gif image types are very close, compared to the difference in both algorithms on the other image file formats. Table 1: Compression rate for Huffman Coding S/ File Name File File Huffman Huffman Saving Decomp- N Type Size Output File Compression Percentage ression size (kb) Size (kb) Ratio (%) (kb) 1 05 TIFF1d Jpg 147 105 0.71 28.6 147 2 BisonTeton Jpg 93 75 0.81 19.4 93 3 Yoyin Jpg 2470 2250 0.91 8.9 2470 4 Sciurusvulgaris Png 1599 205 0.13 87.2 1599 5 Lady Png 514 51 0.1 90.1 514 6 latest-1 Png 81 15 0.19 81.5 81 7 Tiger-1 Bmp 655 67 0.1 89.8 655 8 Hildebrantmed Bmp 470 204 0.43 56.6 470 9 Adafruit Bmp 226 56 0.25 75.2 226 10 Earth Gif 1319 15 0.01 98.9 1319 11 PeterPan Gif 380 11 0.03 97.1 380 12 SpongeBob Gif 48 9 0.19 81.3 48 Table 2: Compression rate for Run-length Encoding S/N File Name File File RLE RLE RLE Saving RLE Type Size output Compression Percentage Decompression (kb) file size Ratio (%) Size (kb) 1 05 TIFF1d Jpg 147 21 0.14 85.7 147 2 BisonTeton Jpg 93 24 0.26 74.2 93 3 Yoyin Jpg 2470 243 0.1 90.1 2470 4 Sciurusvulgaris Png 1599 33 0.2 97.9 1599 5 Lady Png 514 15 0.03 97.1 514 6 latest-1 Png 81 7 0.09 91.2 81 7 Tiger-1 Bmp 655 17 0.03 97.4 655 8 Hildebrantmed Bmp 470 381 0.81 18.9 470 9 Adafruit Bmp 226 11 0.05 95.1 226 10 Earth Gif 1319 13 0.01 99 1319 11 PeterPan Gif 380 9 0.02 97.6 380 12 SpongeBob Gif 48 8 0.12 83.3 48 187

Compression Ratio Comparison 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 File Size 147kb 93kb 2470kb 1599kb 514kb 81kb 655kb 470kb 226kb 1319kb 380kb 48kb File Type jpg jpg jpg png png png bmp bmp bmp gif gif gif RLE Huffman Figure 2: Compression ratio comparison between RLE and Huffman Saving Percentage Comparison 120 100 80 60 40 20 0 File Size File Type 147kb 93kb 2470kb 1599kb 514kb 81kb 655kb 470kb 226kb 1319kb 380kb 48kb jpg jpg jpg png png png bmp bmp bmp gif gif gif RLE Huffman Figure 3: Saving Percentage comparison between RLE and Huffman 188

Table 3:Compression time for RLE and Huffman S/N File Type File Size (kb) RLE Compression Time (Secs) 1 Jpg 147 7 25 2 Jpg 93 4 23 3 Jpg 2470 264 587 4 Png 1599 6 125 5 Png 514 3 22 6 Png 81 2 5 7 Bmp 655 5 14 8 Bmp 470 4 37 9 Bmp 226 2 7 10 Gif 1319 2 8 11 Gif 380 2 7 12 Gif 48 1 4 HUFFMAN Compression Time (Secs) Table 3 displays the compression comparison between both algorithms. As it can be depicted, RLE compresses faster than Huffman on all image files of different sizes. Table 4: Compression Output File size by both RLE and Huffman S/N File Type File Size (kb) RLE Output File size (kb) Huffman Output File Size (kb) 1 Jpg 147 21 105 2 Jpg 93 24 75 3 Jpg 2470 243 2250 4 Png 1599 33 205 5 Png 514 15 51 6 Png 81 7 15 7 Bmp 655 17 67 8 Bmp 470 381 204 9 Bmp 226 11 56 10 Gif 1319 13 15 11 Gif 380 9 11 12 Gif 48 8 9 189

Figure 4:jpg image after compression and decompression with Huffman Figure 5: Bmp image after compression and decompression with RLE 5. Conclusion This paper thoroughly studied and implemented two well-known compression algorithms named Runlength Encoding (RLE) and Huffman Coding (HC). Both Algorithms are tested on the following types of image file GIF, BMP, JPG, and PNG. Experimentally, results showed that RLE performs better than Huffman in compressing GIF, BMP, JPG, and PNG images, with very low compression ratio and high saving percentage. The only instance where Huffman performed better was on a BMP file with less repeating strings. RLE also compresses in a minimal amount of time compared to Huffman. All things 190

being equal, it is concluded that RLE performs better than Huffman based on the stipulated parameters which are compression ratio, saving percentage, compressed file size, and compression time. In the future, we intend to implement the two algorithms on various video file formats. REFERENCES Arora, S., & Kumar, G.(2018). Review of Image Compression Techniques. International Journal of Recent Research. 5( 1), 185-188 David S and Giovanni M. (2010). Handbook of Data Compression. New York: Springer. Ibrahim, A. M. A., & Mustafa, M. E. (2015). Comparison between (Rle and huffman) algorithms for lossless data compression. IJITR, 3(1), 1808-1812. Joshi, M. A., Raval, M. S., Dandawate, Y. H., Joshi, K. R., & Metkar, S. P. (2014).Image and Video Compression: Fundamentals, Techniques, and Applications. CRC Press. Kodituwakku, S. R., & Amarasinghe, U. S. (2010). Comparison of lossless data compression algorithms for text data. Indian journal of computer science and engineering, 1(4), 416-425. Maan, A. J. (2013). Analysis and comparison of algorithms for lossless data compression. International Journal of Information and Computation Technology, 3(3), 139-146. Mahmud, S. (2012). An improved data compression method for general data. International Journal of Scientific & Engineering Research, 3(3), 2. Zakariya, S. M., & Inamullah, M. (2012). Analysis of video compression algorithms on different video files. In Computational Intelligence and Communication Networks (CICN), 2012 Fourth International Conference on (pp. 257-262). IEEE. Murray, J. D., & William vanryper. (1996). Encyclopedia of Graphics File Formats: The Complete Reference on CD-ROM with Links to Internet Resources. O'Reilly. Run-lenght Encoding Pseudocode, retrieved from http://www.cs.unca.edu/~reiser/imaging/rle.html,12 may 2017 Shankar, U. B. (2010). Image compression techniques. International Journal of Information Technology and Knowledge Management, 2(2), 265-269. Sharma, M. (2010). Compression using Huffman coding. IJCSNS International Journal of Computer Science and Network Security, 10(5), 133-141. Shukla, R., & Gupta, N. K. (2015). Image Compression through DCT and Huffman Coding Technique. International Journal of Current Engineering and Technology, 5(3), 1942-1946. Yadav, D. S. (2006). Foundations of Information Technology. New Delhi: New Age International. Yuan, H., Guo, K., Sun, X., & Ju, Z. (2016). A Power Efficient Test Data Compression Method for SoC using Alternating Statistical Run-Length Coding. Journal of Electronic Testing, 32(1), 59-68. 191