# Lossless Compression Algorithms

Size: px
Start display at page:

## Transcription

1 Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1

2 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms Fix-Length Coding Run length coding Differential coding Dictionary based coding Variable - Length Coding Shannon-Fano Algorithm Huffman Coding Algorithm 2

3 Introduction A digital file (Video, Image and Audio) can easily become very large we need Huge volume of multimedia data It can be useful or necessary to compress them for ease of storage or delivery we need more efficient data storage, processing and transmission However, while compression can save space or assist delivery, it can slow or delay the opening of the file, since it must be decompressed when displayed. Some forms of compression will also compromise the quality of the file. 3

4 Introduction Compression: the process of coding that will effectively reduce the total number of bits needed to represent certain information. Fig. 7.1: A General Data Compression Scheme. 4

5 Introduction Compression ratio denotes the relation between the size of the original data before compression and the size of the compressed data. Compression ratio therefore rates the effectivity of a compression system in terms of data reduction capability. B 0 compresionratio B 1 B 0 number of bits before compression B 1 number of bits after compression In general, we would desire any codec (encoder/decoder scheme) to have a compression ratio much larger than 1. The higher the compression ratio, the better the lossless compression scheme. 5

6 Introduction If the compression and decompression processes induce no information loss, then the compression scheme is lossless; otherwise, it is lossy. Lossy compression means that the quality of your file will be reduced. The right side of this image has been saved using lossy compression and doesn't look as good as the left. 6

7 Introduction Lossless Compression Algorithms There will be no data loss in this type of compression as it is defined by the name. Both original data and the compressed data are the same in this compression. The algorithms for the compression and decompression are exact inverse of each other in the Lossless Compression. The main mechanism in this compression is removing the redundant data in the compression and adding them in the decompression. Advantages: The original format of the data remains even it is compressed. Disadvantages: Reduction of the size of the data is a small. Sometimes the size can increase instead of decrease. 7

8 Introduction Lossy Compression Algorithms It is the compression technique which will lose data in the original source while trying to keep the visible quality at the almost same amount. The compression ratio will be very high. Most probably the ratio will be a value near 10. It reduces non sensitive information to the human eyes and the compressed media will not be the media that was available before compression. Advantages: Can reduce the file size more than in the Lossless Compression Disadvantages: The original file cannot be taken after the decompression 8

9 Motivating Example Transmit the data {250, 251, 251, 252, 253, 253, 254, 255} by the network Rewrite the data sequence using binary vector { , , , , , , , } Totally require 8*8 bit = 64 bits for transmission The available bandwidth is limited!! Only 16 bits available for the transmission of the data Compression is necessary!! 9

10 Lossy Compression Encode: drop the least significant bits Encode data: 8 * 2 bit = 16 bits 10

11 Lossless Compression Encode: encode the difference Encode data: 8 bit + 7*1 bit = 15 bits 11

12 Lossless Compression Algorithms 12

13 Basics of Information Theory Entropy The entropy η of an information source with alphabet S = {s 1, n s 2,..., s n } is: 1 H( S) pi log 2 p (7.2) p i probability that symbol s i will occur in S. 1 log 2 p i (7.3) indicates the amount of information ( self information as defined by Shannon) contained in s i, which corresponds to the number of bits needed to encode s i. i 1 n i 1 p i log 2 p i i 13

14 Basics of Information Theory Distribution of Gray-Level Intensities Example For example, in an image with uniform distribution of graylevel intensity, i.e. pi = 1/256, then the number of bits needed to code each gray level is 8 bits. The entropy of this image is 8. Fig. 7.2 Histograms for Two Gray-level Images. Fig. 7.2(a) shows the histogram of an image with uniform distribution of gray-level intensities, i.e., i p i = 1/256. Hence, the entropy of this image is: (7.4) Fig. 7.2(b) shows the histogram of an image with two possible values. Its entropy is

15 Basics of Information Theory Entropy and Code Length The most significant interpretation of entropy in the context of lossless compression is that entropy represents the average amount of information contained (in bits) per symbol in the source S. (bits/symbol), i.e. to conduct lossless compression. Entropy therefore gives a lower bound on the number of bits required to represent the source information, i.e. the minimum amount of bits can be used to encode a message without loss. 15

16 Basics of Information Theory Entropy Example A. What is the entropy of the image below, where numbers (0, 20, 50, 99) denote the gray-level intensities? 16

17 Entropy Example 17

18 Lossless Compression Algorithms Lossless Compression Algorithms Fix-length coding Variable-Length Coding Run length coding Differential coding Shannon-Fano Coding Huffman Coding Dictionary based coding 18

19 Fix-length coding The length of the codeword is fixed 1. Run length coding 2. Differential coding 3. Dictionary based coding 19

20 Fix-length coding 1. Run length coding Run-length coding is a data compression algorithm that helps us encode large runs of repeating items by only sending one item from the run and a counter showing how many times this item is repeated. Unfortunately this technique is useless when trying to compress natural language texts, because they don t have long runs of repeating elements. In the other hand it is useful when it comes to image compression, because images happen to have long runs pixels with identical color. 20

21 Fix-length coding 1. Run length coding For example: The string AAABBCDDDD would be encoded as 3A2B1C4D. aaaabbaba?? 4a2b1a1b1a A real life example where run-length encoding is quite effective is the fax machine. Most faxes are white sheets with the occasional black text. So, a run-length encoding scheme can take each line and transmit a code for while then the number of pixels, then the code for black and the number of pixels and so on. This method of compression must be used carefully. If there is not a lot of repetition in the data then it is possible the run length encoding scheme would actually increase the size of a file. 21

22 Fix-length coding 1. Run length coding (Example) Original data: AAAAABBBAAAAAAAABBBB Using ASCII code B = 20*8bit = 20 * 1Byte = 20 Bytes Run length coding result: 5A3B8A4B Compressed data = 8 * 1 Byte = 8 Bytes < 20 Bytes Compression ratio: 20/8 = 2.5 > 1 good 22

23 Fix-length coding 1. Run length coding (Example) Extreme cases: Best case: AAAAAAAA ~ 8A Compression ratio: 8/2 = 4 Worst case: ABABABAB ~1A1B1A1B1A1B1A1B Compression ratio: 8/16 = 0.5 Negative compression: the resulting compressed file is larger than the original one Example: Original data: ABABBABCABABBA, use RLC and compute Compression ratio? Run length coding: 1A1B1A2B1A1B1C1A1B1A2B1A Compression ratio: 14/24 ~ negative compression 23

24 Fix-length coding 2. Differential coding In this method first a reference symbol is placed. Then for each symbol in the data, we place the difference between that symbol and the reference symbol used. For example, using symbol A as reference symbol, the string: AAABBC DDDD would be encoded as A (since A is the same as reference symbol, B has a difference of 1 from the reference symbol and so on.) 24

25 Fix-length coding 3. Dictionary based coding One of the best known dictionary based encoding algorithms is Lempel-Ziv (LZ) compression algorithm. In this method, a dictionary (table) of variable length strings (common phrases) is built. This dictionary contains almost every string that is expected to occur in data. When any of these strings occur in the data, then they are replaced with the corresponding index to the dictionary. 25

26 Fix-length coding 3. Dictionary based coding In this method, instead of working with individual characters in text data, we treat each word as a string and output the index in the dictionary for that word. For example, let us say that the word "compression" has the index 4978 in one particular dictionary. To compress a body of text, each time the string "compression" appears, it would be replaced by

27 Variable-length coding The length of the codeword is variable 1. Shannon-Fano Algorithm 2. Huffman Coding Algorithm 27

28 Variable-length coding 1. Shannon-Fano Algorithm Shannon-Fano Algorithm a top-down approach 1. Sort the symbols according to the frequency count of their occurrences. 2. Recursively divide the symbols into two parts, each with approximately the same number of counts, until all parts contain only one symbol. An Example: coding of HELLO Symbol H E L O Count Frequency count of the symbols in HELLO. 28

29 Fig. 7.3: Coding Tree for HELLO by Shannon-Fano. 29

30 Table 7.1: Result of Performing Shannon-Fano on HELLO Symbol Count Log 1 2 Code # of bits pi used L *1=2 H E O TOTAL # of bits: 10 On average it uses 10/5 = 2 bits to code each symbol close to the lower bound of 1.92 the result is satisfactory p i probability that symbol s i will occur in S. 1 log 2 p i indicates the amount of information ( self-information as defined by Shannon) contained in s i, which corresponds to the number of bits needed to encode s i. Entropy ŋ ==?? PL. Log 1/PL + PH. Log 1/PH + PE. Log 1/PE + PO. Log 1/PO = 0.4 X X X X 2.32 =

31 Fig. 7.4 Another coding tree for HELLO by Shannon-Fano. Symbol Count Log 1 2 pi Code # of bits used L H E O TOTAL # of bits: 10 31

32 Example Suppose we have a message consisting : [A(4), B(2), C(2), D(1), E(1)] Draw a Shannon for this distribution.

33 steps first you need to sort the symbols based on the frequency. Second, try to spilt the collection into 2 groups, where the collection with large frequency in the right. Third, repeat (1,2) in each sub tree. splitting is after sorting

34 steps Only we can create 2 combinations: A (4) B (2) C (2) D (1) E (1) OR A (4) B (2) C (2) D (1) E (1)

35 10 A(4) 6(B,C,D,E) B(2) C,D,E (4) Average code length = 22 / 10 = 2.2 bits/symbol Compression Ratio = 8 / 2.2 = 3.64 C(2) D,E (2) D(1) E (1) Symbol Count pi Code Subtotal (#of bits) A *1 bit = 4 B *2 bits = 4 C *3 bits = 6 D *4 bits = 4 E *4 bits = 4 Totals bits

36 Variable-length coding 2. Huffman Coding Algorithm Huffman Coding Algorithm a bottom-up approach 1. Initialization: Put all symbols on a list sorted according to their frequency counts. 2. Repeat until the list has only one symbol left: a. From the list pick two symbols with the lowest frequency counts. Form a Huffman subtree that has these two symbols as child nodes and create a parent node. b. Assign the sum of the children s frequency counts to the parent and insert it into the list such that the order is maintained. c. Delete the children from the list. d. Assign a codeword for each leaf based on the path from the root. 36

37 Fig. 7.5: Coding Tree for HELLO using the Huffman Algorithm. In the Figure, new symbols P1, P2, P3 are created to refer to the parent nodes in the Huffman coding tree. The contents in the list are illustrated below: After initialization: L H E O After iteration (a): L P1 H After iteration (b): L P2 After iteration (c): P3 37

38 Properties of Huffman Coding Unique Prefix Property: No Huffman code is a prefix of any other Huffman code - precludes any ambiguity in decoding. Optimality: shortest tree minimum redundancy code - proved optimal for a given data model (i.e., a given, accurate, probability distribution): The two least frequent symbols will have the same length for their Huffman codes, differing only at the last bit. Symbols that occur more frequently will have shorter Huffman codes than symbols that occur less frequently. 38 Shannon-Fano VS Huffman Encoding - Tutorial

39 Huffman Example (1) A. What is the entropy () of the image below, where numbers (0, 20, 50, 99) denote the gray-level intensities? B. Show step by step how to construct the Huffman tree to encode the above four intensity values in this image. Show the resulting code for each intensity value. C. What is the average number of bits needed for each pixel, using your Huffman code? How does it compare to? 39

40 40

41 Huffman Example (2) 41

42 Huffman Example (2) A(64), D(16), B(13), C(12), E(9), F(5) A(64), D(16), P1(14), B(13), C(12) A(64), P2(25), D(16), P1(14) A(64), P3(30), P2(25) A(64), P4(55) P5(119) 42

43 Huffman Example (2) Average code length = where P is the probability of the symbol and L is the code length of the symbol avg = * * * * * * 4 = Entropy = Average code length (bits/symbol) Entropy = bits/symbol Compression Ratio = Original Symbol Length before Compression / Average Code Length after compression Compression Ratio = 8 / = because we are using ASCII which stores each symbol in 1 byte, which is 8 bits 43

44 Decompression Decompression is straight forward using the tree given : A A C A C E A C E E A C E E D = ACEED

### Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression

### EE67I Multimedia Communication Systems Lecture 4

EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.

### Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn 2 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information

### Chapter 7 Lossless Compression Algorithms

Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5 Dictionary-based Coding 7.6 Arithmetic Coding 7.7

### Intro. To Multimedia Engineering Lossless Compression

Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based

### Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

### Multimedia Networking ECE 599

Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics

### Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information

### Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic

### Data Compression Techniques

Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits

### Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

CS231 Algorithms Handout # 31 Prof. Lyn Turbak November 20, 2001 Wellesley College Compression The Big Picture We want to be able to store and retrieve data, as well as communicate it with others. In general,

### IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I 1 Need For Compression 2D data sets are much larger than 1D. TV and movie data sets are effectively 3D (2-space, 1-time). Need Compression for

### A Research Paper on Lossless Data Compression Techniques

IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 1 June 2017 ISSN (online): 2349-6010 A Research Paper on Lossless Data Compression Techniques Prof. Dipti Mathpal

### An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 10, October 2014,

### IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

IMAGE COMPRESSION- I Week VIII Feb 25 02/25/2003 Image Compression-I 1 Reading.. Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive,

### Figure-2.1. Information system with encoder/decoders.

2. Entropy Coding In the section on Information Theory, information system is modeled as the generationtransmission-user triplet, as depicted in fig-1.1, to emphasize the information aspect of the system.

### CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits a 45% b 13% c 12% d 16% e 9% f 5% Why?

### Text Compression through Huffman Coding. Terminology

Text Compression through Huffman Coding Huffman codes represent a very effective technique for compressing data; they usually produce savings between 20% 90% Preliminary example We are given a 100,000-character

### Digital Image Processing

Digital Image Processing Image Compression Caution: The PDF version of this presentation will appear to have errors due to heavy use of animations Material in this presentation is largely based on/derived

### Topic 5 Image Compression

Topic 5 Image Compression Introduction Data Compression: The process of reducing the amount of data required to represent a given quantity of information. Purpose of Image Compression: the reduction of

### David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

David Rappaport School of Computing Queen s University CANADA Copyright, 1996 Dale Carnegie & Associates, Inc. Data Compression There are two broad categories of data compression: Lossless Compression

### Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2

IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY LOSSLESS METHOD OF IMAGE COMPRESSION USING HUFFMAN CODING TECHNIQUES Trupti S Bobade *, Anushri S. sastikar 1 Department of electronics

### Chapter 1. Digital Data Representation and Communication. Part 2

Chapter 1. Digital Data Representation and Communication Part 2 Compression Digital media files are usually very large, and they need to be made smaller compressed Without compression Won t have storage

### CS 335 Graphics and Multimedia. Image Compression

CS 335 Graphics and Multimedia Image Compression CCITT Image Storage and Compression Group 3: Huffman-type encoding for binary (bilevel) data: FAX Group 4: Entropy encoding without error checks of group

### Engineering Mathematics II Lecture 16 Compression

010.141 Engineering Mathematics II Lecture 16 Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University 1 Lossless Compression Outline Huffman &

### Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Image compression Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Methods for Image Processing academic year 2017 2018 Data and information The representation of images in a raw

### So, what is data compression, and why do we need it?

In the last decade we have been witnessing a revolution in the way we communicate 2 The major contributors in this revolution are: Internet; The explosive development of mobile communications; and The

### More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

### Volume 2, Issue 9, September 2014 ISSN

Fingerprint Verification of the Digital Images by Using the Discrete Cosine Transformation, Run length Encoding, Fourier transformation and Correlation. Palvee Sharma 1, Dr. Rajeev Mahajan 2 1M.Tech Student

### A Comprehensive Review of Data Compression Techniques

Volume-6, Issue-2, March-April 2016 International Journal of Engineering and Management Research Page Number: 684-688 A Comprehensive Review of Data Compression Techniques Palwinder Singh 1, Amarbir Singh

### Greedy Algorithms CHAPTER 16

CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often

### Data Compression 신찬수

Data Compression 신찬수 Data compression Reducing the size of the representation without affecting the information itself. Lossless compression vs. lossy compression text file image file movie file compression

### 6. Finding Efficient Compressions; Huffman and Hu-Tucker

6. Finding Efficient Compressions; Huffman and Hu-Tucker We now address the question: how do we find a code that uses the frequency information about k length patterns efficiently to shorten our message?

### Repetition 1st lecture

Repetition 1st lecture Human Senses in Relation to Technical Parameters Multimedia - what is it? Human senses (overview) Historical remarks Color models RGB Y, Cr, Cb Data rates Text, Graphic Picture,

### IMAGE COMPRESSION TECHNIQUES

International Journal of Information Technology and Knowledge Management July-December 2010, Volume 2, No. 2, pp. 265-269 Uchale Bhagwat Shankar The use of digital images has increased at a rapid pace

### Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

Volume 6, Issue 2, February 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Comparative

### Digital Image Processing

Lecture 9+10 Image Compression Lecturer: Ha Dai Duong Faculty of Information Technology 1. Introduction Image compression To Solve the problem of reduncing the amount of data required to represent a digital

### ENSC Multimedia Communications Engineering Topic 4: Huffman Coding 2

ENSC 424 - Multimedia Communications Engineering Topic 4: Huffman Coding 2 Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 1 Outline Canonical Huffman code Huffman

### Basic Compression Library

Basic Compression Library Manual API version 1.2 July 22, 2006 c 2003-2006 Marcus Geelnard Summary This document describes the algorithms used in the Basic Compression Library, and how to use the library

### A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression P. RATNA TEJASWI 1 P. DEEPTHI 2 V.PALLAVI 3 D. GOLDIE VAL DIVYA 4 Abstract: Data compression is the art of reducing

### Data Compression Fundamentals

1 Data Compression Fundamentals Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch 2 Several classifications of compression methods are possible Based on data type :» Generic data compression» Audio compression»

### Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?

### CIS 121 Data Structures and Algorithms with Java Spring 2018

CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and

### 15 July, Huffman Trees. Heaps

1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in

### MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, Huffman Codes

MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, 2016 Huffman Codes CLRS: Ch 16.3 Ziv-Lempel is the most popular compression algorithm today.

### Text Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2

Text Compression Jayadev Misra The University of Texas at Austin July 1, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction To Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

### Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

Chapter 5 VARIABLE-LENGTH CODING ---- Information Theory Results (II) 1 Some Fundamental Results Coding an Information Source Consider an information source, represented by a source alphabet S. S = { s,

### Data compression.

Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.

### Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

Efficient Sequential Algorithms, Comp39 Part 3. String Algorithms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press (21).

### Image Coding and Data Compression

Image Coding and Data Compression Biomedical Images are of high spatial resolution and fine gray-scale quantisiation Digital mammograms: 4,096x4,096 pixels with 12bit/pixel 32MB per image Volume data (CT

### CS/COE 1501

CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory

### Image Coding and Compression

Lecture 17, Image Coding and Compression GW Chapter 8.1 8.3.1, 8.4 8.4.3, 8.5.1 8.5.2, 8.6 Suggested problem: Own problem Calculate the Huffman code of this image > Show all steps in the coding procedure,

### 4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

### Information Theory and Communication

Information Theory and Communication Shannon-Fano-Elias Code and Arithmetic Codes Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/12 Roadmap Examples

### 15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

### Multimedia Systems. Part 20. Mahdi Vasighi

Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding

### IMAGE COMPRESSION TECHNIQUES

IMAGE COMPRESSION TECHNIQUES A.VASANTHAKUMARI, M.Sc., M.Phil., ASSISTANT PROFESSOR OF COMPUTER SCIENCE, JOSEPH ARTS AND SCIENCE COLLEGE, TIRUNAVALUR, VILLUPURAM (DT), TAMIL NADU, INDIA ABSTRACT A picture

### Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Overview Last Lecture Data Transmission This Lecture Data Compression Source: Lecture notes Next Lecture Data Integrity 1 Source : Sections 10.1, 10.3 Lecture 4 Data Compression 1 Data Compression Decreases

### Data and information. Image Codning and Compression. Image compression and decompression. Definitions. Images can contain three types of redundancy

Image Codning and Compression data redundancy, Huffman coding, image formats Lecture 7 Gonzalez-Woods: 8.-8.3., 8.4-8.4.3, 8.5.-8.5.2, 8.6 Carolina Wählby carolina@cb.uu.se 08-47 3469 Data and information

### An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal

### Without the move-to-front step abcddcbamnopponm is encoded as C* = (0, 1, 2, 3, 3, 2, 1, 0, 4, 5, 6, 7, 7, 6, 5, 4) (Table 1.14b).

4 th Stage Lecture time: 10:30 AM-2:30 PM Instructor: Dr. Ali Kadhum AL-Quraby Lecture No. : 5 Subject: Data Compression Class room no.: Department of computer science Move-to-Front Coding Move To Front

### Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

CS6 Lecture 4 Greedy Algorithms Scribe: Virginia Williams, Sam Kim (26), Mary Wootters (27) Date: May 22, 27 Greedy Algorithms Suppose we want to solve a problem, and we re able to come up with some recursive

### CS 206 Introduction to Computer Science II

CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental

### ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding Huffman codes require us to have a fairly reasonable idea of how source symbol probabilities are distributed. There are a number of applications

### ENSC Multimedia Communications Engineering Huffman Coding (1)

ENSC 424 - Multimedia Communications Engineering Huffman Coding () Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 Outline Entropy Coding Prefix code Kraft-McMillan

### Analysis of Algorithms

Algorithm An algorithm is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program can be viewed as an elaborate algorithm. In mathematics and

### CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

CHAPTER 6 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform Page No 6.1 Introduction 103 6.2 Compression Techniques 104 103 6.2.1 Lossless compression 105 6.2.2 Lossy compression

### FPGA based Data Compression using Dictionary based LZW Algorithm

FPGA based Data Compression using Dictionary based LZW Algorithm Samish Kamble PG Student, E & TC Department, D.Y. Patil College of Engineering, Kolhapur, India Prof. S B Patil Asso.Professor, E & TC Department,

### A Comparative Study of Lossless Compression Algorithm on Text Data

Proc. of Int. Conf. on Advances in Computer Science, AETACS A Comparative Study of Lossless Compression Algorithm on Text Data Amit Jain a * Kamaljit I. Lakhtaria b, Prateek Srivastava c a, b, c Department

### IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

IMAGE COMPRESSION Image Compression Why? Reducing transportation times Reducing file size A two way event - compression and decompression 1 Compression categories Compression = Image coding Still-image

### Compressing Data. Konstantin Tretyakov

Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

### Data Compression. Guest lecture, SGDS Fall 2011

Data Compression Guest lecture, SGDS Fall 2011 1 Basics Lossy/lossless Alphabet compaction Compression is impossible Compression is possible RLE Variable-length codes Undecidable Pigeon-holes Patterns

### Optimization of Bit Rate in Medical Image Compression

Optimization of Bit Rate in Medical Image Compression Dr.J.Subash Chandra Bose 1, Mrs.Yamini.J 2, P.Pushparaj 3, P.Naveenkumar 4, Arunkumar.M 5, J.Vinothkumar 6 Professor and Head, Department of CSE, Professional

### A New Compression Method Strictly for English Textual Data

A New Compression Method Strictly for English Textual Data Sabina Priyadarshini Department of Computer Science and Engineering Birla Institute of Technology Abstract - Data compression is a requirement

### Image coding and compression

Image coding and compression Robin Strand Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Today Information and Data Redundancy Image Quality Compression Coding

### Horn Formulae. CS124 Course Notes 8 Spring 2018

CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

### Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

### Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 29 Source Coding (Part-4) We have already had 3 classes on source coding

### Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013

Abdullah-Al Mamun CSE 5095 Yufeng Wu Spring 2013 Introduction Data compression is the art of reducing the number of bits needed to store or transmit data Compression is closely related to decompression

### CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding

CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,

### Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master

### Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Data Compression Media Signal Processing, Presentation 2 Presented By: Jahanzeb Farooq Michael Osadebey What is Data Compression? Definition -Reducing the amount of data required to represent a source

### Analysis of Parallelization Effects on Textual Data Compression

Analysis of Parallelization Effects on Textual Data GORAN MARTINOVIC, CASLAV LIVADA, DRAGO ZAGAR Faculty of Electrical Engineering Josip Juraj Strossmayer University of Osijek Kneza Trpimira 2b, 31000

### Data Compression Scheme of Dynamic Huffman Code for Different Languages

2011 International Conference on Information and Network Technology IPCSIT vol.4 (2011) (2011) IACSIT Press, Singapore Data Compression Scheme of Dynamic Huffman Code for Different Languages Shivani Pathak

### Algorithms Dr. Haim Levkowitz

91.503 Algorithms Dr. Haim Levkowitz Fall 2007 Lecture 4 Tuesday, 25 Sep 2007 Design Patterns for Optimization Problems Greedy Algorithms 1 Greedy Algorithms 2 What is Greedy Algorithm? Similar to dynamic

### CS/COE 1501

CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,

### Greedy Algorithms and Huffman Coding

Greedy Algorithms and Huffman Coding Henry Z. Lo June 10, 2014 1 Greedy Algorithms 1.1 Change making problem Problem 1. You have quarters, dimes, nickels, and pennies. amount, n, provide the least number

### EE-575 INFORMATION THEORY - SEM 092

EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------

### Research Article Does an Arithmetic Coding Followed by Run-length Coding Enhance the Compression Ratio?

Research Journal of Applied Sciences, Engineering and Technology 10(7): 736-741, 2015 DOI:10.19026/rjaset.10.2425 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:

### A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

International Journal of Engineering Research and General Science Volume 3, Issue 4, July-August, 15 ISSN 91-2730 A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

### Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Source Coding Basics and Speech Coding Yao Wang Polytechnic University, Brooklyn, NY1121 http://eeweb.poly.edu/~yao Outline Why do we need to compress speech signals Basic components in a source coding

### CSC 310, Fall 2011 Solutions to Theory Assignment #1

CSC 310, Fall 2011 Solutions to Theory Assignment #1 Question 1 (15 marks): Consider a source with an alphabet of three symbols, a 1,a 2,a 3, with probabilities p 1,p 2,p 3. Suppose we use a code in which

### SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel

### 7.5 Dictionary-based Coding

7.5 Dictionary-based Coding LZW uses fixed-length code words to represent variable-length strings of symbols/characters that commonly occur together, e.g., words in English text LZW encoder and decoder

### Greedy Algorithms. Alexandra Stefan

Greedy Algorithms Alexandra Stefan 1 Greedy Method for Optimization Problems Greedy: take the action that is best now (out of the current options) it may cause you to miss the optimal solution You build

### Information Science 2

Information Science 2 - Path Lengths and Huffman s Algorithm- Week 06 College of Information Science and Engineering Ritsumeikan University Agenda l Review of Weeks 03-05 l Tree traversals and notations

### 6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms

6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms We now address the question: How do we find a code that uses the frequency information about k length patterns efficiently, to shorten