EE67I Multimedia Communication Systems Lecture 4

Similar documents
Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Intro. To Multimedia Engineering Lossless Compression

Chapter 7 Lossless Compression Algorithms

Lossless Compression Algorithms

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Ch. 2: Compression Basics Multimedia Systems

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Multimedia Networking ECE 599

Chapter 1. Digital Data Representation and Communication. Part 2

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

7.5 Dictionary-based Coding

Ch. 2: Compression Basics Multimedia Systems

Multimedia Systems. Part 20. Mahdi Vasighi

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

Digital Image Processing

Image coding and compression

ENSC Multimedia Communications Engineering Topic 4: Huffman Coding 2

Digital Image Processing

CS 335 Graphics and Multimedia. Image Compression

Image Coding and Data Compression

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Repetition 1st lecture

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Topic 5 Image Compression

Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2

Engineering Mathematics II Lecture 16 Compression

Lecture 8 JPEG Compression (Part 3)

Wireless Communication

Compressing Data. Konstantin Tretyakov

CSE 421 Greedy: Huffman Codes

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

6. Finding Efficient Compressions; Huffman and Hu-Tucker

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

A Research Paper on Lossless Data Compression Techniques

Image Coding and Compression

Image Compression. CS 6640 School of Computing University of Utah

Lecture 6 Review of Lossless Coding (II)

A Comprehensive Review of Data Compression Techniques

MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, Huffman Codes

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Basic Compression Library

A Comparative Study of Lossless Compression Algorithm on Text Data

Greedy Algorithms. Alexandra Stefan

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Greedy Algorithms CHAPTER 16

EE-575 INFORMATION THEORY - SEM 092

CMPT 365 Multimedia Systems. Media Compression - Image

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

VC 12/13 T16 Video Compression

Data and information. Image Codning and Compression. Image compression and decompression. Definitions. Images can contain three types of redundancy

Figure-2.1. Information system with encoder/decoders.

Volume 2, Issue 9, September 2014 ISSN

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

Data Compression Techniques

A New Compression Method Strictly for English Textual Data

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

IMAGE COMPRESSION TECHNIQUES

CIS 121 Data Structures and Algorithms with Java Spring 2018

CSC 310, Fall 2011 Solutions to Theory Assignment #1

15 July, Huffman Trees. Heaps

Information Theory and Communication

Lecture 8 JPEG Compression (Part 3)

International Journal of Trend in Research and Development, Volume 3(2), ISSN: A Review of Coding Techniques in the Frequency

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Lecture 5: Compression I. This Week s Schedule

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class:

CS/COE 1501

CoE4TN4 Image Processing. Chapter 8 Image Compression

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Greedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes

Image Compression - An Overview Jagroop Singh 1

Data Compression. Guest lecture, SGDS Fall 2011

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

Without the move-to-front step abcddcbamnopponm is encoded as C* = (0, 1, 2, 3, 3, 2, 1, 0, 4, 5, 6, 7, 7, 6, 5, 4) (Table 1.14b).

Compression I: Basic Compression Algorithms

More Bits and Bytes Huffman Coding

Data Compression Fundamentals

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

Image Compression Technique

Text Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2

Text Compression through Huffman Coding. Terminology

Lossless compression II

Analysis of Algorithms - Greedy algorithms -

Video Compression An Introduction

An Effective Approach to Improve Storage Efficiency Using Variable bit Representation

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

So, what is data compression, and why do we need it?

Data compression.

CS/COE 1501

7: Image Compression

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

Final Review. Image Processing CSE 166 Lecture 18

CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding

Simple variant of coding with a variable number of symbols and fixlength codewords.

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Transcription:

EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost. A desirable compression ratio is one where B 0 /B 1 1. Information Theory The entropy η of an information source with alphabet S = {s 1, s 2,.. s n } is defined below, where p i is the probability that symbol s i in S would occur. The term log 2 (1/ p i ) indicates the amount of information (self-information) and corresponds to the number of bits needed to code s i. Entropy is a measure of disorder in a system and so negative entropy is when order is added to a system. The figure on the left above shows a histogram of an image with uniform distribution of gray-level intensities, such that p i = 1/256. The entropy of this image is: The average code length is usually greater or equal to the entropy: The figure on the right above shows where ⅔ of the image is bright and ⅓ is dark. Below is the entropy calculation.

Run-length Coding (RLC) Coding is performed in groups where symbols form a continuous repeating group. A bi-level image (1-bit black-and-white pixels) has monotone regions which can be coded using RLC. Only need to code the length of each run for a particular color. Shannon-Fano Algorithm (top-down) Variable-Length Coding (VLC) Steps 1. Sort symbols according to frequency count of their occurrences. 2. Recursively divide symbols into two parts, each with approximately the same number of counts, until all parts contain only one symbol. The entropy is as follows: Another coding tree:

Huffman Coding (bottom-up) Algorithm 1. Initialization: put all symbols on the list sorted according to frequency counts. 2. Repeat until the list has only one symbol left a. From the list, pick two symbols with the lowest frequency count and form a subtree with them as children and create a parent node for them. b. Assign the sum of the children frequency to the parent node and insert it into the list, such that the order is maintained. c. Delete the children from the list. 3. Assign a code word for each leaf. The algorithm is demonstrated using HELLO : Properties of Huffman 1. Unique prefix property makes for efficient decoding by precluding ambiguity. 2. Optimality minimum redundancy code a. If at least two symbols have the same length, their code differs by one bit. b. Frequently accruing symbols have shorter codes. c. Average code length is less than η+1 Extended Huffman For symbols with large probabilities, the amount of information is almost 0. Thus it is costly to use a bit to code this symbol. To counteract this a single code word is use to code a group of symbols:

The size of this alphabet is n k. If k is relatively large (e.g. k 3) then for most practical applications where n>>1, n k would be a large number, implying a large symbol table. The entropy of S is now: Adaptive Huffman Regular Huffman needs prior statistical information which is not always available in multimedia applications. This is an order-0 model where no prior information is maintained. An order-k model looks at k preceding symbols for contextual information. Probability distribution of received symbols changes the probabilities assigned to each symbol. The algorithm is as follows: Initial_code assigns symbols with some initially agreed-upon codes without prior knowledge. Update_tree is a procedure for constructing an adaptive Huffman tree by incrementing the frequency counts for the symbols and updating the configuration of the tree. o The Huffman tree must maintain a sibling property (all notes are arranged in the order of increasing counts) and if this is about to be violated a swap procedure is invoked. o In the swap procedure, the farthest node with count N is swapped with the node whose count has just been increased to N + 1. If the node is not a leaf-node, the entire subtree will go with it during the swap. The encoder and decoder must do the same Initial_code and Update_tree routines.

Example of Coding AADCCDD

Dictionary-based coding Lempel-Ziv-Welch (LZW) employs an adaptive, dictionary-based compression technique. It uses fixed length code words to represent varying length strings Builds dictionary dynamically as it receives data, so that both encoder and decoder have the same dictionary. It places longer and longer repeated entries into a dictionary rather than the string itself. Example for compression of string ABABBABCABABBA LZW uses 12-bit codelengths, and so the dictionary contains 4096 entries. LZW Decompression Algorithm

Example of LZW Decompression Example where LZW encounters difficulty: ABABBABCABBABBAX Below is the modified LZW Decompression code:

The average code lengths are usually longer than 8-bit ASCII word lengths. If there is not a lot of redundancy data expansion occurs instead of data reduction. V. 42 compensates by having two modes: transparent and compressed. Compressed mode is run when data expansion is detected. Dictionary size is fixed and so adaptation fails once dictionary is full, which invokes a flushing of the dictionary entries above a threshold. Arithmetic Coding Arithmetic coding treats the whole message as one unit. Input data is broken into chunks to avoid error propagation. A message is represented by a half-open interval [a, b) where a and b are real numbers between 0 and 1. Initially [0, 1) the interval shortens as the message gets longer, and the number of bits represent the interval increases. Below is the encoder algorithm: The example below illustrates the encoder for alphabet [A, B, C, D, E, F, $] where $ is the terminator value. The probability distributions are shown in the table below.

The figure above shows the encoding of symbols CAEE$. The following table shows the resulting ranges. As the probabilities are directly related to the ranges, the range is equal to the probability of the coded symbols occurring in succession. The following algorithm is used to generate codeword for the encoder. In the above code, value (code) is the value of the binary FRACTION bit, e.g. 0.1 2 = 0.5 10. Code generator yields 0.01010101 which is equal to 0.33203125 10. In the worst case scenario, the shortest codeword in arithmetic coding will require k bits to encode a sequence of symbols, and the following holds where P i is the probability for symbol i and range is the final range generated by the encoder. As the message is long, the difference between the log 2 (1/range) and ceiling of log 2 (1/range) is negligible.

Below is the routine used in the Arithmetic Coding Decoder: Using the above example the following table is yielded: It is possible to rescale the intervals and use only integer arithmetic to make implementation more practical. If channel/network is noisy, then the terminator can be corrupted leading to the encoder and the decoder going out of sync. Differential Coding of Images Lossless Image Compression For images we look at numbers in two dimensions (x, y). Because of continuity, gray-level intensities of background and foreground objects in images tend to change relatively slowly across the image frame. Given an original image I(x, y), the equation below defines the difference operator: This is a simple approximation of a partial differential operator δ/δx applied to an image defined in terms of x and y. Using the 2D Laplacian operator to define a difference image d(x, y) yields: Image (a) below shows the original image, with image (b) being the partial derivative (d(x, y)) of image (a). The histogram for (a) shown in image (c) is broader, indicating that there is more entropy in it than in the histogram for (b) shown in image (d). Thus compression will work better on a difference image.

Lossless JPEG This is invoked by choosing a 100% quality factor in JPEG. Two steps are involved: forming a differential prediction and encoding. 1. Prediction combines values of up to three neighboring pixels using one of seven schemes from the table below. 2. The encoder then compares prediction to actual pixel value and encodes the difference using lossless techniques. Lossless JPEG yields a low compression ratio making it impractical for multimedia. The table below compares the performance of different lossless techniques on different images.