ENSC Multimedia Communications Engineering Huffman Coding (1)

Similar documents
ENSC Multimedia Communications Engineering Topic 4: Huffman Coding 2

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

CMPSCI 240 Reasoning Under Uncertainty Homework 4

Information Theory and Communication

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Lossless Compression Algorithms

Uniquely Decodable. Code 1 Code 2 A 00 0 B 1 1 C D 11 11

Chapter 5: Data compression. Chapter 5 outline

Lecture: Analysis of Algorithms (CS )

Huffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27

EE 368. Weeks 5 (Notes)

6. Finding Efficient Compressions; Huffman and Hu-Tucker

Lec 04 Variable Length Coding in JPEG

Data Compression Techniques

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class:

Digital Image Processing

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Compressing Data. Konstantin Tretyakov

EE67I Multimedia Communication Systems Lecture 4

Intro. To Multimedia Engineering Lossless Compression

ITCT Lecture 6.1: Huffman Codes

Multimedia Networking ECE 599

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

Text Compression through Huffman Coding. Terminology

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Text Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2

More Bits and Bytes Huffman Coding

OUTLINE. Paper Review First Paper The Zero-Error Side Information Problem and Chromatic Numbers, H. S. Witsenhausen Definitions:

Binary Trees Case-studies

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

FACULTY OF ENGINEERING LAB SHEET INFORMATION THEORY AND ERROR CODING ETM 2126 ETN2126 TRIMESTER 2 (2011/2012)

A New Compression Method Strictly for English Textual Data

Design and Analysis of Algorithms

6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms

Algorithms and Data Structures CS-CO-412

Engineering Mathematics II Lecture 16 Compression

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Analysis of Algorithms - Greedy algorithms -

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Huffman Coding. (EE 575: Source Coding Project) Project Report. Submitted By: Raza Umar. ID: g

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

CSE 421 Greedy: Huffman Codes

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Greedy algorithms part 2, and Huffman code

Chapter 16: Greedy Algorithm

A Research Paper on Lossless Data Compression Techniques

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

David Huffman ( )

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

A Comparative Study of Lossless Compression Algorithm on Text Data

Lecture 15. Error-free variable length schemes: Shannon-Fano code

CS/COE 1501

Volume 2, Issue 9, September 2014 ISSN

15 July, Huffman Trees. Heaps

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of MCA

Greedy Algorithms CHAPTER 16

Figure-2.1. Information system with encoder/decoders.

Red-Black, Splay and Huffman Trees

Huffman Codes (data compression)

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, Huffman Codes

Greedy Algorithms and Huffman Coding

14.4 Description of Huffman Coding

Without the move-to-front step abcddcbamnopponm is encoded as C* = (0, 1, 2, 3, 3, 2, 1, 0, 4, 5, 6, 7, 7, 6, 5, 4) (Table 1.14b).

Data Compression Algorithms

15-122: Principles of Imperative Computation, Spring 2013

LOSSLESS VERSUS LOSSY COMPRESSION: THE TRADE OFFS

Greedy Algorithms. Alexandra Stefan

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

14 Data Compression by Huffman Encoding

Garbage Collection: recycling unused memory

Chapter 7 Lossless Compression Algorithms

Chapter 10: Trees. A tree is a connected simple undirected graph with no simple circuits.

Greedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes

Data compression.

CS/COE 1501

Repetition 1st lecture

Data Structures and Algorithms for Engineers

Algorithms Dr. Haim Levkowitz

An Efficient Decoding Technique for Huffman Codes Abstract 1. Introduction

ENTROPY ENCODERS: HUFFMAN CODING AND ARITHMETIC CODING 1

Greedy algorithms 2 4/5/12. Knapsack problems: Greedy or not? Compression algorithms. Data compression. David Kauchak cs302 Spring 2012

CSC 310, Fall 2011 Solutions to Theory Assignment #1

VC 12/13 T16 Video Compression

CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Complete Variable-Length "Fix-Free" Codes

Data Compression. Guest lecture, SGDS Fall 2011

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Huffman Code Application. Lecture7: Huffman Code. A simple application of Huffman coding of image compression which would be :

Second Semester - Question Bank Department of Computer Science Advanced Data Structures and Algorithms...

Wireless Communication

Lecture 3, Review of Algorithms. What is Algorithm?

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Data Structures and Algorithms

Source Encoding and Compression

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

Transcription:

ENSC 424 - Multimedia Communications Engineering Huffman Coding () Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424

Outline Entropy Coding Prefix code Kraft-McMillan inequality Huffman Encoding Minimum Variance Huffman Coding Extended Huffman Coding J. Liang: SFU ENSC 424 2

Entropy Coding Design the mapping from source symbols to codewords Lossless mapping Goal: minimizing the average codeword length Approach the entropy of the source J. Liang: SFU ENSC 424 3

Example: Morse Code Represent English characters and numbers by different combinations of dot and dash (codewords) Examples: E I A T O S Z Problem: Letters have to be separated by space, Or paused when transmitting over radio. SOS: pause Not uniquely decodable! J. Liang: SFU ENSC 424 4

Entropy Coding: Prefix-free Code No codeword is a prefix of another one. Can be uniquely decoded. Also called prefix code Example:,,, Binary Code Tree Root node Internal node leaf node Prefix-free code contains leaves only. How to express the requirement mathematically? J. Liang: SFU ENSC 424 5

Kraft-McMillan Inequality Let C be a code with N codewords with length li, i=, N. If C is uniquely decodable, then N i= 2 If a set of li satisfies the inequality above, then there exists a prefix-free code with codeword lengths li, i=, N. l i J. Liang: SFU ENSC 424 6

Kraft-McMillan Inequality To see this, expand the binary code tree to depth L = max(li) N i= 2 l i J. Liang: SFU ENSC 424 7 N i= Number of nodes in the last level: Each code has a sub-tree: 2 L l The number of offsprings in the last level: K-M inequality: L = 3 i L 2 # of L-th level offsprings of all codes is less than 2^L. 2 L 2 L l i Leads to more than 2^L offspring

Outline Entropy Coding Prefix code Kraft-McMillan inequality Huffman Encoding Minimum Variance Huffman Coding Extended Huffman Coding J. Liang: SFU ENSC 424 8

Huffman Coding A procedure to construct optimal prefix-free code Result of David Huffman s term paper in 952 when he was a PhD student at MIT Shannon Fano Huffman (925-999) Observations: Assign short codes to frequent symbols. In an optimum prefix-free code, the two codewords that occur least frequently will have the same length. truncate a b a b J. Liang: SFU ENSC 424 9

Huffman Code Design Another property of Huffman coding: The codewords of the two lowest probability symbols differ only in the last bit. Requirement: The source probability distribution (Not available in most cases) Procedure:. Sort the probability of all source symbols in a descending order. 2. Merge the last two into a new symbol, add their probabilities. 3. Repeat Step, 2 until only one symbol (the root) is left. 4. Code assignment: Traverse the tree from the root to each leaf node, assign to the top branch and to the bottom branch. J. Liang: SFU ENSC 424

Example 3.2. Source alphabet A = {a, a2, a3, a4, a5} Probability distribution: {.2,,.2,.,.} Sort merge Sort merge Sort merge Sort merge a2 ().6 a(.2) a3(.2) a4(.).2.2.2.2.2.6 a5(.) Assign code J. Liang: SFU ENSC 424

Huffman code is prefix-free All codewords are leaf nodes No code is a prefix of any other code. (Prefix free) J. Liang: SFU ENSC 424 2

Average Codeword Length vs Entropy Source alphabet A = {a, b, c, d, e} Probability distribution: {.2,,.2,.,.} Code: {,,,, } Entropy: H(S) = - (.2*log2(.2)*2 + *log2()+.*log2(.)*2) = 2.22 bits / symbol Average Huffman codeword length: L =.2*2+*+.2*3+.*4+.*4 = 2.2 bits / symbol In general: H(S) L < H(S) + J. Liang: SFU ENSC 424 3

Huffman Code is not unique Two choices for each split:, or,.6.6.6.2.2.6 Multiple ordering choices for tied probabilities a.6 b.6 b.6 a.6 c.2 c.2 J. Liang: SFU ENSC 424 4

Minimum Variance Huffman Code Put the combined symbol as high as possible in the sorted list Prevent unbalanced tree: Reduce memory requirement for decoding (revisited later) Repeat previous example Compute average codeword length J. Liang: SFU ENSC 424 5

Extended Huffman Code Code multiple symbols jointly Composite symbol: (X, X2,, Xk) Alphabet increased exponentioally: N k Code symbols of different meanings jointly JPEG: Run-level coding H.264 CAVLC: context-adaptive variable length coding # of non-zero coefficients and # of trailing ones Revisited later J. Liang: SFU ENSC 424 6

Example Joint probability: P(X2i, X2i+) P(, ) = 3/8, P(, ) = /8 P(, ) = /8, P(, ) = 3/8 P(Xj = ) = P(Xj = ) = /2 Entropy H(Xj) = bit / symbol Joint Prob P(X2i, X2i+) X2i+ X2i 3/8 /8 /8 3/8 Second order entropy: H X 2 i, X 2i+ ) =.83 bits ( / 2 symbols, or.956 bits / symbol Huffman code for Xj:, Average code length Huffman code for (X2i, X2i+): bit / symbol :, :, :, : Average code length:.9375 bit /symbol J. Liang: SFU ENSC 424 7

Summary Goal of entropy coding: Reduce the average codeword length (the entropy is the lower bound) Prefix-free code: uniquely decodable code Kraft-McMillan Inequality: Characteristic of prefix-free code Huffman Code: Optimal prefix-free code Minimum variance code Next: Canonical Huffman Encoding and decoding J. Liang: SFU ENSC 424 8