Huffman Coding. (EE 575: Source Coding Project) Project Report. Submitted By: Raza Umar. ID: g

Similar documents
Chapter 20: Binary Trees

Binary Search Tree (3A) Young Won Lim 6/2/18

Binary Search Tree (2A) Young Won Lim 5/17/18

Binary Search Tree (3A) Young Won Lim 6/4/18

EE 368. Weeks 5 (Notes)

15 July, Huffman Trees. Heaps

TREES. Trees - Introduction

(2,4) Trees. 2/22/2006 (2,4) Trees 1

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

Data compression.

Greedy Algorithms. Alexandra Stefan

Data Structures and Organization (p.5 Recursion. Binary Trees)

Sorting. Bubble sort method. Bubble sort properties. Quick sort method. Notes. Eugeniy E. Mikhailov. Lecture 27. Notes. Notes

CS 171: Introduction to Computer Science II. Binary Search Trees

Intro. To Multimedia Engineering Lossless Compression

Search Trees - 1 Venkatanatha Sarma Y

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

Review of the Lectures 21-26, 30-32

Text Compression through Huffman Coding. Terminology

Overview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap

EE67I Multimedia Communication Systems Lecture 4

Binary Trees and Huffman Encoding Binary Search Trees

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Red-Black, Splay and Huffman Trees

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Tree Structures. A hierarchical data structure whose point of entry is the root node

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

Multi-way Search Trees

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Binary Search Tree (3A) Young Won Lim 6/6/18

R10 SET - 1. Code No: R II B. Tech I Semester, Supplementary Examinations, May

CS301 - Data Structures Glossary By

Binary Trees

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology

Lossless Compression Algorithms

DATA STRUCTURES AND ALGORITHMS. Hierarchical data structures: AVL tree, Bayer tree, Heap

Data Abstractions. National Chiao Tung University Chun-Jen Tsai 05/23/2012

CMPSCI 240 Reasoning Under Uncertainty Homework 4

March 20/2003 Jayakanth Srinivasan,

Heap: A binary heap is a complete binary tree in which each, node other than root is smaller than its parent. Heap example: Fig 1. NPTEL IIT Guwahati

ENSC Multimedia Communications Engineering Huffman Coding (1)

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

6. Finding Efficient Compressions; Huffman and Hu-Tucker

Data Structures and Algorithms for Engineers

Advanced Java Concepts Unit 5: Trees. Notes and Exercises

ECE 242 Data Structures and Algorithms. Heaps I. Lecture 22. Prof. Eric Polizzi

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

Data Structures. Trees. By Dr. Mohammad Ali H. Eljinini. M.A. Eljinini, PhD

COT 5407: Introduction. to Algorithms. Giri NARASIMHAN. 1/29/19 CAP 5510 / CGS 5166

Binary Tree. Binary tree terminology. Binary tree terminology Definition and Applications of Binary Trees

Binary Search Trees > = 2014 Goodrich, Tamassia, Goldwasser. Binary Search Trees 1

CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees. CS 135 Winter 2018 Tutorial 7: Accumulative Recursion and Binary Trees 1

CS 331 DATA STRUCTURES & ALGORITHMS BINARY TREES, THE SEARCH TREE ADT BINARY SEARCH TREES, RED BLACK TREES, THE TREE TRAVERSALS, B TREES WEEK - 7

(2,4) Trees Goodrich, Tamassia. (2,4) Trees 1

Lecture 26. Introduction to Trees. Trees

Binary Trees, Binary Search Trees

Trees. Eric McCreath

Chapter 10: Trees. A tree is a connected simple undirected graph with no simple circuits.

Garbage Collection: recycling unused memory

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

Data and File Structures Laboratory

We have the pointers reference the next node in an inorder traversal; called threads

ECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Recursion: The Beginning

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

DUKE UNIVERSITY Department of Computer Science. Test 2: CompSci 100

12 July, Red-Black Trees. Red-Black Trees

CS24 Week 8 Lecture 1

Binary Trees. Directed, Rooted Tree. Terminology. Trees. Binary Trees. Possible Implementation 4/18/2013

Analysis of Algorithms - Greedy algorithms -

Huffman, YEAH! Sasha Harrison Spring 2018

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

ITI Introduction to Computing II

Analysis of Algorithms

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Data Compression Algorithms

CS 206 Introduction to Computer Science II

Discussion 2C Notes (Week 8, February 25) TA: Brian Choi Section Webpage:

& ( D. " mnp ' ( ) n 3. n 2. ( ) C. " n

Cpt S 122 Data Structures. Data Structures Trees

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of MCA

B-Trees. Version of October 2, B-Trees Version of October 2, / 22

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

Programming II (CS300)

Backtracking. Chapter 5

Data Structures and Algorithms

Priority Queues, Binary Heaps, and Heapsort

ITI Introduction to Computing II

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

Algorithms. Deleting from Red-Black Trees B-Trees

Greedy algorithms 2 4/5/12. Knapsack problems: Greedy or not? Compression algorithms. Data compression. David Kauchak cs302 Spring 2012

Advanced Java Concepts Unit 5: Trees. Notes and Exercises

CSE 230 Intermediate Programming in C and C++ Binary Tree

Binary Trees. Examples:

Transcription:

Huffman Coding (EE 575: Source Coding Project) Project Report Submitted By: Raza Umar ID: g200905090

Algorithm Description Algorithm developed for Huffman encoding takes a string of data symbols to be encoded along with a vector containing respective symbol probabilities as input. It calls two recursive functions to generate the Huffman dictionary and reports the average length of the codeword dictionary as output. The main theme of algorithm is to make use of cell structures in matlab to build the Huffman tree while keeping track of child and parent nodes. Once the tree has been built, codeword corresponding to each input data symbol (which acts like a leaf node in Huffman tree) can be found out by simply traversing the tree from the branch till that leaf node is encountered. The general structure contains cells corresponding to input data symbol, probability and its original order in the list of symbols passed to the algorithm as a string. Two additional cells have been added in the structure to keep information regarding the child nodes and code word of the current node. A structure is made for each data symbol and M (= number of input data symbols) instances of this structure are filled with known information and sorted in ascending order of probability. This result in M leaf nodes corresponding to M data symbols arranged in ascending order of probability. Huffman tree is generated by passing this structure (with M nodes) to a recursive function gen_h_tree. This function combines the top two nodes (nodes with least probability) to make one parent node. Parent node contains the information of two combining nodes as child nodes and the probability of parent node is equal to the sum of probabilities of child nodes. The two child nodes are then removed from the Huffman tree and depending on the probability of this parent node, it is inserted in the Huffman tree such that all the (M-1) nodes remain in ascending order of probability. Note that, by replacing two child nodes with one parent node, number of nodes gets reduced by 1. This function is recursively called till the Huffman tree consists of only one final node with probability 1. Huffman dictionary is then generated by traversing this tree recursively till the leaf nodes. Essentially, Huffman dictionary is another structure containing cells corresponding to input data symbol, probability, codeword, length of its codeword and its original order in input string of data symbols. Since Huffman tree is a binary tree so each parent node contains information of its two child nodes. A child node with least probability is assigned bit 1 while a child node with higher probability is assigned a bit 0. All these bits corresponding to each node are concatenated into a vector which ultimately becomes the code word of the node which has no child i.e. leaf node. Each time when a leaf node is encountered, weighted average length of the code word is accumulated in a variable avglen containing 0 as its initial value. This variable represents the average length of the codeword dictionary when all leaf nodes get their codewords assigned.

Codeword dictionary is then arranged according to the desired output format e.g. either same as input data symbol order (original order) or in ascending/descending order of code length. The algorithm then output each data symbol along with its respective codeword from the codeword dictionary.

Algorithm Flowchart Start Read the inputs 1. String of input data symbols 2. Vector of respective probabilities Fill in the structure h_tree h_tree.symbol h_tree.prob h_tree.org_order corresponding to each i/p data symbol sort M nodes in ascending order of prob. Generate h_tree Is this structure has only 1 node? yes no Combine top two nodes to form a parent node Combining nodes are two children of parent node Prob. of parent node is the sum of prob. of child nodes Insert new_node index=1 While (new_node.prob > h_tree(index).prob) do index=index+1 Place new_node before h_tree(index) in struct h_tree

avglen=0 h_tree Generate h_dict Is this a leaf node? yes For i=1:2 h_tree.child{i}.code=[h_tree.code 2-i] call Generate h_dict with h_tree.child{i} as input end For no Copy h_tree to h_dict Avglen=avglen+h_tee.prob*length(h_tree.code) Display Output Sort h_dict instances according to original order of input data symbols Output symbols and their respective codewords Output avglen of codeword dictionary