14 Data Compression by Huffman Encoding
|
|
- Milton Banks
- 5 years ago
- Views:
Transcription
1 4 Data Compression by Huffman Encoding 4. Introduction In order to save on disk storage space, it is useful to be able to compress files (or memory blocks) of data so that they take up less room. However, we don't want to lose or corrupt the data, so we want to use loss-less data compression. Huffman encoding is probably the simplest method of loss-less data compression (although not the most effective method). We should also note here that not all files can be compressed. A simple explanation for this is that if any file can be compressed, then a compressed file can be compressed into another file, and so on, until the original file is reduced down to nothing! Clearly this is not possible. Before we go any further with Huffman Encoding, let's remind ourselves about probability. 4.2 Probability Probability is the mathematician's way of communicating how likely an event is. There are two ways of calculating probabilities of events: (i) Using past experience e.g. What is the probability that you will live to be 9 years of age? Life insurance companies hold a database of statistics recording how many people live to that age. They can use this to estimate how long you are likely to live. (ii) By calculation using the formula p(event) = No of ways event can happen Total no of possible outcomes where p(event) means 'probability of event happening' e.g. The probability of scoring 5 or more when throwing a normal 'fair' die is #{5,6} = 2 = #{,2,3,4,5,6} 6 3 e.g.2 A character is read from a file. Assuming that all 256 ASCII characters are equally likely to be read, what is the probability that the character is alpha-numeric? Alpha-numerics are {'a'... 'z'}, {'A'... 'Z'} and {''... '9'} therefore the required probability is given by: p(char is alphanumeric) = = Huffman Encoding Huffman encoding uses a similar principle to Morse code. In Morse code, the length of the code reflects the frequency of occurrence of that character in English language text. So, for example, 'e' = (i.e. the shortest code) and 'q' = (i.e. almost the longest code). 3
2 4.3. Creating the Huffman Codes In Huffman coding, instead of using ASCII, a new code is devised, which depends on which characters are in the file (or message) and how often they occur. The coded file or message will take up less space than the original ASCII representation. Huffman codes are of variable length and are created using a tree structure. The algorithm used for this is:. Across the bottom of the page, list each different character that appears in the 'message'. 2. Write against each character the probability of it occurring. (These are the leaf nodes of the tree. At the moment they have no parents.) 3. If there is only one parent-less node, then go to step Find the two nodes (leaf or internal node) with the lowest sum of probabilities. Join them by adding a common (internal) parent node. Give this parent node a probability equal to their sum. 5. Go to step Assign binary codes to the characters by 'walking' down the tree from root to leaf, giving a '' for each left branch and a '' for each right branch. The code for each character is obtained by reading from the root to each leaf node Example Consider the following block of 6 characters: ABABCBABDF. BEBCBDBEBF. BDBDBABCBA FABABCCCDE. FABCFABBAA. FCAAABABCD The distribution and hence probability of each character is thus: Character No of occurrences Probability A 5.25 B 2.35 C 9.5 D 6. E 3.5 F 6. So, steps and 2 of the algorithm give us: Step 3 of the algorithm allows us to go onto step 4. At step 4, we can choose either (E and F) or (E and D). We'll use (E and F) giving us: 3
3 We now go back to step 3 and then do step 4 again. This time we'll link C and D giving: We keep repeating this procedure until all nodes have a parent. At this point, the root node should have the value We now go to step 6 and label all the left branches with the value '' and all the right branches with the value '' giving:
4 We can now read off the codes of each character as follows: A = B = C = D = E = F = Note: It is only by chance that the codes for A to F go up in a binary sequence. If we had made different decisions during the creation of the tree, we could have got different codes. The first characters of the encoded message (ABABCBABDF) are thus: Exercise Produce another version of this tree and codes. Compare the total number of bits in the encoded message using the two sets of codes. (Think about how to do this. You don't have to encode the message to fiond out how many bits are needed.) 4.4 Decoding In order to decode the Huffman codes, we need a copy of the encoding tree at the receiving end. Unfortunately, this means we also have to transmit the frequency data for each character in the character set (or send the codes with separators first). So, if the character set is ASCII, we have to send 256 values which are the counts for each character, as well as the encoded data. This is unfortunate as it increases the size of the compressed file. Fortunately, for large files the overhead involved is acceptably small. (And there are ways to reduce the size of the count data.) We also have to ensure that both the transmitting end and the receiving end use the same rules to create the tree Recreating the tree To recreate the tree, we use the counts to produce the probabilities as before. However, we have already seen that there may be more than one possible tree created from any set of probability data. Hence we have to make sure that both the transmitting end and the receiving end follow the same rules for creating the tree. Rules which should ensure that both ends create the same tree (& which might help to produce a nice neat tree) are:. Write the characters along the bottom of the page with the highest probability on the left and descending across the page to the lowest on the right. Any characters with the same probability should be written in 'ASCII' order. 2. When there is a choice of nodes to use, always use the one furthest over to the right, even if it is at a higher level. 33
5 Following these rules, the data we used in the example would give this tree: B A C D F E and the codes would be: A = B = C = D = E = F = The encoded data for the first characters of the message (ABABCBABDF) would now be: The frequency values followed by Decoding the message.4 Having recreated the tree, decoding the message is simply a matter of reading the encoded characters in turn by reading down the tree from the root node to the leaf node and hence decoding the character. Thus, taking in order we would go: From the root node: - left, - right = A Go back to the root: - left, - left = B Go back to the root: - left, - right = A Go back to the root: - left, - left = B Go back to the root: - right, - left = C etc There is no problem about knowing when the next character starts and no conflict between which codes mean which character. 4.5 Instantaneous Codes Huffman codes are an example of instantaneous codes. These are codes in which it is guaranteed that no code will have a sequence of characters which is identical to the first few characters of any other code. e.g. in our codes, there is no character with the code '' which could be confused with the code for A or B, and no character has the code '' which could be confused with E or F. 34
6. Finding Efficient Compressions; Huffman and Hu-Tucker
6. Finding Efficient Compressions; Huffman and Hu-Tucker We now address the question: how do we find a code that uses the frequency information about k length patterns efficiently to shorten our message?
More informationText Compression through Huffman Coding. Terminology
Text Compression through Huffman Coding Huffman codes represent a very effective technique for compressing data; they usually produce savings between 20% 90% Preliminary example We are given a 100,000-character
More information6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms
6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms We now address the question: How do we find a code that uses the frequency information about k length patterns efficiently, to shorten
More informationCMPSCI 240 Reasoning Under Uncertainty Homework 4
CMPSCI 240 Reasoning Under Uncertainty Homework 4 Prof. Hanna Wallach Assigned: February 24, 2012 Due: March 2, 2012 For this homework, you will be writing a program to construct a Huffman coding scheme.
More informationAn undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.
Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal
More informationCS 206 Introduction to Computer Science II
CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced
More informationBinary Trees Case-studies
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look
More informationWe will show that the height of a RB tree on n vertices is approximately 2*log n. In class I presented a simple structural proof of this claim:
We have seen that the insert operation on a RB takes an amount of time proportional to the number of the levels of the tree (since the additional operations required to do any rebalancing require constant
More informationDigital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay
Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today
More informationLECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS
Department of Computer Science University of Babylon LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS By Faculty of Science for Women( SCIW), University of Babylon, Iraq Samaher@uobabylon.edu.iq
More informationGreedy Algorithms and Huffman Coding
Greedy Algorithms and Huffman Coding Henry Z. Lo June 10, 2014 1 Greedy Algorithms 1.1 Change making problem Problem 1. You have quarters, dimes, nickels, and pennies. amount, n, provide the least number
More information15 July, Huffman Trees. Heaps
1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in
More informationData compression.
Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.
More informationCOSC-211: DATA STRUCTURES HW5: HUFFMAN CODING. 1 Introduction. 2 Huffman Coding. Due Thursday, March 8, 11:59pm
COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING Due Thursday, March 8, 11:59pm Reminder regarding intellectual responsibility: This is an individual assignment, and the work you submit should be your own.
More informationHuffman Coding Assignment For CS211, Bellevue College (rev. 2016)
Huffman Coding Assignment For CS, Bellevue College (rev. ) (original from Marty Stepp, UW CSE, modified by W.P. Iverson) Summary: Huffman coding is an algorithm devised by David A. Huffman of MIT in 95
More informationGreedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes
Greedy Algorithms CLRS Chapters 16.1 16.3 Introduction to greedy algorithms Activity-selection problem Design of data-compression (Huffman) codes (Minimum spanning tree problem) (Shortest-path problem)
More informationWeek 12: Priority queues Heaps and heap operations
Week 12: Priority queues Heaps and heap operations Comp 271 Spring, 2012 Mr. Weisert The queues we studied in week 6 were FIFO Many real-world situations consider other criteria for choosing which object
More informationHorn Formulae. CS124 Course Notes 8 Spring 2018
CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it
More informationFundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding
Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression
More informationPriority Queues. Chapter 9
Chapter 9 Priority Queues Sometimes, we need to line up things according to their priorities. Order of deletion from such a structure is determined by the priority of the elements. For example, when assigning
More informationLossless Compression Algorithms
Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms
More informationMore Bits and Bytes Huffman Coding
More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationAlgorithms and Data Structures CS-CO-412
Algorithms and Data Structures CS-CO-412 David Vernon Professor of Informatics University of Skövde Sweden david@vernon.eu www.vernon.eu Algorithms and Data Structures 1 Copyright D. Vernon 2014 Trees
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental
More informationITI Introduction to Computing II
ITI 1121. Introduction to Computing II Marcel Turcotte School of Electrical Engineering and Computer Science Binary search tree (part I) Version of March 24, 2013 Abstract These lecture notes are meant
More informationHuffman Codes (data compression)
Huffman Codes (data compression) Data compression is an important technique for saving storage Given a file, We can consider it as a string of characters We want to find a compressed file The compressed
More information2010 Canadian Computing Competition: Senior Division. Sponsor:
2010 Canadian Computing Competition: Senior Division Sponsor: 1 Canadian Computing Competition Student Instructions for the Senior Problems 1. You may only compete in one competition. If you wish to write
More informationITI Introduction to Computing II
ITI 1121. Introduction to Computing II Marcel Turcotte School of Electrical Engineering and Computer Science Binary search tree (part I) Version of March 24, 2013 Abstract These lecture notes are meant
More informationDesign and Analysis of Algorithms
Design and Analysis of Algorithms Instructor: SharmaThankachan Lecture 10: Greedy Algorithm Slides modified from Dr. Hon, with permission 1 About this lecture Introduce Greedy Algorithm Look at some problems
More informationChapter 5: Data compression. Chapter 5 outline
Chapter 5: Data compression Chapter 5 outline 2 balls weighing problem Examples of codes Kraft inequality Optimal codes + bounds Kraft inequality for uniquely decodable codes Huffman codes Shannon-Fano-Elias
More informationGreedy Algorithms. Alexandra Stefan
Greedy Algorithms Alexandra Stefan 1 Greedy Method for Optimization Problems Greedy: take the action that is best now (out of the current options) it may cause you to miss the optimal solution You build
More informationA Secondary storage Algorithms and Data Structures Supplementary Questions and Exercises
308-420A Secondary storage Algorithms and Data Structures Supplementary Questions and Exercises Section 1.2 4, Logarithmic Files Logarithmic Files 1. A B-tree of height 6 contains 170,000 nodes with an
More informationSummary of Digital Information (so far):
CS/MA 109 Fall 2016 Wayne Snyder Department Boston University Today : Audio concluded: MP3 Data Compression introduced Next Time Data Compression Concluded Error Detection and Error Correcction Summary
More informationGreedy Algorithms CHAPTER 16
CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often
More informationEE 368. Weeks 5 (Notes)
EE 368 Weeks 5 (Notes) 1 Chapter 5: Trees Skip pages 273-281, Section 5.6 - If A is the root of a tree and B is the root of a subtree of that tree, then A is B s parent (or father or mother) and B is A
More informationImage coding and compression
Chapter 2 Image coding and compression 2. Lossless and lossy compression We have seen that image files can be very large. It is thus important for reasons both of storage and file transfer to make these
More informationMultimedia Networking ECE 599
Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics
More informationCS201 Discussion 12 HUFFMAN + ERDOSNUMBERS
CS0 Discussion HUFFMAN + ERDOSNUMBERS Huffman Compression Today, we ll walk through an example of Huffman compression. In Huffman compression, there are three steps:. Create a Huffman tree. Get all the
More informationChapter 16: Greedy Algorithm
Chapter 16: Greedy Algorithm 1 About this lecture Introduce Greedy Algorithm Look at some problems solvable by Greedy Algorithm 2 Coin Changing Suppose that in a certain country, the coin dominations consist
More informationASCII American Standard Code for Information Interchange. Text file is a sequence of binary digits which represent the codes for each character.
Project 2 1 P2-0: Text Files All files are represented as binary digits including text files Each character is represented by an integer code ASCII American Standard Code for Information Interchange Text
More informationData Compression Algorithms
Data Compression Algorithms Adaptive Huffman coding Robert G. Gallager Massachusetts Institute of Technology Donald Ervin Knuth Stanford University 17.10.2017 NSWI072-6 Static adaptive methods (Statistical)
More informationCSE 143, Winter 2013 Programming Assignment #8: Huffman Coding (40 points) Due Thursday, March 14, 2013, 11:30 PM
CSE, Winter Programming Assignment #8: Huffman Coding ( points) Due Thursday, March,, : PM This program provides practice with binary trees and priority queues. Turn in files named HuffmanTree.java, secretmessage.short,
More informationCSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials)
CSE100 Advanced Data Structures Lecture 12 (Based on Paul Kube course materials) CSE 100 Coding and decoding with a Huffman coding tree Huffman coding tree implementation issues Priority queues and priority
More informationEfficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms
Efficient Sequential Algorithms, Comp39 Part 3. String Algorithms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press (21).
More informationUniquely Decodable. Code 1 Code 2 A 00 0 B 1 1 C D 11 11
Uniquely Detectable Code Uniquely Decodable A code is not uniquely decodable if two symbols have the same codeword, i.e., if C(S i ) = C(S j ) for any i j or the combination of two codewords gives a third
More informationGreedy algorithms part 2, and Huffman code
Greedy algorithms part 2, and Huffman code Two main properties: 1. Greedy choice property: At each decision point, make the choice that is best at the moment. We typically show that if we make a greedy
More informationCS15100 Lab 7: File compression
C151 Lab 7: File compression Fall 26 November 14, 26 Complete the first 3 chapters (through the build-huffman-tree function) in lab (optionally) with a partner. The rest you must do by yourself. Write
More informationPriority Queues and Heaps. Heaps of fun, for everyone!
Priority Queues and Heaps Heaps of fun, for everyone! Learning Goals After this unit, you should be able to... Provide examples of appropriate applications for priority queues and heaps Manipulate data
More informationChapter 10: Trees. A tree is a connected simple undirected graph with no simple circuits.
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: o There is a unique simple path between any 2 of its vertices. o No loops. o No multiple edges. Example
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2015S-P2 Huffman Codes Project 2 David Galles Department of Computer Science University of San Francisco P2-0: Text Files All files are represented as binary digits
More informationENSC Multimedia Communications Engineering Huffman Coding (1)
ENSC 424 - Multimedia Communications Engineering Huffman Coding () Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 Outline Entropy Coding Prefix code Kraft-McMillan
More informationAnalysis of Algorithms - Greedy algorithms -
Analysis of Algorithms - Greedy algorithms - Andreas Ermedahl MRTC (Mälardalens Real-Time Reseach Center) andreas.ermedahl@mdh.se Autumn 2003 Greedy Algorithms Another paradigm for designing algorithms
More informationData Structures and Organization (p.5 Recursion. Binary Trees)
Data Structures and Organization (p.5 Recursion. Binary Trees) Yevhen Berkunskyi, Computer Science dept., NUoS eugeny.berkunsky@gmail.com http://www.berkut.mk.ua Let s start with example Triangular Numbers
More informationIntro. To Multimedia Engineering Lossless Compression
Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based
More informationCHAPTER 1 Encoding Information
MIT 6.02 DRAFT Lecture Notes Spring 2011 Comments, questions or bug reports? Please contact 6.02-staff@mit.edu CHAPTER 1 Encoding Information In this lecture and the next, we ll be looking into compression
More informationData Compression Techniques
Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits
More informationGarbage Collection: recycling unused memory
Outline backtracking garbage collection trees binary search trees tree traversal binary search tree algorithms: add, remove, traverse binary node class 1 Backtracking finding a path through a maze is an
More informationGZIP is a software application used for file compression. It is widely used by many UNIX
Behram Mistree & Dmitry Kashlev 6.375 Final Project Report GZIP Encoding and Decoding in Hardware GZIP Introduction GZIP is a software application used for file compression. It is widely used by many UNIX
More informationAn Efficient Algorithm for Identifying the Most Contributory Substring. Ben Stephenson Department of Computer Science University of Western Ontario
An Efficient Algorithm for Identifying the Most Contributory Substring Ben Stephenson Department of Computer Science University of Western Ontario Problem Definition Related Problems Applications Algorithm
More informationCS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding
CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,
More informationEntropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code
Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic
More informationMotivation for B-Trees
1 Motivation for Assume that we use an AVL tree to store about 20 million records We end up with a very deep binary tree with lots of different disk accesses; log2 20,000,000 is about 24, so this takes
More informationChapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha
Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we have to take into account the complexity of the code.
More informationS. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani 165
S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani 165 5.22. You are given a graph G = (V, E) with positive edge weights, and a minimum spanning tree T = (V, E ) with respect to these weights; you may
More informationUnivariate Statistics Summary
Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:
More informationManipulating Digital Information
CS/MA 19 Fall 215 Wayne Snyder Department Boston University Today: Data Compression: Run-Length Encoding and Huffman Encoding Next: Huffman Encoding continued; Practical consequences of compression Next
More informationText Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2
Text Compression Jayadev Misra The University of Texas at Austin July 1, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction To Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable
More informationProblems Overview. The 2015 Asia ACM-ICPC Hanoi Regional Contest. Note: The input and output for all the problems are standard input and output.
Problems Overview Problem A: Obfuscated Emails Problem B: Parallelogram Problem C: Egyptian Encryption Problem D: Work Effectiveness Problem E: Pepsi Distribution Problem F: Genome Problem G: Cloud Computing
More informationChapter 17: Information Science Lesson Plan
Lesson Plan For All Practical Purposes Binary Codes Mathematical Literacy in Today s World, 7th ed. Encoding with Parity Check Sums Cryptography Web Searches and Mathematical Logic 2006, W.H. Freeman and
More informationCSE 230 Intermediate Programming in C and C++ Binary Tree
CSE 230 Intermediate Programming in C and C++ Binary Tree Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu Introduction to Tree Tree is a non-linear data structure
More informationHuffman, YEAH! Sasha Harrison Spring 2018
Huffman, YEAH! Sasha Harrison Spring 2018 Overview Brief History Lesson Step-wise Assignment Explanation Starter Files, Debunked What is Huffman Encoding? File compression scheme In text files, can we
More informationCSE 143 Lecture 22. Huffman Tree
CSE 4 Lecture Huffman slides created by Ethan Apter http://www.cs.washington.edu/4/ Huffman Tree For your next assignment, you ll create a Huffman tree Huffman trees are used for file compression file
More informationEfficient VLSI Huffman encoder implementation and its application in high rate serial data encoding
LETTER IEICE Electronics Express, Vol.14, No.21, 1 11 Efficient VLSI Huffman encoder implementation and its application in high rate serial data encoding Rongshan Wei a) and Xingang Zhang College of Physics
More informationENSC Multimedia Communications Engineering Topic 4: Huffman Coding 2
ENSC 424 - Multimedia Communications Engineering Topic 4: Huffman Coding 2 Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 1 Outline Canonical Huffman code Huffman
More informationUNIVERSITY OF CALIFORNIA, SANTA CRUZ BOARD OF STUDIES IN COMPUTER ENGINEERING CMPE13/L: INTRODUCTION TO PROGRAMMING IN C SPRING 2011.
UNIVERSITY OF CALIFORNIA, SANTA CRUZ BOARD OF STUDIES IN COMPUTER ENGINEERING CMPE13/L: INTRODUCTION TO PROGRAMMING IN C SPRING 2011 Lab 8 Morse code Introduction Reading This lab will perform Morse code
More informationLower Bound on Comparison-based Sorting
Lower Bound on Comparison-based Sorting Different sorting algorithms may have different time complexity, how to know whether the running time of an algorithm is best possible? We know of several sorting
More informationCSE 214 Computer Science II Introduction to Tree
CSE 214 Computer Science II Introduction to Tree Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu http://www3.cs.stonybrook.edu/~cse214/sec02/ Tree Tree is a non-linear
More informationBasics of Information Worksheet
Basics of Information Worksheet Concept Inventory: Notes: Measuring information content; entropy Two s complement; modular arithmetic Variable-length encodings; Huffman s algorithm Hamming distance, error
More informationDesign and Analysis of Algorithms
CSE 101, Winter 018 D/Q Greed SP s DP LP, Flow B&B, Backtrack Metaheuristics P, NP Design and Analysis of Algorithms Lecture 8: Greed Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ Optimization
More informationComputational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs
Computational Optimization ISE 407 Lecture 16 Dr. Ted Ralphs ISE 407 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms in
More informationEE-575 INFORMATION THEORY - SEM 092
EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------
More informationDigital Image Processing
Digital Image Processing Image Compression Caution: The PDF version of this presentation will appear to have errors due to heavy use of animations Material in this presentation is largely based on/derived
More informationThis chapter is intended to take you through the basic steps of using the Visual Basic
CHAPTER 1 The Basics This chapter is intended to take you through the basic steps of using the Visual Basic Editor window and writing a simple piece of VBA code. It will show you how to use the Visual
More informationChapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 9 Greedy Technique Copyright 2007 Pearson Addison-Wesley. All rights reserved. Greedy Technique Constructs a solution to an optimization problem piece by piece through a sequence of choices that
More informationFigure 4.1: The evolution of a rooted tree.
106 CHAPTER 4. INDUCTION, RECURSION AND RECURRENCES 4.6 Rooted Trees 4.6.1 The idea of a rooted tree We talked about how a tree diagram helps us visualize merge sort or other divide and conquer algorithms.
More informationASCII American Standard Code for Information Interchange. Text file is a sequence of binary digits which represent the codes for each character.
Project 2 1 P2-0: Text Files All files are represented as binary digits including text files Each character is represented by an integer code ASCII American Standard Code for Information Interchange Text
More informationStart of Lecture: February 10, Chapter 6: Scheduling
Start of Lecture: February 10, 2014 1 Reminders Exercise 2 due this Wednesday before class Any questions or comments? 2 Scheduling so far First-Come-First Serve FIFO scheduling in queue without preempting
More informationEE67I Multimedia Communication Systems Lecture 4
EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.
More informationCOS 226 Algorithms and Data Structures Fall Final Solutions. 10. Remark: this is essentially the same question from the midterm.
COS 226 Algorithms and Data Structures Fall 2011 Final Solutions 1 Analysis of algorithms (a) T (N) = 1 10 N 5/3 When N increases by a factor of 8, the memory usage increases by a factor of 32 Thus, T
More informationA Research Paper on Lossless Data Compression Techniques
IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 1 June 2017 ISSN (online): 2349-6010 A Research Paper on Lossless Data Compression Techniques Prof. Dipti Mathpal
More informationDisjoint set (Union-Find)
CS124 Lecture 6 Spring 2011 Disjoint set (Union-Find) For Kruskal s algorithm for the minimum spanning tree problem, we found that we needed a data structure for maintaining a collection of disjoint sets.
More informationTreaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19
CSE34T/CSE549T /05/04 Lecture 9 Treaps Binary Search Trees (BSTs) Search trees are tree-based data structures that can be used to store and search for items that satisfy a total order. There are many types
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,
More informationCMSC 341 Priority Queues & Heaps. Based on slides from previous iterations of this course
CMSC 341 Priority Queues & Heaps Based on slides from previous iterations of this course Today s Topics Priority Queues Abstract Data Type Implementations of Priority Queues: Lists BSTs Heaps Heaps Properties
More informationScribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017
CS6 Lecture 4 Greedy Algorithms Scribe: Virginia Williams, Sam Kim (26), Mary Wootters (27) Date: May 22, 27 Greedy Algorithms Suppose we want to solve a problem, and we re able to come up with some recursive
More informationSelf-Balancing Search Trees. Chapter 11
Self-Balancing Search Trees Chapter 11 Chapter Objectives To understand the impact that balance has on the performance of binary search trees To learn about the AVL tree for storing and maintaining a binary
More informationModule 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.
The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory
More informationBinary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary is a
More information