CS15100 Lab 7: File compression
|
|
- Edmund Hamilton
- 5 years ago
- Views:
Transcription
1 C151 Lab 7: File compression Fall 26 November 14, 26 Complete the first 3 chapters (through the build-huffman-tree function) in lab (optionally) with a partner. The rest you must do by yourself. Write both your name and your partner s name on the homework when you hand it in. f you are in the 9:3am MWF section, handin your solution by ing it to robby@cs.uchicago.edu. f you are in the TTh section, your solution to bboven@gmail.com and mulmuley@cs.uchicago.edu. t must be in the appropriate mailbox before lab starts in week 8. 1 ntroduction At the lowest level, computers represent data as sequences of bits ( or 1). The normal way to represent a message as a sequence of bits is to use a table that associates bit patterns with characters and then translate each letter the message according to the table. The standard table that many computers use is called the AC table, which represents every character on the keyboard (and a few more besides) as a sequence of exactly 8 bits. Here is one portion of the AC table: Character AC encoding 1 M P n AC, the message MPP would be represented like this: M P P f you save the word MPP in Drcheme, that sequence of bits is how it will be written out in the saved file. There are a number of advantages to representing messages with AC, but it is not particularly good for generating short encodings of particular messages. n situations where we really need messages to be short (maybe because we want to transmit a message quickly across a network, or save it on a disk that doesn t have much space left) we can often do dramatically better. The message MPP, for instance, doesn t use most of the letters of the alphabet at all, so an encoding scheme that didn t let us write those letters down at all would be fine. Furthermore, it uses and four times each, but P only twice and M only once: for that reason, it would be a good trade to use an encoding table that had short representations for and and longer representations of P and M. The following alternative encoding produces a much shorter encoding for the message MPP: 1
2 Character Alternative encoding M 1 P M 1 While AC needs 88 bits, the alternative encoding needs just 21. The goal of this lab is to implement an algorithm called Huffman coding that determines the best encoding table for a particular message, and then encodes or decodes messages according to that table. As a demonstration of the technique s practical application, you will use it to write a program that compresses and decompresses files. For this lab, you will need to use the following teachpack: P P jacobm/ fall/huffman-utils.ss Huffman coding is named after its inventor, David Huffman ( ). He invented it in 1951 as a final project for a class he was taking his instructor listed it as a possible paper topic without mentioning that it was a major unsolved problem at the time! 2 Gathering statistics The first step of the algorithm is to determine the frequencies of each letter in the input. ;; A statistics is a (listof frequency) ;; A frequency is: ;; (make-frequency character number) (define-struct frequency (token count)) Note. The frequency structure is provided by the teachpack. Do not define it yourself. Characters are a built-in category of primitive values each representing one letter (or numeral, or punctuation mark, et cetera). They can be written down directly with the syntax #\x (for the character corresponding to a lower-case x). Characters can be tested for equality using char=?. The main advantage of characters is that we can get them out of strings: for instance, given the string "MPP" we can use the built-in function string list: (string list "MPP") produces (list #\M #\ #\ #\ #\ #\ #\ #\ #\P #\P #\). Write a function frequencies : (listof character) statistics, which takes a message represented as a list of characters and produces statistics containing the frequency with which each token appears in the message. For instance, (frequencies (list #\M #\ #\ #\ #\ #\ #\ #\ #\P #\P #\)) shouldbe (list (make-frequency #\M 1) (make-frequency #\ 4) (make-frequency #\ 4) (make-frequency #\P 2)) 2
3 3 Building Huffman trees ;; A huffman-tree is either: ;; - (make-leaf character number) ;; - (make-branch huffman-tree huffman-tree (listof character) number) (define-struct leaf (token count)) (define-struct branch (l r tokens count)) The key idea behind Huffman coding is the Huffman tree. Given a particular message, a Huffman tree for that message is a binary tree whose leaves are character, one per distinct character in the message. Additionally, for every subtree, the total frequency of all the tokens on the left side is as nearly equal to the total frequency of all the tokens on the right side as possible. Huffman s algorithm for building these trees is as follows. t takes as its input the statistics generated in the last section, for instance: (list (make-frequency #\M 1) (make-frequency #\ 4) (make-frequency #\ 4) (make-frequency #\P 2)) t turns each of these frequencies into a trivial binary tree consisting of just the input character and its frequency, and sorts them by frequency (lowest to highest): (list (make-leaf #\M 1) (make-leaf #\P 2) (make-leaf #\ 4) (make-leaf #\ 4)) From this point on the algorithm works on lists of trees sorted by frequency. t successively removes the first two trees from the list and combines them into a single branch whose character list is the combination of the two subtrees character lists and whose frequency is the sum of the two subtrees frequencies. t inserts this new branch into the list (making sure to maintain sorted order) and repeats the process until only one tree is left. That tree is the output. For instance, here are the successive steps the algorithm would take on the example above, both in code and in picture form: tage Code: (list (make-leaf #\M 1) (make-leaf #\P 2) (make-leaf #\ 4) (make-leaf #\ 4)) tage 1 Code: M 1 P (list (make-branch (make-leaf #\M 1) (make-leaf #\P 2) (list #\M #\P) 3) (make-leaf #\ 4) (make-leaf #\ 4)) (M P) M 1 P 2 3
4 tage 2 Code: (list (make-leaf #\ 4) (make-branch (make-branch (make-leaf #\M 1) (make-leaf #\P 2) (list #\M #\P) 3) (make-leaf #\ 4) (list #\M #\P #\) 7)) 4 (M P ) 7 (M P) 3 4 tage 3 Code: M 1 P 2 (list (make-branch (make-leaf #\ 4) (make-branch (make-branch (make-leaf #\M 1) (make-leaf #\P 2) (list #\M #\P) 3) (make-leaf #\ 4) (list #\M #\P #\) 7) (list #\ #\M #\P #\) )) ( M P ) 4 (M P ) 7 (M P) 3 4 M 1 P 2 Write the function build-huffman-tree : statistics huffman-tree, which builds the Huffman tree that corresponds to the given frequencies. 4 Encoding a message The Huffman tree for a message is a representation of the optimal table for encoding that message: the code for each letter is just the path from the root of the tree to that letter, with representing going down the left branch and 1 representing going down the right branch. Write the function encode-message : (listof character) huffman-tree (listof bit), where a bit is either or 1. For instance, 4
5 (define message (string list "MPP")) (define freqs (frequencies message)) (define tree (build-huffman-tree freqs)) (encode-message message tree) shouldbe (list ) 5 Decoding a message To decode a message, one needs the encoded version of the message and the Huffman tree that was used to encode it. Write the function decode-message : (listof bit) huffman-tree (listof character), which decodes a message encoded with encode-message. For instance, (list string (decode-message (list ) tree)) shouldbe "MPP" 6 An application: file compression n the introduction we mentioned that computers store messages as sequences of bits. That is not quite the whole truth: the sequences must be exact multiples of 8, since computers arrange memory into 8-bit bytes. When using AC you never need to think about this, since every character in AC is represented as a whole byte, so you can t end up with a message that doesn t fill some exact number of bytes; but with the encodings that come from Huffman tables it is possible. The problem is this: when you re reading a compressed message off of a disk, you will always read it as a whole number of bytes, but somewhere between and 7 of the last bits were not a part of the encoding of the original message. The standard way to deal with this is to add a special end-of-message token to the end of every message when encoding it. With that character added, the encoding process can proceed almost exactly as normal eom is counted just like a character when computing statistics, generating a Huffman tree, and encoding the message the only difference being that the encoder must ensure that the lengths of its final encodings are multiples of 8 bits long by padding the ending (after the encoding of the eom token) with arbitrary bits. With this done, the decoder can take advantage of the fact that eom appears at the end of every message and stop decoding as soon as it decodes an end-of-message token, even if there are more bits available for decoding. Change the definition of a frequency from section 2 as follows: ;; A frequency is: ;; (make-frequency token number) ;; a token is either: ;; - a character ;; - eom Then modify all parts of your program that need to change to make proper use of the eom token. Once you have done that, you are ready to write the final compression and decompression functions. To help with that, the huffman-util.ss teachpack provides one new data definition and four functions: ;; compressed-data is 5
6 ;; (make-compressed-data statistics (listof bit)) ;; NOTE: the length of the list of bits must be a multiple of 8 (define-struct compressed-data (stats bits)) ;; file list : string (listof character) ;; produces a list of characters corresponding to the entire named file ;; write-compressed-data-to-file : compressed-data string boolean ;; writes the contents of the given compressed-data structure into a file. ;; Returns true on success, or false if something went wrong ;; (for instance the file couldn t be written) ;; read-compressed-data-from-file : string compressed-data ;; reads a compressed data file into a compressed-data structure ;; Note: the length of the bits returned is always a multiple of 8 ;; list file : (listof character) string boolean ;; which makes a file with the given string as its name with the given list ;; of characters as its contents. Returns true on success, ;; false if something went wrong. Note. The compressed-data structure is provided by the teachpack. Do not define it yourself. Use these helpers to define the following functions: compress-file : string string boolean, which compresses the contents of the file named by the first string and places the compressed version in the file named by the second string. uncompress-file : string string boolean, which expects the contents of the file named by the first string to be compressed data, uncompresses that data, and writes the result to the file named by the second string. (The provided helpers do a small bit of magic for you: they write out the statistics at the beginning of the file before writing your bit list and read it back in, in addition to writing and reading your provided bit list. Building this functionality yourself is not particularly difficult, but since it isn t particularly interesting we figured we d save you the trouble.) 6
15 July, Huffman Trees. Heaps
1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in
More informationText Compression through Huffman Coding. Terminology
Text Compression through Huffman Coding Huffman codes represent a very effective technique for compressing data; they usually produce savings between 20% 90% Preliminary example We are given a 100,000-character
More informationCOSC-211: DATA STRUCTURES HW5: HUFFMAN CODING. 1 Introduction. 2 Huffman Coding. Due Thursday, March 8, 11:59pm
COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING Due Thursday, March 8, 11:59pm Reminder regarding intellectual responsibility: This is an individual assignment, and the work you submit should be your own.
More informationHuffman Coding Assignment For CS211, Bellevue College (rev. 2016)
Huffman Coding Assignment For CS, Bellevue College (rev. ) (original from Marty Stepp, UW CSE, modified by W.P. Iverson) Summary: Huffman coding is an algorithm devised by David A. Huffman of MIT in 95
More information6. Finding Efficient Compressions; Huffman and Hu-Tucker
6. Finding Efficient Compressions; Huffman and Hu-Tucker We now address the question: how do we find a code that uses the frequency information about k length patterns efficiently to shorten our message?
More informationHorn Formulae. CS124 Course Notes 8 Spring 2018
CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it
More informationCS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding
CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,
More information14.4 Description of Huffman Coding
Mastering Algorithms with C By Kyle Loudon Slots : 1 Table of Contents Chapter 14. Data Compression Content 14.4 Description of Huffman Coding One of the oldest and most elegant forms of data compression
More informationCpSc 1011 Lab 5 Conditional Statements, Loops, ASCII code, and Redirecting Input Characters and Hurricanes
CpSc 1011 Lab 5 Conditional Statements, Loops, ASCII code, and Redirecting Input Characters and Hurricanes Overview For this lab, you will use: one or more of the conditional statements explained below
More informationGreedy Algorithms CHAPTER 16
CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often
More informationIf Statements, For Loops, Functions
Fundamentals of Programming If Statements, For Loops, Functions Table of Contents Hello World Types of Variables Integers and Floats String Boolean Relational Operators Lists Conditionals If and Else Statements
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2015S-P2 Huffman Codes Project 2 David Galles Department of Computer Science University of San Francisco P2-0: Text Files All files are represented as binary digits
More informationCSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials)
CSE100 Advanced Data Structures Lecture 12 (Based on Paul Kube course materials) CSE 100 Coding and decoding with a Huffman coding tree Huffman coding tree implementation issues Priority queues and priority
More informationInformation Science 2
Information Science 2 - Path Lengths and Huffman s Algorithm- Week 06 College of Information Science and Engineering Ritsumeikan University Agenda l Review of Weeks 03-05 l Tree traversals and notations
More informationIT101. Characters: from ASCII to Unicode
IT101 Characters: from ASCII to Unicode Java Primitives Note the char (character) primitive. How does it represent the alphabet letters? What is the difference between char and String? Does a String consist
More informationCS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3
Compressing Data using Huffman Coding Due Oct.24 noon Objectives In this assignment, you will implement classes for data compression. You will write: () An implementation of the Huffman Coding using a
More informationEE 368. Weeks 5 (Notes)
EE 368 Weeks 5 (Notes) 1 Chapter 5: Trees Skip pages 273-281, Section 5.6 - If A is the root of a tree and B is the root of a subtree of that tree, then A is B s parent (or father or mother) and B is A
More informationCS02b Project 2 String compression with Huffman trees
PROJECT OVERVIEW CS02b Project 2 String compression with Huffman trees We've discussed how characters can be encoded into bits for storage in a computer. ASCII (7 8 bits per character) and Unicode (16+
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory
More informationFall 2017 Discussion 7: October 25, 2017 Solutions. 1 Introduction. 2 Primitives
CS 6A Scheme Fall 207 Discussion 7: October 25, 207 Solutions Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write
More informationCSE 143, Winter 2013 Programming Assignment #8: Huffman Coding (40 points) Due Thursday, March 14, 2013, 11:30 PM
CSE, Winter Programming Assignment #8: Huffman Coding ( points) Due Thursday, March,, : PM This program provides practice with binary trees and priority queues. Turn in files named HuffmanTree.java, secretmessage.short,
More informationRed-Black, Splay and Huffman Trees
Red-Black, Splay and Huffman Trees Kuan-Yu Chen ( 陳冠宇 ) 2018/10/22 @ TR-212, NTUST AVL Trees Review Self-balancing binary search tree Balance Factor Every node has a balance factor of 1, 0, or 1 2 Red-Black
More informationData Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression
An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression
More informationGreedy algorithms part 2, and Huffman code
Greedy algorithms part 2, and Huffman code Two main properties: 1. Greedy choice property: At each decision point, make the choice that is best at the moment. We typically show that if we make a greedy
More informationBinary Trees Case-studies
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look
More informationLecture: Analysis of Algorithms (CS )
Lecture: Analysis of Algorithms (CS483-001) Amarda Shehu Spring 2017 1 The Fractional Knapsack Problem Huffman Coding 2 Sample Problems to Illustrate The Fractional Knapsack Problem Variable-length (Huffman)
More information15100 Fall 2005 Final Project
15100 Fall 2005 Final Project Robby Findler & Jacob Matthews 1 Introduction to Sudoku Sudoku is a logic puzzle set on a nine by nine grid. The goal is to fill in the blank spaces in the puzzle with the
More informationGreedy Algorithms. Alexandra Stefan
Greedy Algorithms Alexandra Stefan 1 Greedy Method for Optimization Problems Greedy: take the action that is best now (out of the current options) it may cause you to miss the optimal solution You build
More informationSCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017
SCHEME 8 COMPUTER SCIENCE 61A March 2, 2017 1 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme programs,
More information15110 Principles of Computing, Carnegie Mellon University - CORTINA. Digital Data
UNIT 7A Data Representa1on: Numbers and Text 1 Digital Data 10010101011110101010110101001110 What does this binary sequence represent? It could be: an integer a floa1ng point number text encoded with ASCII
More informationSource coding and compression
Computer Mathematics Week 5 Source coding and compression College of Information Science and Engineering Ritsumeikan University last week binary representations of signed numbers sign-magnitude, biased
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental
More information15-122: Principles of Imperative Computation, Spring 2013
15-122 Homework 6 Page 1 of 13 15-122: Principles of Imperative Computation, Spring 2013 Homework 6 Programming: Huffmanlab Due: Thursday, April 4, 2013 by 23:59 For the programming portion of this week
More informationENSC Multimedia Communications Engineering Topic 4: Huffman Coding 2
ENSC 424 - Multimedia Communications Engineering Topic 4: Huffman Coding 2 Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 1 Outline Canonical Huffman code Huffman
More informationSpring 2018 Discussion 7: March 21, Introduction. 2 Primitives
CS 61A Scheme Spring 2018 Discussion 7: March 21, 2018 1 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme
More informationASCII American Standard Code for Information Interchange. Text file is a sequence of binary digits which represent the codes for each character.
Project 2 1 P2-0: Text Files All files are represented as binary digits including text files Each character is represented by an integer code ASCII American Standard Code for Information Interchange Text
More information2010 Canadian Computing Competition: Senior Division. Sponsor:
2010 Canadian Computing Competition: Senior Division Sponsor: 1 Canadian Computing Competition Student Instructions for the Senior Problems 1. You may only compete in one competition. If you wish to write
More informationData Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data
More informationText Input and Conditionals
Text Input and Conditionals Text Input Many programs allow the user to enter information, like a username and password. Python makes taking input from the user seamless with a single line of code: input()
More informationCSE 143 Lecture 22. Huffman Tree
CSE 4 Lecture Huffman slides created by Ethan Apter http://www.cs.washington.edu/4/ Huffman Tree For your next assignment, you ll create a Huffman tree Huffman trees are used for file compression file
More informationVariables and Data Representation
You will recall that a computer program is a set of instructions that tell a computer how to transform a given set of input into a specific output. Any program, procedural, event driven or object oriented
More informationS. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani 165
S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani 165 5.22. You are given a graph G = (V, E) with positive edge weights, and a minimum spanning tree T = (V, E ) with respect to these weights; you may
More informationHomework: More Abstraction, Trees, and Lists
Homework: More Abstraction, Trees, and Lists COMP 50 Fall 2013 This homework is due at 11:59PM on Monday, November 18. Submit your solutions in a single file using the COMP 50 Handin button on DrRacket;
More informationFall 2018 Discussion 8: October 24, 2018 Solutions. 1 Introduction. 2 Primitives
CS 6A Scheme Fall 208 Discussion 8: October 24, 208 Solutions Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write
More informationCIS 121 Data Structures and Algorithms with Java Spring 2018
CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,
More informationPriority Queues and Huffman Encoding
Priority Queues and Huffman Encoding Introduction to Homework 7 Hunter Schafer Paul G. Allen School of Computer Science - CSE 143 I Think You Have Some Priority Issues ER Scheduling. How do we efficiently
More informationScribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017
CS6 Lecture 4 Greedy Algorithms Scribe: Virginia Williams, Sam Kim (26), Mary Wootters (27) Date: May 22, 27 Greedy Algorithms Suppose we want to solve a problem, and we re able to come up with some recursive
More informationPriority Queues and Huffman Encoding
Priority Queues and Huffman Encoding Introduction to Homework 7 Hunter Schafer Paul G. Allen School of Computer Science - CSE 143 I Think You Have Some Priority Issues ER Scheduling. How do we efficiently
More informationCompression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:
CS231 Algorithms Handout # 31 Prof. Lyn Turbak November 20, 2001 Wellesley College Compression The Big Picture We want to be able to store and retrieve data, as well as communicate it with others. In general,
More informationHuffman, YEAH! Sasha Harrison Spring 2018
Huffman, YEAH! Sasha Harrison Spring 2018 Overview Brief History Lesson Step-wise Assignment Explanation Starter Files, Debunked What is Huffman Encoding? File compression scheme In text files, can we
More informationAn Overview 1 / 10. CS106B Winter Handout #21 March 3, 2017 Huffman Encoding and Data Compression
CS106B Winter 2017 Handout #21 March 3, 2017 Huffman Encoding and Data Compression Handout by Julie Zelenski with minor edits by Keith Schwarz In the early 1980s, personal computers had hard disks that
More information14 Data Compression by Huffman Encoding
4 Data Compression by Huffman Encoding 4. Introduction In order to save on disk storage space, it is useful to be able to compress files (or memory blocks) of data so that they take up less room. However,
More informationAlgorithms and Data Structures CS-CO-412
Algorithms and Data Structures CS-CO-412 David Vernon Professor of Informatics University of Skövde Sweden david@vernon.eu www.vernon.eu Algorithms and Data Structures 1 Copyright D. Vernon 2014 Trees
More informationBlack Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files
Black Problem 2: Huffman Compression [75 points] Copied from: https://www.cs.hmc.edu/twiki/bin/view/cs5/huff manblack on 3/15/2017 Due: 11:59 PM on November 14, 2016 Starter files First, here is a set
More informationMore Bits and Bytes Huffman Coding
More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width
More informationCS 206 Introduction to Computer Science II
CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced
More informationOut: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday)
CS 215 Fundamentals of Programming II Spring 2017 Programming Project 7 30 points Out: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday) This project
More informationDigital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay
Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today
More informationRed-Black trees are usually described as obeying the following rules :
Red-Black Trees As we have seen, the ideal Binary Search Tree has height approximately equal to log n, where n is the number of values stored in the tree. Such a BST guarantees that the maximum time for
More informationASCII American Standard Code for Information Interchange. Text file is a sequence of binary digits which represent the codes for each character.
Project 2 1 P2-0: Text Files All files are represented as binary digits including text files Each character is represented by an integer code ASCII American Standard Code for Information Interchange Text
More informationEntropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code
Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic
More informationAsking for information (with three complex questions, so four main paragraphs)
Structures of different kinds of emails Write typical paragraph plans for the kinds of emails, describing the paragraphs in the body and what kinds of opening lines and closing lines you need. Asking for
More informationCS 337 Project 1: Minimum-Weight Binary Search Trees
CS 337 Project 1: Minimum-Weight Binary Search Trees September 6, 2006 1 Preliminaries Let X denote a set of keys drawn from some totally ordered universe U, such as the set of all integers, or the set
More informationCS52 - Assignment 8. Due Friday 4/15 at 5:00pm.
CS52 - Assignment 8 Due Friday 4/15 at 5:00pm https://xkcd.com/859/ This assignment is about scanning, parsing, and evaluating. It is a sneak peak into how programming languages are designed, compiled,
More informationIntro. To Multimedia Engineering Lossless Compression
Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based
More informationCSE 374 Programming Concepts & Tools
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 8 C: Miscellanea Control, Declarations, Preprocessor, printf/scanf 1 The story so far The low-level execution model of a process (one
More information6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms
6. Finding Efficient Compressions; Huffman and Hu-Tucker Algorithms We now address the question: How do we find a code that uses the frequency information about k length patterns efficiently, to shorten
More informationHomework 3 Huffman Coding. Due Thursday October 11
Homework 3 Huffman Coding Due Thursday October 11 Huffman Coding Implement Huffman Encoding and Decoding and the classes shown on the following slides. You will also need to use Java s stack class HuffmanEncode
More informationCSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm
CSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm This assignment is worth a total of 30 points. It is divided into two parts, each worth approximately half
More informationBasic data types. Building blocks of computation
Basic data types Building blocks of computation Goals By the end of this lesson you will be able to: Understand the commonly used basic data types of C++ including Characters Integers Floating-point values
More informationCS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 11 Greedy Algorithms 1 Activity Selection Problem Input: a set S {1, 2,, n} of n activities s i =Start time of activity i, f i = Finish time of activity i Activity i takes place
More informationString Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42
String Matching Pedro Ribeiro DCC/FCUP 2016/2017 Pedro Ribeiro (DCC/FCUP) String Matching 2016/2017 1 / 42 On this lecture The String Matching Problem Naive Algorithm Deterministic Finite Automata Knuth-Morris-Pratt
More informationCS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial
Week 7 General remarks Arrays, lists, pointers and 1 2 3 We conclude elementary data structures by discussing and implementing arrays, lists, and trees. Background information on pointers is provided (for
More informationlast time in cs recitations. computer commands. today s topics.
last time in cs1007... recitations. course objectives policies academic integrity resources WEB PAGE: http://www.columbia.edu/ cs1007 NOTE CHANGES IN ASSESSMENT 5 EXTRA CREDIT POINTS ADDED sign up for
More informationAnimations that make decisions
Chapter 17 Animations that make decisions 17.1 String decisions Worked Exercise 17.1.1 Develop an animation of a simple traffic light. It should initially show a green disk; after 5 seconds, it should
More informationProgramming Abstractions
Programming Abstractions C S 1 0 6 X Cynthia Lee Topics: Today we re going to be talking about your next assignment: Huffman coding It s a compression algorithm It s provably optimal (take that, Pied Piper)
More informationSCHEME 7. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. October 29, 2015
SCHEME 7 COMPUTER SCIENCE 61A October 29, 2015 1 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme programs,
More informationAn undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.
Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal
More informationInformation Retrieval. Lecture 3 - Index compression. Introduction. Overview. Characterization of an index. Wintersemester 2007
Information Retrieval Lecture 3 - Index compression Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 30 Introduction Dictionary and inverted index:
More informationBinary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary is a
More informationBinary Search Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Previously, on ECE-250... We discussed trees (the
More informationDepiction of program declaring a variable and then assigning it a value
Programming languages I have found, the easiest first computer language to learn is VBA, the macro programming language provided with Microsoft Office. All examples below, will All modern programming languages
More informationLinked Structures Songs, Games, Movies Part IV. Fall 2013 Carola Wenk
Linked Structures Songs, Games, Movies Part IV Fall 23 Carola Wenk Storing Text We ve been focusing on numbers. What about text? Animal, Bird, Cat, Car, Chase, Camp, Canal We can compare the lexicographic
More informationEncoding. A thesis submitted to the Graduate School of University of Cincinnati in
Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master
More informationTREES Lecture 10 CS2110 Spring2014
TREES Lecture 10 CS2110 Spring2014 Readings and Homework 2 Textbook, Chapter 23, 24 Homework: A thought problem (draw pictures!) Suppose you use trees to represent student schedules. For each student there
More informationMusic. Numbers correspond to course weeks EULA ESE150 Spring click OK Based on slides DeHon 1. !
MIC Lecture #7 Digital Logic Music 1 Numbers correspond to course weeks sample EULA D/A 10101001101 click OK Based on slides 2009--2018 speaker MP Player / iphone / Droid DeHon 1 2 A/D domain conversion
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationLecture Notes on Binary Decision Diagrams
Lecture Notes on Binary Decision Diagrams 15-122: Principles of Imperative Computation William Lovas Notes by Frank Pfenning Lecture 25 April 21, 2011 1 Introduction In this lecture we revisit the important
More information7: Image Compression
7: Image Compression Mark Handley Image Compression GIF (Graphics Interchange Format) PNG (Portable Network Graphics) MNG (Multiple-image Network Graphics) JPEG (Join Picture Expert Group) 1 GIF (Graphics
More informationIn Java, data type boolean is used to represent Boolean data. Each boolean constant or variable can contain one of two values: true or false.
CS101, Mock Boolean Conditions, If-Then Boolean Expressions and Conditions The physical order of a program is the order in which the statements are listed. The logical order of a program is the order in
More informationLossless Compression Algorithms
Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms
More informationFinal Examination CSE 100 UCSD (Practice)
Final Examination UCSD (Practice) RULES: 1. Don t start the exam until the instructor says to. 2. This is a closed-book, closed-notes, no-calculator exam. Don t refer to any materials other than the exam
More informationBoolean Logic & Branching Lab Conditional Tests
I. Boolean (Logical) Operations Boolean Logic & Branching Lab Conditional Tests 1. Review of Binary logic Three basic logical operations are commonly used in binary logic: and, or, and not. Table 1 lists
More informationSelec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Spring 2016 Howard Rosenthal
Selec%on and Decision Structures in Java: If Statements and Switch Statements CSC 121 Spring 2016 Howard Rosenthal Lesson Goals Understand Control Structures Understand how to control the flow of a program
More informationDigital Image Processing
Digital Image Processing Image Compression Caution: The PDF version of this presentation will appear to have errors due to heavy use of animations Material in this presentation is largely based on/derived
More informationData compression.
Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.
More informationBits, Words, and Integers
Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are
More information(Refer Slide Time: 00:23)
In this session, we will learn about one more fundamental data type in C. So, far we have seen ints and floats. Ints are supposed to represent integers and floats are supposed to represent real numbers.
More informationDiscussion 2C Notes (Week 9, March 4) TA: Brian Choi Section Webpage:
Discussion 2C Notes (Week 9, March 4) TA: Brian Choi (schoi@cs.ucla.edu) Section Webpage: http://www.cs.ucla.edu/~schoi/cs32 Heaps A heap is a tree with special properties. In this class we will only consider
More information