CS02b Project 2 String compression with Huffman trees

Size: px
Start display at page:

Download "CS02b Project 2 String compression with Huffman trees"

Transcription

1 PROJECT OVERVIEW CS02b Project 2 String compression with Huffman trees We've discussed how characters can be encoded into bits for storage in a computer. ASCII (7 8 bits per character) and Unicode (16+ bits per character) are two related standardized encodings used by Java and other computer systems. However, an encoding can use fewer bits (less memory) when the "alphabet" (set of all possible characters) is small, requiring fewer unique bit combinations (character codes) since fewer characters need representation. Huffman trees are a method of producing an encoding system that uses few bits for the most commonly used characters, and more bits for rarely used characters, saving bits overall. This is one form of "compression", minimizing the memory required to store data. For this project, you will complete a class whose primary functions: 1. build a Huffman tree based on a text passage, 2. encode a text passage into a "bit String" (String of 1s and 0s) using an encoding from the tree, and 3. decode a bit String using the encoding represented by the tree. Your starter code also comes with a function that builds a "standard" Huffman Tree for you. Note that even small variations in Huffman Trees will result in different encodings for some characters, which is why this standardized tree is provided for you to test your decode function in a way that PRECISELY matches the original encoding. This project also gives you the opportunity to see and manage a larger amount of code than most of our assignments, as well as practice working within and understanding a code base produced by another person. This is a valuable skill in programming most advanced work is a result of multiple complex pieces of code produced by multiple people, working together! Required Questions Once your program is functioning, you should use it and write some additional output statements in key locations in your code to help you answer the following questions: 1. Consider encoding the passage from I Know Why the Caged Bird Sings (cagedbirdpassage.txt).

2 a. How many unique characters are present in that passage? (In this case we mean text characters, not story characters!) b. If all characters were assigned unique codes using the same number of bits, how many bits would be required for each? (See our class example where we assigned unique 3 bit codes to the characters from "a man, a plan, a canal, panama". Why did we need 3 bits?) c. How many bits total would be required for the entire passage, using the number of bits from part (b) for each character? d. How many bits do you save by instead encoding the passage using your Huffman tree generated from the same text? 2. Consider encoding the passage from The Hobbit (thehobbitpassage.txt). a. How many bits does it take to encode it using a Huffman tree generated using its own text? b. How many bits does it take to encode using the provided "standard" Huffman tree, which was generated from a different text passage? c. How do you explain this difference? 3. Using the standard Huffman tree provided, what's encoded in mysterypassage.txt? DETAILED INSTRUCTIONS Drag and drop the provided file, HuffmanTree.java, into the src folder for your CS02b project within the Eclipse package explorer or if you like, you can make a brand new Java Project for this assignment and put your file in that src folder. If Eclipse asks, choose "copy", which allows Eclipse to manage the file in your workspace folder without worrying about what you do with the original. Save the following text files in your project folder (but outside your src folder): thehobbitpassage.txt cagedbirdpassage.txt mysterypassage.txt This does not need to be done through the Eclipse package explorer, although if you want them to show up there, you'll need to select your project folder and choose "refresh" from the menu (which is likely hotkeyed to F5). Get to know the provided source code. An overview is provided below. Look at the different sections and make sure you have a clear picture of the different options, variables and functions.

3 Complete the code responsible for the three major tasks: Tree building, encoding, and decoding. You can do these in any order and test them individually if you like the standard tree can be used for both encoding and decoding without building your own, and the mysterypassage.txt file is already encoded using that standard tree if you would like to decode it without first encoding your own files. Thoroughly test your code. You may un comment or add your own println statements to give feedback throughout your program. One thing you might like to do is create a very small text file, for instance containing a single line with just a few characters, to test the basics before trying it on real English text. With a small enough file, you can manually check a generated tree or encoding for correctness. And once your encoding and decoding is working, you should be able to encode a file and then decode it using the same tree, and confirm the the original result comes back out. When your code is working (or perhaps in the process of testing), activate the standard tree and decode the mystery passage. If the result is not intelligible, your decode functions are probably not correct! Answer questions 1 and 2 above in a multi line comment at the end of your source code. You may run your program multiple times for this, changing which files are encoded/decoded as necessary. You may also switch between using the provided standard tree or your own built tree. Feel free to add any extra output statements to give you additional information as your code runs. There is no specific console output required as long as it's clear what your code is doing, the three primary operations work correctly, and you've answered the questions. Submit the following via e mail: 1. your HuffmanTree.java source code, with a comment at the bottom answering the two questions 2. your decoded.txt file showing the decoded version of mysterypassage.txt (you may rename the decoded file if you like) Optionally, feel free to experiment and make additions to the code. My complete version of this project includes two extensions that are unfinished in your starter code: 1. the gapcheck function, which "fills in gaps" in a frequency map so the resulting tree is more flexible. 2. the buildbitrep function of the HuffmanParent class, which is part of a set of functions that can translate the tree itself to a bit String of 0s and 1s.

4 CODE OVERVIEW As mentioned, there are a few distinct sections and processes going on in the code. This section describes the overall plan, but you should look at the code itself to get a sense of how it all fits together it's thoroughly commented. By the way, this may be a good time to re open that "outline" panel in Eclipse we've kept closed/minimized this whole time! Terminology note: The term "prefix" is used in several places in this code, meaning "the first part of a String". In this program, prefixes are often added to one or two characters at a time, for instance while working your way recursively down a Huffman tree. Upon reaching a leaf, the "prefix" represents the path taken through the parent nodes to get there. Tree node classes These only appear at the end, but understanding them is key to working with the tree structure, so you may want to look at them first. The tree is represented by interconnected HuffmanNode objects. Since HuffmanNode is abstract, each node must be one of the two subclasses, HuffmanParent or HuffmanLeaf. HuffmanParent s keep track of the structure of the tree, while HuffmanLeaf s store the actual characters at the end of each series of branches. The two subclasses have appropriate (different) definitions for the same recursive methods to enable tree operations (see below). Because HuffmanNode declares these methods, polymorphism is used to call them recursively from parent nodes, regardless of whether its children are HuffmanLeaf s or are themselves HuffmanParent s of even more nodes. Thus, the recursive calls travel easily down the tree and into the leaves. Constants These static final variables at the top of the HuffmanTree class cannot be changed once the program has begun. In an application produced for the general public, these variables would probably be replaced with configuration files, a user interface, or some other way to specify exactly what the program is supposed to be doing without having to change the source code. In our case, these variables serve our needs just fine. Take a look at what each is supposed to do and where each is used. One variable you might like to add to this section is a file name of your own so that you can easily test your trees on any piece of text you like. You can create a new file in Eclipse through the "new" options.

5 To answer the questions, you will definitely need to change which files are encoded and decoded by changing the file selections represented in these constants. One thing you might like to do while testing is set your program to decode the very same file that was just generated during encoding ( ENCODE_OUT_F ). Main method This long method is already finished, although there are a few lines where System.out.println function calls may be commented in or out, and you may add your own println statements as well. This may be useful while testing and answering questions. For the most part you shouldn't have to change existing code in this method instead focus on making the individual methods work correctly when called FROM the main method. Tree building methods You will need to complete the genfrequencymap and maptotree methods for a tree to be built correctly the first creates a Map from Characters to their frequencies based on a char[], and the second uses that Map to build the actual Huffman tree. These are both used by the gentree method, which is already complete but may be a useful place to put some additional output statements. Two (overloaded) genstdtree methods are provided and already completed, which generate a standard tree from the included bit String instead of building a brand new one. You don't need to change these, although it may be interesting to see how they work (see the last section of this document for how trees are encoded as bits). Encoding methods As we discussed, there are two main approaches to encoding a sequence of characters: Mapping each possible character to its node within the tree ahead of time, and following the chain of parent nodes up to the top to determine the bit String each time a character is encoded, OR pre generating a mapping directly from each possible character to its bit String. This project uses the latter approach. A third approach, searching the entire tree from the top down every time a character needs encoding, would be extremely inefficient. You'll have to finish the recursive setbitstrings methods of the Huffman node subclasses. A parent node is responsible for propagating the recursive call to its children so that they too can be added to the Map, providing those children with the

6 correct bit Strings representing the branches taken down to that point. A leaf node is responsible for adding itself to the Map. Once the Map from characters to their bit Strings is finished, encoding is quite straightforward, so the encode function has been completed for you. The hard part is recursively traversing the tree and getting each character into the map in the first place. Decoding methods You must complete the decode method of the Huffman node subclasses. Decoding works perfectly well without a Map, instead starting at the top of the tree and reading input bits one by one to decide which branch to take at each point. That's exactly what this method should do for the parent nodes using the provided CharArrayIterator to advance through the different bit characters. Once a leaf is reached, no further bits are required; the character has been found. Again, once the recursive tree methods are complete, the rest is very straightforward, so the non recursive static decode method has been completed for you already. A note about char s If we were writing professional compression software, instead of converting each character to multiple '0' and '1' char s (which take up the same amount of memory as any other character, after all), we'd be converting them to "raw binary" 0s and 1s (which really do only take one bit to store) before writing them to a file, creating an actual reduction in file size. However, the purpose of this project is to demonstrate the compression techniques, so to make it as easy as possible to see the results of your encoding, we'll keep them as char s. Remember to treat them this way in the code! OPTIONAL ADDITIONS That concludes the sections of code you are required to complete. If you'd like some more challenges, here are two additional features you might try. They actually look more complicated than they are, they shouldn't take too much code, and they allow you to do some pretty interesting things! Gap check Certain characters appear in some passages and not others. A properly generated Huffman tree can always be used to encode the passage it was generated from, but a tree can't be used to fully encode a passage for which it's missing characters (although

7 it could just skip those characters). One way to handle this is to have your tree generating function, after generating the "frequency count", add any missing characters to the queue/tree with a frequency of 0. This way, they'll be at the bottom of the tree (the way generation works, they should all end up in one big sub tree "hanging off" the bottom), but they will still get encodings, even if they're really long ones. This could conceivably allow two people to agree on a "standard passage" to always use for tree generation. They could then encode ANY messages they like using that tree and send them to each other in compressed form. Decoding the messages would use that same standard tree. In our project, some letters may not be present in some passages. This is especially true of the less common capital letters. Additionally, the following non alphabetic characters appear in some of the provided passages and not others: '!' '"' '(' ')' '/' ':' ';' '?' Angelou Y Y Y Y Tolkein Y Y Y Y Y Mystery Y Y Y Y The standard tree was generated by looping through the following character groups and adding any characters from it to the tree generating queue if necessary: All capital letters All lowercase letters Characters with ASCII values 32 34: ' ' '!' and '"' Characters with ASCII values 39 41: '\'' '(' and ')' Characters with ASCII values 44 47: ',' ' ' '.' and '/' These characters with inconvenient ASCII values: '\n' ':' ';' '?' So, the standard tree is therefore capable of encoding and decoding ANY of the three passages provided! Although it might use more bits to do so than a tree built specifically for a given passage. By the way, we could have also included the digits and certain other punctuation char s in this group, but none of our sample passages include those characters, so we've chosen not to worry about them. Your mission, should you choose to accept it, is to complete the gapcheck method to fill in any missing characters in the frequency map so that the generated tree will include those characters too. The most obvious way to do this is to create a long array of all the characters you want to double check for, then loop through them all. However,

8 there are cleverer ways to set up your loops to avoid having to list out every single character to check in your code remember, each character is represented by a number behind the scenes, so you can loop through them if you know what order they occur. Bit representation of the tree itself When files are sent or stored in encoded form, they are very hard to decode unless the decoder knows exactly which tree was used in the first place. Some sort of standard passage could be used to always build the same tree (as described above), but this can be inconvenient to communicate and may result in inefficient compression since the standard tree isn't custom built for each different encoded file. One solution is to encode the tree itself in binary and include it at the beginning of the file in a standardized form. One straightforward standard is to start at the top and write a 0 for each parent node (which always has 2 children following it), and a 1 for each leaf immediately followed by the 8 digit ASCII character code (in binary) for that leaf. In fact, that is precisely the standard used to encode the tree provided to you in this project, and the genstdtree methods decode any bit String using that standard (although it obviously reads the bit String from the constant variable rather than a file). For example, a small tree with only three children could be encoded as follows (spaces added for clarity): left/"0" right/"1" child child of of root (itself root an entire sub tree) / root / left/"0" right/"1" parent node child in child in in sub tree sub tree sub tree Note that following each 1 identifying a leaf node is the 8 bit ASCII code for 'a', 'b', or 'c'. So this represents a Huffman tree where 'a' is encoded as 0 (the only leaf on that side of the root), 'b' is encoded as 10, and c as 11 (the two leaves on the other side of the root).

9 So, this technique allows ANY file using ANY tree to be encoded and communicated, as long as the receiving program is using the same standard to encode and decode its trees and files. Fortunately, this is much easier to standardize than having to settle on a single tree to use for every communication! The project code for converting a tree to and from a bit String is actually nearly complete already. Obviously, the decoding part is already finished or else the provided standard tree could not be built by the genstdtree method. And, the translation of a character to its 8 bit ASCII code uses enough functions and operations you haven't used before that it has been provided for you in the HuffmanLeaf class. The only thing left to do is fill in the buildbitrep function in the HuffmanParent class, call bitrep() on the root of the tree from the main method, and print the result. (Actually, you could go even further and actually write it to the beginning of each encoded file. But you'll have to do that part on your own, and also, mysterypassage.txt was not encoded this way, so make sure you don't try to decode it using that format!) Display (already finished) The recursive display function has already been completed in the node classes, but you may learn something from looking at the definition and working out how it operates!

Huffman Coding Assignment For CS211, Bellevue College (rev. 2016)

Huffman Coding Assignment For CS211, Bellevue College (rev. 2016) Huffman Coding Assignment For CS, Bellevue College (rev. ) (original from Marty Stepp, UW CSE, modified by W.P. Iverson) Summary: Huffman coding is an algorithm devised by David A. Huffman of MIT in 95

More information

15 July, Huffman Trees. Heaps

15 July, Huffman Trees. Heaps 1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in

More information

COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING. 1 Introduction. 2 Huffman Coding. Due Thursday, March 8, 11:59pm

COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING. 1 Introduction. 2 Huffman Coding. Due Thursday, March 8, 11:59pm COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING Due Thursday, March 8, 11:59pm Reminder regarding intellectual responsibility: This is an individual assignment, and the work you submit should be your own.

More information

CSE 143, Winter 2013 Programming Assignment #8: Huffman Coding (40 points) Due Thursday, March 14, 2013, 11:30 PM

CSE 143, Winter 2013 Programming Assignment #8: Huffman Coding (40 points) Due Thursday, March 14, 2013, 11:30 PM CSE, Winter Programming Assignment #8: Huffman Coding ( points) Due Thursday, March,, : PM This program provides practice with binary trees and priority queues. Turn in files named HuffmanTree.java, secretmessage.short,

More information

Binary Trees Due Sunday March 16, 2014

Binary Trees Due Sunday March 16, 2014 Problem Description Binary Trees Due Sunday March 16, 2014 Recall that a binary tree is complete if all levels in the tree are full 1 except possibly the last level which is filled in from left to right.

More information

So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees.

So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees. So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees. According to the 161 schedule, heaps were last week, hashing

More information

Design Pattern: Composite

Design Pattern: Composite Design Pattern: Composite Intent Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly. Motivation

More information

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices. Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal

More information

CSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm

CSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm CSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm This assignment is worth a total of 30 points. It is divided into two parts, each worth approximately half

More information

Binary Trees Case-studies

Binary Trees Case-studies Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look

More information

Black Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files

Black Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files Black Problem 2: Huffman Compression [75 points] Copied from: https://www.cs.hmc.edu/twiki/bin/view/cs5/huff manblack on 3/15/2017 Due: 11:59 PM on November 14, 2016 Starter files First, here is a set

More information

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between MITOCW Lecture 10A [MUSIC PLAYING] PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between all these high-level languages like Lisp and the query

More information

BEGINNER PHP Table of Contents

BEGINNER PHP Table of Contents Table of Contents 4 5 6 7 8 9 0 Introduction Getting Setup Your first PHP webpage Working with text Talking to the user Comparison & If statements If & Else Cleaning up the game Remembering values Finishing

More information

CS 206 Introduction to Computer Science II

CS 206 Introduction to Computer Science II CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced

More information

Constraint Satisfaction Problems: A Deeper Look

Constraint Satisfaction Problems: A Deeper Look Constraint Satisfaction Problems: A Deeper Look The last problem set covered the topic of constraint satisfaction problems. CSP search and solution algorithms are directly applicable to a number of AI

More information

More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

More information

Binary Trees and Huffman Encoding Binary Search Trees

Binary Trees and Huffman Encoding Binary Search Trees Binary Trees and Huffman Encoding Binary Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary is a

More information

TREES. Trees - Introduction

TREES. Trees - Introduction TREES Chapter 6 Trees - Introduction All previous data organizations we've studied are linear each element can have only one predecessor and successor Accessing all elements in a linear sequence is O(n)

More information

MITOCW MIT6_172_F10_lec18_300k-mp4

MITOCW MIT6_172_F10_lec18_300k-mp4 MITOCW MIT6_172_F10_lec18_300k-mp4 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for

More information

EE 368. Weeks 5 (Notes)

EE 368. Weeks 5 (Notes) EE 368 Weeks 5 (Notes) 1 Chapter 5: Trees Skip pages 273-281, Section 5.6 - If A is the root of a tree and B is the root of a subtree of that tree, then A is B s parent (or father or mother) and B is A

More information

Huffman, YEAH! Sasha Harrison Spring 2018

Huffman, YEAH! Sasha Harrison Spring 2018 Huffman, YEAH! Sasha Harrison Spring 2018 Overview Brief History Lesson Step-wise Assignment Explanation Starter Files, Debunked What is Huffman Encoding? File compression scheme In text files, can we

More information

CSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials)

CSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials) CSE100 Advanced Data Structures Lecture 12 (Based on Paul Kube course materials) CSE 100 Coding and decoding with a Huffman coding tree Huffman coding tree implementation issues Priority queues and priority

More information

CS15100 Lab 7: File compression

CS15100 Lab 7: File compression C151 Lab 7: File compression Fall 26 November 14, 26 Complete the first 3 chapters (through the build-huffman-tree function) in lab (optionally) with a partner. The rest you must do by yourself. Write

More information

Horn Formulae. CS124 Course Notes 8 Spring 2018

Horn Formulae. CS124 Course Notes 8 Spring 2018 CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

More information

Out: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday)

Out: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday) CS 215 Fundamentals of Programming II Spring 2017 Programming Project 7 30 points Out: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday) This project

More information

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College! Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College! ADTʼs Weʼve Studied! Position-oriented ADT! List! Stack! Queue! Value-oriented ADT! Sorted list! All of these are linear! One previous item;

More information

Download, Install and Use Winzip

Download, Install and Use Winzip Download, Install and Use Winzip Something that you are frequently asked to do (particularly if you are in one of my classes) is to either 'zip' or 'unzip' a file or folders. Invariably, when I ask people

More information

Text Compression through Huffman Coding. Terminology

Text Compression through Huffman Coding. Terminology Text Compression through Huffman Coding Huffman codes represent a very effective technique for compressing data; they usually produce savings between 20% 90% Preliminary example We are given a 100,000-character

More information

Arduino IDE Friday, 26 October 2018

Arduino IDE Friday, 26 October 2018 Arduino IDE Friday, 26 October 2018 12:38 PM Looking Under The Hood Of The Arduino IDE FIND THE ARDUINO IDE DOWNLOAD First, jump on the internet with your favorite browser, and navigate to www.arduino.cc.

More information

CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3

CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3 Compressing Data using Huffman Coding Due Oct.24 noon Objectives In this assignment, you will implement classes for data compression. You will write: () An implementation of the Huffman Coding using a

More information

Skill 1: Multiplying Polynomials

Skill 1: Multiplying Polynomials CS103 Spring 2018 Mathematical Prerequisites Although CS103 is primarily a math class, this course does not require any higher math as a prerequisite. The most advanced level of mathematics you'll need

More information

CSE 143 Lecture 22. Huffman Tree

CSE 143 Lecture 22. Huffman Tree CSE 4 Lecture Huffman slides created by Ethan Apter http://www.cs.washington.edu/4/ Huffman Tree For your next assignment, you ll create a Huffman tree Huffman trees are used for file compression file

More information

Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides

Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides for both problems first, and let you guys code them

More information

Lab 7 Macros, Modules, Data Access Pages and Internet Summary Macros: How to Create and Run Modules vs. Macros 1. Jumping to Internet

Lab 7 Macros, Modules, Data Access Pages and Internet Summary Macros: How to Create and Run Modules vs. Macros 1. Jumping to Internet Lab 7 Macros, Modules, Data Access Pages and Internet Summary Macros: How to Create and Run Modules vs. Macros 1. Jumping to Internet 1. Macros 1.1 What is a macro? A macro is a set of one or more actions

More information

Linked Lists. What is a Linked List?

Linked Lists. What is a Linked List? Linked Lists Along with arrays, linked lists form the basis for pretty much every other data stucture out there. This makes learning and understand linked lists very important. They are also usually the

More information

Assignment 1: grid. Due November 20, 11:59 PM Introduction

Assignment 1: grid. Due November 20, 11:59 PM Introduction CS106L Fall 2008 Handout #19 November 5, 2008 Assignment 1: grid Due November 20, 11:59 PM Introduction The STL container classes encompass a wide selection of associative and sequence containers. However,

More information

https://www.eskimo.com/~scs/cclass/notes/sx8.html

https://www.eskimo.com/~scs/cclass/notes/sx8.html 1 de 6 20-10-2015 10:41 Chapter 8: Strings Strings in C are represented by arrays of characters. The end of the string is marked with a special character, the null character, which is simply the character

More information

Data compression.

Data compression. Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.

More information

Huffman Codes (data compression)

Huffman Codes (data compression) Huffman Codes (data compression) Data compression is an important technique for saving storage Given a file, We can consider it as a string of characters We want to find a compressed file The compressed

More information

CMPSCI 240 Reasoning Under Uncertainty Homework 4

CMPSCI 240 Reasoning Under Uncertainty Homework 4 CMPSCI 240 Reasoning Under Uncertainty Homework 4 Prof. Hanna Wallach Assigned: February 24, 2012 Due: March 2, 2012 For this homework, you will be writing a program to construct a Huffman coding scheme.

More information

CS103 Spring 2018 Mathematical Vocabulary

CS103 Spring 2018 Mathematical Vocabulary CS103 Spring 2018 Mathematical Vocabulary You keep using that word. I do not think it means what you think it means. - Inigo Montoya, from The Princess Bride Consider the humble while loop in most programming

More information

Binary Search Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250

Binary Search Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250 Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Previously, on ECE-250... We discussed trees (the

More information

Using X-Particles with Team Render

Using X-Particles with Team Render Using X-Particles with Team Render Some users have experienced difficulty in using X-Particles with Team Render, so we have prepared this guide to using them together. Caching Using Team Render to Picture

More information

Radix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees:

Radix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees: Radix Searching The most simple radix search method is digital tree searching - the binary search tree with the branch in the tree according to the bits of keys: at the first level the leading bit is used,

More information

In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology.

In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology. Guide to and Hi everybody! In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology. This guide focuses on two of those symbols: and. These symbols represent concepts

More information

Binary, Hexadecimal and Octal number system

Binary, Hexadecimal and Octal number system Binary, Hexadecimal and Octal number system Binary, hexadecimal, and octal refer to different number systems. The one that we typically use is called decimal. These number systems refer to the number of

More information

Using Eclipse and Karel

Using Eclipse and Karel Alisha Adam and Rohit Talreja CS 106A Summer 2016 Using Eclipse and Karel Based on a similar handout written by Eric Roberts, Mehran Sahami, Keith Schwarz, and Marty Stepp If you have not already installed

More information

New to the Mac? Then start with this lesson to learn the basics.

New to the Mac? Then start with this lesson to learn the basics. Mac 101: Mac essentials If you're brand new to the world of computers and are just starting to get up and running, relax using a Mac is simple. This lesson introduces you to the basic principles of interacting

More information

CS 170 Java Tools. Step 1: Got Java?

CS 170 Java Tools. Step 1: Got Java? CS 170 Java Tools This summer in CS 170 we'll be using the DrJava Integrated Development Environment. You're free to use other tools but this is what you'll use on your programming exams, so you'll need

More information

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we have to talk about the way in which we represent the

More information

A PROGRAM IS A SEQUENCE of instructions that a computer can execute to

A PROGRAM IS A SEQUENCE of instructions that a computer can execute to A PROGRAM IS A SEQUENCE of instructions that a computer can execute to perform some task. A simple enough idea, but for the computer to make any use of the instructions, they must be written in a form

More information

Building Java Programs. Priority Queues, Huffman Encoding

Building Java Programs. Priority Queues, Huffman Encoding Building Java Programs Priority Queues, Huffman Encoding Prioritization problems ER scheduling: You are in charge of scheduling patients for treatment in the ER. A gunshot victim should probably get treatment

More information

COMP-202 Unit 4: Programming with Iterations

COMP-202 Unit 4: Programming with Iterations COMP-202 Unit 4: Programming with Iterations Doing the same thing again and again and again and again and again and again and again and again and again... CONTENTS: While loops Class (static) variables

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi. Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

More information

CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding

CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,

More information

CSE 100 Advanced Data Structures

CSE 100 Advanced Data Structures CSE 100 Advanced Data Structures Overview of course requirements Outline of CSE 100 topics Review of trees Helpful hints for team programming Information about computer accounts Page 1 of 25 CSE 100 web

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

printf( Please enter another number: ); scanf( %d, &num2);

printf( Please enter another number: ); scanf( %d, &num2); CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful

More information

Binary Trees

Binary Trees Binary Trees 4-7-2005 Opening Discussion What did we talk about last class? Do you have any code to show? Do you have any questions about the assignment? What is a Tree? You are all familiar with what

More information

Greedy Algorithms CHAPTER 16

Greedy Algorithms CHAPTER 16 CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often

More information

1 Getting used to Python

1 Getting used to Python 1 Getting used to Python We assume you know how to program in some language, but are new to Python. We'll use Java as an informal running comparative example. Here are what we think are the most important

More information

6.001 Notes: Section 8.1

6.001 Notes: Section 8.1 6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything

More information

TourMaker Reference Manual. Intro

TourMaker Reference Manual. Intro TourMaker Reference Manual Intro Getting Started Tutorial: Edit An Existing Tour Key Features & Tips Tutorial: Create A New Tour Posting A Tour Run Tours From Your Hard Drive Intro The World Wide Web is

More information

CSC148 Week 6. Larry Zhang

CSC148 Week 6. Larry Zhang CSC148 Week 6 Larry Zhang 1 Announcements Test 1 coverage: trees (topic of today and Wednesday) are not covered Assignment 1 slides posted on the course website. 2 Data Structures 3 Data Structures A data

More information

Graduate-Credit Programming Project

Graduate-Credit Programming Project Graduate-Credit Programming Project Due by 11:59 p.m. on December 14 Overview For this project, you will: develop the data structures associated with Huffman encoding use these data structures and the

More information

Huffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27

Huffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27 Huffman Coding Version of October 13, 2014 Version of October 13, 2014 Huffman Coding 1 / 27 Outline Outline Coding and Decoding The optimal source coding problem Huffman coding: A greedy algorithm Correctness

More information

Ruby on Rails Welcome. Using the exercise files

Ruby on Rails Welcome. Using the exercise files Ruby on Rails Welcome Welcome to Ruby on Rails Essential Training. In this course, we're going to learn the popular open source web development framework. We will walk through each part of the framework,

More information

Java Programming Constructs Java Programming 2 Lesson 1

Java Programming Constructs Java Programming 2 Lesson 1 Java Programming Constructs Java Programming 2 Lesson 1 Course Objectives Welcome to OST's Java 2 course! In this course, you'll learn more in-depth concepts and syntax of the Java Programming language.

More information

B-Trees. Introduction. Definitions

B-Trees. Introduction. Definitions 1 of 10 B-Trees Introduction A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node,

More information

Slide 1 Side Effects Duration: 00:00:53 Advance mode: Auto

Slide 1 Side Effects Duration: 00:00:53 Advance mode: Auto Side Effects The 5 numeric operators don't modify their operands Consider this example: int sum = num1 + num2; num1 and num2 are unchanged after this The variable sum is changed This change is called a

More information

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd 4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

More information

An Overview 1 / 10. CS106B Winter Handout #21 March 3, 2017 Huffman Encoding and Data Compression

An Overview 1 / 10. CS106B Winter Handout #21 March 3, 2017 Huffman Encoding and Data Compression CS106B Winter 2017 Handout #21 March 3, 2017 Huffman Encoding and Data Compression Handout by Julie Zelenski with minor edits by Keith Schwarz In the early 1980s, personal computers had hard disks that

More information

Designing a Database -- Understanding Relational Design

Designing a Database -- Understanding Relational Design Designing a Database -- Understanding Relational Design Contents Overview The Database Design Process Steps in Designing a Database Common Design Problems Determining the Purpose Determining the Tables

More information

Notice on Access to Advanced Lists...2 Database Overview...2 Example: Real-life concept of a database... 2

Notice on Access to Advanced Lists...2 Database Overview...2 Example: Real-life concept of a database... 2 Table of Contents Notice on Access to Advanced Lists...2 Database Overview...2 Example: Real-life concept of a database... 2 Queries...2 Example: Real-life concept of a query... 2 Database terminology...3

More information

Analysis of Algorithms

Analysis of Algorithms Algorithm An algorithm is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program can be viewed as an elaborate algorithm. In mathematics and

More information

Naming Things in Adafruit IO

Naming Things in Adafruit IO Naming Things in Adafruit IO Created by Adam Bachman Last updated on 2016-07-27 09:29:53 PM UTC Guide Contents Guide Contents Introduction The Two Feed Identifiers Name Key Aside: Naming things in MQTT

More information

static CS106L Spring 2009 Handout #21 May 12, 2009 Introduction

static CS106L Spring 2009 Handout #21 May 12, 2009 Introduction CS106L Spring 2009 Handout #21 May 12, 2009 static Introduction Most of the time, you'll design classes so that any two instances of that class are independent. That is, if you have two objects one and

More information

15-122: Principles of Imperative Computation, Spring 2013

15-122: Principles of Imperative Computation, Spring 2013 15-122 Homework 6 Page 1 of 13 15-122: Principles of Imperative Computation, Spring 2013 Homework 6 Programming: Huffmanlab Due: Thursday, April 4, 2013 by 23:59 For the programming portion of this week

More information

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5 Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5 [talking head] This lecture we study theory design and implementation. Programmers have two roles to play here. In one role, they

More information

[key, Left subtree, Right subtree]

[key, Left subtree, Right subtree] Project: Binary Search Trees A binary search tree is a method to organize data, together with operations on these data (i.e., it is a data structure). In particular, the operation that this organization

More information

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it.

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it. Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 49 Module 09 Other applications: expression tree

More information

MITOCW watch?v=flgjisf3l78

MITOCW watch?v=flgjisf3l78 MITOCW watch?v=flgjisf3l78 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To

More information

Tree Structures. A hierarchical data structure whose point of entry is the root node

Tree Structures. A hierarchical data structure whose point of entry is the root node Binary Trees 1 Tree Structures A tree is A hierarchical data structure whose point of entry is the root node This structure can be partitioned into disjoint subsets These subsets are themselves trees and

More information

SQL - Tables. SQL - Create a SQL Table. SQL Create Table Query:

SQL - Tables. SQL - Create a SQL Table. SQL Create Table Query: SQL - Tables Data is stored inside SQL tables which are contained within SQL databases. A single database can house hundreds of tables, each playing its own unique role in th+e database schema. While database

More information

CONTENTS: While loops Class (static) variables and constants Top Down Programming For loops Nested Loops

CONTENTS: While loops Class (static) variables and constants Top Down Programming For loops Nested Loops COMP-202 Unit 4: Programming with Iterations Doing the same thing again and again and again and again and again and again and again and again and again... CONTENTS: While loops Class (static) variables

More information

The Stack, Free Store, and Global Namespace

The Stack, Free Store, and Global Namespace Pointers This tutorial is my attempt at clarifying pointers for anyone still confused about them. Pointers are notoriously hard to grasp, so I thought I'd take a shot at explaining them. The more information

More information

CSE100. Advanced Data Structures. Lecture 13. (Based on Paul Kube course materials)

CSE100. Advanced Data Structures. Lecture 13. (Based on Paul Kube course materials) CSE100 Advanced Data Structures Lecture 13 (Based on Paul Kube course materials) CSE 100 Priority Queues in Huffman s algorithm Heaps and Priority Queues Time and space costs of coding with Huffman codes

More information

Android Programming Family Fun Day using AppInventor

Android Programming Family Fun Day using AppInventor Android Programming Family Fun Day using AppInventor Table of Contents A step-by-step guide to making a simple app...2 Getting your app running on the emulator...9 Getting your app onto your phone or tablet...10

More information

CS2112 Fall Assignment 4 Parsing and Fault Injection. Due: March 18, 2014 Overview draft due: March 14, 2014

CS2112 Fall Assignment 4 Parsing and Fault Injection. Due: March 18, 2014 Overview draft due: March 14, 2014 CS2112 Fall 2014 Assignment 4 Parsing and Fault Injection Due: March 18, 2014 Overview draft due: March 14, 2014 Compilers and bug-finding systems operate on source code to produce compiled code and lists

More information

2. INSTALLATION OF SUSE

2. INSTALLATION OF SUSE 2. INSTALLATION OF SUSE 2.1. PREINSTALLATION STEPS 2.1.1. Overview Installing any kind of operating system is a big move and can come as something of a shock to our PC. However, SUSE Linux makes this complicated

More information

Linked lists. Yet another Abstract Data Type Provides another method for providing space-efficient storage of data

Linked lists. Yet another Abstract Data Type Provides another method for providing space-efficient storage of data Linked lists One of the classic "linear structures" What are linked lists? Yet another Abstract Data Type Provides another method for providing space-efficient storage of data What do they look like? Linked

More information

MITOCW watch?v=v3omvlzi0we

MITOCW watch?v=v3omvlzi0we MITOCW watch?v=v3omvlzi0we The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To

More information

Post Experiment Interview Questions

Post Experiment Interview Questions Post Experiment Interview Questions Questions about the Maximum Problem 1. What is this problem statement asking? 2. What is meant by positive integers? 3. What does it mean by the user entering valid

More information

EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10

EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10 EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10 1 Project: Scheme Parser. In many respects, the ultimate program is an interpreter. Why? Because

More information

The IBM I A Different Roadmap

The IBM I A Different Roadmap The IBM I A Different Roadmap Not long ago I was reading an article about a session Steve Will gave on how to make the IBM i "sexy". Those who know me know that that would immediately start me thinking

More information

CS103 Handout 50 Fall 2018 November 30, 2018 Problem Set 9

CS103 Handout 50 Fall 2018 November 30, 2018 Problem Set 9 CS103 Handout 50 Fall 2018 November 30, 2018 Problem Set 9 What problems are beyond our capacity to solve? Why are they so hard? And why is anything that we've discussed this quarter at all practically

More information

CIS 121 Data Structures and Algorithms with Java Spring 2018

CIS 121 Data Structures and Algorithms with Java Spring 2018 CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and

More information

Assignment #6: Markov-Chain Language Learner CSCI E-220 Artificial Intelligence Due: Thursday, October 27, 2011

Assignment #6: Markov-Chain Language Learner CSCI E-220 Artificial Intelligence Due: Thursday, October 27, 2011 1. General Idea Assignment #6: Markov-Chain Language Learner CSCI E-220 Artificial Intelligence Due: Thursday, October 27, 2011 You will write a program that determines (as best as possible) the language

More information

MITOCW watch?v=0jljzrnhwoi

MITOCW watch?v=0jljzrnhwoi MITOCW watch?v=0jljzrnhwoi The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information