CS02b Project 2 String compression with Huffman trees
|
|
- Harvey Chapman
- 5 years ago
- Views:
Transcription
1 PROJECT OVERVIEW CS02b Project 2 String compression with Huffman trees We've discussed how characters can be encoded into bits for storage in a computer. ASCII (7 8 bits per character) and Unicode (16+ bits per character) are two related standardized encodings used by Java and other computer systems. However, an encoding can use fewer bits (less memory) when the "alphabet" (set of all possible characters) is small, requiring fewer unique bit combinations (character codes) since fewer characters need representation. Huffman trees are a method of producing an encoding system that uses few bits for the most commonly used characters, and more bits for rarely used characters, saving bits overall. This is one form of "compression", minimizing the memory required to store data. For this project, you will complete a class whose primary functions: 1. build a Huffman tree based on a text passage, 2. encode a text passage into a "bit String" (String of 1s and 0s) using an encoding from the tree, and 3. decode a bit String using the encoding represented by the tree. Your starter code also comes with a function that builds a "standard" Huffman Tree for you. Note that even small variations in Huffman Trees will result in different encodings for some characters, which is why this standardized tree is provided for you to test your decode function in a way that PRECISELY matches the original encoding. This project also gives you the opportunity to see and manage a larger amount of code than most of our assignments, as well as practice working within and understanding a code base produced by another person. This is a valuable skill in programming most advanced work is a result of multiple complex pieces of code produced by multiple people, working together! Required Questions Once your program is functioning, you should use it and write some additional output statements in key locations in your code to help you answer the following questions: 1. Consider encoding the passage from I Know Why the Caged Bird Sings (cagedbirdpassage.txt).
2 a. How many unique characters are present in that passage? (In this case we mean text characters, not story characters!) b. If all characters were assigned unique codes using the same number of bits, how many bits would be required for each? (See our class example where we assigned unique 3 bit codes to the characters from "a man, a plan, a canal, panama". Why did we need 3 bits?) c. How many bits total would be required for the entire passage, using the number of bits from part (b) for each character? d. How many bits do you save by instead encoding the passage using your Huffman tree generated from the same text? 2. Consider encoding the passage from The Hobbit (thehobbitpassage.txt). a. How many bits does it take to encode it using a Huffman tree generated using its own text? b. How many bits does it take to encode using the provided "standard" Huffman tree, which was generated from a different text passage? c. How do you explain this difference? 3. Using the standard Huffman tree provided, what's encoded in mysterypassage.txt? DETAILED INSTRUCTIONS Drag and drop the provided file, HuffmanTree.java, into the src folder for your CS02b project within the Eclipse package explorer or if you like, you can make a brand new Java Project for this assignment and put your file in that src folder. If Eclipse asks, choose "copy", which allows Eclipse to manage the file in your workspace folder without worrying about what you do with the original. Save the following text files in your project folder (but outside your src folder): thehobbitpassage.txt cagedbirdpassage.txt mysterypassage.txt This does not need to be done through the Eclipse package explorer, although if you want them to show up there, you'll need to select your project folder and choose "refresh" from the menu (which is likely hotkeyed to F5). Get to know the provided source code. An overview is provided below. Look at the different sections and make sure you have a clear picture of the different options, variables and functions.
3 Complete the code responsible for the three major tasks: Tree building, encoding, and decoding. You can do these in any order and test them individually if you like the standard tree can be used for both encoding and decoding without building your own, and the mysterypassage.txt file is already encoded using that standard tree if you would like to decode it without first encoding your own files. Thoroughly test your code. You may un comment or add your own println statements to give feedback throughout your program. One thing you might like to do is create a very small text file, for instance containing a single line with just a few characters, to test the basics before trying it on real English text. With a small enough file, you can manually check a generated tree or encoding for correctness. And once your encoding and decoding is working, you should be able to encode a file and then decode it using the same tree, and confirm the the original result comes back out. When your code is working (or perhaps in the process of testing), activate the standard tree and decode the mystery passage. If the result is not intelligible, your decode functions are probably not correct! Answer questions 1 and 2 above in a multi line comment at the end of your source code. You may run your program multiple times for this, changing which files are encoded/decoded as necessary. You may also switch between using the provided standard tree or your own built tree. Feel free to add any extra output statements to give you additional information as your code runs. There is no specific console output required as long as it's clear what your code is doing, the three primary operations work correctly, and you've answered the questions. Submit the following via e mail: 1. your HuffmanTree.java source code, with a comment at the bottom answering the two questions 2. your decoded.txt file showing the decoded version of mysterypassage.txt (you may rename the decoded file if you like) Optionally, feel free to experiment and make additions to the code. My complete version of this project includes two extensions that are unfinished in your starter code: 1. the gapcheck function, which "fills in gaps" in a frequency map so the resulting tree is more flexible. 2. the buildbitrep function of the HuffmanParent class, which is part of a set of functions that can translate the tree itself to a bit String of 0s and 1s.
4 CODE OVERVIEW As mentioned, there are a few distinct sections and processes going on in the code. This section describes the overall plan, but you should look at the code itself to get a sense of how it all fits together it's thoroughly commented. By the way, this may be a good time to re open that "outline" panel in Eclipse we've kept closed/minimized this whole time! Terminology note: The term "prefix" is used in several places in this code, meaning "the first part of a String". In this program, prefixes are often added to one or two characters at a time, for instance while working your way recursively down a Huffman tree. Upon reaching a leaf, the "prefix" represents the path taken through the parent nodes to get there. Tree node classes These only appear at the end, but understanding them is key to working with the tree structure, so you may want to look at them first. The tree is represented by interconnected HuffmanNode objects. Since HuffmanNode is abstract, each node must be one of the two subclasses, HuffmanParent or HuffmanLeaf. HuffmanParent s keep track of the structure of the tree, while HuffmanLeaf s store the actual characters at the end of each series of branches. The two subclasses have appropriate (different) definitions for the same recursive methods to enable tree operations (see below). Because HuffmanNode declares these methods, polymorphism is used to call them recursively from parent nodes, regardless of whether its children are HuffmanLeaf s or are themselves HuffmanParent s of even more nodes. Thus, the recursive calls travel easily down the tree and into the leaves. Constants These static final variables at the top of the HuffmanTree class cannot be changed once the program has begun. In an application produced for the general public, these variables would probably be replaced with configuration files, a user interface, or some other way to specify exactly what the program is supposed to be doing without having to change the source code. In our case, these variables serve our needs just fine. Take a look at what each is supposed to do and where each is used. One variable you might like to add to this section is a file name of your own so that you can easily test your trees on any piece of text you like. You can create a new file in Eclipse through the "new" options.
5 To answer the questions, you will definitely need to change which files are encoded and decoded by changing the file selections represented in these constants. One thing you might like to do while testing is set your program to decode the very same file that was just generated during encoding ( ENCODE_OUT_F ). Main method This long method is already finished, although there are a few lines where System.out.println function calls may be commented in or out, and you may add your own println statements as well. This may be useful while testing and answering questions. For the most part you shouldn't have to change existing code in this method instead focus on making the individual methods work correctly when called FROM the main method. Tree building methods You will need to complete the genfrequencymap and maptotree methods for a tree to be built correctly the first creates a Map from Characters to their frequencies based on a char[], and the second uses that Map to build the actual Huffman tree. These are both used by the gentree method, which is already complete but may be a useful place to put some additional output statements. Two (overloaded) genstdtree methods are provided and already completed, which generate a standard tree from the included bit String instead of building a brand new one. You don't need to change these, although it may be interesting to see how they work (see the last section of this document for how trees are encoded as bits). Encoding methods As we discussed, there are two main approaches to encoding a sequence of characters: Mapping each possible character to its node within the tree ahead of time, and following the chain of parent nodes up to the top to determine the bit String each time a character is encoded, OR pre generating a mapping directly from each possible character to its bit String. This project uses the latter approach. A third approach, searching the entire tree from the top down every time a character needs encoding, would be extremely inefficient. You'll have to finish the recursive setbitstrings methods of the Huffman node subclasses. A parent node is responsible for propagating the recursive call to its children so that they too can be added to the Map, providing those children with the
6 correct bit Strings representing the branches taken down to that point. A leaf node is responsible for adding itself to the Map. Once the Map from characters to their bit Strings is finished, encoding is quite straightforward, so the encode function has been completed for you. The hard part is recursively traversing the tree and getting each character into the map in the first place. Decoding methods You must complete the decode method of the Huffman node subclasses. Decoding works perfectly well without a Map, instead starting at the top of the tree and reading input bits one by one to decide which branch to take at each point. That's exactly what this method should do for the parent nodes using the provided CharArrayIterator to advance through the different bit characters. Once a leaf is reached, no further bits are required; the character has been found. Again, once the recursive tree methods are complete, the rest is very straightforward, so the non recursive static decode method has been completed for you already. A note about char s If we were writing professional compression software, instead of converting each character to multiple '0' and '1' char s (which take up the same amount of memory as any other character, after all), we'd be converting them to "raw binary" 0s and 1s (which really do only take one bit to store) before writing them to a file, creating an actual reduction in file size. However, the purpose of this project is to demonstrate the compression techniques, so to make it as easy as possible to see the results of your encoding, we'll keep them as char s. Remember to treat them this way in the code! OPTIONAL ADDITIONS That concludes the sections of code you are required to complete. If you'd like some more challenges, here are two additional features you might try. They actually look more complicated than they are, they shouldn't take too much code, and they allow you to do some pretty interesting things! Gap check Certain characters appear in some passages and not others. A properly generated Huffman tree can always be used to encode the passage it was generated from, but a tree can't be used to fully encode a passage for which it's missing characters (although
7 it could just skip those characters). One way to handle this is to have your tree generating function, after generating the "frequency count", add any missing characters to the queue/tree with a frequency of 0. This way, they'll be at the bottom of the tree (the way generation works, they should all end up in one big sub tree "hanging off" the bottom), but they will still get encodings, even if they're really long ones. This could conceivably allow two people to agree on a "standard passage" to always use for tree generation. They could then encode ANY messages they like using that tree and send them to each other in compressed form. Decoding the messages would use that same standard tree. In our project, some letters may not be present in some passages. This is especially true of the less common capital letters. Additionally, the following non alphabetic characters appear in some of the provided passages and not others: '!' '"' '(' ')' '/' ':' ';' '?' Angelou Y Y Y Y Tolkein Y Y Y Y Y Mystery Y Y Y Y The standard tree was generated by looping through the following character groups and adding any characters from it to the tree generating queue if necessary: All capital letters All lowercase letters Characters with ASCII values 32 34: ' ' '!' and '"' Characters with ASCII values 39 41: '\'' '(' and ')' Characters with ASCII values 44 47: ',' ' ' '.' and '/' These characters with inconvenient ASCII values: '\n' ':' ';' '?' So, the standard tree is therefore capable of encoding and decoding ANY of the three passages provided! Although it might use more bits to do so than a tree built specifically for a given passage. By the way, we could have also included the digits and certain other punctuation char s in this group, but none of our sample passages include those characters, so we've chosen not to worry about them. Your mission, should you choose to accept it, is to complete the gapcheck method to fill in any missing characters in the frequency map so that the generated tree will include those characters too. The most obvious way to do this is to create a long array of all the characters you want to double check for, then loop through them all. However,
8 there are cleverer ways to set up your loops to avoid having to list out every single character to check in your code remember, each character is represented by a number behind the scenes, so you can loop through them if you know what order they occur. Bit representation of the tree itself When files are sent or stored in encoded form, they are very hard to decode unless the decoder knows exactly which tree was used in the first place. Some sort of standard passage could be used to always build the same tree (as described above), but this can be inconvenient to communicate and may result in inefficient compression since the standard tree isn't custom built for each different encoded file. One solution is to encode the tree itself in binary and include it at the beginning of the file in a standardized form. One straightforward standard is to start at the top and write a 0 for each parent node (which always has 2 children following it), and a 1 for each leaf immediately followed by the 8 digit ASCII character code (in binary) for that leaf. In fact, that is precisely the standard used to encode the tree provided to you in this project, and the genstdtree methods decode any bit String using that standard (although it obviously reads the bit String from the constant variable rather than a file). For example, a small tree with only three children could be encoded as follows (spaces added for clarity): left/"0" right/"1" child child of of root (itself root an entire sub tree) / root / left/"0" right/"1" parent node child in child in in sub tree sub tree sub tree Note that following each 1 identifying a leaf node is the 8 bit ASCII code for 'a', 'b', or 'c'. So this represents a Huffman tree where 'a' is encoded as 0 (the only leaf on that side of the root), 'b' is encoded as 10, and c as 11 (the two leaves on the other side of the root).
9 So, this technique allows ANY file using ANY tree to be encoded and communicated, as long as the receiving program is using the same standard to encode and decode its trees and files. Fortunately, this is much easier to standardize than having to settle on a single tree to use for every communication! The project code for converting a tree to and from a bit String is actually nearly complete already. Obviously, the decoding part is already finished or else the provided standard tree could not be built by the genstdtree method. And, the translation of a character to its 8 bit ASCII code uses enough functions and operations you haven't used before that it has been provided for you in the HuffmanLeaf class. The only thing left to do is fill in the buildbitrep function in the HuffmanParent class, call bitrep() on the root of the tree from the main method, and print the result. (Actually, you could go even further and actually write it to the beginning of each encoded file. But you'll have to do that part on your own, and also, mysterypassage.txt was not encoded this way, so make sure you don't try to decode it using that format!) Display (already finished) The recursive display function has already been completed in the node classes, but you may learn something from looking at the definition and working out how it operates!
Huffman Coding Assignment For CS211, Bellevue College (rev. 2016)
Huffman Coding Assignment For CS, Bellevue College (rev. ) (original from Marty Stepp, UW CSE, modified by W.P. Iverson) Summary: Huffman coding is an algorithm devised by David A. Huffman of MIT in 95
More information15 July, Huffman Trees. Heaps
1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in
More informationCOSC-211: DATA STRUCTURES HW5: HUFFMAN CODING. 1 Introduction. 2 Huffman Coding. Due Thursday, March 8, 11:59pm
COSC-211: DATA STRUCTURES HW5: HUFFMAN CODING Due Thursday, March 8, 11:59pm Reminder regarding intellectual responsibility: This is an individual assignment, and the work you submit should be your own.
More informationCSE 143, Winter 2013 Programming Assignment #8: Huffman Coding (40 points) Due Thursday, March 14, 2013, 11:30 PM
CSE, Winter Programming Assignment #8: Huffman Coding ( points) Due Thursday, March,, : PM This program provides practice with binary trees and priority queues. Turn in files named HuffmanTree.java, secretmessage.short,
More informationBinary Trees Due Sunday March 16, 2014
Problem Description Binary Trees Due Sunday March 16, 2014 Recall that a binary tree is complete if all levels in the tree are full 1 except possibly the last level which is filled in from left to right.
More informationSo on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees.
So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees. According to the 161 schedule, heaps were last week, hashing
More informationDesign Pattern: Composite
Design Pattern: Composite Intent Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly. Motivation
More informationAn undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.
Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal
More informationCSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm
CSE143X: Computer Programming I & II Programming Assignment #10 due: Friday, 12/8/17, 11:00 pm This assignment is worth a total of 30 points. It is divided into two parts, each worth approximately half
More informationBinary Trees Case-studies
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look
More informationBlack Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files
Black Problem 2: Huffman Compression [75 points] Copied from: https://www.cs.hmc.edu/twiki/bin/view/cs5/huff manblack on 3/15/2017 Due: 11:59 PM on November 14, 2016 Starter files First, here is a set
More informationPROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between
MITOCW Lecture 10A [MUSIC PLAYING] PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between all these high-level languages like Lisp and the query
More informationBEGINNER PHP Table of Contents
Table of Contents 4 5 6 7 8 9 0 Introduction Getting Setup Your first PHP webpage Working with text Talking to the user Comparison & If statements If & Else Cleaning up the game Remembering values Finishing
More informationCS 206 Introduction to Computer Science II
CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced
More informationConstraint Satisfaction Problems: A Deeper Look
Constraint Satisfaction Problems: A Deeper Look The last problem set covered the topic of constraint satisfaction problems. CSP search and solution algorithms are directly applicable to a number of AI
More informationMore Bits and Bytes Huffman Coding
More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width
More informationBinary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary is a
More informationTREES. Trees - Introduction
TREES Chapter 6 Trees - Introduction All previous data organizations we've studied are linear each element can have only one predecessor and successor Accessing all elements in a linear sequence is O(n)
More informationMITOCW MIT6_172_F10_lec18_300k-mp4
MITOCW MIT6_172_F10_lec18_300k-mp4 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for
More informationEE 368. Weeks 5 (Notes)
EE 368 Weeks 5 (Notes) 1 Chapter 5: Trees Skip pages 273-281, Section 5.6 - If A is the root of a tree and B is the root of a subtree of that tree, then A is B s parent (or father or mother) and B is A
More informationHuffman, YEAH! Sasha Harrison Spring 2018
Huffman, YEAH! Sasha Harrison Spring 2018 Overview Brief History Lesson Step-wise Assignment Explanation Starter Files, Debunked What is Huffman Encoding? File compression scheme In text files, can we
More informationCSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials)
CSE100 Advanced Data Structures Lecture 12 (Based on Paul Kube course materials) CSE 100 Coding and decoding with a Huffman coding tree Huffman coding tree implementation issues Priority queues and priority
More informationCS15100 Lab 7: File compression
C151 Lab 7: File compression Fall 26 November 14, 26 Complete the first 3 chapters (through the build-huffman-tree function) in lab (optionally) with a partner. The rest you must do by yourself. Write
More informationHorn Formulae. CS124 Course Notes 8 Spring 2018
CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it
More informationOut: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday)
CS 215 Fundamentals of Programming II Spring 2017 Programming Project 7 30 points Out: April 19, 2017 Due: April 26, 2017 (Wednesday, Reading/Study Day, no late work accepted after Friday) This project
More informationTrees! Ellen Walker! CPSC 201 Data Structures! Hiram College!
Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College! ADTʼs Weʼve Studied! Position-oriented ADT! List! Stack! Queue! Value-oriented ADT! Sorted list! All of these are linear! One previous item;
More informationDownload, Install and Use Winzip
Download, Install and Use Winzip Something that you are frequently asked to do (particularly if you are in one of my classes) is to either 'zip' or 'unzip' a file or folders. Invariably, when I ask people
More informationText Compression through Huffman Coding. Terminology
Text Compression through Huffman Coding Huffman codes represent a very effective technique for compressing data; they usually produce savings between 20% 90% Preliminary example We are given a 100,000-character
More informationArduino IDE Friday, 26 October 2018
Arduino IDE Friday, 26 October 2018 12:38 PM Looking Under The Hood Of The Arduino IDE FIND THE ARDUINO IDE DOWNLOAD First, jump on the internet with your favorite browser, and navigate to www.arduino.cc.
More informationCS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3
Compressing Data using Huffman Coding Due Oct.24 noon Objectives In this assignment, you will implement classes for data compression. You will write: () An implementation of the Huffman Coding using a
More informationSkill 1: Multiplying Polynomials
CS103 Spring 2018 Mathematical Prerequisites Although CS103 is primarily a math class, this course does not require any higher math as a prerequisite. The most advanced level of mathematics you'll need
More informationCSE 143 Lecture 22. Huffman Tree
CSE 4 Lecture Huffman slides created by Ethan Apter http://www.cs.washington.edu/4/ Huffman Tree For your next assignment, you ll create a Huffman tree Huffman trees are used for file compression file
More informationHi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides
Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides for both problems first, and let you guys code them
More informationLab 7 Macros, Modules, Data Access Pages and Internet Summary Macros: How to Create and Run Modules vs. Macros 1. Jumping to Internet
Lab 7 Macros, Modules, Data Access Pages and Internet Summary Macros: How to Create and Run Modules vs. Macros 1. Jumping to Internet 1. Macros 1.1 What is a macro? A macro is a set of one or more actions
More informationLinked Lists. What is a Linked List?
Linked Lists Along with arrays, linked lists form the basis for pretty much every other data stucture out there. This makes learning and understand linked lists very important. They are also usually the
More informationAssignment 1: grid. Due November 20, 11:59 PM Introduction
CS106L Fall 2008 Handout #19 November 5, 2008 Assignment 1: grid Due November 20, 11:59 PM Introduction The STL container classes encompass a wide selection of associative and sequence containers. However,
More informationhttps://www.eskimo.com/~scs/cclass/notes/sx8.html
1 de 6 20-10-2015 10:41 Chapter 8: Strings Strings in C are represented by arrays of characters. The end of the string is marked with a special character, the null character, which is simply the character
More informationData compression.
Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.
More informationHuffman Codes (data compression)
Huffman Codes (data compression) Data compression is an important technique for saving storage Given a file, We can consider it as a string of characters We want to find a compressed file The compressed
More informationCMPSCI 240 Reasoning Under Uncertainty Homework 4
CMPSCI 240 Reasoning Under Uncertainty Homework 4 Prof. Hanna Wallach Assigned: February 24, 2012 Due: March 2, 2012 For this homework, you will be writing a program to construct a Huffman coding scheme.
More informationCS103 Spring 2018 Mathematical Vocabulary
CS103 Spring 2018 Mathematical Vocabulary You keep using that word. I do not think it means what you think it means. - Inigo Montoya, from The Princess Bride Consider the humble while loop in most programming
More informationBinary Search Trees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Previously, on ECE-250... We discussed trees (the
More informationUsing X-Particles with Team Render
Using X-Particles with Team Render Some users have experienced difficulty in using X-Particles with Team Render, so we have prepared this guide to using them together. Caching Using Team Render to Picture
More informationRadix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees:
Radix Searching The most simple radix search method is digital tree searching - the binary search tree with the branch in the tree according to the bits of keys: at the first level the leading bit is used,
More informationIn our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology.
Guide to and Hi everybody! In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology. This guide focuses on two of those symbols: and. These symbols represent concepts
More informationBinary, Hexadecimal and Octal number system
Binary, Hexadecimal and Octal number system Binary, hexadecimal, and octal refer to different number systems. The one that we typically use is called decimal. These number systems refer to the number of
More informationUsing Eclipse and Karel
Alisha Adam and Rohit Talreja CS 106A Summer 2016 Using Eclipse and Karel Based on a similar handout written by Eric Roberts, Mehran Sahami, Keith Schwarz, and Marty Stepp If you have not already installed
More informationNew to the Mac? Then start with this lesson to learn the basics.
Mac 101: Mac essentials If you're brand new to the world of computers and are just starting to get up and running, relax using a Mac is simple. This lesson introduces you to the basic principles of interacting
More informationCS 170 Java Tools. Step 1: Got Java?
CS 170 Java Tools This summer in CS 170 we'll be using the DrJava Integrated Development Environment. You're free to use other tools but this is what you'll use on your programming exams, so you'll need
More informationHi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we
Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we have to talk about the way in which we represent the
More informationA PROGRAM IS A SEQUENCE of instructions that a computer can execute to
A PROGRAM IS A SEQUENCE of instructions that a computer can execute to perform some task. A simple enough idea, but for the computer to make any use of the instructions, they must be written in a form
More informationBuilding Java Programs. Priority Queues, Huffman Encoding
Building Java Programs Priority Queues, Huffman Encoding Prioritization problems ER scheduling: You are in charge of scheduling patients for treatment in the ER. A gunshot victim should probably get treatment
More informationCOMP-202 Unit 4: Programming with Iterations
COMP-202 Unit 4: Programming with Iterations Doing the same thing again and again and again and again and again and again and again and again and again... CONTENTS: While loops Class (static) variables
More informationData Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data
More informationCS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding
CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,
More informationCSE 100 Advanced Data Structures
CSE 100 Advanced Data Structures Overview of course requirements Outline of CSE 100 topics Review of trees Helpful hints for team programming Information about computer accounts Page 1 of 25 CSE 100 web
More informationUNIT III BALANCED SEARCH TREES AND INDEXING
UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant
More informationprintf( Please enter another number: ); scanf( %d, &num2);
CIT 593 Intro to Computer Systems Lecture #13 (11/1/12) Now that we've looked at how an assembly language program runs on a computer, we're ready to move up a level and start working with more powerful
More informationBinary Trees
Binary Trees 4-7-2005 Opening Discussion What did we talk about last class? Do you have any code to show? Do you have any questions about the assignment? What is a Tree? You are all familiar with what
More informationGreedy Algorithms CHAPTER 16
CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often
More information1 Getting used to Python
1 Getting used to Python We assume you know how to program in some language, but are new to Python. We'll use Java as an informal running comparative example. Here are what we think are the most important
More information6.001 Notes: Section 8.1
6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything
More informationTourMaker Reference Manual. Intro
TourMaker Reference Manual Intro Getting Started Tutorial: Edit An Existing Tour Key Features & Tips Tutorial: Create A New Tour Posting A Tour Run Tours From Your Hard Drive Intro The World Wide Web is
More informationCSC148 Week 6. Larry Zhang
CSC148 Week 6 Larry Zhang 1 Announcements Test 1 coverage: trees (topic of today and Wednesday) are not covered Assignment 1 slides posted on the course website. 2 Data Structures 3 Data Structures A data
More informationGraduate-Credit Programming Project
Graduate-Credit Programming Project Due by 11:59 p.m. on December 14 Overview For this project, you will: develop the data structures associated with Huffman encoding use these data structures and the
More informationHuffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27
Huffman Coding Version of October 13, 2014 Version of October 13, 2014 Huffman Coding 1 / 27 Outline Outline Coding and Decoding The optimal source coding problem Huffman coding: A greedy algorithm Correctness
More informationRuby on Rails Welcome. Using the exercise files
Ruby on Rails Welcome Welcome to Ruby on Rails Essential Training. In this course, we're going to learn the popular open source web development framework. We will walk through each part of the framework,
More informationJava Programming Constructs Java Programming 2 Lesson 1
Java Programming Constructs Java Programming 2 Lesson 1 Course Objectives Welcome to OST's Java 2 course! In this course, you'll learn more in-depth concepts and syntax of the Java Programming language.
More informationB-Trees. Introduction. Definitions
1 of 10 B-Trees Introduction A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node,
More informationSlide 1 Side Effects Duration: 00:00:53 Advance mode: Auto
Side Effects The 5 numeric operators don't modify their operands Consider this example: int sum = num1 + num2; num1 and num2 are unchanged after this The variable sum is changed This change is called a
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationAn Overview 1 / 10. CS106B Winter Handout #21 March 3, 2017 Huffman Encoding and Data Compression
CS106B Winter 2017 Handout #21 March 3, 2017 Huffman Encoding and Data Compression Handout by Julie Zelenski with minor edits by Keith Schwarz In the early 1980s, personal computers had hard disks that
More informationDesigning a Database -- Understanding Relational Design
Designing a Database -- Understanding Relational Design Contents Overview The Database Design Process Steps in Designing a Database Common Design Problems Determining the Purpose Determining the Tables
More informationNotice on Access to Advanced Lists...2 Database Overview...2 Example: Real-life concept of a database... 2
Table of Contents Notice on Access to Advanced Lists...2 Database Overview...2 Example: Real-life concept of a database... 2 Queries...2 Example: Real-life concept of a query... 2 Database terminology...3
More informationAnalysis of Algorithms
Algorithm An algorithm is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program can be viewed as an elaborate algorithm. In mathematics and
More informationNaming Things in Adafruit IO
Naming Things in Adafruit IO Created by Adam Bachman Last updated on 2016-07-27 09:29:53 PM UTC Guide Contents Guide Contents Introduction The Two Feed Identifiers Name Key Aside: Naming things in MQTT
More informationstatic CS106L Spring 2009 Handout #21 May 12, 2009 Introduction
CS106L Spring 2009 Handout #21 May 12, 2009 static Introduction Most of the time, you'll design classes so that any two instances of that class are independent. That is, if you have two objects one and
More information15-122: Principles of Imperative Computation, Spring 2013
15-122 Homework 6 Page 1 of 13 15-122: Principles of Imperative Computation, Spring 2013 Homework 6 Programming: Huffmanlab Due: Thursday, April 4, 2013 by 23:59 For the programming portion of this week
More informationFormal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5
Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5 [talking head] This lecture we study theory design and implementation. Programmers have two roles to play here. In one role, they
More information[key, Left subtree, Right subtree]
Project: Binary Search Trees A binary search tree is a method to organize data, together with operations on these data (i.e., it is a data structure). In particular, the operation that this organization
More informationThere are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it.
Programming, Data Structures and Algorithms Prof. Hema Murthy Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 49 Module 09 Other applications: expression tree
More informationMITOCW watch?v=flgjisf3l78
MITOCW watch?v=flgjisf3l78 The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high-quality educational resources for free. To
More informationTree Structures. A hierarchical data structure whose point of entry is the root node
Binary Trees 1 Tree Structures A tree is A hierarchical data structure whose point of entry is the root node This structure can be partitioned into disjoint subsets These subsets are themselves trees and
More informationSQL - Tables. SQL - Create a SQL Table. SQL Create Table Query:
SQL - Tables Data is stored inside SQL tables which are contained within SQL databases. A single database can house hundreds of tables, each playing its own unique role in th+e database schema. While database
More informationCONTENTS: While loops Class (static) variables and constants Top Down Programming For loops Nested Loops
COMP-202 Unit 4: Programming with Iterations Doing the same thing again and again and again and again and again and again and again and again and again... CONTENTS: While loops Class (static) variables
More informationThe Stack, Free Store, and Global Namespace
Pointers This tutorial is my attempt at clarifying pointers for anyone still confused about them. Pointers are notoriously hard to grasp, so I thought I'd take a shot at explaining them. The more information
More informationCSE100. Advanced Data Structures. Lecture 13. (Based on Paul Kube course materials)
CSE100 Advanced Data Structures Lecture 13 (Based on Paul Kube course materials) CSE 100 Priority Queues in Huffman s algorithm Heaps and Priority Queues Time and space costs of coding with Huffman codes
More informationAndroid Programming Family Fun Day using AppInventor
Android Programming Family Fun Day using AppInventor Table of Contents A step-by-step guide to making a simple app...2 Getting your app running on the emulator...9 Getting your app onto your phone or tablet...10
More informationCS2112 Fall Assignment 4 Parsing and Fault Injection. Due: March 18, 2014 Overview draft due: March 14, 2014
CS2112 Fall 2014 Assignment 4 Parsing and Fault Injection Due: March 18, 2014 Overview draft due: March 14, 2014 Compilers and bug-finding systems operate on source code to produce compiled code and lists
More information2. INSTALLATION OF SUSE
2. INSTALLATION OF SUSE 2.1. PREINSTALLATION STEPS 2.1.1. Overview Installing any kind of operating system is a big move and can come as something of a shock to our PC. However, SUSE Linux makes this complicated
More informationLinked lists. Yet another Abstract Data Type Provides another method for providing space-efficient storage of data
Linked lists One of the classic "linear structures" What are linked lists? Yet another Abstract Data Type Provides another method for providing space-efficient storage of data What do they look like? Linked
More informationMITOCW watch?v=v3omvlzi0we
MITOCW watch?v=v3omvlzi0we The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To
More informationPost Experiment Interview Questions
Post Experiment Interview Questions Questions about the Maximum Problem 1. What is this problem statement asking? 2. What is meant by positive integers? 3. What does it mean by the user entering valid
More informationEECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10
EECS 311: Data Structures and Data Management Program 1 Assigned: 10/21/10 Checkpoint: 11/2/10; Due: 11/9/10 1 Project: Scheme Parser. In many respects, the ultimate program is an interpreter. Why? Because
More informationThe IBM I A Different Roadmap
The IBM I A Different Roadmap Not long ago I was reading an article about a session Steve Will gave on how to make the IBM i "sexy". Those who know me know that that would immediately start me thinking
More informationCS103 Handout 50 Fall 2018 November 30, 2018 Problem Set 9
CS103 Handout 50 Fall 2018 November 30, 2018 Problem Set 9 What problems are beyond our capacity to solve? Why are they so hard? And why is anything that we've discussed this quarter at all practically
More informationCIS 121 Data Structures and Algorithms with Java Spring 2018
CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and
More informationAssignment #6: Markov-Chain Language Learner CSCI E-220 Artificial Intelligence Due: Thursday, October 27, 2011
1. General Idea Assignment #6: Markov-Chain Language Learner CSCI E-220 Artificial Intelligence Due: Thursday, October 27, 2011 You will write a program that determines (as best as possible) the language
More informationMITOCW watch?v=0jljzrnhwoi
MITOCW watch?v=0jljzrnhwoi The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To
More informationPhysical Level of Databases: B+-Trees
Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,
More information