CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #4

Size: px
Start display at page:

Download "CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #4"

Transcription

1 Building a Word Histogram using a Hashtable Due Nov. 14 noon Objectives In this assignment, you will implement classes for histogramming words using your own hashtable. You will write: (1) An implementation of the Hashtable: In this your software should be able to manage hash space collisions efficiently. (2) A class to build a histogram based on the frequency of the word occurances in classic literature. This class should use the Hashtable built in (1) Background: Hashtable Hashtable is a class mapping keys to values. Any non-null object can be used as a key. To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashcode method and the equals method. An instance of Hashtable has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The hash table is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hashtable exceeds the product of the load factor and the current capacity, the capacity is increased by calling the rehash method. Generally, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most Hashtable operations, including get and put). The initial capacity controls a tradeoff between wasted space and the need for rehash operations, which are time-consuming. No rehash operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor. However, setting the initial capacity too high can waste space. If many entries are to be made into a Hashtable, creating it with a sufficiently large capacity may allow the entries to be inserted more efficiently than letting it perform automatic rehashing as needed to grow the table. (Fom the Java Hashtable document Task Description Part 1. Build FastHashtable The first task is to build a Hashtable. A skeleton file, FastHashtable is provided. This class contains Java Generics. Please note that this class does not implement java.util.map. 1

2 You can use the hashcode() implemented in each of the Java objects. For example, PA4 uses String object as key. Therefore, you can use the hashcode() method provided by java.lang.string. Your FastHashtable should be able to manage collisions and rehash. You should choose an algorithm(s) to manage this collision. You can use single algorithm or mix multiple algorithms. Of course, you can also come up with your own algorithm. Use of Java s Hashtable class and Hashmap class are NOT ALLOWED for PA4. Please make sure that your software does not include java.util.hashtable or java.util.hashmap in any case. The skeleton files include some code for testing. Part 2. Build a Histogram with your FastHashtable Download free e-books listed in the Appendix and build a histogram of each of the word that appeared in the book. Each of the items (a combination of a word entity and its frequency) should be stored in your FastHashtable. Implement a skeleton file, Histogram.java. This file also includes some code for testing. To extract and manipulate Strings from a novel, 1) Split words using the space character 2) Use only the text part: it includes upper case letters, lower case letters and decimal digit numbers. 3) Ignore all of other symbols such as mathematical notation, quotation marks, or underscore. For example, I m should be interpreted as im 4) Consider all the words as lowercase. ( My and my are considered the same word my ) To satisfy the above requirements, use the provided method gettextonly(string input) of the Histogram class. Also, use the method, tolowercase() of the String class for the requirement (4). Part 3. Tune your hashtable For this assignment, your software will compete with ones from other teams. Five of the fastest submissions will get a bonus score. Besides selecting a collision algorithm, you can tune your hashtable by means of adjusting the initial capacity and the load factor. Your FastHashtable class should provide those methods. You can specify constant values based on the result of your experiments. You can also add software components to adjust the factors automatically. (e.g. based on the input file size). Part 4.Test your software Test your software with the included testing program: PA4_Test.java This test program provides 4 commands, 1) histogram 2) freq 3) mostfreq 4) wordcount 2

3 histogram builds a histogram from the input file and prints the histogram. freq will find the number of appearances of the specified word. mostfreq will find the word(s) that has(have) appeared most frequently in the input file. wordcount will return the number of unique words appeared in the input file. Each of the sub-tests will measure the delay in java PA4_Test histogram test.txt =============================== 0 funnylooking 1 1 more 1 2 is 1 3 are 1 4 tail 1 5 im 1 6 on 2 7 theres 1 8 my 3 9 stegosaurus 1 10 boney 1 11 back 1 12 many 1 13 for 1 14 dinosaur 1 15 a 1 16 plates 1 17 and 1 18 name 1 =============================== Turnaround time for building a hashtable: 2 The test.txt file is included in the file set posted. The output table contains three columns: index, word, and frequency. java PA4_Test mostfreq test.txt my Most frequently appreared word(s) is(are) my Turnaround time for finding the most frequently appeared word : 1 java PA4_Test freq test.txt my my has appeared in test.txt 3 times. java PA4_Test wordcount test.txt There are 19 valid word(s) java PA4_Test mostfreq OliverTwist.txt Turnaround time for finding most frequently appeared word : 10 3

4 java PA4_Test freq OliverTwist.txt the the has appeared in OliverTwist.txt 9771 times. java PA4_Test mostfreq WarAndPeace.txt Turnaround time for finding most frequently appeared word : 14 java PA4_Test freq WarAndPeace.txt the the has appeared in WarAndPeace.txt times. java PA4_Test mostfreq HuckFinn.txt Most frequently appreared word(s) is(are) and Turnaround time for finding most frequently appeared word : 6 java PA4_Test freq HuckFinn.txt and and has appeared in HuckFinn.txt 6299 times. java PA4_Test mostfreq Hamlet.txt Turnaround time for finding most frequently appeared word : 4 java PA4_Test freq Hamlet.txt the the has appeared in Hamlet.txt 1106 times. java PA4_Test wordcount Hamlet.txt There are 5328 valid word(s) java PA4_Test wordcount HuckFinn.txt There are 7723 valid word(s) java PA4_Test wordcount WarAndPeace.txt There are valid word(s) java PA4_Test wordcount OliverTwist.txt There are valid word(s) Also, you can find a sample output of PA4_Test histogram Hamlet.txt in the skeleton fileset. 4

5 Deliverables Submit a tar ball of your java source code including: FastHashtable.java Histogram.java Keep all of your source code in a single flat directory. The skeleton files are provided. Please do not modify PA4_Test.java. Note: You are required to work as a team in this assignment. You and your teammate should submit only ONE copy of the assignment. Please write down the implementer s name(s) on top of each of the source code. Grading This assignment will account for 5 of your final grade. The grading itself will be done on a 50 point scale. After the test, five teams who developed the fastest hashtables will get 10 points as a bonus score. We will test with 5 different sized of novels (listed in the Appendix) and combine the delays for building histogram and retrieval data. Late Policy Please check the late policy available from the course web page Appendix Please download plain-text ebooks listed below. This data is from the project Gutenberg. ( This ebooks are free in the United States because their copyright has expired. [1] Oliver Twist by Charles Dickens (914kB) [2] War and Peach by Leo Tolstoy (3.1 MB) [3] Adventures of Huckleberry Finn by Mark Twain (584 kb) [4] Hamlet by William Shakespeare (180 kb) 5

CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3

CS 200 Algorithms and Data Structures, Fall 2012 Programming Assignment #3 Compressing Data using Huffman Coding Due Oct.24 noon Objectives In this assignment, you will implement classes for data compression. You will write: () An implementation of the Huffman Coding using a

More information

HASH TABLE BY AKARSH KUMAR

HASH TABLE BY AKARSH KUMAR HASH TABLE BY AKARSH KUMAR HASH TABLE PURPOSE The purpose of a HashTable is to serve as an ArrayList type object that has O(1) set, get, and lookup times. It also maps keys to values and includes put and

More information

Lecture 18. Collision Resolution

Lecture 18. Collision Resolution Lecture 18 Collision Resolution Introduction In this lesson we will discuss several collision resolution strategies. The key thing in hashing is to find an easy to compute hash function. However, collisions

More information

Announcements. Submit Prelim 2 conflicts by Thursday night A6 is due Nov 7 (tomorrow!)

Announcements. Submit Prelim 2 conflicts by Thursday night A6 is due Nov 7 (tomorrow!) HASHING CS2110 Announcements 2 Submit Prelim 2 conflicts by Thursday night A6 is due Nov 7 (tomorrow!) Ideal Data Structure 3 Data Structure add(val x) get(int i) contains(val x) ArrayList 2 1 3 0!(#)!(1)!(#)

More information

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

Introduction hashing: a technique used for storing and retrieving information as quickly as possible. Lecture IX: Hashing Introduction hashing: a technique used for storing and retrieving information as quickly as possible. used to perform optimal searches and is useful in implementing symbol tables. Why

More information

Review. CSE 143 Java. A Magical Strategy. Hash Function Example. Want to implement Sets of objects Want fast contains( ), add( )

Review. CSE 143 Java. A Magical Strategy. Hash Function Example. Want to implement Sets of objects Want fast contains( ), add( ) Review CSE 143 Java Hashing Want to implement Sets of objects Want fast contains( ), add( ) One strategy: a sorted list OK contains( ): use binary search Slow add( ): have to maintain list in sorted order

More information

Hash-Based Indexing 1

Hash-Based Indexing 1 Hash-Based Indexing 1 Tree Indexing Summary Static and dynamic data structures ISAM and B+ trees Speed up both range and equality searches B+ trees very widely used in practice ISAM trees can be useful

More information

Data Structures. COMS W1007 Introduction to Computer Science. Christopher Conway 1 July 2003

Data Structures. COMS W1007 Introduction to Computer Science. Christopher Conway 1 July 2003 Data Structures COMS W1007 Introduction to Computer Science Christopher Conway 1 July 2003 Linked Lists An array is a list of elements with a fixed size, accessed by index. A more flexible data structure

More information

Hash-Based Indexes. Chapter 11

Hash-Based Indexes. Chapter 11 Hash-Based Indexes Chapter 11 1 Introduction : Hash-based Indexes Best for equality selections. Cannot support range searches. Static and dynamic hashing techniques exist: Trade-offs similar to ISAM vs.

More information

Module 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing.

Module 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing. The Lecture Contains: Hashing Dynamic hashing Extendible hashing Insertion file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture9/9_1.htm[6/14/2012

More information

MIDTERM EXAM THURSDAY MARCH

MIDTERM EXAM THURSDAY MARCH Week 6 Assignments: Program 2: is being graded Program 3: available soon and due before 10pm on Thursday 3/14 Homework 5: available soon and due before 10pm on Monday 3/4 X-Team Exercise #2: due before

More information

Introduction to Hashing

Introduction to Hashing Lecture 11 Hashing Introduction to Hashing We have learned that the run-time of the most efficient search in a sorted list can be performed in order O(lg 2 n) and that the most efficient sort by key comparison

More information

CS 310 Advanced Data Structures and Algorithms

CS 310 Advanced Data Structures and Algorithms CS 310 Advanced Data Structures and Algorithms Hashing June 6, 2017 Tong Wang UMass Boston CS 310 June 6, 2017 1 / 28 Hashing Hashing is probably one of the greatest programming ideas ever. It solves one

More information

Hash Tables. Gunnar Gotshalks. Maps 1

Hash Tables. Gunnar Gotshalks. Maps 1 Hash Tables Maps 1 Definition A hash table has the following components» An array called a table of size N» A mathematical function called a hash function that maps keys to valid array indices hash_function:

More information

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40 Lecture 16 Hashing Hash table and hash function design Hash functions for integers and strings Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table

More information

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =

More information

CS ) PROGRAMMING ASSIGNMENT 11:00 PM 11:00 PM

CS ) PROGRAMMING ASSIGNMENT 11:00 PM 11:00 PM CS3114 (Fall 2017) PROGRAMMING ASSIGNMENT #4 Due Thursday, December 7 th @ 11:00 PM for 100 points Due Tuesday, December 5 th @ 11:00 PM for 10 point bonus Last updated: 11/13/2017 Assignment: Update:

More information

Lecture 16 More on Hashing Collision Resolution

Lecture 16 More on Hashing Collision Resolution Lecture 16 More on Hashing Collision Resolution Introduction In this lesson we will discuss several collision resolution strategies. The key thing in hashing is to find an easy to compute hash function.

More information

Hashing as a Dictionary Implementation

Hashing as a Dictionary Implementation Hashing as a Dictionary Implementation Chapter 22 Contents The Efficiency of Hashing The Load Factor The Cost of Open Addressing The Cost of Separate Chaining Rehashing Comparing Schemes for Collision

More information

Hash Table. Ric Glassey

Hash Table. Ric Glassey Hash Table Ric Glassey glassey@kth.se Overview Hash Table Aim: Describe the map abstract data type with efficient insertion, deletion and search operations Motivation: List data structures are divided

More information

Linked lists. Yet another Abstract Data Type Provides another method for providing space-efficient storage of data

Linked lists. Yet another Abstract Data Type Provides another method for providing space-efficient storage of data Linked lists One of the classic "linear structures" What are linked lists? Yet another Abstract Data Type Provides another method for providing space-efficient storage of data What do they look like? Linked

More information

Syllabus CS 301: Data Structures Spring 2015

Syllabus CS 301: Data Structures Spring 2015 Syllabus CS 301: Data Structures Spring 2015 Meeting Times Instructor Graders Text Lect: 12:00-12:50 M, Tu, Wed, HB 116 Labs: 12:00-12:50 Th, HB 203 Dr. Razvan Andonie, HB 219-B, Office hours Projects

More information

Final assignment: Hash map

Final assignment: Hash map Final assignment: Hash map 1 Introduction In this final assignment you will implement a hash map 1. A hash map is a data structure that associates a key with a value (a chunk of data). Most hash maps are

More information

Logistics. Homework 10 due tomorrow Review on Monday. Final on the following Friday at 3pm in CHEM 102. Come with questions

Logistics. Homework 10 due tomorrow Review on Monday. Final on the following Friday at 3pm in CHEM 102. Come with questions Hashing Logistics Homework 10 due tomorrow Review on Monday Come with questions Final on the following Friday at 3pm in CHEM 102 Preview A hash function is a function that: When applied to an Object, returns

More information

csci 210: Data Structures Maps and Hash Tables

csci 210: Data Structures Maps and Hash Tables csci 210: Data Structures Maps and Hash Tables Summary Topics the Map ADT Map vs Dictionary implementation of Map: hash tables READING: GT textbook chapter 9.1 and 9.2 Map ADT A Map is an abstract data

More information

Chapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking

Chapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking Chapter 17 Disk Storage, Basic File Structures, and Hashing Records Fixed and variable length records Records contain fields which have values of a particular type (e.g., amount, date, time, age) Fields

More information

COMP 250. Lecture 27. hashing. Nov. 10, 2017

COMP 250. Lecture 27. hashing. Nov. 10, 2017 COMP 250 Lecture 27 hashing Nov. 10, 2017 1 RECALL Map keys (type K) values (type V) Each (key, value) pairs is an entry. For each key, there is at most one value. 2 RECALL Special Case keys are unique

More information

Hashing file organization

Hashing file organization Hashing file organization These slides are a modified version of the slides of the book Database System Concepts (Chapter 12), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides

More information

Outline. 1 Hashing. 2 Separate-Chaining Symbol Table 2 / 13

Outline. 1 Hashing. 2 Separate-Chaining Symbol Table 2 / 13 Hash Tables 1 / 13 Outline 1 Hashing 2 Separate-Chaining Symbol Table 2 / 13 The basic idea is to save items in a key-indexed array, where the index is a function of the key Hash function provides a method

More information

Hash Tables. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Data Dictionary Revisited

Hash Tables. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Data Dictionary Revisited Unit 9, Part 4 Hash Tables Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Data Dictionary Revisited We've considered several data structures that allow us to store and search for data

More information

20-EECE-4029 Operating Systems Spring, 2013 John Franco

20-EECE-4029 Operating Systems Spring, 2013 John Franco 20-EECE-4029 Operating Systems Spring, 2013 John Franco Second Exam name: Question 1: Translation Look-aside Buffer (a) Describe the TLB. Include its location, why it is located there, its contents, and

More information

Preview. A hash function is a function that:

Preview. A hash function is a function that: Hashing Preview A hash function is a function that: When applied to an Object, returns a number When applied to equal Objects, returns the same number for each When applied to unequal Objects, is very

More information

Standard ADTs. Lecture 19 CS2110 Summer 2009

Standard ADTs. Lecture 19 CS2110 Summer 2009 Standard ADTs Lecture 19 CS2110 Summer 2009 Past Java Collections Framework How to use a few interfaces and implementations of abstract data types: Collection List Set Iterator Comparable Comparator 2

More information

Com S 227 Assignment Submission HOWTO

Com S 227 Assignment Submission HOWTO Com S 227 Assignment Submission HOWTO This document provides detailed instructions on: 1. How to submit an assignment via Canvas and check it 3. How to examine the contents of a zip file 3. How to create

More information

Depth-wise Hashing with Deep Hashing Structures. A two dimensional representation of a Deep Table

Depth-wise Hashing with Deep Hashing Structures. A two dimensional representation of a Deep Table Proceedings of Student Research Day, CSIS, Pace University, May 9th, 2003 Depth-wise Hashing with Deep Hashing Structures Edward Capriolo Abstract The objective of this research is to implement a variation

More information

CS159 - Assignment 2b

CS159 - Assignment 2b CS159 - Assignment 2b Due: Tuesday, Sept. 23 at 2:45pm For the main part of this assignment we will be constructing a number of smoothed versions of a bigram language model and we will be evaluating its

More information

Counting Words Using Hashing

Counting Words Using Hashing Computer Science I Counting Words Using Hashing CSCI-603 Lecture 11/30/2015 1 Problem Suppose that we want to read a text file, count the number of times each word appears, and provide clients with quick

More information

MSWLogo and dynamic link libraries. Matjaž Zaveršnik, Vladimir Batagelj

MSWLogo and dynamic link libraries. Matjaž Zaveršnik, Vladimir Batagelj MSWLogo and dynamic link libraries Matjaž Zaveršnik, Vladimir Batagelj University of Ljubljana, FMF, Department of Mathematics Jadranska 19, 1000 Ljubljana, Slovenia matjaz.zaversnik@fmf.uni-lj.si, vladimir.batagelj@uni-lj.si

More information

CSC 231 DYNAMIC PROGRAMMING HOMEWORK Find the optimal order, and its optimal cost, for evaluating the products A 1 A 2 A 3 A 4

CSC 231 DYNAMIC PROGRAMMING HOMEWORK Find the optimal order, and its optimal cost, for evaluating the products A 1 A 2 A 3 A 4 CSC 231 DYNAMIC PROGRAMMING HOMEWORK 10-1 PROFESSOR GODFREY MUGANDA 1. Find the optimal order, and its optimal cost, for evaluating the products where A 1 A 2 A 3 A 4 A 1 is 10 4 A 2 is 4 5 A 3 is 5 20

More information

Chapter 10: MySQL & PHP. PHP and MySQL CIS 86 Mission College

Chapter 10: MySQL & PHP. PHP and MySQL CIS 86 Mission College Chapter 10: MySQL & PHP PHP and MySQL CIS 86 Mission College Tonight s agenda Drop the class? Login file Connecting to a MySQL database Object-oriented PHP Executing a query Fetching a result Fetching

More information

CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Project 1 (document version 1.3) Process Simulation Framework

CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Project 1 (document version 1.3) Process Simulation Framework CSCI 4210 Operating Systems CSCI 6140 Computer Operating Systems Project 1 (document version 1.3) Process Simulation Framework Overview This project is due by 11:59:59 PM on Thursday, October 20, 2016.

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Assignment 4: Hashtables

Assignment 4: Hashtables Assignment 4: Hashtables In this assignment we'll be revisiting the rhyming dictionary from assignment 2. But this time we'll be loading it into a hashtable and using the hashtable ADT to implement a bad

More information

COMP 250. Lecture 26. maps. Nov. 8/9, 2017

COMP 250. Lecture 26. maps. Nov. 8/9, 2017 COMP 250 Lecture 26 maps Nov. 8/9, 2017 1 Map (Mathematics) domain codomain A map is a set of pairs { (x, f(x)) }. Each x in domain maps to exactly one f(x) in codomain, but it can happen that f(x1) =

More information

CSC 321: Data Structures. Fall 2016

CSC 321: Data Structures. Fall 2016 CSC : Data Structures Fall 6 Hash tables HashSet & HashMap hash table, hash function collisions Ø linear probing, lazy deletion, primary clustering Ø quadratic probing, rehashing Ø chaining HashSet & HashMap

More information

Pointer Accesses to Memory and Bitwise Manipulation

Pointer Accesses to Memory and Bitwise Manipulation C Programming Pointer Accesses to Memory and Bitwise Manipulation This assignment consists of two parts, the second extending the solution to the first. Q1 [80%] Accessing Data in Memory Here is a hexdump

More information

Microprocessors & Assembly Language Lab 1 (Introduction to 8086 Programming)

Microprocessors & Assembly Language Lab 1 (Introduction to 8086 Programming) Microprocessors & Assembly Language Lab 1 (Introduction to 8086 Programming) Learning any imperative programming language involves mastering a number of common concepts: Variables: declaration/definition

More information

CSSE 230 Fundamentals of Computing Final Exam

CSSE 230 Fundamentals of Computing Final Exam CSSE 230 Fundamentals of Computing Final Exam Winter term, 2011-2012 Your name: Instructions: This exam is open book and notes. In addition: All the work you turn in must be your own. You must not use

More information

Project 2 - Kernel Memory Allocation

Project 2 - Kernel Memory Allocation Project 2 - Kernel Memory Allocation EECS 343 - Fall 2014 Important Dates Out: Monday October 13, 2014 Due: Tuesday October 28, 2014 (11:59:59 PM CST) Project Overview The kernel generates and destroys

More information

CS 310: Hash Table Collision Resolution Strategies

CS 310: Hash Table Collision Resolution Strategies CS 310: Hash Table Collision Resolution Strategies Chris Kauffman Week 7-1 Logistics HW 2 Due Wednesday night Test Cases Final Questions for group discussion? Goals Today Midterm Exam Next Monday Review

More information

Java Programming Unit 3: Variables and Arithmetic Operations

Java Programming Unit 3: Variables and Arithmetic Operations Java Programming Unit 3: Variables and Arithmetic Operations Bensalem Township School District Standards Link: PA State Standards for Business Education: http://www.pdesas.org/standard/views#114,115,116,117

More information

CS 245 Midterm Exam Winter 2014

CS 245 Midterm Exam Winter 2014 CS 245 Midterm Exam Winter 2014 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 70 minutes

More information

Problem Set 0. General Instructions

Problem Set 0. General Instructions CS246: Mining Massive Datasets Winter 2014 Problem Set 0 Due 9:30am January 14, 2014 General Instructions This homework is to be completed individually (no collaboration is allowed). Also, you are not

More information

Hands on Assignment 1

Hands on Assignment 1 Hands on Assignment 1 CSci 2021-10, Fall 2018. Released Sept 10, 2018. Due Sept 24, 2018 at 11:55 PM Introduction Your task for this assignment is to build a command-line spell-checking program. You may

More information

Hash Tables. Hashing Probing Separate Chaining Hash Function

Hash Tables. Hashing Probing Separate Chaining Hash Function Hash Tables Hashing Probing Separate Chaining Hash Function Introduction In Chapter 4 we saw: linear search O( n ) binary search O( log n ) Can we improve the search operation to achieve better than O(

More information

Java Collections Framework reloaded

Java Collections Framework reloaded Java Collections Framework reloaded October 1, 2004 Java Collections - 2004-10-01 p. 1/23 Outline Interfaces Implementations Ordering Java 1.5 Java Collections - 2004-10-01 p. 2/23 Components Interfaces:

More information

Lecture 8 Index (B+-Tree and Hash)

Lecture 8 Index (B+-Tree and Hash) CompSci 516 Data Intensive Computing Systems Lecture 8 Index (B+-Tree and Hash) Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 due tomorrow: Announcements Due on 09/21 (Thurs),

More information

Pointer Accesses to Memory and Bitwise Manipulation

Pointer Accesses to Memory and Bitwise Manipulation C Programming Pointer Accesses to Memory and Bitwise Manipulation This assignment consists of two parts, the second extending the solution to the first. Q1 [80%] Accessing Data in Memory Here is a hexdump

More information

CSC 321: Data Structures. Fall 2017

CSC 321: Data Structures. Fall 2017 CSC : Data Structures Fall 7 Hash tables HashSet & HashMap hash table, hash function collisions Ø linear probing, lazy deletion, clustering, rehashing Ø chaining Java hashcode method HW6: finite state

More information

CIT-590 Final Exam. Name: Penn Key (Not ID number): If you write a number above, you will lose 1 point

CIT-590 Final Exam. Name: Penn Key (Not ID number): If you write a number above, you will lose 1 point 1 CIT-590 Final Exam Name: Penn Key (Not ID number): If you write a number above, you will lose 1 point Instructions: You will have two hours to complete this exam. If you finish in the last 15 minutes,

More information

// class variable that gives the path where my text files are public static final String path = "C:\\java\\sampledir\\PS10"

// class variable that gives the path where my text files are public static final String path = C:\\java\\sampledir\\PS10 Problem Set 10 Due: 4:30PM, Friday May 10, 2002 Problem 1. Files and hashing, preliminary question (30%) This problem focuses on the use of the hashcode() method, and touches on the tostring() and equals()

More information

Part 1: Written Questions (60 marks):

Part 1: Written Questions (60 marks): COMP 352: Data Structure and Algorithms Fall 2016 Department of Computer Science and Software Engineering Concordia University Combined Assignment #3 and #4 Due date and time: Sunday November 27 th 11:59:59

More information

CMPSCI 250: Introduction to Computation. Lecture #1: Things, Sets and Strings David Mix Barrington 22 January 2014

CMPSCI 250: Introduction to Computation. Lecture #1: Things, Sets and Strings David Mix Barrington 22 January 2014 CMPSCI 250: Introduction to Computation Lecture #1: Things, Sets and Strings David Mix Barrington 22 January 2014 Things, Sets, and Strings The Mathematical Method Administrative Stuff The Objects of Mathematics

More information

1.00 Lecture 32. Hashing. Reading for next time: Big Java Motivation

1.00 Lecture 32. Hashing. Reading for next time: Big Java Motivation 1.00 Lecture 32 Hashing Reading for next time: Big Java 18.1-18.3 Motivation Can we search in better than O( lg n ) time, which is what a binary search tree provides? For example, the operation of a computer

More information

HASH TABLES. CSE 332 Data Abstractions: B Trees and Hash Tables Make a Complete Breakfast. An Ideal Hash Functions.

HASH TABLES. CSE 332 Data Abstractions: B Trees and Hash Tables Make a Complete Breakfast. An Ideal Hash Functions. -- CSE Data Abstractions: B Trees and Hash Tables Make a Complete Breakfast Kate Deibel Summer The national data structure of the Netherlands HASH TABLES July, CSE Data Abstractions, Summer July, CSE Data

More information

ECE 242 Data Structures and Algorithms. Hash Tables I. Lecture 24. Prof.

ECE 242 Data Structures and Algorithms.  Hash Tables I. Lecture 24. Prof. ECE 242 Data Structures and Algorithms http//www.ecs.umass.edu/~polizzi/teaching/ece242/ Hash Tables I Lecture 24 Prof. Eric Polizzi Motivations We have to store some records and perform the following

More information

Topic HashTable and Table ADT

Topic HashTable and Table ADT Topic HashTable and Table ADT Hashing, Hash Function & Hashtable Search, Insertion & Deletion of elements based on Keys So far, By comparing keys! Linear data structures Non-linear data structures Time

More information

Java 2 Programmer Exam Cram 2

Java 2 Programmer Exam Cram 2 Java 2 Programmer Exam Cram 2 Copyright 2003 by Que Publishing International Standard Book Number: 0789728613 Warning and Disclaimer Every effort has been made to make this book as complete and as accurate

More information

Habanero Extreme Scale Software Research Project

Habanero Extreme Scale Software Research Project Habanero Extreme Scale Software Research Project Comp215: Performance Zoran Budimlić (Rice University) To suffer the penalty of too much haste, which is too little speed. - Plato Never sacrifice correctness

More information

Problem. Context. Hash table

Problem. Context. Hash table Problem In many problems, it is natural to use Hash table as their data structures. How can the hash table be efficiently accessed among multiple units of execution (UEs)? Context Hash table is used when

More information

COL106: Data Structures, I Semester Assignment 4 A small search engine

COL106: Data Structures, I Semester Assignment 4 A small search engine COL106: Data Structures, I Semester 2015-16 Assignment 4 A small search engine September 26, 2015 In this assignment we will build the basic data structure underlying search engines: an inverted index.

More information

CSE 100: HASHING, BOGGLE

CSE 100: HASHING, BOGGLE CSE 100: HASHING, BOGGLE Probability of Collisions If you have a hash table with M slots and N keys to insert in it, then the probability of at least 1 collision is: 2 The Birthday Collision Paradox 1

More information

ENCE 3241 Data Lab. 60 points Due February 19, 2010, by 11:59 PM

ENCE 3241 Data Lab. 60 points Due February 19, 2010, by 11:59 PM 0 Introduction ENCE 3241 Data Lab 60 points Due February 19, 2010, by 11:59 PM The purpose of this assignment is for you to become more familiar with bit-level representations and manipulations. You ll

More information

Lecture 16: HashTables 10:00 AM, Mar 2, 2018

Lecture 16: HashTables 10:00 AM, Mar 2, 2018 CS18 Integrated Introduction to Computer Science Fisler, Nelson Lecture 16: HashTables 10:00 AM, Mar 2, 2018 Contents 1 Speeding up Lookup 1 2 Hashtables 2 2.1 Java HashMaps.......................................

More information

VARIABLES. Aim Understanding how computer programs store values, and how they are accessed and used in computer programs.

VARIABLES. Aim Understanding how computer programs store values, and how they are accessed and used in computer programs. Lesson 2 VARIABLES Aim Understanding how computer programs store values, and how they are accessed and used in computer programs. WHAT ARE VARIABLES? When you input data (i.e. information) into a computer

More information

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada CS102 Hash Tables Prof. Tejada 1 Vectors, Linked Lists, Stack, Queues, Deques Can t provide fast insertion/removal and fast lookup at the same time The Limitations of Data Structure Binary Search Trees,

More information

HASH TABLES. Hash Tables Page 1

HASH TABLES. Hash Tables Page 1 HASH TABLES TABLE OF CONTENTS 1. Introduction to Hashing 2. Java Implementation of Linear Probing 3. Maurer s Quadratic Probing 4. Double Hashing 5. Separate Chaining 6. Hash Functions 7. Alphanumeric

More information

Due: Tuesday 29 November by 11:00pm Worth: 8%

Due: Tuesday 29 November by 11:00pm Worth: 8% CSC 180 H1F Project # 3 General Instructions Fall 2016 Due: Tuesday 29 November by 11:00pm Worth: 8% Submitting your assignment You must hand in your work electronically, using the MarkUs system. Log in

More information

CS2630: Computer Organization Project 2, part 1 Register file and ALU for MIPS processor

CS2630: Computer Organization Project 2, part 1 Register file and ALU for MIPS processor CS2630: Computer Organization Project 2, part 1 Register file and ALU for MIPS processor Goals for this assignment Apply knowledge of combinational logic and sequential logic to build two major components

More information

CSE Theory of Computing Fall 2017 Project 3: K-tape Turing Machine

CSE Theory of Computing Fall 2017 Project 3: K-tape Turing Machine CSE 30151 Theory of Computing Fall 2017 Project 3: K-tape Turing Machine Version 1: Oct. 23, 2017 1 Overview The goal of this project is to have each student understand at a deep level the functioning

More information

DATA STRUCTURES AND ALGORITHMS

DATA STRUCTURES AND ALGORITHMS LECTURE 11 Babeş - Bolyai University Computer Science and Mathematics Faculty 2017-2018 In Lecture 9-10... Hash tables ADT Stack ADT Queue ADT Deque ADT Priority Queue Hash tables Today Hash tables 1 Hash

More information

Compsci 201, Maps+Hashing

Compsci 201, Maps+Hashing Compsci 201, Maps+Hashing Owen Astrachan Jeff Forbes September 13, 2017 9/15/17 Compsci 201, Fall 2017, Maps/Hashing 1 Plan for the Day Extend Fast ClassScores solution to Not just values in [0..100],

More information

CS 310: Hash Table Collision Resolution

CS 310: Hash Table Collision Resolution CS 310: Hash Table Collision Resolution Chris Kauffman Week 8-1 Logistics Reading Weiss Ch 20: Hash Table Weiss Ch 6.7-8: Maps/Sets Homework HW 1 Due Saturday Discuss HW 2 next week Questions? Schedule

More information

COMP 103 RECAP-TODAY. Hashing: collisions. Collisions: open hashing/buckets/chaining. Dealing with Collisions: Two approaches

COMP 103 RECAP-TODAY. Hashing: collisions. Collisions: open hashing/buckets/chaining. Dealing with Collisions: Two approaches COMP 103 2017-T1 Lecture 31 Hashing: collisions Marcus Frean, Lindsay Groves, Peter Andreae and Thomas Kuehne, VUW Lindsay Groves School of Engineering and Computer Science, Victoria University of Wellington

More information

CS 2604 Minor Project 1 DRAFT Fall 2000

CS 2604 Minor Project 1 DRAFT Fall 2000 RPN Calculator For this project, you will design and implement a simple integer calculator, which interprets reverse Polish notation (RPN) expressions. There is no graphical interface. Calculator input

More information

CMSC 132: Object-Oriented Programming II. Hash Tables

CMSC 132: Object-Oriented Programming II. Hash Tables CMSC 132: Object-Oriented Programming II Hash Tables CMSC 132 Summer 2017 1 Key Value Map Red Black Tree: O(Log n) BST: O(n) 2-3-4 Tree: O(log n) Can we do better? CMSC 132 Summer 2017 2 Hash Tables a

More information

Lecture 5 Data Structures (DAT037) Ramona Enache (with slides from Nick Smallbone)

Lecture 5 Data Structures (DAT037) Ramona Enache (with slides from Nick Smallbone) Lecture 5 Data Structures (DAT037) Ramona Enache (with slides from Nick Smallbone) Hash Tables A hash table implements a set or map The plan: - take an array of size k - define a hash funcion that maps

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Field guide to Java collections Mike Duigou (@mjduigou) Java Core Libraries 2 Required Reading Should have used most at some point List, Vector, ArrayList, LinkedList, Arrays.asList Set, HashSet, TreeSet

More information

27/04/2012. Objectives. Collection. Collections Framework. "Collection" Interface. Collection algorithm. Legacy collection

27/04/2012. Objectives. Collection. Collections Framework. Collection Interface. Collection algorithm. Legacy collection Objectives Collection Collections Framework Concrete collections Collection algorithm By Võ Văn Hải Faculty of Information Technologies Summer 2012 Legacy collection 1 2 2/27 Collections Framework "Collection"

More information

CS 3410 Ch 20 Hash Tables

CS 3410 Ch 20 Hash Tables CS 341 Ch 2 Hash Tables Sections 2.1-2.7 Pages 773-82 2.1 Basic Ideas 1. A hash table is a data structure that supports insert, remove, and find in constant time, but there is no order to the items stored.

More information

INFSCI 1017 Implementation of Information Systems Spring 2017

INFSCI 1017 Implementation of Information Systems Spring 2017 INFSCI 1017 Implementation of Information Systems Spring 2017 Time: Thursdays 6:00 8:30 Location: Information Science Building, Room 406 Instructor: Alexander Nolte Office Hours: Monday, 1-2PM Thursdays,

More information

COMP251: Algorithms and Data Structures. Jérôme Waldispühl School of Computer Science McGill University

COMP251: Algorithms and Data Structures. Jérôme Waldispühl School of Computer Science McGill University COMP251: Algorithms and Data Structures Jérôme Waldispühl School of Computer Science McGill University About Me Jérôme Waldispühl Associate Professor of Computer Science I am conducting research in Bioinformatics

More information

Linked lists (6.5, 16)

Linked lists (6.5, 16) Linked lists (6.5, 16) Linked lists Inserting and removing elements in the middle of a dynamic array takes O(n) time (though inserting at the end takes O(1) time) (and you can also delete from the middle

More information

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo Module 5: Hashing CS 240 - Data Structures and Data Management Reza Dorrigiv, Daniel Roche School of Computer Science, University of Waterloo Winter 2010 Reza Dorrigiv, Daniel Roche (CS, UW) CS240 - Module

More information

(f) Given what we know about linked lists and arrays, when would we choose to use one data structure over the other?

(f) Given what we know about linked lists and arrays, when would we choose to use one data structure over the other? CSM B Hashing & Heaps Spring 0 Week 0: March 0, 0 Motivation. (a) In the worst case, how long does it take to index into a linked list? Θ(N) (b) In the worst case, how long does it take to index into an

More information

Introduction to Programming Style

Introduction to Programming Style Introduction to Programming Style Thaddeus Aid The IT Learning Programme The University of Oxford, UK 30 July, 2013 Abstract Programming style is the part of the program that the human reads and the compiler

More information

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables

CITS2200 Data Structures and Algorithms. Topic 15. Hash Tables CITS2200 Data Structures and Algorithms Topic 15 Hash Tables Introduction to hashing basic ideas Hash functions properties, 2-universal functions, hashing non-integers Collision resolution bucketing and

More information

Full file at

Full file at Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class

More information

Hashing. It s not just for breakfast anymore! hashing 1

Hashing. It s not just for breakfast anymore! hashing 1 Hashing It s not just for breakfast anymore! hashing 1 Hashing: the facts Approach that involves both storing and searching for values (search/sort combination) Behavior is linear in the worst case, but

More information

Dictionary Wars CSC 190. March 14, Learning Objectives 2. 2 Swiper no swiping!... and pre-introduction 2

Dictionary Wars CSC 190. March 14, Learning Objectives 2. 2 Swiper no swiping!... and pre-introduction 2 Dictionary Wars CSC 190 March 14, 2013 Contents 1 Learning Objectives 2 2 Swiper no swiping!... and pre-introduction 2 3 Introduction 2 3.1 The Dictionaries.................................... 2 4 Dictionary

More information