Homework 4: Hash Tables Due: 5:00 PM, Mar 9, 2018

Similar documents
Lecture 17: Implementing HashTables 10:00 AM, Mar 5, 2018

Homework 2: Imperative Due: 5:00 PM, Feb 15, 2019

Homework 6: Heaps Due: 5:00 PM, Apr 9, 2018

Homework 5: Fun with Scala Due: 5:00 PM, Mar 16, 2018

Homework 1: Classes Due: 5:00 PM, Feb 9, 2018

Lecture 16: HashTables 10:00 AM, Mar 2, 2018

Homework 2: Imperative Due: 5:00 PM, Feb 16, 2018

Lab 9: More Sorting Algorithms 12:00 PM, Mar 21, 2018

Lab 5: Java IO 12:00 PM, Feb 21, 2018

Homework 7: Subsets Due: 11:59 PM, Oct 23, 2018

Lecture 23: Priority Queues 10:00 AM, Mar 19, 2018

Homework 3: Recursion Due: 11:59 PM, Sep 25, 2018

Lecture 8: Iterators and More Mutation

Lecture 5: Implementing Lists, Version 1

Lecture 23: Priority Queues, Part 2 10:00 AM, Mar 19, 2018

CS211 Computers and Programming Matthew Harris and Alexa Sharp July 9, Boggle

Lab 4: Imperative & Debugging 12:00 PM, Feb 14, 2018

Data Structures. COMS W1007 Introduction to Computer Science. Christopher Conway 1 July 2003

Lab 2: Object-Oriented Design 12:00 PM, Jan 31, 2018

Lecture 12: Dynamic Programming Part 1 10:00 AM, Feb 21, 2018

Homework 6: Higher-Order Procedures Due: 10:00 PM, Oct 17, 2017

Lecture 7: Lists, Version 2 (with Mutation)

CS159 - Assignment 2b

CS61BL Summer 2013 Midterm 2

Lab 10: Sockets 12:00 PM, Apr 4, 2018

Lab 8: Introduction to Scala 12:00 PM, Mar 14, 2018

Homework 8: Matrices Due: 11:59 PM, Oct 30, 2018

Homework #10 due Monday, April 16, 10:00 PM

Java HashMap Interview Questions

CS11 Java. Winter Lecture 8

CSCI 200 Lab 3 Using and implementing sets

Lecture 14: Exceptions 10:00 AM, Feb 26, 2018

Lecture 4: Inheritence and Abstract Classes

CMSC 201 Fall 2016 Homework 6 Functions

CS2110: Software Development Methods. Maps and Sets in Java

// class variable that gives the path where my text files are public static final String path = "C:\\java\\sampledir\\PS10"

Lecture 21: The Many Hats of Scala: OOP 10:00 AM, Mar 14, 2018

CSE 143: Computer Programming II Summer 2017 HW5: Anagrams (due Thursday, August 3, :30pm)

Hashing as a Dictionary Implementation

Lecture 7: Implementing Lists, Version 2

ABSTRACT DATA TYPES: COLLECTIONS, LISTS, SETS, MAP, QUEUES. Thursday, June 30, 2011

CS 310: Hash Table Collision Resolution

Problem 1: Building the BookCollection application

Review. CSE 143 Java. A Magical Strategy. Hash Function Example. Want to implement Sets of objects Want fast contains( ), add( )

CMSC 201 Fall 2016 Lab 09 Advanced Debugging

Not overriding equals

Announcements. Submit Prelim 2 conflicts by Thursday night A6 is due Nov 7 (tomorrow!)

Lab 1: Setup 12:00 PM, Sep 10, 2017

(Provisional) Lecture 20: OCaml Fun!

CSE 143: Computer Programming II Winter 2019 HW6: AnagramSolver (due Thursday, Feb 28, :30pm)

Homework 5. Due Friday, March 1 at 5:00 PM

Homework # 7 DUE: 11:59pm November 15, 2002 NO EXTENSIONS WILL BE GIVEN

ASSIGNMENT 5 Objects, Files, and More Garage Management

CS4023 Week06 Lab Exercise

CSE 143 Au03 Final Exam Page 1 of 15

COSC2430 Hw2: Name list management (Linked lists practice)

ASSIGNMENT 5 Data Structures, Files, Exceptions, and To-Do Lists

CS61B Lecture #24: Hashing. Last modified: Wed Oct 19 14:35: CS61B: Lecture #24 1

Lecture 22: Garbage Collection 10:00 AM, Mar 16, 2018

Project Compiler. CS031 TA Help Session November 28, 2011

+ Abstract Data Types

Lab 7: OCaml 12:00 PM, Oct 22, 2017

Good Coding Practices Spring 2018

09/02/2013 TYPE CHECKING AND CASTING. Lecture 5 CS2110 Spring 2013

CS/ENGRD 2110 SPRING Lecture 3: Fields, getters and setters, constructors, testing

ASSIGNMENT 5 Objects, Files, and a Music Player

Lecture 23: Priority Queues, Part 2 10:00 AM, Mar 19, 2018

CMSC 201 Spring 2017 Homework 4 Lists (and Loops and Strings)

Lecture 11: In-Place Sorting and Loop Invariants 10:00 AM, Feb 16, 2018

CSCI-1200 Data Structures Spring 2018 Lecture 14 Associative Containers (Maps), Part 1 (and Problem Solving Too)

Today. Book-keeping. Exceptions. Subscribe to sipb-iap-java-students. Collections. Play with problem set 1

INSTRUCTIONS TO CANDIDATES

CSCI 355 Lab #2 Spring 2007

COMP 250. Lecture 27. hashing. Nov. 10, 2017

CSC 321: Data Structures. Fall 2017

Lab 1: Introduction to Java

Have the students look at the editor on their computers. Refer to overhead projector as necessary.

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Topic 22 Hash Tables

Scala Style Guide Spring 2018

CS451 - Assignment 3 Perceptron Learning Algorithm

CS2110: Software Development Methods. Maps and Sets in Java

Project 4 Query Optimization Part 2: Join Optimization

Answer Key. 1. General Understanding (10 points) think before you decide.

CS 310: Maps and Sets

Chapter 2.6: Testing and running a solution

Points off Total off Net Score. CS 314 Final Exam Spring 2017

For this section, we will implement a class with only non-static features, that represents a rectangle

11 HashMap: Overriding equals ; JUnit; Vistors

Lecture 27: (PROVISIONAL) Insertion Sort, Selection Sort, and Merge Sort 10:00 AM, Nov 8, 2017

CS 051 Homework Laboratory #2

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

CSCI 355 LAB #2 Spring 2004

CpSc 1111 Lab 9 2-D Arrays

Homework 6: Higher-Order Procedures Due: 11:59 PM, Oct 16, 2018

CSCI 200 Lab 4 Evaluating infix arithmetic expressions

STANDARD ADTS Lecture 17 CS2110 Spring 2013

MIDTERM EXAM THURSDAY MARCH

Safety SPL/2010 SPL/20 1

1.00/1.001 Introduction to Computers and Engineering Problem Solving. Final Exam

Transcription:

CS18 Integrated Introduction to Computer Science Fisler, Nelson Contents Homework 4: Hash Tables Due: 5:00 PM, Mar 9, 2018 1 DIY Grep 2 2 Chaining Hash Tables 4 3 Hash Table Iterator 5 Objectives By the end of this homework, you will know: ˆ how to iterate over the elements a hash table By the end of this homework, you will be able to: ˆ implement a hash table ˆ use hash tables to represent sets (as well as dictionaries) How to Hand In For this (and all) homework assignments, you should hand in answers for all the non-practice questions. For this homework specifically, this entails answering the Hash Tables, Hash Table Iterator, and DIY Grep questions. In order to hand in your solutions to these problems, they must be stored in appropriately-named files with the appropriate package header in an appropriately-named directory. The source code files should comprise the hw04.src package, and your solution code files, the hw04.sol package. Begin by copying the source code from the course directory to your own personal directory. That is, copy the following files from /course/cs0180/src/hw04/src/*.java to /course/ cs0180/workspace/javaproject/src/hw04/src: ˆ IDictionary.java, containing public interface IDictionary<K,V> ˆ KeyNotFoundException.java, containing public class KeyNotFoundException ˆ KeyAlreadyExistsException.java, containing public class KeyAlreadyExistsException ˆ AbsHashTable.java, containing public abstract class AbsHashTable<K,V>.

ˆ IGrep.java, containing public interface IGrep. Do not alter these files! After completing this assignment, the following solution files should be in your /course/cs0180/workspace/javaproject/sol/hw04/sol directory: ˆ DIY Grep Grep.java, containing public class Grep, which implements the interface IGrep. GrepTest.txt, which contains documentation of all of your tests for your Grep class. ˆ Hash Tables Chaining.java, containing public class Chaining<K,V>, which extends AbsHashTable<K,V>. HashTableTester.java containing public class HashTableTester, which tests your hash table implementation. ˆ Hash Table Iterator Iterator.tex, which contains your answers to the Hash Table Iterator question. To hand in your files, navigate to the /course/cs0180/workspace/javaproject/ directory, and run the command cs018 handin hw04. This will automatically hand in all of the above files. Once you have handed in your homework, you should receive an email, more or less immediately, confirming that fact. If you don t receive this email, try handing in again, or ask the TAs what went wrong. Java s Built-in Hash Tables In this homework, you will be completing an implementation of a chaining hash table. For the problem that does not involve implementing hash tables, 1 you can (and should) make use of one of Java s built-in hash tables, specifically, HashMap or HashSet. These classes differ in that the former uses a hash table to represent a dictionary, 2 while the latter uses a hash table to represent a set. Observe that a set is a special case of a dictionary; the keys are the elements of the set, and there are no values. 3 Consequently, it is straightforward to use a hash table to represent a set. You can use Java s HashMap class by importing java.util.hashmap. Documentation on how to use Java s HashMap can be found here. Likewise, you can use Java s HashSet class by importing java.util.hashset. Documentation on how to use Java s HashSet can be found here. 1 DIY Grep 2 Also called a map; however, we refrain from using this term in CS 18, since it means something entirely different in CS 17/ CS 19. 3 Recall the invariant that dictionaries do not allow duplicate keys. That is why it makes sense to view sets as dictionaries. 2

Problems 1 DIY Grep Suppose you want to search a file for a word and retrieve all the line numbers on which that word appears. If you know that you ll have to support lots of queries, you ll probably want to preprocess your file to make it easier to look words up. Task: Explain how you can preprocess a file to make it easy to look up the line numbers associated with each word in the file. Hint: Use a dictionary. Note: Write your answer to this question at the top of the Grep class, which the next task asks you to write. Task: Write a class Grep with a constructor and a single method, lookup. Your constructor should take as input a filename and perform any necessary preprocessing. The lookup method should take as input a word and return the line numbers on which that word appears in the file. It should operate in expected constant time. As noted above, you can (and should) make use of Java s built-in hash table data structure, which is called HashMap, to solve this problem. Notes: ˆ If the same word appears more than once on the same line, you should include the line number only once in your output. ˆ You should treat words as sequences of characters separated by whitespace; so "flow" and "flow!" are distinct words. Also, you can assume words are case sensitive; so "flower" and "Flower" are distinct words. Hints: ˆ As part of the preprocessing step, you may want to use the split method in the String class, which splits up a string into pieces each time it encounters a specific character, and stores those pieces in an array of strings. ˆ You might find the LineNumberReader class, which extends BufferedReader, useful. It has a method getlinenumber that gets the current line number. ˆ The Java syntax to declare a HashMap that maps, for example, from a String to a Set of Integers is new HashMap<String, Set<Integer>>(). Likewise, the syntax to declare a HashSet of Integers is new HashSet<Integer>(). Task: Write a main method for the Grep class. The String[] args should correspond to a file name, and then at least 1 other word to look for, in that order. For example, running java hw04.sol.grep /course/cs018/src/hw04/iliad tree water cats should print something like: 3

tree found on lines: 1353 water found on lines: 686 7260 9731 15877 17749 20584 cats is not found Non-DIY Grep What you ve just written is very similar to one of the functions of the UNIX command grep, an extremely powerful and useful tool. You can use grep to search a file for a given pattern, and report where that pattern appears in the file, as follows: grep -n <pattern> <file> The -n is an optional parameter that tells grep that we want it to report line numbers. It prints out each line of text next to the line number. This is but one of many, many grep features, which are fully documented on its man page (which you can access, if you want to learn more, by typing man grep into a terminal). Here s a couple of examples of how you d use grep with this line number option: grep -n polyp myfile This command would print out something like: 1: Lyn likes the word polyp. 4: polyp 9: Lyn needs to stop using the word polyp all the time. Task: Test your Grep class. Document all of your tests in a file called GrepTest.txt. This file should have what you input to your Grep class, what the output was, and a short explanation (a sentence or less) of what you were trying to test with that input. Note: As usual, you should make sure your testing is exhaustive. unexpected program inputs. Test edge cases, and test You cannot easily test your implementation of grep using the Tester library, as you have become accustomed to doing in CS 18. But you can use grep to test your program, simply by comparing the output of your program with the output of grep. Hint: We ve included several test files (thankfully, none of which is the file shown above) in /course /cs018/src/poems/. But do not be afraid to create your own text files for testing, especially to try to catch edge cases! If you do this, make sure to hand in these files with the rest of your code. 2 Chaining Hash Tables In this problem, you will implement a hash table using chaining. Recall that the internal data structure of a hash table implemented using chaining is an array of lists. Although you have built your very own implementation of mutable linked lists by now, we recommend (insist, actually) that you use Java s built-in mutable lists, so that you get some practice using Java s mutable list interface. Java s LinkedList class lives in the java.util library. 4

In Java, you cannot create an array of a generic type. In order to circumvent this limitation, you should do the following to create your hash table: this.data = (LinkedList<KVPair<K,V>>[]) new LinkedList[size]; This situation is one of a few exceptions (equals is another) when casting is acceptable; in general, however, it s still discouraged. This line of code will generate a warning that there is an unchecked cast. To get rid of the warning, above any methods that cast like this, you should write: @SuppressWarnings("unchecked") Typically, warnings such as these indicate a problem with your code, so you should not suppress them; you should pay attention to them! In this specific instance, however, we justify its use as we are trying to get around a Java limitation. Task: Extend AbsHashTable<K,V> with a class Chaining<K, V>. The Chaining constructor should take as input a size variable, which specifies the size of the table. Hint: The key method to write in Chaining is findkvpair. Write this method first, and use it as a helper when you write insert and delete. Task: Test your Chaining class thoroughly. Since your class is generic, be sure to use multiple types in your testing. Additionally, since we are working with a mutable data structure, be sure to have setup methods. Put your tests in the HashTableTester.java file. 3 Hash Table Iterator Note: For this problem, you need not submit any Java code. A high-level description of each algorithm is enough. Whenever you implement a class, you should always override equals, tostring, and hashcode. Whenever you implement a collection, in addition, you should override iterator. We did this successfully for linked lists, and in this problem, you will think about how to do this for hash tables. (You may assume that the hash table does not change while you are iterating; no items are inserted or deleted.) Note: While not part of the assignment, take a moment to think about how you would use an iterator to write equals and tostring for hash tables. First, let s think about how an iterator for a chaining hash table might work. As a first attempt, you could try iterating over all the slots in the hash table: if a slot is empty, you can skip right over it; but if it is not empty, you would then iterate over the bucket stored at that slot. Task: The iterator we just described would examine all n slots in the chaining hash table, even though there might not be data stored at all, or even most, of them. Explain how to implement a more efficient chaining iterator that only examines slots which store data. Your iterator should not affect the run time of the hash table s basic operations; that is, the run times of lookup, insert, update, and delete should not change. 5

Hint: Consider augmenting the key-value pairs stored in your dictionary with additional fields. Task: Discuss the trade-offs between the naive iterator we proposed, and your iterator design. Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any other CS18 document by filling out the anonymous feedback form: http://cs.brown.edu/ courses/cs018/feedback. 6