Topic 10: The Java Collections Framework (and Iterators) A set of interfaces and classes to help manage collections of data. Why study the Collections Framework? very useful in many different kinds of applications good examples of general-purpose OOP What is an Iterator? an object that helps you loop through the contents of a data structore very useful in combination with the Collections Framework 1
Goals For This Topic know how to use the Collections interfaces & implementations be able to make intelligent choices of interface & implementation for a particular problem Required reading: Java tutorial "trail" (link on web site) Other resources: summary sheet Java API documentation 2
What is a Collection? Just what it sounds like! You've already used arrays linked lists ArrayLists (part of Collections Framework) Very common problem to deal with in programming: Need to store, modify, examine a collection of things Two basic kinds of collections: sets (no duplicates, order determined by value) lists (ordered sequences -- program determines positions) A related concept: maps (dictionaries, lookup tables) 3
Real-Life Examples From my first real programming job, writing compilers: 1. collection of methods called by a given method 2. the statements in a method 3. symbol table: all the local variables in method, look-up by name to find type, scope, etc. 4
Interfaces Collection<E> Map<K,V> Set<E> List<E> SortedMap<K,V> SortedSet<E> All these are interfaces in java.util Each has at least one implementation in java.util Users can write more implementations 5
Java Collection Framework Interfaces, Implementations of the interfaces, and algorithms to help you use collections in your programs When you need a collection of things in a program: select an interface that provides the functionality you need select an implementation that provides good performance in your situation use additional methods from framework for sorting, searching, etc. 6
Possibilities For Extensions If you need something a bit different you can: create your own implementation of an interface create a sub-interface, then implement Using familiar interface => may be more useful to others. 7
The Collection Interface a very general interface -- operations common to all collections, details not specified recall: sets and lists are collections, maps are not Operations include: add element remove element ask if object is in collection "bulk operations" 8
The Set Interface a sub-interface of Collection: public interface Set<E> extends Collection<E> {... no new methods, just an addition to specifications: sets may not contain duplicate elements duplicates are defined by the equals method so if a.equals(b) is true: Set<String> myset =...; // empty set boolean flag = myset.add(a); // returns true flag = myset.add(b); // no effect, returns false Moral: remember to put equals method inside your classes 9
The HashSet Class implements Set, no new methods uses "hashing" technique (CISC 235) advantage: very efficient -- add, remove, search are constant time price: uses more memory than some other implementations order is arbitrary for iterators, printing Constructors: HashSet<E>(): creates an empty HashSet HashSet<E>(Collection<E> c): creates a HashSet containing all the elements of c 10
HashSet Example (1) Set<String> cisc121 = new HashSet<String>(); cisc121.add("john"); cisc121.add("george"); cisc121.add("paul"); cisc121.add("ringo"); System.out.println("students in cisc 121: " + cisc121); output: students in cisc 121: [Paul, John, George, Ringo] Note: elements printed in arbitrary order Set<String> cisc124 = new HashSet<String>(); cisc124.add("peter"); cisc124.add("paul"); cisc124.add("mary"); System.out.println("students in cisc 124: " + cisc124); output: students in cisc 124: [Paul, Mary, Peter] 11
HashSet Example (2) // look for a student if (cisc121.contains("mary")) System.out.println("Mary is taking cisc 121"); else System.out.println("Mary is not taking cisc 121"); output: Mary is not taking cisc 121 // Ringo drops out cisc121.remove("ringo"); System.out.println("new cisc 121: " + cisc121); output: new cisc 121: [Paul, John, George] 12
HashSet Example (3) // make a combined student list (union) Set<String> combined = new HashSet(cisc121); combined.addall(cisc124); System.out.println("combined: " + combined); output: combined: [Paul, Mary, Peter, John, George] 13
HashSet Example (4) // all students must attend tutorial unless they // have an A average Set<String> topofclass = new HashSet<String>(); topofclass.add("peter"); topofclass.add("paul"); // create copy of combined set Set<String> tutorial = new HashSet<String>(combined); // remove A students from set tutorial.removeall(topofclass); System.out.println("tutorial: " + tutorial); output: tutorial: [Mary, John, George] 14
HashSet Example (5) // find out who is taking both classes (intersection) Set<String> studentsinboth = new HashSet<String>(cisc121); studentsinboth.retainall(cisc124); System.out.println("students taking both classes: " + studentsinboth); output: students taking both classes: [Paul] 15
HashSet Example (6) To write a loop that does something to each set element: You need an Iterator (explained shortly). 16
Dangers of Hashing Hashset is a very good, efficient set implementation To use intelligently under all circumstances must understand how a hash table works CISC 235 Very safe to use for sets of String, Integers, other API classes If you create a HashSet of objects of a class you've created, you may have some surprises with: duplicate element prevention searching for a set element If time permits: more explanation later If not: stick to API classes or use a different set implementation 17
Two Set Implementations Collection<E> Set<E> HashSet<E> SortedSet<E> TreeSet<E> 18
Sorted Sets A sorted set keeps its elements in order. Advantages of TreeSet: you get elements in order for printing and iterators some extra methods: min & max elements, range subsets less memory than a HashSet Efficiency Comparison: HashSet TreeSet insert O(1) O(log n) delete O(1) O(log n) lookup O(1) O(log n) print/iterate O(n) * Time to iterate through a HashSet is proportional to the size of the table in memory 19
Ordering What order does a sorted set use? Normally, element type must implement Comparable. demo: TreeSetExample... Using the element type's compareto method is called using the "natural ordering". Alternative: pass a Comparator to the constructor. back to example... 20
Question: How Do You Loop Through a Set? Can't use a for loop -- no concept of the i-th element of a set. An Iterator is an object that helps you iterate through a collection. Goals: learn how to... 1. Use iterators -- very important 2. Write your own iterator classes -- later if time permits 21
Common Situation You have some kind of collection of data: array, ArrayList, linked list, stack, queue, tree, etc. You need to loop through the collection and do something with each element: print add to total change contents of objects search for a value (or set of values) check for errors etc. How you do this depends on the kind of data structure 22
Examples or i < count if array isn't full loop through an array: for (int i = 0; i < arr.length; i++){ // do something with arr[i] } loop through an ArrayList: for (int i = 0; i < list.size(); i++){ // do something with list.get(i) } loop through a linked list: ListCell ptr = first; while (ptr!= null) { // do something with ptr.data ptr = ptr.next } // end while 23
Four Common Elements Each example did four things to loop through the collection: initialize check if we're at the end of the collection access the "current" element advance to the next element 24
If you know the structure of your data, you know how to walk through it efficiently. Problems: Information Hiding with information hiding (private data), not always possible if you change your representation, you have to change all your loops 25
Java Iterator Interface An Iterator is an object that controls a loop through a data structure. Two methods in Iterator interface: hasnext(): asks "is there another element left?" next(): advances to next element and returns it The Collection interface contains this method: iterator(): returns an Iterator for looping through the collection. Look at SetIteratorExample.java. 26
Important Restriction You may not change a collection while an iterator is travelling through it! Try IteratorChange.java. Solution: Iterator interface has a third method: remove(). iter.remove() means "Remove the current element from the collection". Try IteratorChangeFIXED.java. 27
Shortcut: "For Each" Loop To avoid the Iterator syntax, Java 1.5 added a new kind of loop for simple use of an iterator. for (String item: myset) {... do something with item... } // end for Above is shorthand for: Iterator<String> iter = myset.iterator(); while (iter.hasnext()) { String item = iter.next();... do something with item... } // end while Look at ForEachLoop.java 28
Limitations of "For Each" "For each" loops are great for many purposes. Two limitations: can't modify the collection during the loop only one kind of iterator for a collection type 29
The List Interface Collection<E> List<E> ArrayList<E> LinkedList<E> Lists are different from Sets because: programmer specifies order of elements duplicates are OK Every element has a position: 0, 1, 2,..., size-1 30
List Methods List inherits all methods from Collection. add with just one parameter (from Collection): adds to end of list add with two parameters (new): first parameter is index More new List methods: get/set/remove by index sublist by index range search returning index, not just boolean fancy iterators 31
List Implementations Two implementations: ArrayList, LinkedList Same methods available; only difference is performance. Advantages of ArrayList: random access get(i) is O(1) adding to / deleting from end of list is O(1) Advantage of LinkedList: adding to / deleting from either end is O(1) demo: ListExample Uses ArrayList. You could substitute LinkedList in declarations and results wouldn't change 32
Collections Class The Collections class provides several useful static methods Some examples: static void shuffle(list l); static void sort(list l); static void sort(list l, Comparator c); static Object max(collection coll) static Object max(collection coll, Comparator comp) static Object min(collection coll) static Object min(collection coll, Comparator comp) 33
Collections & Aliases Very important facts to remember: A Collection of Objects is really a Collection of references to Objects (pointers) When you add an Object to a Collection, it does not make a copy. If an Object belongs to several collections, all the collections contain references to the same Object. 34
Example Person peter = new Person("Peter", 20); Person paul = new Person("Paul", 20); Person mary = new Person("Mary", 20); Set<Person> cisc121 = new HashSet<Person>(); cisc121.add(peter); cisc121.add(paul); Set<Person> cisc124 = new HashSet<Person>(); cisc124.add(paul); cisc124.add(mary); for (Person p: cisc121) p.age++; for (Person p: cisc124) p.age++; System.out.println(peter); System.out.println(paul); System.out.println(mary); 35
Exceptions Many Collection methods can throw exceptions. Most are sub-classes of RuntimeException, so don't have to catch. Try to prevent with careful programming. See Java API descriptions for details Example: List mylist = new LinkedList(); // or ArrayList... // mylist.size() is now 5 Object o = mylist.get(6); // exception! 36
Maps A map is like a "dict" in Python. Set of <key,value> pairs -- a "mapping" from each key to a value. Examples: phone book student database organized by student number 37
HashMap Implementation A good, efficient implementation using hashing Most operations are O(1) -- independent of size of table. like HashSet, doesn't maintain entries in any logical order Often use simple API types for keys (String, Integer) -- these are OK. Using own classes as keys: wait for explanation or use different map implementation 38
HashMap Example (1) Use simple class for marks: class Marks { int midterm; // midterm mark int finalexam; // final exam mark // plus obvious constructor & tostring } // Create a map from student name to Marks Map<String, Marks> studentmap = new HashMap<String, Marks>(); studentmap.put("buffy", new Marks(63, 71)); studentmap.put("willow", new Marks(97, 98)); studentmap.put("xander", new Marks(42, 37)); // Fix error in Buffy's marks studentmap.put("buffy", new Marks(64, 71)); 39
HashMap Example (2) // retrieve Buffy's marks Marks buffymarks = studentmap.get("buffy"); if (buffymarks!= null) System.out.println("Buffy's marks are: " + buffymarks); else System.out.println("no marks for Buffy"); output: Buffy's marks are: (64,71) // look for Dawn's marks (not there) Marks dawnmarks = studentmap.get("dawn"); if (dawnmarks!= null) System.out.println("Dawn's marks are: " + dawnmarks); else System.out.println("no marks for Dawn"); output: no marks for Dawn 40
Sorted Maps Distinction between regular maps & sorted maps just like set vs. sorted set In a sorted map: elements will print in order (by key) iterators will return elements in order you get a few additional methods Implementation: TreeMap Must either: use class for keys that implements Comparable supply a Comparator 41
Other Ways to View Maps Common question: how do you iterate over a map? Instead of iterators, API provides alternate views of a map: can iterate over those views "key set": set of all the keys in the map. This is not a copy if you remove element from set, you've removed that key from the map Not allowed to add to a key set just remove. demo: MapIteratorDemo.java 42
Entry Sets A map may be thought of as a set of pairs: (key, value) Use entryset method to view a map as a set of pairs. Will return a Set of objects of that implement this interface: interface Entry<K,V> { K getkey(); V getvalue(); V setvalue(v value); } May not add elements to entry set. May delete elements and change values. demo... 43