1 of 13
What is the Java Collections Framework? To begin with, what is a collection?. I have a collection of comic books. In that collection, I have Tarzan comics, Phantom comics, Superman comics and several more. I hold my collection of emails in an email folder on my computer. In the context of object oriented programming, a collection is an object that groups multiple objects into a single collection. It is used to store, retrieve, manipulate and transmit a collection of data from one method to another. In Java, the Collections Framework is an architecture that can be used for representing and manipulating collections. All collections frameworks involve three things: 1. Interfaces 2. Implementations 3. Algorithms Interfaces are ADT (abstract data types) used to represent collections. Interfaces are typically used to manipulate collections without being bound to a particular implementation. Implementations are Java programs that are written based on the interfaces to provide the functionality, in this case, that of the collections. Typically these implementations can be reused. Algorithms are the specific set of steps of the Java program used in implementation. Some examples are the searching and sorting algorithms used in implementing the collections interface. The benefits of using a Collections Framework are 1. It offers rich and reusable data structures and algorithms so useful in problem solving 2. The data structures and algorithms are usually high-quality and high-performance ones 3. Various implementations of the interfaces are interchangeable 4. Offers interoperability between unrelated APIs Interfaces Collection interfaces enable a degree of control over the collections and facilitate passing collections to myriad of methods. In other words, interfaces enable manipulation of collections independent of implementation. In Java, the Collection interfaces are Set o List Sorted Set We also have the Map interfaces, but it is not really treated as a part of the Collections interface hierarchy. It has 2 of 13
Map o Sorted Map It is important to know that the JDK does not provide separate interfaces for each variant of each collection type. Some salient features are 1. It enables managing the number of core collection interfaces easily 2. The only variants possible are a. Immutable b. fixed-size c. append-only 3. The modification operations in each of the interfaces are optional. This is also done to cope with these variants. 4. It is possible that certain implementations may not support some of these operations. If an unsupported operation is invoked, the collection throws an exception. This exception is called the UnsupportedOperationException. 5. It is the responsibility of the different implementations to provide the details of the operations that are supported, in the documentation of the implementation. 6. The JDK's general-purpose implementations support all the optional operations. Links to Good Tutorials on Collection Framework 1. An exhaustive tutorial from NTU, Singapore at http://www.ntu.edu.sg/home/ehchua/programming/java/j5c_collection.html 2. There are a series of video tutorials from the Cave of Programming. The Part 1 of the Java Collection framework is at http://www.youtube.com/watch?v=mkctxtle7xu Learning to Use the Collection Interface The Collection interface can be depicted as a hierarchy. the Collection interface is at the root of the collection hierarchy. Elements are the group of objects that a collection represents. Duplicates: In some collection implementations duplicate elements are allowed, while in others it is not allowed. Ordering: Some implementations of the Collection interface are ordered, while others are not. Implementations: The implementations of the Collection interfaces do not come directly from the JDK. The JDK provides the implementations of its sub-interfaces. The Collection interface is used to pass collections between methods and manipulate them when maximum generality is required. Example: Collection implementations usually have a constructor with a Collection as one of the parameters. This constructor is used to initialize the new Collection to contain all the 3 of 13
elements in the specified Collection. By invoking this constructor, we are able to create a Collection of a desired implementation type, initially containing all the elements in any given Collection. This regardless of whatever is its sub-interface or implementation type. Thus, if we have a collection mycollection, which is a List or a set or any other type of Collection; we can create a new ArrayList having all the elements in mycollection. A way of achieving this is shown below. List mylist = new ArrayList(myCollection); // Given below is the Collection interface public interface Collection { // These are the Basic Operations int size(); boolean isempty(); boolean contains(object element); boolean add(object element); //Optional boolean remove(object element); // Optional Iterator iterator(); // Bulk Operations boolean containsall(collection mycollection); boolean addall(collection mycollection); // Optional boolean removeall(collection mycollection); // Optional boolean retainall(collection mycollection); // Optional void clear(); // Optional // These are Array Operations Object[] toarray(); Object[] toarray(object a[]); The Collection interface has the following methods. 1. To add elements to a collection. The add() method is implemented such that it can allow duplicate elements to be present or it does not allow the duplicate elements to be present. The method returns true if the collection has changed fooling the execution of this method. 2. To remove elements from a collection. The remove() method enables one to remove a single instance of the element from the collection. This method returns true if the collection has changed after execution of the method. 3. To iterate over elements of a collection. The object returned by the iterator() method is similar to an enumeration. But differs in two ways. a. When we call the iterator() method, it allows us to remove an element from the collection during iteration. b. When we are moving through an enumeration, we cannot remove an element from the collection without compromising safety. 4 of 13
The Iterator interface is displayed below. public interface Iterator{ boolean hasnext(); Object next(); // Optional ones void remove(); 4. To check if a next element is present. The hasnext() method is similar to the Enumeration.hasMoreElements() method, while the next() method is similar to the Enumeration.nextElement() method. The remove() method removes the last element returned by the next() method from the collection. The remove() method can be called only once for every call to the next() method. An exception is thrown if a violation occurs. Example: Traversing a collection and removing elements. Suppose we want to traverse a collection and simultaneously remove such elements that do not satisfy a specific condition. We can use the iterator to perform this task. static void removeelementthatfailscondition(collection mycollection) { for (Iterator k = mycollection.iterator(); k.hasnext(); ) if (conditionfails(k.next())) k.remove(); Let us spend a minute and examine this segment of code. This method would work regardless of what the collection contains and regardless of the collection implementation. Essentially, this method is polymorphic. Bulk Operations on Collections Bulk operations can be used to manipulate the contents of an entire collection at a time. 1. containsall: 2. addall: a. This returns true if the target collection contains all the elements in the specified collection. a. This adds all the elements of the specified collection to the target collection. 5 of 13
b. The method returns return true if the target collection was modified on executing the method. 3. removeall: a. This removes from the target collection all those elements that are also present in the specified collection. b. The method returns return true if the target collection was modified on executing the method. 4. retainall: 5. clear: a. This removes from the target collection all its elements that are not also present in the specified collection. b. In other words, it retains only those elements in the target collection that are also present in the specified collection. c. The method returns return true if the target collection was modified on executing the method. a. It removes all elements from the collection. Example: To remove all instances of a specified element myelement from the collection mycollection. mycollection.removeall(collections.singleton(myelement)); Example: To remove all null elements from the collection mycollection. mycollection.removeall(collections.singleton(null)); Do note the use of the Collections.singleton(). This is a factory method. This returns a Set that is immutable, and containing the specified element{s). Array Methods These are available to act as bridge between the collections and cases where APIs expect arrays as input. 6 of 13
1. The toarray() method without any arguments creates a new array of objects. 2. The toarray() method can also take parameters to provide an array or to choose the runtime type of the output array. Example: To place the contents of the collection mycollection onto a newly created array of objects whose length is equal to the number of elements in mycollection. Object[] myobject = mycollection.toarray(); Example: To place the contents of mycollection into a newly created array of String type whose length is equal to the number of elements in mycollection. String[] mystrings = (String[]) mycollection.toarray(new String[0]); These are just simple examples. One needs to solve many problems, often of a non-trivial variety to gain good competencies in using the Collections. Learning to Use the Set Interface We have encountered the mathematical abstraction called set. In Java, a set is a special type of collection. It does not contain duplicate elements, just like its mathematical counterpart. The Set interface is a subset of the Collection interface and contains only those methods that are inherited from the Collection interface. It does not contain any other additional method. A set allows for versatile behavior of the equals and hashcode operations. The power of equals and hashcode allow set objects with different implementations to be compared meaningfully. Remember that two Set objects containing the same objects are considered equal. In Java, the Set interface is as displayed below. 7 of 13
public interface Set { // Basic Operations int size(); boolean isempty(); boolean contains(object element); boolean add(object element); // Optional boolean remove(object element); // Optional Iterator iterator(); // These are Bulk Operations boolean containsall(collection mycollection); boolean addall(collection mycollection); // Optional boolean removeall(collection mycollection); // Optional boolean retainall(collection mycollection); // Optional void clear(); // Optional // These are Array Operations Object[] toarray(); Object[] toarray(object a[]); General Purpose Set Implementations 1. HashSet stores its elements in a hash table. This implementation offers excellent performance 2. TreeSet stores its elements in a red-black tree. This implementation guarantees the order of iteration. Example: Removing duplicates from a collection mycolelction. We create a set initially containing all the elements in mycollection. This cannot contain duplicates by definition. Collection distinctcollection = new HashSet(myCollection); 8 of 13
The following are some of the methods that can be used with Set. 1. size(): 2. isempty(): 3. add(): 4. remove(): 5. iterator(): a. This returns the number of elements in a set. a. This returns true if the set is empty. a. This adds a new element to the set if it was not present already. It returns a true to indicate that the element was added or false if not. a. This removes the specified element from the set if it is present and returns a true. It returns false if the element was not removed. a. This returns an iterator of the set. Example: To write a Java program takes the words in its argument list (could be long or short) and prints out any duplicate words present, the number of distinct words, as well as a list of the words with duplicates eliminated. import java.util.*; public class DetectAndRemoveDuplicates { public static void main(string args[]) { Set myset = new HashSet(); for (int k = 0; k < args.length; k++) if (!myset.add(args[k])) System.out.println("Duplicate Found -- "+args[i]); System.out.println("Set Size: "+myset.size()+" Distinct Words Found -- "+ myset); I urge you to execute this program and examine the results. This is an incredibly powerful method that performs an otherwise relatively complicated task rather simply. Example: To write a Java program takes the words in its argument list (could be long or short) and prints out any duplicate words present in alphabetical order, the number of distinct words, as well as a list of the words with duplicates eliminated. 9 of 13
Note that this requirement is quite the same as the earlier example, except that we want the list of words to be displayed in alphabetical order. Now don t rush off to write a fresh program from start. If we use the power of the Java Collection Framework, we would realize that all we need to do is to use the implementation of the set by a TreeSet instead of the earlier HashSet. In the above program, just make the modification in the constructor for the set as follows. Set myset = new TreeSet(); Compile and execute this program to convince yourself that the program would work as intended. This is really powerful stuff. Bulk Operations on Set 1. set1.containsall(set2): a. This returns true if set2 is a subset of set1. 2. set1.addall(set2): a. This makes set1 into the union of set1 and set2. b. The union of two sets is the set containing all the elements found in the two sets. 3. set1.retainall(set2): a. This makes set1 into the intersection of set1 and set2. b. The intersection of two sets is the set containing only those elements that are common in both the sets. 4. set1.removeall(set2): a. This makes set1 into the asymmetric set difference of set1 and set2. b. The set difference of set1 - set2 is the set containing all the elements found in set1 but not in set2. Learning to Use the List Interface In the Java Collection framework, a list is an ordered collection. It is also called a sequence. Lists allow duplicate elements to be present unlike in a set. We can control the place in a list where a new element is to be inserted. Using an integer index, which is used to denote the position of the element in the list, we can access the elements. 10 of 13
The List can use the operations inherited from the Collection interface. Some of the other operations included in the List interface are as follows. 1. Positional Access: This is used to manipulate the elements based on their numerical position in the list. 2. Search: This is used to search for a specified object in the list and return its numerical position. 3. List Iteration: This extends Iterator mechanism for enabling sequential traversal in a list.. 4. Range-View: This is used to carry out arbitrary range operations on the list. The List interface is displayed below. public interface List extends Collection { // Positional Access Object get(int index); // Optional Object set(int index, Object element); // Optional void add(int index, Object element); // Optional Object remove(int index); // Optional abstract boolean addall(int index, Collection mycollection); // Search int indexof(object myobject); int lastindexof(object myobject); // Iteration ListIterator listiterator(); ListIterator listiterator(int index); // Range-View List sublist(int from, int to); General Purpose List Implementations The following are the two general purpose List implementations. 1. ArrayList 2. LinkedList The following are some methods that can be used with List. 11 of 13
1. remove(): 2. add(): 3. addall(): a. This method is used to remove the first occurrence of the specified element from the list. a. Append the specified item at the end of the list. a. Append the specified item at the end of the list. 4. ListIterator(): a. The iterator operation in a list returns the elements of the list in proper sequence b. It allows us to traverse the list in either direction, modify the list during iteration, and obtain the current position of the iterator 5. hasnext(), hasprevious(), next(), previous(): a. They are all inherited by ListIterator() and work the same way as for the other interfaces There are other methods like get(), set(), and remove() that can be used with lists. It would be a good idea to learn how to concatenate two lists, to check if two lists contain the same elements in the same order, and access elements according their positions in a list. There are far too many methods in this interface. A good way of developing competencies is to solve problems that require the use of various methods, rather than listing the methods and memorizing what they do. Tutorials on List Interface There are a number of good tutorials on YouTube. 1. Cave of Programming tutorial is at http://www.youtube.com/watch?v=ymdbl_fceii 2. Authentic one at Oracle Site http://docs.oracle.com/javase/tutorial/collections/interfaces/list.html Other Interfaces in Collection There are other interfaces that are also very powerful and useful in developing applications. They are: 1. The Map Interface 12 of 13
I recommend that you go through the tutorial at http://docs.oracle.com/javase/tutorial/collections/interfaces/map.html. 2. The SortedSet Interface I recommend that you go through the tutorial at http://docs.oracle.com/javase/tutorial/collections/interfaces/sorted-set.html 3. The SortedMap Interface I recommend that you go through the tutorial at http://docs.oracle.com/javase/tutorial/collections/interfaces/sorted-map.html 13 of 13