Try an online Java class for free!
Additional Resources

Collections

In this lesson of the Java tutorial, you will learn...
  1. About the concepts and classes in the Collections API
  2. How to work with Lists, Sets, and Maps
  3. How to use Iterators
  4. How to define and use Comparable and Comparator classes
  5. About generic classes

Collections

The Java Collections API is a set of classes and interfaces designed to store multiple objects

There are a variety of classes that store objects in different ways

  • Lists store objects in a specific order
  • Sets reject duplicates of any objects already in the collection
  • Maps store objects in association with a key, which is later used to look up and retrieve the object (note that if an item with a duplicate key is put into a map, the new item will replace the old item)

The basic distinctions are defined in several interfaces in the java.util package

  • Collection is the most basic generally useful interface that most, but not all, collection classes implement
    • it specifies a number of useful methods, such as those to add and remove elements, ascertain the size of the collection, determine if a specific object is contained in the collection, or return an Iterator that can be used to loop through all elements of the collection (more on this later)
    • methods include: add(Object o), remove(Object o), contains(Objecto), iterator()
    • note that removing an element removes an object that compares as equal to a specified object, rather than by position
    • Collection extends the Iterable interface, which only requires one method, which supplies an object used to iterate through the collection
  • the List interface adds the ability to insert and delete at a specified index within the collection, or retrieve from a specific position
  • the Set interface expects that implementing classes will modify the adding methods to prevent duplicates, and also that any constructors will prevent duplicates (the add method returns a boolean that states whether the operation succeeded or not; i.e., if the object could be added because it was not already there)
    • sets do not support any indexed retrieval
  • the Map interface does not extend Collection, because its adding, removing, and retrieving methods use a key value to identify an element
    • methods include: get(Object key), put(Object key, Object value), remove(Object key), containsKey(Objectkey), containsValue(Objectvalue), keySet(), entrySet()
    • maps do not support any indexed retrieval

In addition to the List, Set, and Map categories, there are also several inferences you can make from some of the collection class names:

  • a name beginning with Hash uses a hashing and mapping strategy internally, although the key values usually have no meaning (the hashing approach attempts to provide an efficient and approximately equal lookup time for any element)
  • a name beginning with Linked uses a linking strategy to preserve the order of insertion, and is optimized for insertion/deletion as opposed to appending or iterating
  • a name beginning with Tree uses a binary tree to impose an ordering scheme - either the natural order of the elements (as specified by the Comparable interface implemented by many classes including String and the numeric wrapper classes), or an order dictated by a special helper class object (a Comparator)

Several additional interfaces are used to define useful helper classes:

  • Enumeration provides a pair of methods that enable you to loop through a collection in a standardized manner, regardless of the type of collection (you test if there are more elements, and, if so, retrieve the next element)
    • this interface is less frequently used now that the following interface has been added to the API
    • but, many of the elements in J2EE (servlets in particular) predate the Iterator concept, and were specified to use enumerations, so they will still appear in new code
  • Iterator is an improved version that allows an object to be removed from the collection through the iterator

The following diagrams show some of the classes and interfaces that are available

Collections API Class Hierarchy - Lists and Sets

Collections API Class Hierarchy - Maps

Using the Collection Classes

The following are several examples of collection classes

Vector - stores objects in a linear list, in order of addition

  • you can insert and delete at any location
  • all methods are synchronized, so that the class is thread-safe
  • predates the collections framework, but was marked to implement List when the collections API was developed
    • as a result, there are many duplicate methods, like add and addElement, since the names chosen for Collection methods didn't all match existing method names in Vector
  • used extensively in the past, and therefore you will see it in a lot of code, but most recent code uses ArrayList instead

ArrayList - like Vector, stores objects in a linear list, in order of addition

  • methods are not synchronized, so that the class is not inherently thread-safe (but there are now tools in the Collections API to provide a thread-safe wrapper for any collection, which is why Vector has fallen into disuse)

TreeSet - stores objects in a linear sequence, sorted by a comparison, with no duplicates

  • stores Comparable items or uses a separate Comparator object to determine ordering
  • uses a balanced tree approach to manage the ordering and retrieval
  • note that there is no tree list collection, since the concepts of insertion order and natural order are incompatible
  • since sets reject duplicates, any comparison algorithm should include a guaranteed tiebreaker (for example, to store employees in last name, first name order: to allow for two Joe Smiths, we should include the employee id as the final level of comparison)

TreeMap - stores objects in a Map, where any subcollection or iterator obtained will be sorted by the key values

  • keys must be Comparable items or have a separate Comparator object
  • note that for maps, an iterator is not directly available
    • you must get either the key set or entry set, from which an iterator is available

Hashtable - stores objects in a Map

  • all methods are synchronized, so that the class is thread-safe
  • predates the collections framework, but was marked to implement Map when the collections API was developed
    • like Vector, has some duplicate methods
  • extends an obsolete class called Dictionary

HashSet - uses hashing strategy to manage a Set

  • rejects duplicate entries
  • entries are managed by an internal HashMap that uses the entry hashCode values as the keys
  • methods are not synchronized

Using the Iterator Interface

Iterators provide a standard way to loop through all items in a collection, regardless of the type of collection

  • all collection classes implement an iterator() method that returns the object's iterator (declared as Iterator iterator())
  • this method is specified by the Iterable interface, which is the base interface for Collection
  • for maps, the collection itself is not Iterable, but the set of keys and the set of entries are
  • you can use a for-each loop for any Iterable object

Without an iterator, you could still use an indexed loop to walk through a list using the size() method to find the upper limit, and the get(int index) method to retrieve an item

  • but, other types of collections may not have the concept of an index, so an iterator is a better choice

There are two key methods: boolean hasNext(), which returns true until there are no more elements, and Object next(), which retrieves the next element and also moves the iterator's internal pointer to it

  • some collections allow the collection to you to remove an element through the iterator
    • the remove method is marked in the docs as optional - by the definition of interfaces it has to be present, but the optional status indicates that it may be implemented to merely throw an UnsupportedOperationException
  • in any case, if the collection is modified from a route other than via the iterator (perhaps by using an index, or even just using the add method), a ConcurrentModificationException is thrown

The Enumeration class is an older approach to this concept; it has methods hasMoreElements() and nextElement()

Code Sample: Java-Collections/Demos/CollectionsTest.java

import java.util.*;

public class CollectionsTest {
 
 public static void main(String[] args) {
  List l = new ArrayList();
  Map m = new TreeMap();
  Set s = new TreeSet();
  
  l.add(new Integer(1));
  l.add(new Integer(4));
  l.add(new Integer(3));
  l.add(new Integer(2));
  l.add(new Integer(3));
  
  m.put(new Integer(1), "A");
  m.put(new Integer(4), "B");
  m.put(new Integer(3), "C");
  m.put(new Integer(2), "D");
  m.put(new Integer(3), "E");
  
  s.add(new Integer(1));
  s.add(new Integer(4));
  s.add(new Integer(3));
  s.add(new Integer(2));
  s.add(new Integer(3));
  
  System.out.println("List");
  Iterator i = l.iterator();
  while (i.hasNext()) System.out.println(i.next());
  
  System.out.println("Map using keys");
  i = m.keySet().iterator();
  while (i.hasNext()) System.out.println(m.get(i.next()));
  
  System.out.println("Map using entries");
  i = m.entrySet().iterator();
  while (i.hasNext()) System.out.println(i.next());
  
  System.out.println("Set");
  i = s.iterator();
  while (i.hasNext()) System.out.println(i.next());
 }
}
Code Explanation

This program demonstrates the three types of collections. As is commonly done, the variables are typed as the most basic interfaces (List, Set, and Map).

We attempt to add the same sequence of values to each: 1, 4, 3, 2, and 3

  • they are deliberately out of sequence and contain a duplicate
  • for the Map, the numbers are the keys, and single-letter strings are used for the associated values

An iterator is then obtained for each and the series printed out (for the Map we try two different approaches: iterating through the keys and retrieving the associated entries, and iterating directly through the set of entries)

Note the order of the values for each, and also which of the duplicates was kept or not.

  • the List object stores all the objects and iterates through them in order of addition
  • the Map object stores by key; since a TreeMap is used, its iterator returns the elements sorted in order by the keys
  • since one key is duplicated, the later of the two values stored under that key is kept
  • the Set object rejects the duplicate, so the first item entered is kept (although in this case it would be hard to tell which one was actually stored)

Creating Collectible Classes

hashCode and equals

Both the equals(Object) method and hashCode() methods are used by methods in the Collections API.

  • the Map classes use them to determine if a key is already present
  • the Set classes use them to determine if an object is already in the set
  • all collections classes use them in contains and related methods

Sun specifies that the behavior of hashCode should be "consistent with equals", meaning that two objects that compare as equal should return the same hash code.

  • the reverse is not required - two objects with the same hash code might be unequal, since hash codes provide "bins" for storing objects

Thsi is critical when writing collectible classes. For example, the implementation of HashSet, which should reject duplicate entries, compares a candidate entry's hash code against that of each object currently in the collection. It will only call equals if it finds a matching hash code. If no hash code matches, it assumes that the candidate object must not match any already present in the set.

Comparable and Comparator

Sorted collections can sort elements in two ways:

  • the natural order of the elements - objects that implement the Comparable interface have a natural order
  • using a third-party class that implements Comparator

Comparable - specifies one method:

  • compareTo(Object o) - returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object

Comparator - specifies two methods:

  • compare(Object a, Object b) - returns a negative integer, zero, or a positive integer as a is less than, equal to, or greater than b
  • equals(Object o) - as in Object, used not to compare the stored values for value, but to determine if two comparator classes can be considered equal
    • but, note that the behavior of the compare method should be consistent with equals, as Sun's documentation advises; that is, compare(a, b) should return 0 when a.equals(b) returns true

As an example:

TreeSet uses a tree structure to store items

  • the tree part of the name is just for identifying the algorithm used for storage - you cannot make use of any of the node-related behaviors from the outside
  • there are several forms of constructor, most notably:
    • TreeSet() - uses the natural order of the items, so they must implement the Comparable interface (and the items should be mutually comparable, to avoid casting exceptions being thrown during a comparison)
    • TreeSet(Comparator c) - uses the specified Comparator to compare the items for ordering (and, again, the mutually comparable caveat applies)

Code Sample: Java-Collections/Demos/UseComparable.java

import java.util.*;

public class UseComparable {
 public static void main(String[] args)
           throws java.io.IOException {
  String[] names = { "Sue", "Bill", "Tom", "Dave", "Andy",
            "Mary", "Beth", "Bill", "Mike" };
  TreeSet sl = new TreeSet(Arrays.asList(names));
  Iterator it = sl.iterator();
  while (it.hasNext()) {
    System.out.println(it.next());
  }
 }
}
Code Explanation

Since String implements Comparable, the names will appear in alphabetical order when we iterate through the set

The Arrays class provides useful methods for working with arrays, some of which will return a collection backed by the array (the Collections classes contain useful methods for working with collections, some of which perform the reverse operation)

Creating a Comparable Class

Classes that implement the Comparable interface may be stored in ordered collections

They must have a int compareTo(Object other) method to implement the interface

  • it is said that these objects have a natural order as determined by this method
  • the function returns a negative value for this object considered less than the object passed as parameter other, a positive value for this object considered greater than the object passed as parameter other, and 0 if they are equal
  • since this capability is built into the object, it is important that it the results of the compareTo method match with the results of the equals and hashCode methods

Creating and Using a ComparatorClass

Classes that implement the Comparator interface may be also stored in ordered collections, but the collection must be constructed with an explicit reference to an instance of the Comparator

  • the Comparator is a separate class that will compare two instances of your class to determine the ordering

The interface specifies int compare(Object a, Object b)

  • the function returns a negative value for object a considered less than the object b, a positive value for b considered greater than a, and 0 if they are equal
  • it is still important that it the results of the compareTo method match with the results of the objects' equals method
    • you should usually implement this method to avoid "ties" for objects that would not be considered equal
    • for example, for two different employees who coincidentally have the same name, they would be returned in an indeterminate order

Code Sample: Java-Collections/Demos/UseComparator.java

import java.util.*;

public class UseComparator {
 
 public static void main(String[] args) throws java.io.IOException {
  String[] names = { "Sue", "Bill", "Tom", "Dave", "Andy",
            "Mary", "Beth", "Bill", "Mike" };

  TreeSet s2 = new TreeSet(new ReverseComparator());
  s2.addAll(Arrays.asList(names));
  
  Iterator it = s2.iterator();
  while (it.hasNext()) {
    System.out.println(it.next());
  }
 }
}

class ReverseComparator implements Comparator {
 public int compare(Object o1, Object o2) {
  if (o1 instanceof String && o2 instanceof String)
   return -((String)o1).compareTo((String)o2);
  else throw new ClassCastException("Objects are not Strings");
 }
}
Code Explanation

The compare method of our comparator makes use of the existing compareTo method in String, and simply inverts the result

Note the tests for both objects being String, and throwing a ClassCastException if one is not

Code Sample: Java-Collections/Demos/UseComparableAndComparator.java

import java.util.*;

public class UseComparableAndComparator {
 
 public static void main(String[] args) throws java.io.IOException {
  String[] names = { "Sue", "Bill", "Tom", "Dave", "Andy",
            "Mary", "Beth", "Bill", "Mike" };

  TreeSet sl = new TreeSet(Arrays.asList(names));
  Iterator it = sl.iterator();
  while (it.hasNext()) {
    System.out.println(it.next());
  }

  TreeSet s2 = new TreeSet(new ReverseComparator());
  s2.addAll(Arrays.asList(names));
  
  it = s2.iterator();
  while (it.hasNext()) {
    System.out.println(it.next());
  }
 }
}
Code Explanation

Since all the collections store is references, it will not use up a lot of memory to store the same references in different collections

This creates an analog to a set of table indexes in a database

Collections in Java 5.0: Generics

Java 5.0 added the concept of Generics, which allow data types to be parameterized for a class

In earlier versions of Java, the methods to store objects all received a parameter whose type was Object

  • therefore, the methods to retrieve elements were typed to return Object
  • to use a retrieved element, you had to typecast the returned object back to whatever it actually was (and somehow you had to know what it actually was)

The collections in Java 5.0 use a special, new syntax where the type of object is stated in angle brackets after the collection class name

Instead of ArrayList, there is now ArrayList<E>, where the E can be replaced by any type

  • within the class, method parameters and return values can be parameterized with the same type
    public class ArrayList<e> {
     . . .
     public void add(int index, E element) { ... }
     . . .
     public E get(int index) {  }
     . . .
    }

For example, an ArrayList of String objects would be ArrayList<String>

Code Sample: Java-Collections/Demos/GenericCollectionsTest.java

import java.util.*;

public class GenericCollectionsTest {
 
 public static void main(String[] args) {

  List<String> ls = new ArrayList<String>();
  
  ls.add("Hello");
  ls.add("how");
  ls.add("are");
  ls.add("you");
  ls.add("today");
  
  // using iterator
  StringBuffer result = new StringBuffer();  
  Iterator<String> is = ls.iterator();
  while (is.hasNext()) result.append(is.next().toUpperCase()).append(' ');
  result.append('?');
  System.out.println(result);
  
  // using for-each loop
  result = new StringBuffer();
  for (String s : ls) result.append(s.toLowerCase()).append(' ');
  System.out.println(result);

  // old way
  List l = new ArrayList();
  
  l.add("Hello");
  l.add("how");
  l.add("are");
  l.add("you");
  l.add("today");
  
  // using iterator
  result = new StringBuffer();
  Iterator i = l.iterator();
  while (i.hasNext()) result.append(((String)i.next()).toUpperCase()).append(' ');
  result.append('?');
  System.out.println(result);
  
  // using for-each loop
  result = new StringBuffer();
  for (Object o : l) result.append(((String)o).toLowerCase()).append(' ');
  System.out.println(result);

 }
}
Code Explanation

As you can see, the objects retrieved from the ArrayList are already typed as being String objects

  • we need to have them as String objects in order to call toUpperCase, whereas our previous examples only printed the objects, so being typed as Object was OK
  • without generics, we must typecast each retrieved element in order to call toUpperCase
  • to use an iterator, we would declare the variable to use the same type of element as the collection we draw from, as in Iterator<String>

Variations on Generics

In the documentation for a collections class, you may see some strange type parameters for methods or constructors, like:

public boolean containsAll(Collection<?> c)
public boolean addAll(Collection<? extends E> c) 
public TreeSet(Comparator<? super E> comparator)

The question mark is a wildcard, indicating that the actual type is unknown. It does no harm for us to see if our collection contains all the elements of some other collection that may contain an entirely unrelated type (but few, if any, of the items would compare as equal). Note that the contains method accepts an Object parameter, not an E, for this same reason.

The extends term means that the unknown class is or extends the listed type. For instance, we could add all the elements of an ArrayList<ExemptEmployee> to an ArrayList<Employee> using the addAll(Collection<? extends E> c) method. That makes sense, since we could add individual exempt employees to a basic employee collection. The reason for the question mark is that the addAll method treats the type as unknown, and therefore cannot call any methods on the incoming object that depend on the type. Imagine if the addAll method used the E type, which in our case would be Employee, and called the add method on the incoming collection, attempting to add an Employee to it. But our incoming collection is ArrayList<ExemptEmployee>, which cannot accept a mere Employee.

The super term is seen less often; it means that the parameterized type of the incoming collection must be of the same type or a more basic type. The TreeSet constructor can accept a Comparator for its actual type, or any type more basic (e.g., a Comparator<Object> can be used for a TreeSet of anything, since its compare method will accept any type of data).

Collections Conclusion

In this lesson of the Java tutorial you have learned how to use the classes and interfaces in the Collections API, and how to work with generic classes

To continue to learn Java go to the top of this page and click on the next lesson in this Java Tutorial's Table of Contents.
Last updated on 2009-03-02

Use of http://www.learn-java-tutorial.com (Website) implies agreement to the following:

Copyright Information

All pages and graphics on Website are the property of Webucator, Inc. unless otherwise specified.

None of the content on Website may be redistributed or reproduced in any way, shape, or form without written permission from Webucator, Inc.

No Printing or saving of pages or content on Website

This content may not be printed or saved. It is for online use only.


Linking to Website

You may link to any of the pages on Website; however, you may not include the content in a frame or iframe without written permission from Webucator, Inc.


Warranties

Website is provided without warranty of any kind. There are no guarantees that use of the site will not be subject to interruptions. All direct or indirect risk related to use of the site is borne entirely by the user. All code and explanations provided on this site are provided without warranties to correctness, performance, fitness, merchantability, and/or any other warranty (whether expressed or implied).


For individual private use only

You agree not to use this online manual to deliver or receive training. If you are delivering or attending a class that is making use of this online manual, you are in violation of our terms of service. Please report any abuse to courseware@webucator.com. If you would like to deliver or receive training using this manual, please fill out the form at http://www.webucator.com/Contact.cfm