Java Reference Types

Published in Java on 6-Feb-2012

The Java language allows the use of four different types of object references. The usage, intricacies and benefits of these 4 types of references seem to be a mystery to even many experienced Java developers.

The four types of references, in order of 'strength' are Strong References, Soft References, Weak References and Phantom References. Each type of reference has distinct properties that affect how referenced objects interact with the Java VM and garbage collector.

Strong References

Not much needs to be said about Strong References. These are the standard types of reference that Java developers deal with on a daily basis. Whenever you use syntax like:

Object x = new Object();

you are defining the variable x as a strong reference.

Strong References guarantee that as long as they are valid, the object they refer to, called the referent, is not eligible for garbage collection. More formally, an object becomes eligible for garbage collection by the VM when a chain of strong references can no longer be traced from the object to a GC Root. GC Root is a fancy term for static variables, variables on the stack, and the like.

Soft References

Soft References, as the name implies, do not hold on to their referents quite as rigidly as Strong References do. Creating a Soft Reference looks a bit different than creating a Strong Reference:

SoftReference reference = new SoftReference(new SomeObject());

This syntax constructs a SomeObject instance and passes it as a parameter to a new SoftReference object. Exactly when objects that are only referenced by Soft References become eligible for garbage collection depends on the JVM implementation, with one exception: Before the VM crashes with an OutOfMemoryError it must garbage collect all soft references to attempt to free sufficient memory to avoid the crash. Otherwise, in the Java HotSpot VM, a soft reference becomes eligible for garbage collection a certain number of seconds after creation (1 second for every megabyte free in the heap). It is important to note however, that just because an object is eligible for garbage collection, it will not actually be collected until a garbage collection of the appropriate memory generation is triggered (usually, by the heap filling up).

The behavior of Soft References makes them ideal candidates for use in memory caches containing medium or long-lived data. Objects stored in a memory cache will mostly be long lived, and as such should find their way into the tenured generation of the heap memory. Thus, they will only be garbage collected when the tenured generation of the heap begins filling up.

While it is possible to construct a cache manually using Soft References, the Apache Commons Collections project contains an implementation of the Map interface specifically for this purpose, called the ReferenceMap. Using this class makes it easy to construct garbage collectable Maps that can be used for caching various forms of long-lived data.

SoftReference (and indeed all the non-Strong reference types) have an overloaded constructor which allows the application, in addition to the referent, to specify a ReferenceQueue. A ReferenceQueue is a queue on to which the SoftReference object will be placed when the garbage collector collects the object. If you need to invoke a particular action when objects are garbage collected, you can create a thread that observes the ReferenceQueue and takes action when an object is enqueued.

Weak References

Weak References become eligible for garbage collection immediately after instantiation. The next garbage collection cycle that runs through the appropriate heap generation will collect any objects referred to only by Weak References. This makes Weak References ideal for caching short-lived data in memory.

Creating Weak References looks very similar to the Soft Reference above:

WeakReference reference = new WeakReference(new SomeObject());

Phantom References

Phantom References, while similar in syntax to Weak and Soft References, are an entirely different breed of reference, and their purpose is often cause for considerable amounts of confusion.

Objects that are only referred to by Phantom References are eligible to be marked for garbage collection immediately. However, the referent itself is not actually garbage collected at that time. Instead, the PhatomReference object is placed on the associated ReferenceQueue. The referent itself is not garbage collected until the clear() method of the PhantomReference is called. This allows you to monitor the Reference Queue for collection of the Phantom Referenced object and then take action before the referent is actually garbage collected. This functionality is primarily a more reliable alternative to the much maligned finalize() method.

Problems with finalize()

It is commonly known that overriding the finalize() method of an object in order to clean up resources or otherwise 'deconstruct' an object is generally a bad idea. The reason for this is it can, in a very insidious way, lead to objects residing in memory much longer than might be otherwise anticipated by the developer.

If the garbage collector decides that an object is no longer strongly referenced from GC roots and is eligible for garbage collection it first checks to see if it has an overridden finalize() method. If the object does have an overridden finalize method, it can not be immediately garbage collected. Instead, the object is marked for finalization.

The JVM has a 'finalizer' thread, which runs continuously but with a very low priority. It iterates over all the objects that are marked for finalization and executes the code in the finalize method. The problem with this should be immediately apparent.

If your application is very CPU intensive or runs with many high priority threads, the 'finalizer' thread will take a back seat to those other high priority threads. The 'finalizer' thread may not get to actually execute the finalize method until a considerable amount of time has passed after the object was initially marked for garbage collection (or in a worst-case scenario, it may never execute).

This means that there will be a considerable delay in deallocating the memory reserved by the object marked for finalization, taking up valuable space that could be used for other objects and potentially leading to more frequent performance-killing garbage collections as the VM tries to free space for the new objects it must allocate.

Additionally, allowing the user to execute code in the finalize method potentially allows the user to create additional strong references to objects marked for finalization. While this almost assuredly would be a coding error, the possibility exists, for example:

@Override
protected void finalize() throws Throwable {
    someObjectCollection.add(this);
}

Assuming that someObjectCollection is an collection that is strongly referenced from a GC root not contained in this class, this would prevent the object from being garbage collected, even after the finalize() method has run.

Phantom References to the Rescue

There are times when it is necessary or desirable to run code immediately before an object's memory is deallocated. The PhantomReference type of reference gives us this ability. As discussed above, we can associate a PhantomReference with a ReferenceQueue object. When all non-phantom references to an object have become invalid (by being popped of the stack, for example), the next time the garbage collector runs on the area of the heap which contains the PhantomReference, the PhantomReference will be placed on the ReferenceQueue. At this point, PhantomReference's referent has not yet been deallocated. By occasionally checking the ReferenceQueue, the application can detect when a PhantomReference has been enqueued and execute code accordingly.

When a PhantomReference is enqueued on the ReferenceQueue, it is no longer possible to gain a reference to the object. The abstract class Reference which PhantomReference, WeakReference and SoftReference all extend defines a method T get() which returns the Reference's referent. However, this method is explicitly overridden in the PhantomReference class to return null. This prevents the application code from creating new strong references to the object and preventing their eventual deallocation, as is possible with the finalize method.

The inability to get a reference to the PhantomReference's referent also presents a challenge to executing code on the object prior to its actual collection, however. The solution here is subclassing the PhantomReference, allowing it to share the resources that are needed for the necessary cleanup:

import java.lang.ref.PhantomReference;
import java.lang.ref.ReferenceQueue;

public class ReferenceTest {
    private final ReferenceQueue<SomeObject> refQueue = new ReferenceQueue<SomeObject>();
    private PhantomReference reference = null;

    public static void main(String[] args) {
        new ReferenceTest().begin();
    }

    public void begin() {
        new Thread(new RefQueueObserver(refQueue)).start();

        this.reference = new CustomPhantomReference(
                new SomeObject(new SomeDependency()), refQueue);

        while(true) {
            new Object();
        }
    }

    public class SomeObject {
        private SomeDependency someDependency;
        private byte[] bytes = new byte[0x1fffff];

        public SomeObject(SomeDependency someDependency) {
            this.someDependency = someDependency;
        }

        public SomeDependency getSomeDependency() {
            return this.someDependency;
        }
    }

    public class SomeDependency {
        public void cleanup() {
            System.out.println("Cleaning up SomeDependency");
        }
    }

    public class CustomPhantomReference extends PhantomReference<SomeObject> {
        private SomeDependency someDependency;

        public CustomPhantomReference(
                SomeObject referent, ReferenceQueue<? super SomeObject> q) {
            super(referent, q);
            this.someDependency = referent.getSomeDependency();
        }

        public SomeDependency getSomeDependency() {
            return this.someDependency;
        }
    }

    public class RefQueueObserver implements Runnable {
        private final ReferenceQueue<SomeObject> refQueue;

        public RefQueueObserver(ReferenceQueue<SomeObject> referenceQueue) {
            this.refQueue = referenceQueue;
        }

        @Override
        public void run() {
            while(true) {
                CustomPhantomReference queuedReference =
                        (CustomPhantomReference)refQueue.poll();
                if (queuedReference != null) {
                    queuedReference.getSomeDependency().cleanup();
                    queuedReference.clear();
                    System.gc();
                    try {
                        Thread.sleep(100);
                    } catch (InterruptedException e) {
                        Thread.currentThread().interrupt();
                    }
                    System.exit(0);

                }
            }
        }
    }
}

Compile and execute the example using the following instructions:

> javac ReferenceTest.java
> java -Xmx16m -verbose:gc -XX:-PrintGCDetails ReferenceTest

Our example contains several inner classes.

The application creates an instance of the RefQueueObserver and starts it watching the reference queue. Then, an instance of the CustomPhantomReference is created. Finally, the main thread of the application loops, creating Object instances in order to fill up the young generation of the heap memory and trigger garbage collections. The RefQueueObserver is set to exit the application when a referenced object is garbage collected.

Executing this application with the JVM parameters specified above leads to output resembling the following (Line numbers added):

1: [GC 4159K->2272K(15744K), 0.0036400 secs]
2: Cleaning up SomeDependency
3: [GC 2787K->2240K(15744K), 0.0003730 secs]
4: [Full GC 2240K->122K(15744K), 0.0069420 secs]
5: [GC 4282K->122K(15744K), 0.0004130 secs]

As we can see, line 1 is a minor garbage collection, and 2 mb are not garbage collected (which is about the size of our Phantom Reference object). Line 2 indicates that our RefQueueObserver has found a reference on the reference queue and has executed the cleanup method on it. As you can see, after we cleanup the dependency and clear the reference, we force a Full Garbage Collection. While this should never be done in reality, it was the easiest way to force the old generation of the heap to be garbage collected. You can see, in line 4, that the full garbage collection reduces the heap usage to 122K, meaning that our phantom reference has finally been garbage collected and deallocated.

The three lesser known types of references in Java are useful but often misunderstood. They can be an incredibly valuable tool when used properly, allowing certain data to be cached in such a way that wont cause OutOfMemoryErrors in your application and allowing you to work around complications with use of the finalize() method.  Hopefully, I have been able to shed some light on their use and usefulness.  

About the Author

dan.jpg

Daniel Morton is a Software Developer with Shopify Plus in Waterloo, Ontario and the co-owner of Switch Case Technologies, a software development and consulting company.  Daniel specializes in Enterprise Java Development and has worked and consulted in a variety of fields including WAN Optimization, Healthcare, Telematics, Media Publishing, and the Payment Card Industry.