RE: Generics Microbenchmark

After the popularity of my “How’d this String get into my List<Integer>?!” post I decided to follow up with a few small benchmarks. I got the idea from Geoffrey Wiseman who read my previous post and wondered about Generics performance.

Geoffrey wondered if there was any runtime run-time performance penalty taken by using Java 5 Generic Collections over pre-Java 5 non-generic Collections. At first I thought there was no such run-time penalty incurred because Generics is a compile-time feature, not a run-time feature. The Java Language Specification states that the class implementing a Generic Collection at run-time is the erasure of the Generic compile-time type. That is to say that the same class implements ArrayList<String>, ArrayList<Integer>, ArrayList<MyType> and ArrayList rather than creating a run-time type for each different Generic type. Because all of these types are implemented by the same class I didn’t think there would be much of a performance penalty, if any at all. But being a good computer scientist I decided to find out for myself.

I wrote a Java class called GenericsMicroBenchmark to do a little bit of my own benchmarking. The first thing I did was declare and initialize two lists, one Generic the other not.

private List<Integer> myInts = new ArrayList<Integer>();
private List yourInts = new ArrayList();

The class really gets started in the addXIntegers method. This method takes one argument to determine the number of Integers to add to the List. Each Integer is created by a call to Math.random() which is multiplied by the current value of the loop counter and then cast to an int.

myInts.add((int) (Math.random() * i));

I’m using autoboxing in that line to promote the resulting int to an Integer value which affords that value a place in my list. The whole loop to populate the List<Integer> looks like:

long startTimeGeneric = System.currentTimeMillis();
for (int i = 0; i < x; ++i) {
myInts.add((int) (Math.random() * i));
}
long endTimeGeneric = System.currentTimeMillis();
genericRunTimeSum += (endTimeGeneric - startTimeGeneric);

In addition to the loop there are some very imprecise timing markers. The non-generic loop looks very similar to the Generic one:

long startTime = System.currentTimeMillis();
for (int i = 0; i < x; ++i) {
yourInts.add((int) (Math.random() * i));
}
long endTime = System.currentTimeMillis();
uncheckedRunTimeSum += (endTime - startTime);

The genericRunTimeSum and uncheckedRunTimeSum variables contain a running total of execution time because the addXIntegers method contains a loop that runs these blocks 1000 times each. The running total is then divided by 1000 to give us an average run time for each type of list. You can take a look at the class linked above for more details about what the addXIntegers method does.

Now that we have a basic driver that is doing slightly more than trivial object creation let’s run it. A quick javac later and we have a class file to do our bidding. I’ve run this class a few different ways to get some interesting metrics out of it. The first thing I did was just run it with javac and no JVM arguments. I passed in 100,000 as a command line argument to the GenericsMicroBenchmark class in order to create 100,000 Integers to add to each list.

java net.anthonychaves.sandbox.generics.GenericsMicroBenchmark 100000
Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at net.anthonychaves.sandbox.generics.GenericsMicroBenchmark.addXIntegers(GenericsMicroBench
mark.java:18)
at net.anthonychaves.sandbox.generics.GenericsMicroBenchmark.main(GenericsMicroBenchmark.jav
a:47)

That’s weird. I commented out the lines of code that print out the results from each loop in order to get only the average times at the end of the program execution. I guess the garbage collector didn’t get to run while the stats were printing so we ran into this exception. I can fix that by passing a few JVM arguments.

java -Xms1536M -Xmx1536M net.anthonychaves.sandbox.generics.
GenericsMicroBenchmark 100000
Avg Generics: 74 ms
Avg unchecked: 48 ms

I ran the program a few times just to confirm the results and so far it looks like Generics are slower than their non-generic counterparts. I can’t believe that’s possible since the same class implements both ArrayList and ArrayList<Integer> at run-time. Let’s see what happens when I add the -server JVM argument.

java -server -Xms1536M -Xmx1536M net.anthonychaves.sandbox.generics.GenericsMicroBenchmark 100000
Avg Generics: 63 ms
Avg unchecked: 41 ms

That’s a little closer, but not quite “almost equal” like I’m looking for. Maybe the JVM is just getting warmed up when the Generic version of the loop finishes and it’s ready to rock when the non-generic version runs. If I change the order of the loops in the addXIntegers method and run it again we get:

java -server -Xms1536M -Xmx1536M net.anthonychaves.sandbox.generics.GenericsMicroBenchmark 100000
Avg Generics: 39 ms
Avg unchecked: 64 ms

It looks like we have a little role-reversal going on here. Again, my guess is that the JVM is just getting warmed up by the time the first loop ends and the second loop beings. Let’s see if we can even that out a little bit. Because the JVM is doing certain helpful, unknown-to-me things let’s give each list a clean slate when it starts. We can do this by suggesting to the JVM to run the garbage collector before we begin creating our 100,000 Integers for each list.

java -server -Xms1536M -Xmx1536M net.anthonychaves.sandbox.generics.GenericsMicroBenchmark 100000
Avg Generics: 27 ms
Avg unchecked: 27 ms

Ah, the great equalizer. Adding a call to System.gc() runs the garbage collector at the end of each list population pair and gives each a fresh start when it begins again. A few runs with the System.gc() call show that the performance of Generic and non-generic Collections in Java actually are just about equal in run-time performance. Again, Generics are a compile-time feature and are implemented by the erasure of the compile-time type which is the same for all objects of that type (generic or not). I have to admit I was getting a little worried for a few minutes there, but all things being equal the run-time performance is pretty much equal.

I have to note that even though the algorithm performance was vastly improved by adding the call to System.gc() the total program time slowed significantly due to some 100,000 suggested runs of the garbage collector. You really shouldn’t do this in practice as the JVM knows what is best for you most of the time. I did this just to give the List implementation a fair shot at performance for Generic and non-generic code. It was nothing more than give each algorithm equal footing.

You can find the full code listing at GenericsMicroBenchmark.

Thanks again for reading. Feel free to add any comments or corrections below. Special thanks to Geoffrey Wiseman for giving me the idea for this post.

3 comments ↓

#1 Geoffrey Wiseman on 03.14.07 at 5:01 am

Just to clarify: in my case, I was comparing these two:
List = new ArrayList();
List = Collections.checkedList( new ArrayList(), String.class );

The latter performs more slowly for me.

#2 Kirk on 03.15.07 at 9:35 am

The problem with Microbenchmarks is that it is very difficult to understand what you are really benchmarking. One of the tools that is absolutely necessary in helping you to understand if you’re benchmark has any validity to it is statistics. Sorry if this sounds harsh but one has to automatically reject outright the results of any benchmark that only includes an average.

You need to include variance at a minimum. But I would suggest that max, min and medium are also very useful when evaluating the results of an MBM.

IMHO (I’ve not run this benchmark myself yet), it looks biased against generics (GC issues). Also there is no consideration of what hotspot is doing. I would suggest that these issues need to be taken care of as well as the two test be separated into different tests that are run separately.

Once these changes are in place I would be surprised if there is a difference between the two test cases.

Kind regards,
Kirk Pepperdine

#3 Kirk on 03.15.07 at 10:54 am

ooops, should have said, glad to see that you finally came to a good conclusion.

– Kirk

Leave a Comment