Optimizing memory consumption of Radix Trees in Java

On the Relevancy team at zulily, we are often required to load a large number of large strings into memory. This often causes memory issues. After looking at multiple ways to reduce memory pressure, we settled on Radix Trees to store these strings. Radix Trees provide very fast prefix searching and are great for auto-complete services and similar uses. This post focuses entirely on memory consumption.

What Is A Radix Tree?

Radix Trees take sequences of data and organize them in a tree structure. Strings with common prefixes end up sharing nodes toward the top of this structure, which is how memory savings is realized. Consider the following example, where we store “antidisestablishmentarian” and “antidisestablishmentarianism” in a Radix Tree:

+- antidisestablishmentarian (node 1)
                           +- ism (node 2)

Two strings, totaling 53 characters, can be stored as two nodes in a tree. The first node stores the common prefix (25 characters) between it and its children. The second stores the rest (3 characters). In terms of character data stored, the Radix Tree stores the same information in approximately 53% of the space (not counting the additional overhead introduced by the tree structure itself).

If you add the string “antibacterial” to the tree, you need to break apart node 1 and shuffle things around. You end with:

+- anti                             (node 3)
      |- disestablishmentarian      (node 4)
      |                      +- ism (node 2)
      +- bacterial                  (node 5)

Real-World Performance

We run a lot of software in the JVM, where memory performance can be tricky to measure. In order to validate our Radix Tree implementation and measure the impact, I pumped a bunch of pseudo-realistic data into various collections and captured memory snapshots with YourKit Java Profiler.

Input Data

It didn’t take long to hack together some real-looking data in Ruby with Faker. I created four input files of approximately 1,000,000 strings that included a random selection of 12-digit numbers, bitcoin addresses, email addresses and ISBNs.

sreed:src/ $ head zulily-oss/radix-tree/12-digit-numbers.txt

sreed:src/ $ head zulily-oss/radix-tree/bitcoins.txt

sreed:src/ $ head zulily-oss/radix-tree/emails.txt

sreed:src/ $ head zulily-oss/radix-tree/isbns.txt

Measuring Memory with YourKit

YourKit provides a measurement of “retained size” in its memory snapshots which is helpful when trying to understand how your code is impacting the heap. What isn’t necessarily intuitive about it, though, is what objects it excludes from this “retained size” measurement. Their documentation is very helpful here: only object references that are exclusively held by the object you’re measuring will be included. Instead of telling you “this is how much memory usage your object imposes on the VM,” retained size instead tells you “this is how much memory the VM would be able to garbage-collect if it were gone.” This is a subtle, but very real, difference if you wish to optimize memory consumption.

Thus, my memory testing needed to ensure that each collection held complete copies of the objects I wished to measure. In this case, each string key needed to be duplicated (I decided to intern and share every value I stored in order to measure only the memory gains from different key storage techniques).

// Results in shared reference, and inaccurate measurement
map1.put(key, value);
map2.put(key, value);

// Results in shared char[] reference, and better but
// still inaccurate measurement
map1.put(new String(key), value);
map2.put(new String(key), value);

// Results in complete copy of keys, and accurate measurement
map1.put(new String(key.toCharArray()), value);
map2.put(new String(key.toCharArray()), value);

Collections Tested

I tested our own Radix Tree implementation, ConcurrentRadixTree from https://code.google.com/p/concurrent-trees/, a string array, Guava’s ImmutableMap and Java’s HashMap, TreeMap, Hashtable and LinkedHashMap. Each collection stored the same values for each key.

Both zulily’s Radix Tree and the ConcurrentRadixTree from concurrent-trees were configured to store string data as UTF-8-encoded byte arrays.

ConcurrentRadixTree was included simply to ensure that our own version (to be open-sourced soon) was worth the effort. The others were measured simply to highlight the benefits of Radix Tree storage for different input types. Each collection has its own merits and in most ways they are all superior to the Radix Tree for storage (put/get performance, concurrency and other features).



First of all, Guava’s ImmutableMap is pretty good. It stored the same key and value data as java.util.HashMap in 92-95% of the space. The Radix Tree breaks keys into byte array sequences and stores them in a tree structure based on common prefixes. This resulted in a best case of 62% the size of the ImmutableMap for bitcoin addresses (strings which have many common prefixes) and a worst case 88% for random 12-digit numbers. We see that the memory used by this data structure is largely dependent on the type of data put into it. Large strings with many large common prefixes are stored very efficiently in a narrow tree structure. Unique strings create a lot of branches in the underlying tree, making it very wide and adding a lot of overhead.

Converting Java Strings to byte arrays accounts for most of the memory savings, but not all. Byte array storage was anywhere from 90% (bitcoin addresses) to 99% (ISBNs) in the tests I ran.

For us, storing byte-encoded representations of string data in a radix tree allowed us to reclaim valuable memory in our services. However it wasn’t until validating the implementation in an accurate manner with realistic data and trustworthy tools that we rested easy knowing we had set out what we wished to accomplish.