Trie Memory and Performance Experiments

In my personal blockchain project (see repo, related post: Building a Blockchain: Introduction) I added a command line entry point to execute some experiments related to my trie implementation.

You can build the project with

./gradlew customFatJar

Then, execute:

java -cp build\libs\all-in-one-jar-1.0-SNAPSHOT.jar -Xms4g com.ajlopez.blockchain.TrieMemoryPerformance 1048576 32 20

The numeric arguments are: number of value to insert in the new trie, key size (in bytes), value size (bytes)

Output:

No values: 1048576
Key size: 32
Value size: 20
MB before: 40.96002960205078
MB after: 488.32274627685547
MB after GC: 182.94497680664062
Creation time (ms): 19027
Hash time (ms): 13839
Trie size (nodes): 1414795

The numbers could vary depending on the machine and other processes. I use -Xms4g Java option to specify a minimum 4 gigas for heap size.

There is also a --csv flag to output to specify comma-delimited ouput. I executed commands to run the experiments with 1 mega of values, but with different value size (from 10 to 100), and also I run experiments with 1 to 8 mega of values, using 20 bytes of value size. I collected the output in files, and imported to a Google worksheet, adding graphics. These is are the results (again, these are informal experiments, the output values could vary from machine to machine).

Generating 1 mega of values

These are the megabytes used in memory (after creating the trie, and do garbage collector):

Megabytes used by the trie nodes depending on the value size

Notably, the relation is linear to the number of key values inserted, but with a low factor. Maybe having the arity (number of subnodes in each trie node) in 16 induces that the non-leaf nodes are the “heavy” memory consumers, proportional to the number of values BUT NOT to leaf node value sizes.

Times (in millisecond) to create trie and calculate its root hash, depending on the value size (in bytes)

Again, the times are relative constants, according to the variation of value size. The hash time could be linear with the memory occupied by the trie, and that amount of memory is mainly due to internal hashes, not leaf node values.

Generating from 1 to 10 megas of values

In this experiment series I vary the number of nodes, keeping the key size fixed to 20 bytes.

The memory usage:

Trie memory usage

This memory usage is AT THE END of the trie creation BUT BEFORE garbage collector. Having arity 16, each insertion could generate 5, 6 new nodes, and 5, 6 nodes could be released. So, at the end, if the Java virtual machine does not execute a garbage collector (maybe the heap is still with space), then the memory could be polluted by unreferenced nodes.

The memory usage when garbage collector is invoked after trie creation:

Trie memory usage after executing garbage collector

There is a linear relation. The linear factor is around 1.5. I would have expected a lower factor, having arity 16, but I guess the non-leaf nodes are the “heavy memory consumers” in this case.

The relation between the number of nodes and the number of values, has a better linear factor, around 1.35:

Number of nodes vs number of values inserted

Conclusions

These are only experiments, and the results are informal. They are too dependent of the execution machine. But I could write some preliminary conclusions:

  • There are many relations that are linear
  • At least with the experimental values used and 4 gigas of minimum size heap, Java garbage collector don’t alter the execution time (at least, not a lot)
  • Root hash calculation time is not small, it have the same order of magnitude than creation time
  • Apparently, the non-leaf nodes occupy more memory, relative to leaf nodes memory, as expected having arity 16. So many results with many key values are less dependent on value size.

These are only experiments

Yes, the main production case is: block execution. My trie implementation is prepared to have a good performance: when using an store, during the execution process, the used nodes are loaded in memory, and kept in it, so the more used nodes are easily queried. Practically, there is no production use case to have a 10 million key values trie in memory. But these experiments could give us a first glimpse about the trie memory usage and process performance.

Angel “Java” Lopez