Every day, I’m having fun implementing a blockchain in Java (see project, see posts), based on Ethereum, like others, like RSK, my daily job. One of the interesting part to code is the representation of the world state. In the last years I implemented the Trie, many times, in many languages (like Java and C#), and one thing it is for sure: it is important in the most popular use case a blockchain node has: the execution of block.
(If you don’t know what a Trie is and how it is used in Ethereum blockchains, read Understanding the Ethereum Trie, mentioned in my blog post Programming for RSK; also read Ethereum State Trie Architecture Explained).
I say that execution of block is the most popular use case because if you monitor the execution of an Ethereum-like node, the execution of blocks occupy most of the time and resources (memory, storage access, etc) of the node. Specially if the blockchain has Proof of Work, and the miners produce many sibling blocks per candidate height: it’s not only the execution of the block with height, ie, 10000, but it also involves the execution of other competing blocks with the same height.
In this kind of blockchains (like my implementation) the trie is used to store the world state, for example, the Account states, and the Smart Contract Storage Cells (accessed by key value). The world state could be stored using simply a key-value store, BUT the trie allows to have something additional: the world state has a hash root, the root hash of the accounts world state, that includes, indirectly, also the storage world state. Practically, it’s the only reason to have a trie: to use its hash root as part of the distributed consensus algorithm. The nodes in the blockchain MUST obtain the same WORLD STATE HASH ROOT at the end of each BLOCK execution.
Since 2016, I experimented and implemented many tries, but with performance and efficiency as the guide to choose implementation. One thing I preferred in my implementations it’s to to have trie nodes, where the subnodes ARE REFERRED in-memory (it’s an usual way to implement a Trie, but also I implemented it without reading any other implementation code, as usual, following the TDD (Test-Driven Development) workflow, and pushing for simplicity along the way). This way of doing things, implementing code without reading other implementation, to explore alternative solutions, gave me many satisfactions these years.
Executing a Block
Using this approach, the tries to use DURING the execution of a block, lives in memory: only some nodes are retrieved from key-value storage: when the node is needed.
Also, when a value should be retrieve using a key, but the trie nodes that represent the key are partially in memory, no access to disk is needed. So, the tries in memory (accounts, storages) started to grow DINAMYCALLY, with adaptation to the CURRENT block execution: the nodes that are needed usually are retrieved only once, and visited many times in-memory during the execution of the block.
This way of doing things has a profound impact in performance. For example, RSK implementation is similar (but not so simple, you know, I prefer simplicity), and then the use of this approach (tries growing in-memory, with an additional cache, in RSK case), give us some interesting numbers:
The first column is the number of block. Second and third column are the number of transactions and uncles in the block (these blocks spent more than 6 million gas units each, so, they are filled, with “DeFi-like” transactions). The fourth column is the block processing time in milli-seconds. NOTICE that some blocks (with similar transactions) ARE EXECUTED in 20x/40x times than others. If you have the detailed logs produced by this node, you will realize that the performance is due to the fact that the accounts and storage cells used during the execution of a block ARE ALREADY in memory, in an internal cache. The bottleneck is the storage access.
(The above data was obtained in some internal stress and performance test using production nodes running in testnet; they are many of this tests, run during the last year, that shows a similar pattern).
In RSK implementation, there is a cache in memory for key-values, so the values are not retrieved EVERY time from storage (in RSK, from leveldb). To show the impact of the retrieve of values from storage, look this log lines:
2021-05-07-23:31:12.253 TRACE [datasourcewithcache] Activity: No. Gets: 968. No. Puts: 469. No. Gets from Store: 966
2021-05-07-23:31:32.838 TRACE [datasourcewithcache] Activity: No. Gets: 724. No. Puts: 384. No. Gets from Store: 426
2021-05-07-23:32:19.899 TRACE [datasourcewithcache] Activity: No. Gets: 1789. No. Puts: 858. No. Gets from Store: 1199
2021-05-07-23:32:45.941 TRACE [datasourcewithcache] Activity: No. Gets: 1418. No. Puts: 649. No. Gets from Store: 757
2021-05-07-23:33:11.732 TRACE [datasourcewithcache] Activity: No. Gets: 1629. No. Puts: 659. No
. Gets from Store: 740
Each line shows some results of the execution of A BLOCK. Notice that the time internval between the first and second line IS 20 seconds aprox. And that the execution of the second blocks invoved the retrieve of 426 values FROM LEVELDB (missing from cache).
Look now at these log files, when the cache is more populated with the most used data, some time after the above blocks execution:
2021-05-08-00:11:13.557 TRACE [datasourcewithcache] Activity: No. Gets: 1438. No. Puts: 602. No. Gets from Store: 10
2021-05-08-00:11:15.201 TRACE [datasourcewithcache] Activity: No. Gets: 1800. No. Puts: 789. No. Gets from Store: 46
2021-05-08-00:11:15.881 TRACE [datasourcewithcache] Activity: No. Gets: 1393. No. Puts: 639. No. Gets from Store: 12
2021-05-08-00:11:17.476 TRACE [datasourcewithcache] Activity: No. Gets: 1385. No. Puts: 654. No. Gets from Store: 42
2021-05-08-00:11:18.626 TRACE [datasourcewithcache] Activity: No. Gets: 962. No. Puts: 462. No. Gets from Store: 28
Now, the second block (in this sequence) processing time was 2 seconds, and the number of missing values was only 46. And look at the third block: execution of less than 0.7 seconds, only 12 missed values.
The transactions are similar in the above blocks: simple invocations to DeFi-like smart contracts. But the influence of data cache is important to performance.
Separation of Concern
To me, RSK code trie and execution code is not “the best”. There are some current issues to improve and fix trie usage (see https://github.com/rsksmart/rskj/issues/1510 and https://github.com/rsksmart/rskj/issues/1511). But also the use of the retrieved value is executed by not so simple code.
More than a year, I proposed a pull request to really start separate the use of the values in block execution (using internal key-value caches) from the trie storages: read Decoupling Storage from Execution in RSK Blockchain. In this way, the internal representation follows the usages: key-values, instead of trie access. It could be a “baby step” to have in the future a journal of key values. According to my pull request comments the performance should be also improved: less convoluted code to retrieve values from trie path, and a more direct key-value access to emergent caches adapted to the current execution pattern (the most used contracts already in-memory).
Dynamic World State
Some years ago, I wrote (in an Spanish post): “the disk is the modern tape”, pointing to have all in-memory to improve performance execution of modern systems. The use of all the above approach (in-memory trie, trie nodes in cache, separation of trie path-value from key-value), allows the emergence of a dynamic world state in memory ADAPTED to the current blocks under execution (the most used values are already in memory), an approach that have its impact, as the numbers from stress and performance test in testnet shown. I think it could even solve more of the problems described in Dodging a bullet: Ethereum State Problems.
Angel “Java” Lopez