Memory Utilisation - accounts processed

I’ve been looking at algod’s memory utilisation for version 2.1.3.stable since my node started to become very sluggish when running mainnet.

I started a new mainnet participant node and observed the memory utilisation. After it had processed a few thousand blocks of catchup all looked good. However, when I set it for Fast Catchup it quickly began to consume memory.

Before it got close to using all available RAM I aborted the Fast Catchup and the node continued to process blocks as before. At this point the algod process did not free up the memory it had grabbed to perform the Fast Catchup.

I gracefully stopped the node and, as expected, the algod processes stopped and freed up the memory.

I started the mainnet node again and it continued to process blocks in order to catch up but did not grab anywhere near as much memory as it had when it was in Fast Catchup mode.

If I leave the machine in Fast Catchup mode for long enough, it consumes all available memory and it thrashes the swap partition resulting in the Fast Catchup aborting due to timeouts.

This post is not about the timeouts (I’ll look at that separately) but about memory utilisation and the number of “accounts”. For Fast Catchup, status is showing 4313466 “accounts”. As the “Catchpoint accounts processed” number increases so does the memory utilisation.

$ goal node status -d /var/lib/algorand -w 1000
Last committed block: 14465
Sync Time: 7.5s
Catchpoint: 8560000#JXTHWW5BRW7MDSPVHITBLYORCNUDTHIXZRIYOZYKHDHGU2WUXLZQ
Catchpoint total accounts: 4313466
Catchpoint accounts processed: 20480
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=

I’ve been trying out Fast Catchup on the testnet but have not seen it consume anywhere near as much memory.

It seems probable that there will be more “accounts” on mainnet than testnet.

I was testing a mainnet archival node a few days ago without using Fast Catchup and by the time it was within a few hundred thousand blocks of being caught up it had consumed all available memory. It seems possible that the amount of memory consumed by algod is proportional to the number of accounts associated with a network.

If the number of “accounts” on the mainnet continues to grow then we might assume that RAM utilisation will continue to increase across the whole installed node base. If unchecked, this could have a significant effect of the performance of the network.

Please tell me I’m wrong and that I’m missing something straight forward (like a configuration parameter to limit the DB’s ability to consume memory).

At this time there is no configuration parameter you could effectively use to lower the memory requirements of this feature. We’re looking into a solution for this issue.

Thanks - you’ll perhaps have already seen my observation in my “testnet fast catchup” post.