Our node was seeing memory issues. Migrated to a new machine installed 2.1.3 and tried Fast Catchup. Node is non-archival. The process consumes large amounts of memory (over 7GB on our machine) and fails after downloading the Cachpoint blocks.
Here is a snapshot of the catchup in flight:
$ algod -v
8590000131
2.1.3.stable [rel/stable] (commit #30c8dd68)
$ cat /var/lib/algorand/config.json
{
"Version": 11,
"EnableMetricReporting": true
}
$ goal node catchup 8550000#MYSKTQ7KYLYLPOU275ZUBPMOBAUMI7OYZY2AZSSAJGIQQIWO4OAQ -d /var/lib/algorand
$ goal node status -w 1000
Last committed block: 11130
Sync Time: 2883.3s
Catchpoint: 8550000#MYSKTQ7KYLYLPOU275ZUBPMOBAUMI7OYZY2AZSSAJGIQQIWO4OAQ
Catchpoint total accounts: 4174171
Catchpoint accounts processed: 4174171
Catchpoint total blocks: 1000
Catchpoint downloaded blocks: 653
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=
We don’t see any obvious error in the logs when searching through and the node seems to gracefully fall back into the regular sync (very obvious due to the different CPU and memory profile). We’ve already re-imaged the machine and started the process 4 different times, increasing resources each time.
Any insights or assistance is appreciated.
-jimmy