Archival Node Sync Time

Wondering what other people’s experience has been syncing an archival node.

Hardware specs:
2.2 GHz Quad-Core Intel Core i7, 16 GB RAM, 2TB SSD

Internet speed:
download 150 Mbps, upload 160 Mbps

The first ~6 million blocks were syncing at a rate of 20-50 blocks/second. Since then the rate has been steadily decreasing and is currently averaging ~5-10 blocks/second after block 9,500,000. At this rate I am looking at 10-14 days to sync the archival node up to the present state of the network.

My theories on why sync rate is slowing down:

  1. More transactions / block towards late 2020 (block count 8,000,000 and above) which increases the amount of information that needs to be communicated
  2. My hardware is slow to process the information being sent by relay nodes
  3. Stress on relay nodes from increased network activity leading to decreased flow of information to archival nodes that are in the process of syncing.

Would like to hear other people’s experiences and theories why syncing is protracted.

Which network? Testnet originally had a ping-pong script holding a steady rate of transactions for the first few million blocks and then had only developers testing things out (with a few bursts of transactions when folks did performance testing. So syncing starts slow and speeds up.

Mainnet is the opposite, very few transactions to start then ramping up to a steady pace with a few spikes when projects onboard. Syncing starts fast, then slows down.

Hi Tim,

I am syncing mainnet.

That does make sense regarding projects onboarding. More or less my first theory.

What you’ve described sounds pretty much accurate for mainnet.

The first 6M rounds didn’t have much activity going on, and therefore the catchup time was pretty fast.
Mainnet started to see steady growth between 6M-8M, and it became notably slower starting round 8M.
When reaching 10M ( approx ), there was an additional “bump”, which made it even slower.

The main reason it’s getting slower is the number of accounts being updated. If I were to try and accelerate that, I would try and set the “CatchpointTracking” in the config.json file to -1. This should boost the catchup performance a bit. One note is that you should not do that if you’re running a relay. Relays need to have this enabled for the network to support the catchpoint catchup.

Okay, this is reassuring to hear it is expected behavior. Will be interesting to see how this evolves over time as archival nodes become prohibitively large for individuals to run, and even for enterprises to catchup with for archival purposes.

What does CatchpointTracking mean? When looking at node config documentation I cannot find this parameter. Furthermore, how does setting this parameter to -1 speed up catchup performance for an archival node?

The documentation sometimes get lagged behind the code. You can look go-algorand/config.go at master · algorand/go-algorand · GitHub for the parameter description.

Please let me know if you have any other questions. ( I think that the in-code documentation would answer your question from above )

As a one-liner : setting CatchpointTracking to -1 would prevent archival nodes from tracking the catchpoint label for the node. Depending on your usage, you might want to restore this up once catchup is complete. Disabling this feature reduces the amount of disk i/o being performed. If the botteneck on the catchup on your machine happen to be the disk ( which I believe is the case ), then setting this to -1 would improve your node’s catchup performance.

Thank you for the link and detailed explanation, I will test this out.

What is the purpose of catchpoint labels? It seems like they are not needed to sync an archival node, but you allude to the fact that I may want to restore catchpoints after catchup is complete. Would like to understand their use beyond using a recent catchpoint to bootstrap a non-archival node.

When running a full catchup ( what you’re doing ), you’re validating each block against its predecessor. That’s quite a cpu and disk intensive process if you have a long running network.

When you want to start off with a non-archival node, you don’t really need the entire history. You only need the recent 1000 blocks as well as a copy of the accounts state.

Catchpoints labels are the hash of that state. They allow you to use the catchpoint catchup ( sometimes referred to as “fast catchup” ), which does the exact opposite - rebuild the hash from the data, and make sure it’s matching your expected hash.

If you’ll ever need to want to catchup another node, you don’t need to trust anyone - you can use your own archival node that already calculated the “correct” catchpoint label.

1 Like