How big can the Algorand Blockchain get before problems start occurring?

I’m quite new to the crypto space and only recently discovered about ALGO and I feel like it offers to right incentives for the payment/contract system to an app that I’m working on. My only other experience with defi was with stellar and the more limited sdk that it uses so that’s the only comparison I can use.

This article was posted 8 months ago had a 10 year 2TB timeline for the blockchain. And the individual suggested it was a way overestimate.

However, according the current figures (3.5 year period), an Index Node is already at 1.5TB. I’m not sure if a full node needs both Index and Archival Node data but it seems like growth may not be linear.

https://developer.algoscan.app/analytics

I’m curious how big the blockchain can get without the size of the blockchain getting out of hand ie. 10TB?, 1000TB? 1000000TB? What might be some potential issues with a huge blockchain and are there any measure in place to solve this?

Welcome to Algorand!

Here are some explanations for the discrepancies you noticed:

  • The indexer does not just store raw blocks but also many other data it computes, including indexes to facilitate search of information in the blockchain. That is why it is also much bigger than than the size of an archival node: https://developer.algoscan.app/analytics (compare archival node = 820GB, indexer = 1448GB). Future versions of indexers may just store the data required by a specific dApp and hence significantly reduce need for storage. (Note that an indexer is not at all required as part of running an archival node and many blockchains actually do not even provide an indexer themselves.)
  • Archival node size is closer to the actual size of the blockchain. It is currently 820GB (see link above). This is much more than estimated in the reddit comment for a couple of reasons:
    • Actual TPS has increased since launch (max TPS has been around 1000 TPS since the start but there weren’t that much use at the beginning so actual TPS was very low). You can see current TPS there: Algorand Developer Portal
    • The reddit comment assumes that the blockchain is just composed of the transactions. But actually it also contains certificates that take most of the space right now. Certificate size depends on the exact distribution of Algos participating in consensus, but is capped thanks to the sortition mechanism.

One important design of Algorand is that: while the total size of the blockchain necessarily increase as more transactions are done (and this is unavoidable and true for all blockchains), the use of “minimum balance” and capped circulation supply ensure and the state of the blockchain (that is the list of all the balances of all the accounts and state of smart contract) is actually bounded. It cannot grow too much. See Asset Limit Account and Minimum Balance - #2 by fabrice too

In particular, to participate in consensus, you only need to keep the state and the last 1000 blocks, which currently takes less than 20GB.
So participating in consensus will never be an issue, even with computer with limited disks.

There is definitely a cost for storage which may be an issue, as well as a cost for bandwidth.
That being said, we can imagine many mechanisms to ensure that most users don’t have to suffer from this (and only the few users that really want to have all the blockchain would have to pay for those costs): for example, we could have nodes and indexers only store the data the user is interested in (specific dApps or assets).
State proofs also should enable ways to catch up to the current state of the blockchain without having to download the full blockchain, as well as ways to check whether a transaction happened on chain easily.

Hi Fabrice, Thanks for the explanation and the links. I have a couple more questions about the chain.

Are there any benefits of running a consensus node/full node alongside a payment service or would it be good enough to for the payment service to just use the main algorand servers?

That would be really cool and very useful for apps expecting to their data scale slowly (ie not FIFA). I was both excited and concerned that 2.4M FIFA players may soon be storing all their info and not to mention pictures, team info, and special edition NFT collectables on the ALGO.

I’d like to be running a node but I am worried about costs and I don’t necessarily want to be storing data/providing the computational resource for big and powerful entities that would consume the bulk of chain transactions. As I currently am not staking ALGO, any node I am running would essentially be charity for those that need it the least.

Is this already implemented or something for the future?

What do you mean by “main Algorand servers”?
You need a node to access the network/API. You may use a free API service for your experiments (see Ecosystem Tools & Projects | Algorand Developer Portal) but you need to either run your own node or pay one of these services for anything in production. Free nodes do not provide SLA and are rate-restricted.
If you don’t need history of transactions, a non-archival node works.
You don’t need the node to be participating in consensus.

It is in progress. First stage (new post-quantum Falcon keys) are already out: How to Participate in Algorand State Proof Generation (Register State Proof Keys) | Algorand Developer Portal

Coming from Stellar, they have https://horizon.stellar.org/ which is a public facing api server for it’s blockchain.

thanks for that. this looks really interesting

I see.
Algorand Inc/Foundation itself does not provide such services.
However, partners/third parties provide such services. A list is there:

These API services are free with rate-limiting and no SLA is provided.

This is similar to what Stellar Horizon seems to propose.
I see they also have rate limiting:

And I don’t see any SLA / customer service (but I did not look very long, so I may be wrong).

I’ve never developed on Stellar. But I would guess that if I had to use a Stellar API on a big project, I would want to pay for an API tier with high SLA, 24/7 customer support, and lower rate limitations.
Similarly as what I strongly recommend on Algorand.