How big can the Algorand Blockchain get before problems start occurring?

Welcome to Algorand!

Here are some explanations for the discrepancies you noticed:

  • The indexer does not just store raw blocks but also many other data it computes, including indexes to facilitate search of information in the blockchain. That is why it is also much bigger than than the size of an archival node: https://developer.algoscan.app/analytics (compare archival node = 820GB, indexer = 1448GB). Future versions of indexers may just store the data required by a specific dApp and hence significantly reduce need for storage. (Note that an indexer is not at all required as part of running an archival node and many blockchains actually do not even provide an indexer themselves.)
  • Archival node size is closer to the actual size of the blockchain. It is currently 820GB (see link above). This is much more than estimated in the reddit comment for a couple of reasons:
    • Actual TPS has increased since launch (max TPS has been around 1000 TPS since the start but there weren’t that much use at the beginning so actual TPS was very low). You can see current TPS there: Algorand Developer Portal
    • The reddit comment assumes that the blockchain is just composed of the transactions. But actually it also contains certificates that take most of the space right now. Certificate size depends on the exact distribution of Algos participating in consensus, but is capped thanks to the sortition mechanism.

One important design of Algorand is that: while the total size of the blockchain necessarily increase as more transactions are done (and this is unavoidable and true for all blockchains), the use of “minimum balance” and capped circulation supply ensure and the state of the blockchain (that is the list of all the balances of all the accounts and state of smart contract) is actually bounded. It cannot grow too much. See Asset Limit Account and Minimum Balance - #2 by fabrice too

In particular, to participate in consensus, you only need to keep the state and the last 1000 blocks, which currently takes less than 20GB.
So participating in consensus will never be an issue, even with computer with limited disks.

There is definitely a cost for storage which may be an issue, as well as a cost for bandwidth.
That being said, we can imagine many mechanisms to ensure that most users don’t have to suffer from this (and only the few users that really want to have all the blockchain would have to pay for those costs): for example, we could have nodes and indexers only store the data the user is interested in (specific dApps or assets).
State proofs also should enable ways to catch up to the current state of the blockchain without having to download the full blockchain, as well as ways to check whether a transaction happened on chain easily.