Maybe I’m misunderstanding something but it seems that fast catchup essentially makes Algorand a centralized blockchain. Because state snapshots are saved on Algornad’s servers…
Ahh…great point @aybehrouz.
But no, it’s not - and I’ll explain:
Each change to the account database ( whether locally on a non-relay, or on any of the relays ) is recorded, check-summed, etc. That allows all nodes that have the CatchpointTracking option enabled, to report the expected Catchpoint Label for any catchpoint round.
All archival servers ( which includes relays ), generates catchpoint files by default. If you would run an archival node locally, you’ll see the catchpoint files created in the genesis directory.
If you want to confirm that, you can run two nodes on your computer : configure the first one as a relay, and have the second one “point” to the first. You’ll be able to fast catchup the second node from the first node.
One thing that you’re correct about, is that currently there is no way to “import” a catchpoint file directly into a data directory. But that doesn’t make it centralized. I think that a step prior to that would be a creation of an archival of catchpoints file that would be available to the public.
Additional to what I wrote above - the reason the catchpoints labels aren’t being generated by default on non-relays is for performance reasons : non-relays are ( in some cases ), have lower machine specs. For these, it’s important to consume the least amount of disk/cpu/memory as possible.
At this time, catchpoint files cannot be generated on non-relays. This is a limitation that might be removed in the future ( along with some code changes to support that ). So far, I have not seen much demand for that on this forum.
The digest of these snapshot files is stored on Algorand’s servers:
Therefore when a node syncs itself using fast-catchup, it will see Algorand’s state data (including account balances) as the foundation wants. This essentially makes Algorand a centralized blockchain. If someone corrupts these snapshot files and their digest on Algorand servers, some nodes will start to have corrupted state data and likely no one will notice it. After some time when the number of corrupted nodes increases because these nodes are participating in the consensus protocol the corrupted state will dominate the valid state that the archival nodes have.
Even if you produce this digest by all nodes or even if u include it in the blockchain, still nodes that use fast catchup are vulnerable to several phishing-like attacks. The only way that you can protect nodes against phishing attacks is to save those digests on trusted https servers which essentially makes things centralized.
But you missed a point - the node would never go to the above location to get the catchpoint label.
Algorand currently publish the latest.catchpoint to help the community.
If the service that maintain these catchpoint labels would stop working, the Algorand network would still work perfectly fine. There is no component on the system that use the fast catchup automatically.
At the moment, Algorand wait for additional individuals ( like yourself! ), to run their own node, and to create a nice user-interface with that information. Would you like to volunteer and run this service ?
( I would envision a crypto exchange letting their customers know what the latest catchpoint label is, for instance )
Last - using the fast catchup requires the user to have a trusted catchpoint label. This was said over and over. If you don’t trust the source from where you received the catchpoint label, then please don’t use it.
btw - please don’t confuse catchpoint files with catchpoint labels. A catchpoint file contain the entire accounts database. A catchpoint label is just the “round-number@hash” string.
Note that by default the node syncs from the start and does not use fast catchup.
You need to manually get the catchpoint label and call
goal node catchup to use fast catchup.
Anyone can publish those catchpoint labels and you can check multiple of them are equal before using them.
As I mentioned in my post one of the problems of this approach is that because u don’t save this digest(cryptographic hash) in the block headers, a node that uses a corrupted snapshot can not detect its database is corrupted. It will work like a normal node. so if an attacker finds a way to make people use his corrupted snapshot he can produce a hidden community of corrupted nodes over time. When he made enough number of infected nodes, his corrupted state, which likely he is a billionaire in it, would dominate the real state. because his nodes are voting in the consensus.
That’s why I call this approach centralized… we have a single point of failure.
i didnt see any warnings about this feature in the documents like “plz dont use this if u have time for a normal sync” or “make sure u have the correct hash”. people are lazy if they can sync their node in 5 mins instead of 4 days they will do it in 5 mins. and also I can guess for a node that has been offline for more than a week fast catchup would be faster than a normal re-syncing with the network.
At least adding the digest of snapshots(or as u call it the label) to the block headers of blockchain could improve the security.
Anyway, in my opinion, this method for syncing nodes is not compatible with the blockchain philosophy.
is this true? I thought the latest block will contain the hash of the previous one? from there it’s possible to detect that the catchup data was altered right? what is the worse case scenario of this? even if he changed historical balances to own billion of algo will won’t be able to spend it ya?
No its not possible.
But, as long as u retrieve the catch point label from a trusted source, like Algorand’s https servers, ur catch point file with a very high probability is not altered. The validity of a catch point file can be verified using catch point labels.
lets say a lot of nodes are ‘corrupted’ due to this, the person who corrupted it changes his historical balances to own billions of algo, will he able to spend it? if he is able to spend it and due to the un forkable nature of algorand the whole chain loses its integrity right?
the possibility is very very low right? I feel it is very unlikely that this can happen.
if the corrupted nodes own more than a certain threshold of all participating algos the attacker would be able to spend that money.
And yes the probability of successfully doing this attack is really really low.
If you’re going to imply a concrete vulnerability, you’ll need to explicitly specify how to get there. A core assumption in security is that you don’t want to get compromised to start with. Once your system is compromised, all bets are off.
When dealing with networked database ( i.e. blockchain ), you can’t trust anything received by the network, and you have to verify everything. However, the above assumption still holds - once you’ve verified the correctness of something, it could be assumed to hold true forever.
Therefore, if a local node has a different “view” of the accounts balances, then it might work correctly for a while… but at some point, it’s likely to start failing. ( there are several other cryptographic elements that depends on account balances. messing with these is likely to fail )
At this time, there is no continues integrity checking for the entire accounts database - that would not be feasible. However, if you have concerns that your account database doesn’t have the correct values, you’re welcome to compare your local generated catchpoint labels against a public source that you trust.
( I’m saying “public source that you trust”, since it doesn’t have to be Algorand. In fact, I would prefer a non-Algorand source )
As a database administrator for a bank I don’t understand why the community would risk this vulnerability with integrity & fast catchup. All nodes should check all blocks from the beginning, if not sorry you cant be a node. We are too used to instant gratification in this society. But as currency is the backbone of the planet there should be no short cuts. If they have to wait 52 days then good, there are already plenty of nodes completely caught up. It separates real people vs people who are just trying to stand up a quick nodes to trick the system.
“At this time, there is no continues integrity checking for the entire accounts database.” There should be. Integrity checking is the backbone of a blockchain. A real DBA is primarily concerned with integrity and backups (what the programmers dont care about). You got rid of the need for backups thanks to the number of replicated nodes online, however integrity checking can never go away!
The problem is that the above 52 days are going to get longer and longer. At some point, it would become non-practical to catchup using sequential block validation.
Using the fast catchup is like downloading a gzip file, along with a md5 checksum. If you got the checksum correct, you can verify the validity of the gzip.
Trusting the catchpoint labels isn’t that different from trusting the genesis file. In both case, you need to get it from a trusted source. Once you do, the startup point for your node is secured.
Great explanation, I second the need for better integrity checking on Algorand.
The difference is the genesis file can be hardcoded into the algo while catchup is always this dynamic vulnerability. Maybe if the catchup file was in github it would be ok but the safest is to hardcode it into the algorithim. I don’t really trust git anything. Actually that’s a cool use case for the blockchain… to save a copy of its code itself inside itself. Get rid of any git vulnerabilities. Am I being overly paranoid here or what?
I think that moving the
genesis.json into the
algod binary would not make it any better from security perspective, since one would question the validity of the
algod binary instead of just questioning the validity of the
algod binary +
i.e. the “problem” of trust is not solved. If Algorand is to come up with a solution, it should be one that answers (ideally) both : you would be able to retrieve a binary, genesis and a catchpoint in a way that you won’t need to question their validity.
This issue sounds ( to me ) like it’s not really a blockchain issue, don’t you agree ?
Btw - regarding your
source code on blockchain idea - while it’s definitely doable, block chains wouldn’t perform optimally storing large quantities of data. I would suggest storing the hash of the gzip file that contains all the source code along with a release url pointing to GitHub. That would allow you to download and verify that the source code is what it was supposed to be.
Great idea just storing the hash of the code rather than the entire source code in the blockchain. Long-term a decentralized file protocol like filecoin built into algo would be cool though.
As far as moving genesis.json into algod, it’s funny you see it that way. I would think it is easier to trust a single file rather than two files. Even if they are separate you would have to question the validity of both. But I get what you are saying either way is almost the same thing, not much is gained.
“This issue sounds ( to me ) like it’s not really a blockchain issue, don’t you agree ?” How about using zk rollups instead of catchpoint? Seems to me that catchpoint is the programmatic solution vs zk rollups would be the cryptographic solution. Is it too early for zk rollups?
This is not true. Genesis is a block not a state snapshot. So when u alter the genesis block you’ll have to make a whole new block chain from the start. A node with invalid genesis can not work in the same blockchain as other nodes. The same is not true about state snapshots. Unless, as I suggested before, you save snapshot hashes on the block headers, and u don’t need to have those hashes in every block.
I think the ultimate solution is to have a ZK-EDB for persistence layer. The commitment of the DB is contained in block headers and other nodes do not save the entire state data(i.e. account balances). They just get the needed information and its proof from available ZK-EDBs. They verify the new block using that information and update the commitment when is needed.
This is a good idea, though, I don’t like the idea of having a github link in the blockchain. Why should we give github such a big credibility?
You should note that, currently, the integrity of node’s software is a big security issue in Algorand. This issue exists in other blockchains too but its not as severe as Algorand. For example in bitcoin, if a lot of nodes have an invalid software as long as miners and block-explorer services have the right software the blockchain will remain valid and it will work fine. but in Algorand if many nodes have an altered software the whole blockchain will be compromised because all nodes participate in the consensus.
Also malware attacks impose a serious security risk for Algorand. Again, bitcoin does not have that problem because miners have secured servers that is not easy to infect. but the same is not true about ordinary people who run Algorand’s nodes. Infecting them is easier and if some one infect them with a malware he will be able to control the whole Algorand’s blockchain.