TestNet stalled?

Hi,

looks like there’s some problem with TestNet. Last block is 429967 and is stalled for last 15 minutes or so (Time since last block: 1093.7s). Is there network reset in progress or something else? Thanks in advance!

Regards,
Ivica

OK, it seams that it started again after more than a hour. Thanks for fixing!

Regards,
Ivica

Thanks for checking on this Ivica, and I’m sorry nobody responded sooner. There have been 2 stalls since the last update on Dec 12. Both were caused by out-of-disk status on one of the VMs (from not clearing out old builds and databases), and were corrected once the stall was detected. The network recovered nicely as expected.

Thanks again for reporting this and for participating in TestNet!

-david

Hey David!

First of all thanks for your response, really appreciate it! I have couple of questions about testnet and this incident…

Network stalls is not a big deal, after all I guess test networks are there to figure out all possible problems before main net. Thing that I don’t understand is how can one down node cause whole network to stop? I see that there are like 20 nodes in network, but most of those are probably just “nodes” that sync blockchain from relayers and does not participate in consensus?

Regarding that, is there a way to make my node participate in consensus (get elected)? It’s been up for a while and I never got any block rewards. I couldn’t find in available docs how to stake some algos and make my node run as validator? Is that even possible right now?

And yes, what does account “online/offline” status mean? I’ve generated participation key and made one of the accounts online, but not sure is that enough for my node to be elected or it have nothing to do with it?

Looking forward for your answers and thanks in advance!!!

Regards,
Ivica

Thanks for the follow-up questions, Ivica -

TestNet currently has its initial stake divided up into 21 wallets. The goal was to disperse sufficiently across nodes we could reasonably ensure would be online to avoid a single point of failure. This goal was compromised out of necessity and resulted in putting a slightly-too-large stake on a node that I directly control.

In the coming weeks, we’ll be releasing a new build and increasing the distribution across more nodes, so this single-point-of-failure will be eliminated.

is there a way to make my node participate in consensus

Yes, by having more stake you will be selected more often. With 1% of the stake you can expect to be selected about 1% of the time. If you are interested in having a larger stake and expect to have your node generally online, send me your public key (account ID) and I’ll send you a big enough chunk to be selected at least once an hour.

And yes, what does account “online/offline” status mean

Online means your account will be considered as available to contribute toward consensus - your stake is counted as part of the entire voting stake and will be included when selection is done for proposers/committees. To actually participate, your account needs to be running on a node connected to the network, and you need to have a participation key that is valid. Offline means you don’t intend to host your account on a connected node and it shouldn’t be counted as part of the voting stake.

Note that Rewards are currently disabled and the implementation is being finalized, so there are no ‘block rewards’ right now. You’ll need to check the blockchain history to know if you proposed any winning blocks.

If you marked an account online with valid participation keys, then you just need more stake to be selected in a reasonable amount of time.

Hope this helps.

-david

There is a single point of failure because one (or more?) node has too much stake? What if this happens in the real world / mainnet?

Besides, how can a decentralized, scalable, secure infrastructure have a single point of failure?

This happened because we haven’t built out our TestNet network enough to spread the stake out, and reasons caused me to put too much stake on one node. In ‘the real world’ we will not have significant stake on a single node. Our stake will be widely distributed to ensure no single point of failure, and we’ll depend on sufficient distribution of the bulk of the stake to ensure it keeps running.

Our infrastructure has a single point of failure when we’re still in development. TestNet is not representative of the MainNet infrastructure, at least not yet.

In the coming days and weeks we will be building out the network; it may even be entirely replaced with a new topology and stake distribution.

“too much stake on one node”

  • was this more than 1/3rd of the total stake?

It was enough of total online stake, combined with other nodes with online stake that weren’t currently running.

Thanks @David

Unfortunately, this is a bug in Algorand, as submitted by me in this medium post. https://medium.com/@rajeshbhaskar/algorand-may-be-broken-d1d2c2542064. This bug cannot be fixed as per the current design of Algorand and needs a new approach which I have outlined in the article.

Can you take this up as a bug submission? Do you have a bounty program?

Thanks/Rajesh

I don’t appreciate you misrepresenting a test environment configuration as indicative of, and proof for, some fundamental flaw in our approach. TestNet is not expected to be resilient yet and is not intended to represent even an early incarnation of MainNet. To call the ‘single point of failure’ any sort of proof is ludicrous and not helpful.

Hi @David, this issue with Algorand was diagnosed by me months back and not specific to your testnet. So there is no case of misrepresentation here. I reached out to your team back then as well.

It certainly appears that the testnet is cropping up with the issue as outlined in that post. The flaw is clearly with nodes having sufficient stake going offline.

Do you believe that this is not a bug with Algorand?

no, pie in the sky. 3 is very unlikely, 2 is unlikely because they won’t know who is online if they were to attack plus you’re assuming everyone in that group know each other highly unlikely. 8. the attacks are infeasible, the attacker would have to know 990000 user locations and know if they’re online or not for the 1st attack. 2nd attack where would he buy those tokens from if they’ve all been acquired already, supply is fixed. Even 50 million isn’t enough for attack. if 990000 have 5% of stake that’s 500mill + the 250mill from 200. that’s significantly larger the 50million usd. All this you assume everyone knows everyone else. Algorand prefers safety over liveness so if it halts that’s fine but that’s rare and it recovers quickly from halts. Micali was right, the probability makes it unlikely.And you’re the founder of a chain, i’m sorry for you. At least you got the clicks to shill your chain

https://link.springer.com/chapter/10.1007/978-3-319-70697-9_14

It’s still a problem, not fixed in most of the blockchains.

try again, in the white paper algorand addressed the sleepy consensus. algorand isn’t sleepy consensus. 2017? bruh

Not addressed. Anyway all the best with your investment in ALGO. Logging off.

you can say it but so far you haven’t given any convincing argument. Btw aside from addressing sleepy consensus and its shortcomings in the whitepaper, micali provides an analysis of some nodes being offline and online in the whitepaper. As a parting gift even in cryptography RSA, quantum security, zk-snarks attacks are possible it just takes so much to pull it off so we dismiss them on the basis of negligible probability. Have good one

1 Like

An Additional Property/Technique: Lazy Honesty A honest user follows his prescribed
instructions, which include being online and run the protocol. Since, Algorand has only modest
computation and communication requirement, being online and running the protocol “in the
background” is not a major sacrifice. Of course, a few “absences” among honest players, as those
due to sudden loss of connectivity or the need of rebooting, are automatically tolerated (because
we can always consider such few players to be temporarily malicious). Let us point out, however,
that Algorand can be simply adapted so as to work in a new model, in which honest users to be
offline most of the time. Our new model can be informally introduced as follows.
Lazy Honesty. Roughly speaking, a user i is lazy-but-honest if (1) he follows all his prescribed
instructions, when he is asked to participate to the protocol, and (2) he is asked to participate
to the protocol only rarely, and with a suitable advance notice.
With such a relaxed notion of honesty, we may be even more confident that honest people will be
at hand when we need them, and Algorand guarantee that, when this is the case,
The system operates securely even if, at a given point in time,
the majority of the participating players are malicious.—Algorand whitepaper