Brief Randomness Beacon availability incident

Earlier today we had an brief availability incident with the Randomness Beacon we run at the Algorand Foundation.

The backend service submitting VRF proofs was unavailable for about 23 minutes. After that, the service was restarted and the proofs were caught up within 11 minutes.

Data here:

Background

This service provides on-chain randomness for smart contracts to utilize, e.g. for lottery use cases, etc. It was in-housed in 2024 as a cost-savings measure.

It is generally very reliable, and we have several precautions against something like this happening, but today we hit an edge case.

Cause

We submit these proofs through 3 threads of a process, via 3 independent node providers. A configuration change was made that resulted in a fatal error before the independent threads spawned, which resulted in the service halting overall.

Remediation

We will be remediating this on a code level with extra guards during the process bootstrap, so as to avoid this incident cause in the future.

4 Likes

The delicious irony is that the configuration change was part of changes that we are making for better observability and alerting for potential issues.

2 Likes

Thanks for the report! I have to admit that when i first saw your tweet for a nano second i thought the consensus protocol had issues.

1 Like

Hi D13, i is there anywhere any example of the smart contract using randomness beacon in python or tealscript pls?