VRF and Binomial splits

Hi,

First, thanks for sharing your work. :slightly_smiling_face:

I’m reading this paper. Under section 5.1 it mentions how VRF handles the Sybils Attack problem.

It mentions that the binomial distribution is indeed a good choice since splitting the total of sub-users of a user won’t give an advantage over a non-split.

In particular it mentions the following equality (1):

B(k1;n1,p) + B(k2;n2,p) = B(k1 + k2;n1 + n2,p)

Which at first seemed a bit odd to me.

As far as I understand the B(a,b,c) notation, it refers to a probability and not a random variable.

If that’s the case I think that the that formula should be:

B(n1,p) + B(n2,p) = B(n1 + n2,p)

In the B(a,b,c) notation would be something like:

B(z;n1 + n2,p) = convolution(B(k1, n1, p), B(k2, n2,p)) (in such way that k1+k2=z)

Am I looking this correctly?

Thanks,
Ignacio

6 Likes

Great find, Ignacio. You’re correct!

From the team (and definitely not me :sweat_smile: ):

The formula in the paper is slightly wrong, and the right way to think about it is in terms of random variables – if X comes from Binomial(n1,p) and Y comes from Binomial(n2,p), and if X and Y are independent, then X+Y is distributed as Binomial(n1+n2, p). A subtle point that is that the output of sortition on different accounts must be uncorrelated (so that X andY are independent). In the paper accounts are referred to by VRF public keys, and the VRF ensures this property. In our system now (where two different accounts could register the same VRF key) the address gets hashed into the VRF output before binomial sampling, ensuring independence.

6 Likes

Great!, thanks for the confirmation and having a direct reply from the team.

Regards,
Ignacio

1 Like