An Exact Beta Generator
Peter Occil - 13/Jul/2020
Peter Occil - 13/Jul/2020
[SHOWTOGROUPS=4,20]
A sampler in Python for beta-distributed random numbers without floating-point arithmetic.
This article introduces a new sampler for beta-distributed random numbers. Here we discuss the Beta Distribution, and look at the Python code for my beta sampler, which includes class I wrote called "bernoulli.py", which collects a number of Bernoulli factories, and kthsmallest, which generates the 'k'th smallest 'bitcount'-bit uniform random number out of 'n' of them, is implemented in "randomgen.py." We also discuss the exact simulation of continuous distributions, and look at an example: The Continuous Bernoulli Distribution.
Introduction
This page introduces a new sampler for beta-distributed random numbers. Unlike any other specially-designed beta sampler I am aware of, this sampler—
This page shows Python code for my new sampler.
About the Beta Distribution
The Как увидеть ссылки? | How to see hidden links? is a bounded-domain probability distribution; its two parameters, alpha and beta, are both greater than 0 and describe the distribution's shape. Depending on alpha and beta, the shape can be a smooth peak or a smooth valley. The beta distribution can take on values in the interval [0, 1]. Any value in this interval (x) can occur with a probability proportional to—
pow(x, alpha - 1) * pow(1 - x, beta - 1). (1)
Although alpha and beta can each be greater than 0, this sampler only works if both parameters are 1 or greater.
Sampler Code
The following Python code relies on a class I wrote called "Как увидеть ссылки? | How to see hidden links?", which collects a number of Bernoulli factories, some of which are relied on by the code below. This includes the "geometric bag" mentioned earlier, as well as a Bernoulli factory that transforms a coin that produces heads with probability p into a coin that produces heads with probability pow(p, y). The case where y is in (0, 1) is due to recent work by Mendo (2019)(4).
The Python code also relies on a method I wrote called kthsmallest, which generates the kth smallest number in an arbitrary precision. This will be described later in this page.
This code is far from fast, though, at least in Python.
The kthsmallest Method
kthsmallest, which generates the 'k'th smallest 'bitcount'-bit uniform random number out of 'n' of them, is implemented in "Как увидеть ссылки? | How to see hidden links?" and relied on by this beta sampler. It is used when both a and b are integers, based on the known property that a beta random variable in this case is the ath smallest uniform (0, 1) random number out of a + b - 1 of them (Devroye 1986, p. 431)(5).
kthsmallest, however, doesn't simply generate 'n' 'bitcount'-bit numbers and then sort them. Rather, it builds up their binary expansions bit by bit, via the concept of "u-rands" (Karney 2014)(1). It uses the observation that each uniform (0, 1) random number is equally likely to be less than half or greater than half; thus, the number of uniform numbers that are less than half vs. greater than half follows a binomial(n, 1/2) distribution (and of the numbers less than half, say, the less-than-one-quarter vs. greater-than-one-quarter numbers follows the same distribution, and so on). Thanks to this observation, the algorithm can generate a sorted sample "on the fly".
The algorithm is as follows:
[/SHOWTOGROUPS]
A sampler in Python for beta-distributed random numbers without floating-point arithmetic.
This article introduces a new sampler for beta-distributed random numbers. Here we discuss the Beta Distribution, and look at the Python code for my beta sampler, which includes class I wrote called "bernoulli.py", which collects a number of Bernoulli factories, and kthsmallest, which generates the 'k'th smallest 'bitcount'-bit uniform random number out of 'n' of them, is implemented in "randomgen.py." We also discuss the exact simulation of continuous distributions, and look at an example: The Continuous Bernoulli Distribution.
Introduction
This page introduces a new sampler for beta-distributed random numbers. Unlike any other specially-designed beta sampler I am aware of, this sampler—
- avoids floating-point arithmetic, and
- samples from the beta distribution (with both parameters 1 or greater) to an arbitrary precision and with user-specified error bounds (and thus is "exact" in the sense defined in (Karney 2014)(1)).
This page shows Python code for my new sampler.
About the Beta Distribution
The Как увидеть ссылки? | How to see hidden links? is a bounded-domain probability distribution; its two parameters, alpha and beta, are both greater than 0 and describe the distribution's shape. Depending on alpha and beta, the shape can be a smooth peak or a smooth valley. The beta distribution can take on values in the interval [0, 1]. Any value in this interval (x) can occur with a probability proportional to—
pow(x, alpha - 1) * pow(1 - x, beta - 1). (1)
Although alpha and beta can each be greater than 0, this sampler only works if both parameters are 1 or greater.
Sampler Code
The following Python code relies on a class I wrote called "Как увидеть ссылки? | How to see hidden links?", which collects a number of Bernoulli factories, some of which are relied on by the code below. This includes the "geometric bag" mentioned earlier, as well as a Bernoulli factory that transforms a coin that produces heads with probability p into a coin that produces heads with probability pow(p, y). The case where y is in (0, 1) is due to recent work by Mendo (2019)(4).
The Python code also relies on a method I wrote called kthsmallest, which generates the kth smallest number in an arbitrary precision. This will be described later in this page.
This code is far from fast, though, at least in Python.
Код:
import math
import random
import bernoulli
from randomgen import RandomGen
from fractions import Fraction
def _toreal(ret, precision):
# NOTE: Although we convert to a floating-point
# number here, this is not strictly necessary and
# is merely for convenience.
return ret*1.0/(1<<precision)
def betadist(b, ax, ay, bx, by, precision=53):
# Beta distribution for alpha>=1 and beta>=1
bag=[]
bpower=Fraction(bx, by)-1
apower=Fraction(ax, ay)-1
# Special case for a=b=1
if bpower==0 and apower==0:
return random.randint(0, (1<<precision)-1)*1.0/(1<<precision)
# Special case if a and b are integers
if int(bpower) == bpower and int(apower) == apower:
a=int(Fraction(ax, ay))
b=int(Fraction(bx, by))
return _toreal(RandomGen().kthsmallest(a+b-1,a, \
precision), precision)
# Create a "geometric bag" to hold a uniform random
# number (U), described by Flajolet et al. 2010
gb=lambda: b.geometric_bag(bag)
# Complement of "geometric bag"
gbcomp=lambda: b.geometric_bag(bag)^1
bPowerBigger=(bpower > apower)
while True:
# Create a uniform random number (U) bit-by-bit, and
# accept it with probability U^(a-1)*(1-U)^(b-1), which
# is the unnormalized PDF of the beta distribution
bag.clear()
r=1
if bPowerBigger:
# Produce 1 with probability (1-U)^(b-1)
r=b.power(gbcomp, bpower)
# Produce 1 with probability U^(a-1)
if r==1: r=b.power(gb, apower)
else:
# Produce 1 with probability U^(a-1)
r=b.power(gb, apower)
# Produce 1 with probability (1-U)^(b-1)
if r==1: r=b.power(gbcomp, bpower)
if r == 1:
# Accepted, so fill up the "bag" and return the
# uniform number
ret=_fill_geometric_bag(b, bag, precision)
return ret
def _fill_geometric_bag(b, bag, precision):
ret=0
lb=min(len(bag), precision)
for i in range(lb):
if i>=len(bag) or bag[i]==None:
ret=(ret<<1)|b.randbit()
else:
ret=(ret<<1)|bag[i]
if len(bag) < precision:
diff=precision-len(bag)
ret=(ret << diff)|random.randint(0,(1 << diff)-1)
# Now we have a number that is a multiple of
# 2^-precision.
return _toreal(ret, precision)
The kthsmallest Method
kthsmallest, which generates the 'k'th smallest 'bitcount'-bit uniform random number out of 'n' of them, is implemented in "Как увидеть ссылки? | How to see hidden links?" and relied on by this beta sampler. It is used when both a and b are integers, based on the known property that a beta random variable in this case is the ath smallest uniform (0, 1) random number out of a + b - 1 of them (Devroye 1986, p. 431)(5).
kthsmallest, however, doesn't simply generate 'n' 'bitcount'-bit numbers and then sort them. Rather, it builds up their binary expansions bit by bit, via the concept of "u-rands" (Karney 2014)(1). It uses the observation that each uniform (0, 1) random number is equally likely to be less than half or greater than half; thus, the number of uniform numbers that are less than half vs. greater than half follows a binomial(n, 1/2) distribution (and of the numbers less than half, say, the less-than-one-quarter vs. greater-than-one-quarter numbers follows the same distribution, and so on). Thanks to this observation, the algorithm can generate a sorted sample "on the fly".
The algorithm is as follows:
- Create n empty u-rands.
- Set index to 1.
- If index <= k:
- Generate LC, a binomial(n, 0.5) random number.
- Append a 0 bit to the first LC u-rands (starting at index) and a 1 bit to the next n - LC u-rands.
- If LC > 1, repeat step 3 and these substeps with the same index and n = LC.
- If n - LC > 1, repeat step 3 and these substeps with index = index+LC, and n = n - LC.
- Take the kth u-rand (starting at 1) and fill it with uniform random bits as necessary to make a bitcount-bit number. Return that u-rand.
[/SHOWTOGROUPS]