13. Mai 2011

# Sampling from a Poisson distribution - a benchmark

Having a poisson-distributed random variable X, how can we efficiently generate samples (or say realizations) of that random variable? This article shows how Knuth’s technique can be derived and it also indicates that the Ratio-of-Uniforms method is quite efficient.

X is the number of arrivals in a poisson process within a fixed period of time. The inter-arrival times follow an exponential distribution. The following small benchmark makes an attempt to compare the time necessary to compute samples using three different algorithms:

- Algorithm
**exp**, do it yourself - Algorithm
**uni**by D. Knuth - Algorithm
**rou**, from Numerical Recipes

(Let’s peek at the results up-front: *rou* seems to win this challenge!)

## exp – counting inter-arrival times

We want a random sample of the number of arrivals within a fixed period of time. We know the distribution of the inter-arrival times. By summing up inter-arrival times until the sum exceeds the fixed period of time (use the unit interval for convenience) we generate a sample number of arrivals. Formalizing this idea, we have a sequence of exponentially distributed random values:

The i-th interval represents the time until the i-th arrival. We then find a number k such that

This number k is the sought-after random sample.

double exponential(Ran *ran, double lambda) { double p; do { p = ran->doub(); } while (p == 0.); return -log(p) / lambda; } int poisson_exp(Ran *ran, double lambda) { double t = 0; int k = 0; while (1) { t += exponential(ran, lambda); if (t > 1) { return k; } else { k++; } } }

## uni – counting inter-arrival times, improved

Usually, the inversion method is used for sampling from an exponential distribution.

Note the logarithm. It is the computationally most expensive operation when generating a sample inter-arrival time. Knuth’s variant gets rid of this nuisance. Here’s how it works

Finding a k that satisfies the last inequation only requires λ+1 uniform samples on average, reflecting in shortened computation times.

int poisson_uni(Ran *ran, double lambda) { int k = -1; double p = 1.; double lambda_exp = exp(-lambda); do { k++; p *= ran->doub(); } while (p > lambda_exp); return k; }

## rou – using Ratio-of-Uniforms method

No explanation here (unfortunately…). Just take the algorithm from Numerical Recipes and see what it’s worth. Actually, I modified it to use Ratio-of-Uniforms for all values of λ for the sake of comparison. Sources are licensed and therefore cannot be listed here.

## Benchmark

Taking different values for λ (1, 2, 4, 16, 32, 64) and generating a million random samples using each of the three algorithms on a notebook from 2008, here’s the results (interpolated).

g++ -O2 benchmark.cpp -I . -o benchmark ./benchmark

Numbers generated per algorithm: 1000000 Expected value: 1 method mean time [s] uni 1.000 0.130 exp 0.998 0.300 rou 1.036 0.310

Numbers generated per algorithm: 1000000 Expected value: 2 method mean time [s] uni 2.001 0.140 exp 2.000 0.460 rou 2.008 0.310

Numbers generated per algorithm: 1000000 Expected value: 4 method mean time [s] uni 3.999 0.190 exp 3.999 0.750 rou 4.000 0.300

Numbers generated per algorithm: 1000000 Expected value: 8 method mean time [s] uni 8.000 0.290 exp 8.000 1.340 rou 8.001 0.300

Numbers generated per algorithm: 1000000 Expected value: 16 method mean time [s] uni 16.002 0.470 exp 15.996 2.440 rou 16.000 0.220

Numbers generated per algorithm: 1000000 Expected value: 32 method mean time [s] uni 32.007 0.820 exp 32.001 4.710 rou 31.996 0.220

Numbers generated per algorithm: 1000000 Expected value: 64 method mean time [s] uni 63.982 1.530 exp 64.009 9.270 rou 63.995 0.210

## Sources

benchmark.zip : In order to compile, you need to obtain files *nr3.h, ran.h, deviates.h, gamma.h* from Numerical Recipes and put them into the folder with the other source files.