13. Mai 2011

# Sampling from a Poisson distribution - a benchmark

Having a poisson-distributed random variable X, how can we efficiently generate samples (or say realizations) of that random variable? This article shows how Knuth’s technique can be derived and it also indicates that the Ratio-of-Uniforms method is quite efficient.

X is the number of arrivals in a poisson process within a fixed period of time. The inter-arrival times follow an exponential distribution. The following small benchmark makes an attempt to compare the time necessary to compute samples using three different algorithms:

1. Algorithm exp, do it yourself
2. Algorithm uni by D. Knuth
3. Algorithm rou, from Numerical Recipes

(Let’s peek at the results up-front: rou seems to win this challenge!)

## exp – counting inter-arrival times

We want a random sample of the number of arrivals within a fixed period of time. We know the distribution of the inter-arrival times. By summing up inter-arrival times until the sum exceeds the fixed period of time (use the unit interval for convenience) we generate a sample number of arrivals. Formalizing this idea, we have a sequence of exponentially distributed random values: The i-th interval represents the time until the i-th arrival. We then find a number k such that This number k is the sought-after random sample.

```double exponential(Ran *ran, double lambda) {
double p;
do {
p = ran->doub();
} while (p == 0.);
return -log(p) / lambda;
}

int poisson_exp(Ran *ran, double lambda) {
double t = 0;
int k = 0;
while (1) {
t += exponential(ran, lambda);
if (t > 1) {
return k;
} else {
k++;
}
}
}
```

## uni – counting inter-arrival times, improved

Usually, the inversion method is used for sampling from an exponential distribution. Note the logarithm. It is the computationally most expensive operation when generating a sample inter-arrival time. Knuth’s variant gets rid of this nuisance. Here’s how it works Finding a k that satisfies the last inequation only requires λ+1 uniform samples on average, reflecting in shortened computation times.

```int poisson_uni(Ran *ran, double lambda) {
int k = -1;
double p = 1.;
double lambda_exp = exp(-lambda);
do {
k++;
p *= ran->doub();
} while (p > lambda_exp);
return k;
}
```

## rou – using Ratio-of-Uniforms method

No explanation here (unfortunately…). Just take the algorithm from Numerical Recipes and see what it’s worth. Actually, I modified it to use Ratio-of-Uniforms for all values of λ for the sake of comparison. Sources are licensed and therefore cannot be listed here.

## Benchmark

Taking different values for λ (1, 2, 4, 16, 32, 64) and generating a million random samples using each of the three algorithms on a notebook from 2008, here’s the results (interpolated). ``` g++ -O2 benchmark.cpp -I . -o benchmark
./benchmark
```
``` Numbers generated per algorithm: 1000000
Expected value: 1
method       mean   time [s]
uni      1.000      0.130
exp      0.998      0.300
rou      1.036      0.310
```
``` Numbers generated per algorithm: 1000000
Expected value: 2
method       mean   time [s]
uni      2.001      0.140
exp      2.000      0.460
rou      2.008      0.310
```
``` Numbers generated per algorithm: 1000000
Expected value: 4
method       mean   time [s]
uni      3.999      0.190
exp      3.999      0.750
rou      4.000      0.300
```
``` Numbers generated per algorithm: 1000000
Expected value: 8
method       mean   time [s]
uni      8.000      0.290
exp      8.000      1.340
rou      8.001      0.300
```
``` Numbers generated per algorithm: 1000000
Expected value: 16
method       mean   time [s]
uni     16.002      0.470
exp     15.996      2.440
rou     16.000      0.220
```
``` Numbers generated per algorithm: 1000000
Expected value: 32
method       mean   time [s]
uni     32.007      0.820
exp     32.001      4.710
rou     31.996      0.220
```
``` Numbers generated per algorithm: 1000000
Expected value: 64
method       mean   time [s]
uni     63.982      1.530
exp     64.009      9.270
rou     63.995      0.210
```

## Sources

benchmark.zip : In order to compile, you need to obtain files nr3.h, ran.h, deviates.h, gamma.h from Numerical Recipes and put them into the folder with the other source files.