reservoir sampling
create ~random_state desired_sample_size
creates an empty sample of 'a
values.
The sample will grow no larger than desired_sample_size
when presented with more
values by calling add
.
the desired sample size
maybe_add t x
will randomly either add x
to t
or ignore it. If adding x
would grow the sample larger than desired_sample_size t
, some previously selected
value will be discarded.
the current selection from values previously seen by t
. Of all previously seen
values, each subset of size desired_sample_size t
is equally likely to have
been selected.
randomly select a subset of size sample_size
from a stream of unknown length.
Each possible subset is chosen with equal probability.