Module type Hasher.S
val hash_fold_t : Base.Hash.state -> t -> Base.Hash.state
hash_fold_t state x
mixes the content ofx
into thestate
.By default, all our
hash_fold_t
functions (derived or not) should satisfy the following properties.1.
hash_fold_t state x
should mix all the information present inx
in the state. That is, by default,hash_fold_t
will traverse the full termx
(this is a significant change for Hashtbl.hash which by default stops traversing the term after after considering a small number of "significant values").hash_fold_t
must not discard thestate
.2.
hash_fold_t
must be compatible with the associatedcompare
function: that is, for allx
y
ands
,compare x y = 0
must implyhash_fold_t s x = hash_fold_t s y
.3. To avoid avoid systematic collisions,
hash_fold_t
should expand to different sequences of built-in mixing functions for different values ofx
. No such sequence is allowed to be a prefix of another.A common mistake is to implement
hash_fold_t
of a collection by just folding all the elements. This makes the folding sequence ofa
be a prefix ofa @ b
, thereby violating the requirement. This creates large families of collisions: all of the following collections would hash the same:[[]; [1;2;3]] [[1]; [2;3]] [[1; 2]; [3]] [[1; 2; 3]; []] [[1]; [2]; []; [3];] ...
A good way to avoid this is to mix in the size of the collection to the beginning (
fold ~init:(hash_fold_int state length) ~f:hash_fold_elem
). The default in our libraries is to mix the length of the structure before folding. To prevent the aforementioned collisions, one should respect this ordering.