val get_matching_blocks : transform:('a ‑> elt) ‑> ?big_enough:int ‑> mine:'a array ‑> other:'a array ‑> Matching_block.t list
Get_matching_blocks not only aggregates the data from matches a b
but also
attempts to remove random, semantically meaningless matches ("semantic cleanup").
The value of big_enough
governs how aggressively we do so. See get_hunks
below for more details.
matches a b
returns a list of pairs (i,j) such that a.(i) = b.(j) and such that
the list is strictly increasing in both its first and second coordinates. This is
essentially a "unfolded" version of what get_matching_blocks
returns. Instead of
grouping the consecutive matching block using length
this function would return
all the pairs (mine_start * other_start).
match_ratio ~compare a b
computes the ratio defined as:
2 * len (matches a b) / (len a + len b)
It is an indication of how much alike a and b are. A ratio closer to 1.0 will indicate a number of matches close to the number of elements that can potentially match, thus is a sign that a and b are very much alike. On the other hand, a low ratio means very little match.
val get_hunks : transform:('a ‑> elt) ‑> context:int ‑> ?big_enough:int ‑> mine:'a array ‑> other:'a array ‑> 'a Hunk.t list
get_hunks ~transform ~context ~mine ~other
will compare the arrays mine
and
other
and produce a list of hunks. (The hunks will contain Same ranges of at most
context
elements.) context
defaults to infinity (producing a singleton hunk
list). The value of big_enough
governs how aggressively we try to clean up
spurious matches, by restricting our attention to only matches of length
less than big_enough
. Thus, setting big_enough
to a higher value results in
more aggressive cleanup, and the default value of 1 results in no cleanup at all.
When this function is called by Patdiff_core
, the value of big_enough
is 3 at
the line level, and 7 at the word level.
val merge : elt array array ‑> elt merged_array