String type based on Bigarray
, for use in I/O and C-bindings
create length
create
approach max_mem_waiting_gc
,
the pressure in the garbage collector to be more agressive will increase.
length
.
Content is undefined.
init n ~f
creates a bigstring t
of length n
, with t.{i} = f i
of_string ?pos ?len str
len
in str
starting at position pos
.
String.length str - pos
to_string ?pos ?len bstr
len
in bstr
starting at position pos
.
length bstr - pos
check_args ~loc ~pos ~len bstr
checks the position and length
arguments pos
and len
for bigstrings bstr
.
loc
to indicate the calling context.
get_opt_len bstr ~pos opt_len
bstr
starting at position pos
and given optional length
opt_len
. This function does not check the validity of its
arguments. Use check_args for that purpose.
length bstr
bstr
.
get t pos
returns the character at pos
set t pos
sets the character at pos
is_mmapped bstr
bstr
is
memory-mapped.
blit ~src ?src_pos ?src_len ~dst ?dst_pos ()
blits src_len
characters
from src
starting at position src_pos
to dst
at position dst_pos
.
These functions write the "size-prefixed" bin-prot format that is used by, e.g.,
async's Writer.write_bin_prot
, Reader.read_bin_prot
and
Unpack_buffer.Unpack_one.create_bin_prot
.
write_bin_prot t writer a
writes a
to t
starting at pos
, and returns the index
in t
immediately after the last byte written. It raises if pos < 0
or if a
doesn't fit in t
.
The read_bin_prot*
functions read from the region of t
starting at pos
of length
len
. They return the index in t
immediately after the last byte read. They raise
if pos
and len
don't describe a region of t
.
map_file shared fd n
memory-maps n
characters of the data associated with
descriptor fd
to a bigstring. Iff shared
is true
, all changes to the bigstring
will be reflected in the file.
Users must keep in mind that operations on the resulting bigstring may result in disk operations which block the runtime. This is true for pure OCaml operations (such as t.
<- 1), and for calls to blit
. While some I/O operations may release the OCaml
lock, users should not expect this to be done for all operations on a bigstring
returned from map_file
.
find ?pos ?len char t
returns Some i
for the smallest i >= pos
such that
t.{i} = char
, or None
if there is no such i
.
length bstr - pos
Same as find
, but does no bounds checking, and returns a negative value instead of
None
if char
is not found.
unsafe_destroy bstr
destroys the bigstring by deallocating its associated data or,
if memory-mapped, unmapping the corresponding file, and setting all dimensions to
zero. This effectively frees the associated memory or address-space resources
instantaneously. This feature helps working around a bug in the current OCaml
runtime, which does not correctly estimate how aggressively to reclaim such resources.
This operation is safe unless you have passed the bigstring to another thread that is performing operations on it at the same time. Access to the bigstring after this operation will yield array bounds exceptions.
Accessors for parsing binary values, analogous to binary_packing. These are in Bigstring rather than a separate module because:
1) Existing binary_packing requires copies and does not work with bigstrings 2) The accessors rely on the implementation of bigstring, and hence should changeshould the implementation of bigstring move away from Bigarray. 3) Bigstring already has some external C functions, so it didn't require many changes to the OMakefile ^_^.
In a departure from Binary_packing, the naming conventions are chosen to be close to C99 stdint types, as it's a more standard description and it is somewhat useful in making compact macros for the implementations. The accessor names contain endian-ness to allow for branch-free implementations
<accessor> ::= <unsafe><operation><type><endian><int> <unsafe> ::= unsafe_ | '' <operation> ::= get_ | set_ <type> ::= int16 | uint16 | int32 | int64 <endian> ::= _le | _be | '' <int> ::= _int | ''
The "unsafe_" prefix indicates that these functions do no bounds checking. Performance testing demonstrated that the bounds check was 2-3 times slower due to the fact that Bigstring.length is a C call, and not even a noalloc one. In practice, message parsers can check the size of an outer message once, and use the unsafe accessors for individual fields, so many bounds checks can end up being redundant as well. The situation could be improved by having bigarray cache the length/dimensions.
Similar to the usage in binary_packing, the below methods are treating the value being read (or written), as an ocaml immediate integer, as such it is actually 63 bits. If the user is confident that the range of values used in practice will not require 64 bit precision (i.e. Less than Max_Long), then we can avoid allocation and use an immediate. If the user is wrong, an exception will be thrown (for get).
similar to Binary_packing.unpack_tail_padded_fixed_string
and
.pack_tail_padded_fixed_string
.