String type based on Bigarray, for use in I/O and C-bindings
create length
create approach max_mem_waiting_gc,
the pressure in the garbage collector to be more agressive will increase.
length.
Content is undefined.
init n ~f creates a bigstring t of length n, with t.{i} = f i
of_string ?pos ?len str
len in str starting at position pos.
String.length str - pos
to_string ?pos ?len bstr
len in bstr starting at position pos.
length bstr - pos
check_args ~loc ~pos ~len bstr checks the position and length
arguments pos and len for bigstrings bstr.
loc to indicate the calling context.
get_opt_len bstr ~pos opt_len
bstr starting at position pos and given optional length
opt_len. This function does not check the validity of its
arguments. Use check_args for that purpose.
length bstr
bstr.
get t pos returns the character at pos
set t pos sets the character at pos
is_mmapped bstr
bstr is
memory-mapped.
blit ~src ?src_pos ?src_len ~dst ?dst_pos () blits src_len characters
from src starting at position src_pos to dst at position dst_pos.
These functions write the "size-prefixed" bin-prot format that is used by, e.g.,
async's Writer.write_bin_prot, Reader.read_bin_prot and
Unpack_buffer.Unpack_one.create_bin_prot.
write_bin_prot t writer a writes a to t starting at pos, and returns the index
in t immediately after the last byte written. It raises if pos < 0 or if a
doesn't fit in t.
The read_bin_prot* functions read from the region of t starting at pos of length
len. They return the index in t immediately after the last byte read. They raise
if pos and len don't describe a region of t.
map_file shared fd n memory-maps n characters of the data associated with
descriptor fd to a bigstring. Iff shared is true, all changes to the bigstring
will be reflected in the file.
Users must keep in mind that operations on the resulting bigstring may result in disk operations which block the runtime. This is true for pure OCaml operations (such as t.
<- 1), and for calls to blit. While some I/O operations may release the OCaml
lock, users should not expect this to be done for all operations on a bigstring
returned from map_file.
find ?pos ?len char t returns Some i for the smallest i >= pos such that
t.{i} = char, or None if there is no such i.
length bstr - pos
Same as find, but does no bounds checking, and returns a negative value instead of
None if char is not found.
unsafe_destroy bstr destroys the bigstring by deallocating its associated data or,
if memory-mapped, unmapping the corresponding file, and setting all dimensions to
zero. This effectively frees the associated memory or address-space resources
instantaneously. This feature helps working around a bug in the current OCaml
runtime, which does not correctly estimate how aggressively to reclaim such resources.
This operation is safe unless you have passed the bigstring to another thread that is performing operations on it at the same time. Access to the bigstring after this operation will yield array bounds exceptions.
Accessors for parsing binary values, analogous to binary_packing. These are in Bigstring rather than a separate module because:
1) Existing binary_packing requires copies and does not work with bigstrings 2) The accessors rely on the implementation of bigstring, and hence should changeshould the implementation of bigstring move away from Bigarray. 3) Bigstring already has some external C functions, so it didn't require many changes to the OMakefile ^_^.
In a departure from Binary_packing, the naming conventions are chosen to be close to C99 stdint types, as it's a more standard description and it is somewhat useful in making compact macros for the implementations. The accessor names contain endian-ness to allow for branch-free implementations
<accessor> ::= <unsafe><operation><type><endian><int> <unsafe> ::= unsafe_ | '' <operation> ::= get_ | set_ <type> ::= int16 | uint16 | int32 | int64 <endian> ::= _le | _be | '' <int> ::= _int | ''
The "unsafe_" prefix indicates that these functions do no bounds checking. Performance testing demonstrated that the bounds check was 2-3 times slower due to the fact that Bigstring.length is a C call, and not even a noalloc one. In practice, message parsers can check the size of an outer message once, and use the unsafe accessors for individual fields, so many bounds checks can end up being redundant as well. The situation could be improved by having bigarray cache the length/dimensions.
Similar to the usage in binary_packing, the below methods are treating the value being read (or written), as an ocaml immediate integer, as such it is actually 63 bits. If the user is confident that the range of values used in practice will not require 64 bit precision (i.e. Less than Max_Long), then we can avoid allocation and use an immediate. If the user is wrong, an exception will be thrown (for get).
similar to Binary_packing.unpack_tail_padded_fixed_string and
.pack_tail_padded_fixed_string.