Module Base.Bytes
OCaml's byte sequence type, semantically similar to a char array, but taking less space in memory.
A byte sequence is a mutable data structure that contains a fixed-length sequence of bytes (of type char). Each byte can be indexed in constant time for reading or writing.
include Sexpable.S with type t := t
val t_of_sexp : Sexplib0.Sexp.t -> tval sexp_of_t : t -> Sexplib0.Sexp.t
val t_sexp_grammar : Sexp.Private.Raw_grammar.t
Common Interfaces
include Blit.S with type t := t
val blit : (t, t) Base__.Blit_intf.blitval blito : (t, t) Base__.Blit_intf.blitoval unsafe_blit : (t, t) Base__.Blit_intf.blitval sub : (t, t) Base__.Blit_intf.subval subo : (t, t) Base__.Blit_intf.subo
include Comparable.S with type t := t
include Base__.Comparable_intf.Polymorphic_compare
val ascending : t -> t -> intascendingis identical tocompare.descending x y = ascending y x. These are intended to be mnemonic when used likeList.sort ~compare:ascendingandList.sort ~cmp:descending, since they cause the list to be sorted in ascending or descending order, respectively.
val descending : t -> t -> intval between : t -> low:t -> high:t -> boolbetween t ~low ~highmeanslow <= t <= high
val clamp_exn : t -> min:t -> max:t -> tclamp_exn t ~min ~maxreturnst', the closest value totsuch thatbetween t' ~low:min ~high:maxis true.Raises if
not (min <= max).
val clamp : t -> min:t -> max:t -> t Or_error.t
include Comparator.S with type t := t
val comparator : (t, comparator_witness) Comparator.comparator
include Base__.Comparable_intf.Validate with type t := t
val validate_lbound : min:t Maybe_bound.t -> t Validate.checkval validate_ubound : max:t Maybe_bound.t -> t Validate.checkval validate_bound : min:t Maybe_bound.t -> max:t Maybe_bound.t -> t Validate.check
Note that pp allocates in order to preserve the state of the byte sequence it was initially called with.
module To_string : sig ... endmodule From_string : Blit.S_distinct with type src := string and type dst := tval create : int -> tcreate lenreturns a newly-allocated and uninitialized byte sequence of lengthlen. No guarantees are made about the contents of the return value.
val make : int -> char -> tmake len creturns a newly-allocated byte sequence of lengthlenfilled with the bytec.
val map : t -> f:(char -> char) -> tmap f tapplies functionfto every byte, in order, and builds the byte sequence with the results returned byf.
val mapi : t -> f:(int -> char -> char) -> tLike
map, but passes each character's index tofalong with the char.
val init : int -> f:(int -> char) -> tinit len ~freturns a newly-allocated byte sequence of lengthlenwith indexiin the sequence being initialized with the result off i.
val of_char_list : char list -> tof_char_list lreturns a newly-allocated byte sequence where each byte in the sequence corresponds to the byte inlat the same index.
val length : t -> intlength treturns the number of bytes int.
val get : t -> int -> charget t ireturns theith byte oft.
val unsafe_get : t -> int -> char
val set : t -> int -> char -> unitset t i csets theith byte ofttoc.
val unsafe_set : t -> int -> char -> unit
val fill : t -> pos:int -> len:int -> char -> unitfill t ~pos ~len cmodifiestin place, replacing all the bytes frompostopos + lenwithc.
val tr : target:char -> replacement:char -> t -> unittr ~target ~replacement tmodifiestin place, replacing every instance oftargetinswithreplacement.
val tr_multi : target:string -> replacement:string -> (t -> unit) Staged.ttr_multi ~target ~replacementreturns an in-place function that replaces every instance of a character intargetwith the corresponding character inreplacement.If
replacementis shorter thantarget, it is lengthened by repeating its last character. Emptyreplacementis illegal unlesstargetalso is.If
targetcontains multiple copies of the same character, the last correspondingreplacementcharacter is used. Note that character ranges are not supported, so~target:"a-z"means the literal characters'a','-', and'z'.
val to_list : t -> char listto_list treturns the bytes intas a list of chars.
val to_array : t -> char arrayto_array treturns the bytes intas an array of chars.
val fold : t -> init:'a -> f:('a -> char -> 'a) -> 'afold a ~f ~init:bisf a1 (f a2 (...))
val foldi : t -> init:'a -> f:(int -> 'a -> char -> 'a) -> 'afoldiworks similarly tofold, but also passes the index of each character tof.
val contains : ?pos:int -> ?len:int -> t -> char -> boolcontains ?pos ?len t creturnstrueiffcappears intbetweenposandpos + len.
val max_length : intMaximum length of a byte sequence, which is architecture-dependent. Attempting to create a
Byteslarger than this will raise an exception.
Unsafe conversions (for advanced users)
This section describes unsafe, low-level conversion functions between bytes and string. They might not copy the internal data; used improperly, they can break the immutability invariant on strings provided by the -safe-string option. They are available for expert library authors, but for most purposes you should use the always-correct Bytes.to_string and Bytes.of_string instead.
val unsafe_to_string : no_mutation_while_string_reachable:t -> stringUnsafely convert a byte sequence into a string.
To reason about the use of
unsafe_to_string, it is convenient to consider an "ownership" discipline. A piece of code that manipulates some data "owns" it; there are several disjoint ownership modes, including:- Unique ownership: the data may be accessed and mutated
- Shared ownership: the data has several owners, that may only access it, not mutate it.
Unique ownership is linear: passing the data to another piece of code means giving up ownership (we cannot access the data again). A unique owner may decide to make the data shared (giving up mutation rights on it), but shared data may not become uniquely-owned again.
unsafe_to_string scan only be used when the caller owns the byte sequences-- either uniquely or as shared immutable data. The caller gives up ownership ofs, and gains (the same mode of) ownership of the returned string. There are two valid use-cases that respect this ownership discipline:The first is creating a string by initializing and mutating a byte sequence that is never changed after initialization is performed.
let string_init len f : string = let s = Bytes.create len in for i = 0 to len - 1 do Bytes.set s i (f i) done; Bytes.unsafe_to_string ~no_mutation_while_string_reachable:sThis function is safe because the byte sequence
swill never be accessed or mutated afterunsafe_to_stringis called. Thestring_initcode gives up ownership ofs, and returns the ownership of the resulting string to its caller.Note that it would be unsafe if
swas passed as an additional parameter to the functionfas it could escape this way and be mutated in the future --string_initwould give up ownership ofsto pass it tof, and could not callunsafe_to_stringsafely.We have provided the
String.init,String.mapandString.mapifunctions to cover most cases of building new strings. You should prefer those overto_stringorunsafe_to_stringwhenever applicable.The second is temporarily giving ownership of a byte sequence to a function that expects a uniquely owned string and returns ownership back, so that we can mutate the sequence again after the call ended.
let bytes_length (s : bytes) = String.length (Bytes.unsafe_to_string ~no_mutation_while_string_reachable:s)In this use-case, we do not promise that
swill never be mutated after the call tobytes_length s. TheString.lengthfunction temporarily borrows unique ownership of the byte sequence (and sees it as astring), but returns this ownership back to the caller, which may assume thatsis still a valid byte sequence after the call. Note that this is only correct because we know thatString.lengthdoes not capture its argument -- it could escape by a side-channel such as a memoization combinator. The caller may not mutateswhile the string is borrowed (it has temporarily given up ownership). This affects concurrent programs, but also higher-order functions: ifString.lengthreturned a closure to be called later,sshould not be mutated until this closure is fully applied and returns ownership.
val unsafe_of_string_promise_no_mutation : string -> tUnsafely convert a shared string to a byte sequence that should not be mutated.
The same ownership discipline that makes
unsafe_to_stringcorrect applies tounsafe_of_string_promise_no_mutation, however unique ownership of string values is extremely difficult to reason about correctly in practice. As such, one should always assume strings are shared, never uniquely owned (For example, string literals are implicitly shared by the compiler, so you never uniquely own them)The only case we have reasonable confidence is safe is if the produced
bytesis shared -- used as an immutable byte sequence. This is possibly useful for incremental migration of low-level programs that manipulate immutable sequences of bytes (for exampleMarshal.from_bytes) and previously used thestringtype for this purpose.