## 113.33.01 ### core - Fix build problem on BSD related to `endian.h`. ### core_kernel - Fix segfault in [Stack] - Fix BSD build problem related to endian.h ## 113.33.00 ### async Keep up to date with interface changes in `Async_kernel`, `Async_extra` and `Async_unix`. ### async_extended - Create library to fork and do calculations in child process. - `Command_rpc.with_close` returns an `Or_error.t Deferred.t`, and indeed if it fails to execute the binary, it will return an `Error _` rather than raising. However, if it successfully executes the binary but then the handshake fails, it raises an exception. Successfully executing the binary but then having the handshake fail is I think a reasonable thing to want to catch, as it's what you'll get if you are starting the command rpc slave via ssh and there's some problem on the remote side. I haven't thought very hard about what happens if the subprocess dies after successful handshake and/or the implications of the Writer's monitor being the monitor in effect when `with_close` is called. I think heartbeats will kill the connection before this becomes an issue. - Remove `Async_extended.Deprecated_async_bench` It's causing problems for the open-source release on platforms that don't have `Posix_clock.gettime`: https://github.com/janestreet/async_extended/issues/1 Seems that a module with Deprecated in the name and no uses should just be binned. - add a simple tcp proxy that is useful in testing for simulating network issues ### async_extra - Rename `Async.Std.Schedule.every_{enter,tag_change}` to `Async.Std.Schedule.every_{enter,tag_change}_without_pushback` and introduce `Async.Std.Schedule.every_{enter,tag_change}` with pushback behavior. The resulting functions are analogous to `Pipe.iter_without_pushback` and `Pipe.iter`. - Replaced `Persistent_rpc_client` with a successor `Persistent_connection` for maintaining persistent connections to all manner of services, not just rpc servers. - Make `Bus.pipe1_exn` take a `Source_position.t` to be more consistent with `first_exn` and `iter_exn`. This also shows better debug sexps on `Bus` when multiple pipes are subscribed to the bus. ### async_kernel - In `Pipe`, deprecated `read_at_most` and `read_now_at_most`, and replaced them with `read'` and `read_now'`, respectively. - Reverted a recent feature that changed `Pipe`'s default `max_queue_length` from `Int.max_value` to `100`. It looks like 100 is to small to keep `Pipe` overhead to a minimum -- we saw a noticeable slowdown in Async.Log. We're going to do some more benchmarking and try to reduce the overhead and choose a better number. - Added `Async.Std.Let_syntax = Deferred.Let_syntax`, which makes it possible to use `%bind` and `%map` after `open Async.Std`. This required fixing a few places where the new `Let_syntax` in scope shadowed `Command.Param.Let_syntax`. - Improve stacktrace for unhandle async exception in js\_of\_ocaml. Wrapping the unhandle exception with `Error.create` will serialize this exception and prevent us to have good error reporting in Js\_of\_ocaml. - Reworked `Async_kernel`'s clock handling to make it possible to use a notion of time distinct from the wall-clock time used by `Clock_ns` and maintained by the Async scheduler. There is now a self-contained `Time_source` module, a data structure that holds a timing-wheel. All of the code that was in `Clock_ns` and implicitly used the Async scheduler and its timing wheel is now in `Time_source`, and is suitably parameterized to work on an arbitrary time source with an arbitrary timing wheel. `Clock_ns` is now a wrapper around `Time_source` that implicitly uses the Async scheduler's time source. `Time_source` still uses the Async scheduler for running jobs that fire -- the new power comes from the user being able to advance time distinctly from the wall clock. - Changed `Time_source.sexp_of_t` to display "" for the wall-clock time source. We don't want to display the events because that can be ridiculously large. And, for non-wall-clock time sources, don't show the `Job.t`s in events, because they are uninformative pool pointers. - Added `Time_source.advance_by`. - Added `Time_source.alarm_precision`. - Added to `Async.Scheduler` `Time_ns` analogs of `cycle_start` and `cycle_times`: val cycle_start_ns : unit -> Time_ns.t val cycle_times_ns : unit -> Time_ns.Span.t Stream.t - Add `Deferred.map_choice` for transforming the output of `Deferred.choice`. - Remove an avoidable call to `is_determined` in `Deferred.value_exn`. The patch is meant to preserve current error messages without adding any allocation overhead compared to the existing. Context: implementation of Eager_deferred in the works that essentially wants to redefine map, bind, etc, in the following way: let map t ~f = if Deferred.is_determined t then return (f (Deferred.value_exn t)) else Deferred.map t ~f ;; - Add `Pipe.read_choice` to encode the appropriate way of combining `values_available` and `read_now` to conditionally read from a pipe. ### async_parallel - Add an option `~close_stdout_and_stderr` to `Async_parallel_deprecated.Std.Parallel.init` to close the `stdout` & `stderr` fds. This is needed when using `Async_parallel_deprecated` in a daemonized processes, such as the in-development jenga server. Without this option, Calling ` Process.run ~prog:"jenga" ~args:`"server";"start"` ` from build-manager is problematic because the resulting deferred never becomes determined. ### async_rpc_kernel - Cleans up the implementation-side interface for aborting `Pipe_rpc`s. Summary The `aborted` `Deferred.t` that got passed to `Pipe_rpc` implementations is gone. The situations where it would have been determined now close the reading end of the user-supplied pipe instead. Details Previously, when an RPC dispatcher decided to abort a query, the RPC implementation would get its `aborted` `Deferred.t` filled in, but would remain free to write some final elements to the pipe. This is a little bit more flexible than the new interface, but it's also problematic in that the implementer could easily just not pay attention to `aborted`. (They're not obligated to pay attention to when the pipe is closed, either, but at least they can't keep writing to it.) We don't think the extra flexibility was used at all. In the future, we may also simplify the client side to remove the `abort` function on the dispatch side (at least when not using `dispatch_iter`). For the time being it remains, but closing the received pipe is the preferred way of aborting the query. There are a couple of known ways this might have changed behavior from before. Both of these appear not to cause problems in the jane repo. - In the past, an implementation could write to a pipe as long as the client didn't disconnect, even if it closed its pipe. Now writes will raise after a client closes its pipe (or calls `abort`), since the implementor's pipe will also be closed in this case. Doing this was already unsafe, though, since the pipe *was* closed if the RPC connection was closed. - `aborted` was only determined if a client aborted the query or the connection was closed. The new alternative, `Pipe.closed` called on the returned pipe, will also be determined if the implementation closes the pipe itself. This is unlikely to cause serious issues but could potentially cause some confusing logging. - Blocking RPC implementations (i.e., ones made with `Rpc.implement'`) now capture the backtrace when they raise. This is consistent with what happens in the deferred implementation case, since in that case the implementation is run inside a `Monitor.try_with`, which captures backtraces as well. Here's an example of what a new error looks like: ((rpc_error (Uncaught_exn ((location "server-side blocking rpc computation") (exn (Invalid_argument "Float.iround_up_exn: argument (100000000000000000000.000000) is too large")) (backtrace "Raised at file \"pervasives.ml\", line 31, characters 25-45\ \nCalled from file \"result.ml\", line 43, characters 17-22\ \nCalled from file \"monad.ml\", line 17, characters 20-28\ \n")))) (connection_description ("Client connected via TCP" (localhost 3333))) (rpc_tag foo) (rpc_version 1)) - Actually deprecated `deprecated_dispatch_multi` functions in `Versioned_rpc`. ### async_smtp - Allow an smtp\_client to connect to a local unix socket - Better logging ### async_ssl - Make sure to close the `Pipe.Writer.t` that goes back to the application, otherwise the application will never get an `Eof if the connection is closed. ### async_unix - expose a function to get the number of jobs that async has run since Scheduler.go - In `Log` change `sexp` to be immediate rather than lazy. This changes the behavior the `sexp` function. I've gone through the entire tree rewriting `sexp` to `create_s`, attempting to follow these guidelines: - If the logging is clearly intended to always happen (error logging, transaction log logging, etc.) I translated the call unguarded. - If the logging was already guarded with an if statetment I translated the call unguarded. - Otherwise, I added a call to `Log.would_log` as a guarding if. This duplicates the behavior of `sexp`, which would avoid the sexp conversion if the log message was never going to be logged. - `Writer.with_file_atomic` checks if destination file exists and then stats it to get existing permissions. File can go away before the call to stat, which would make the `with_file_atomic` to error out, which seems harsh. I think it is better to continue on in this case. Besides, `Async.file_exists` is essentially just another stat, so we might as well just do the stat without checking whether file exists or not. - Moved various `let%test`'s in `Async_unix` library into their own module. I'd like the `let%test`'s to be in files that depend on `Std`, so that they can pick up a top-level side effect that `Std` does, specifically the assignment of `Monitor.try_with_log_exn`. The child feature of this feature makes the change to do that assignment. - Changed the `Async_unix` library's assignment of `Monitor.try_with_log_exn` to be a top-level side effect that happens if `Async_unix.Std` is used. This fixes a bug in which `Thread_safe.block_on_async` (and other bits of `Async_unix`) failed to to do the assignment. The assignment was part of `Scheduler.go`, which isn't called by `Thread_safe.block_on_async`. This in turn cause programs that use `Thread_safe.block_on_async` and do not use `Scheduler.go`, like inline tests, to raise if they needed to report an exception raised to `Monitor.try_with` that already returned. - Deprecate the (blocking) `Sexp.save_*` functions in `Async_unix.Std`. Please do not switch from `Sexp.save*` to `Writer.save_sexp*` in this feature -- this feature is just for the deprecation. You can do the switch in a subfeature or later feature. A note aboute `Core_extended.Sexp` ---------------------------------- This feature causes surprising behavior in `Core_extended.Sexp`, where opening `Async.Std` undoes any previous open of `Core_extended.Std`: open Core.Std open Core_extended.Std type t = int Sexp.Comprehension.t `@@deriving sexp` open Async.Std type u = int Sexp.Comprehension.t `@@deriving sexp` The type `t` is fine but `u` fails to compile because at that point the module Sexp has no submodule Comprehension. But we already have the same problem with module `Unix` if you open `Async.Std` after `Core_extended.Std`, so I left the `Sexp` module as is. - Added `Async.Unix.Stats.compare`. - Changed Async's logging of exceptions raised to a `Monitor.try_with` that already returned to use `Log.Global.sexp` rather than `Log.global.error`. For logs that use single-line sexps, that makes them much nicer. - Added to `Async.Scheduler` `Time_ns` analogs of `cycle_start` and `cycle_times`: val cycle_start_ns : unit -> Time_ns.t val cycle_times_ns : unit -> Time_ns.Span.t Stream.t - I'm tired of having to write `>>| ok_exn` after most calls to Process, let's just make Process provide these functions directly. Also you can't write `>>| ok_exn` nicely anyway when using the let%bind and friends. - Added a test demonstrating that `Async.Process.create` returns `Ok` when `exec` fails. And updated the `create`'s documentation. ### bignum - This release improves the slow path of bignum of string. The previous version used a split on `'_'` followed by a concat, which allocated a bunch of intermediate strings. ### core - Updated to follow core\_kernel's evolution. - Add variance annotations to a few types in `Command`. - `Command.Param.choice` represents a sum type. - Added `Command.Spec.of_param` for conversion from new-style to old style command line specifications. This is expected to be useful for gradually converting Spec-based commands into Param-based ones. - `flag_names` and `anon_names` gives the flags and anons used by a Command.Param.t, while `mnemonic` combines them to give a string that's good for referring to that param in error messages. This feature improves the interface of `Command.Param.one_of` by saving the caller from passing the flag names separately. - Remove the closure allocation from `Date0.create_exn` - Many apps take schedules as command line parameters (to determine when they'll be up, or when they should raise issues, etc.). Let's add a Schedule.Stable.V4.flag function to generate the command line param to make writing this common pattern easier. - Add Ofday.Option to `time_ns.ml`. Exports `Time_ns.Span.Option.unchecked_value` and `.Ofday.Option.unchecked_value`! - Finally fix `Time_ns.Of_day.of_local_time` w.r.t. DST transitions after regression test failures at the DST transition. As a result, clarified slightly the role of `Time_ns.Ofday`, like `Time.Ofday`, as 24h wall-clock times, not linear time offsets, which latter would have to be up to 25h for DST. - Adds `Command.Spec.Arg_type.percent`. - Add `Stable.V1` to `Time_ns.Ofday.Option.t`, add it to `Core.Stable`, and standardize the interface to `Option` modules in `Time_ns`. - Add `Time_ns.to_ofday`. ### core_extended - Update to follow `core[_kernel]` evolution. ### core_kernel - Rename Tuple.T2.map1 and Tuple.T2.map2 to `map_fst` and `map_snd`, following the convention of the accessors. Usually, numeric suffixes mean higher arities, not different fields. - After discussion, rather than checking for overflow in `Time_ns` arithmetic, clarify that it's silently ignored. (Subsequent conversions may or may not notice.) We did identify the set of functions to document: Time_ns.Span.((+), (-), scale_int, scale_int63, create, of_parts) Time_ns.(add, sub, diff, abs_diff, next_multiple) Added `Core_int63.(add_with_overflow_exn, abs_with_overflow_exn, neg_with_overflow_exn)` in the course of abandoned work on overflow detection in `Time_ns`. These may be useful. `mul_with_overflow_exn` was abandoned because 1. it's a lot of work to review 2. there's a better approach: Go to the C level and check the high word of the product from the IMUL instruction, which is both simpler and faster. - Changes related to Float.sign Float.sign currently maps -1e-8 and nan to Zero. Some users don't expect this. This feature addresses that via the following changes: - `Float.sign` renamed to `Float.robust_sign`. (`Float.sign` is still present, but deprecated.) - `Float.sign_exn` introduced which does not use robust comparison, and which raises on `nan`. - `Sign` pulled out of `Float` and made its own module, and `sign : t -> Sign.t` added to `Comparable.With_zero`. In particular, `Int.sign : int -> Sign.t` now exists. (In looking at existing uses of `Float.sign`, there was at least one case where the user was converting an int to a float to get its sign. That ended up being deleted, but it still seems desirable to have `Int.sign`. There were also record types with a field `sign : Float.Sign.t` where logically the type has no real connection to `Float`.) - Uses of `Float.robust_sign` revisited to make sure that's the behavior we want. - Added Quickcheck tests for `Hashtbl` functions, using `Map` as a point of comparison. A lot of `Hashtbl` functions did not have tests, so I used an interface trick to require every function to show up in the test module. The easiest way to write a readable test for every function was to compare it to a similar datatype, so I went with `Map`. - Allow clients of `Hashtbl.incr` and `Hashtbl.decr` to specify entries should be removed if the value is 0 - The type of `symmetric_diff` has a typo in it. - Switched `Timing_wheel_ns` to use `%message` and `%sexp`. - Improved the error message raised by `Timing_wheel.add` and `reschedule` if the supplied time is before the start of the current interval. Previously, the error message was something like: ("Timing_wheel.Priority_queue got invalid key" (key ...) (timing_wheel ...)) Now, it will be like: ("Timing_wheel cannot schedule alarm before start of current interval" (at ...) (now_interval_num_start ...)) The old message was confusing because `key` is less understandable than `at`, and because it didn't make clear that this is a usage error, as opposed to a Timing_wheel bug. Implementing this efficiently required adding a field to timing wheel: mutable now_interval_num_start : Time_ns.t so that the check done by `add` is fast. - Add `Comparable.Make_using_comparator`. Since `Map` and `Set` already have `Make_using_comparator` functors, there's no particular reason for `Comparable` not to have one. More concretely, this will be useful when (if) we add stable containers to core: we can add a stable version of `Comparable.Make`, then pass the resulting comparator into the unstable functor to get equal types. - Add a `Total_map.Make_using_comparator` functor to allow the creation of total maps which are type equivalent to regular maps. - Change default major heap increments to be % of the heap size instead of a constant increment Also changed type of overhead parameters to Percent.t - Remove modules `Core.Std.Sexp.Sexp_{option,list,array,opaque}`, which used to allow binability for types `sexp_option`, `sexp_list`, etc., but now serve no purpose. - Change the signature of the output of `Comparable.Make{,_binable}_using_comparator` to not include `comparator_witness`. This is so that code like this will compile: include T include Comparable.Make_using_comparator (T) - Changed `Timing_wheel_ns` so that it only supports times at or after the epoch, i.e. only non-negative `Time_ns.t` values. Times before the epoch aren't needed, and supporting negative times unnecessarily complicates the implementation. Removed fields from `Timing_wheel_ns.t` that are now constants: ; min_time : Time_ns.t ; max_time : Time_ns.t ; min_interval_num : Interval_num.t - In `Timing_wheel.t`, cache `alarm_upper_bound`, which allows us to give an improved error message when `Timing_wheel.add` is called with a time beyond `alarm_upper_bound`. - Add `Sequence.merge_with_duplicates` (** `merge_with_duplicates_element t1 t2 ~cmp` interleaves the elements of `t1` and `t2`. Where the two next available elements of `t1` and `t2` are unequal according to `cmp`, the smaller is produced first. *) val merge_with_duplicates : 'a t -> 'a t -> cmp:('a -> 'a -> int) module Merge_with_duplicates_element : sig type 'a t = | Left of 'a | Right of 'a | Both of 'a * 'a `@@deriving bin_io, compare, sexp` end - Add `Set.merge_to_sequence` (** Produces the elements of the two sets between `greater_or_equal_to` and `less_or_equal_to` in `order`, noting whether each element appears in the left set, the right set, or both. In the both case, both elements are returned, in case the caller can distinguish between elements that are equal to the sets' comparator. Runs in O(length t + length t'). *) val merge_to_sequence : ?order : ` `Increasing (** default *) | `Decreasing ` -> ?greater_or_equal_to : 'a -> ?less_or_equal_to : 'a -> ('a, 'cmp) t -> ('a, 'cmp) t -> 'a Merge_to_sequence_element.t Sequence.t module Merge_to_sequence_element : sig type 'a t = 'a Sequence.Merge_with_duplicates_element.t = | Left of 'a | Right of 'a | Both of 'a * 'a `@@deriving bin_io, compare, sexp` end - Make `Hashtbl.merge_into` take explicit variant type type 'a merge_into_action = Remove | Set_to of 'a val merge_into : f:(key:'k key -> 'a1 -> 'a2 option -> 'a2 merge_into_action) -> src:('k, 'a1) t -> dst:('k, 'a2) t -> unit The `f` used to return 'a2 option, and it was unclear whether None meant do nothing or remove. (It meant do nothing.) - Some red-black tree code showed a 15-20% speedup by inlining the comparison, rather than relying on the `caml_int_compare` external call. I've tried to cleanly apply it to `Core_int` (though it can't really be done without an Obj.magic), though this may be a better fit for a compiler patch to treat int comparisons as an intrinsic. Testing ------- Added an inline test for boundary cases. Presently it returns the identical values to `caml_int_compare`, though probably it should only be held to same sign results. - Rename `Hashtbl.[filter_]replace_all[i]` to `Hashtbl.[filter_]map[i]_inplace`. - Import `debug.ml` from incremental. - `Float_intf` used 'float' sometimes where it means 't'. - Added `Identifiable.Make_with_comparator`, for situations where you want to preserve the already known comparator, for example to define stable sets and maps that are type equivalent to unstable types and sets. ### incremental - Made it possible to use Incremental without invalidation -- i.e. a change in a the left-hand side of bind does *not* invalidate nodes created on the right-hand side. This is configurable via the argument to `Incremental.Make_with_config`, which now takes: val bind_lhs_change_should_invalidate_rhs : bool So, one can now build an Incremental without invalidation using: Incremental.Make_with_config (struct include Incremental.Config.Default () let bind_lhs_change_should_invalidate_rhs = false end) () Implementation -------------- The implementation is simple: When a bind rhs changes, instead of `invalidate_nodes_created_on_rhs`, we `rescope_nodes_created_on_rhs`, which moves the nodes up to the bind's parent. Testing ------- Turned the unit tests into a functor parameterized on `bind_lhs_change_should_invalidate_rhs`, and run them with both `true` and `false`. Modified tests where necessary to skip tests of invalidity when `bind_lhs_change_should_invalidate_rhs = false`. Added a unit test of `bind_lhs_change_should_invalidate_rhs = true` that makes sure a node created on a bind rhs whose lhs subsequently changes continues to stabilize correctly. - Splitted incremental into a part that can run in javascript, incremental_kernel, and the other one. ### jenga - Jenga's `Api` is extended with `printf` and `printf_verbose`, and `verbose` is removed. These changes are in preparation for jenga/server. In jenga/server, builds will run on a daemonized server; the build trace is displayed on any clients which are `watch`ing. `Api.printf` and `Api.printf_verbose` allow messages from the `jenga/root.ml`. The command line flag `-verbose` will be a client flag, not a server flag, allowing the choice of verbosity to be selected per-client. It therefore won't make sense to allow `jenga/root.ml` to ask if `verbose` is true, so we remove `verbose` from the jenga `Api`. Instead `jenga/root/ml` can use `printf_verbose` for messages which are to be displayed only by a client that is watching with `-verbose`. - Make benchmark analysis script compute heap stats in addition to time. - Give more information about why we build `Dep.action` and `Dep.action_stdout`, because right now, the output of jenga sometimes doesn't contain the information (the summary line contains it, but it's not always printed, at least when using `-stop-on-error`). For instance: Old jenga: *** jenga: ERROR: (summary) a/.DEFAULT: External command failed - build a stdout + bash -e -u -o pipefail -c false - exit a stdout, 3ms, exited with code 1 *** jenga: ERROR: (summary) a/foo.ml.d: External command failed - build a stdout + .../ocamldep.opt -modules -impl foo.ml foo.ml: File "foo.ml", line 2, characters 0-0: Error: Syntax error - exit a stdout, 4ms, exited with code 2 New jenga: *** jenga: ERROR: (summary) a/.DEFAULT: External command failed - build a .DEFAULT + bash -e -u -o pipefail -c false - exit a .DEFAULT, 5ms, exited with code 1 *** jenga: ERROR: (summary) a/foo.ml.d: External command failed - build a rule of foo.ml.d + .../ocamldep.opt -modules -impl foo.ml foo.ml: File "foo.ml", line 2, characters 0-0: Error: Syntax error - exit a rule of foo.ml.d, 6ms, exited with code 2 - When - a command run by jenga forks - and that fork inherits the stdout/stderr pipes to jenga - and the fork doesn't terminate - but the initial process does terminate we are in a situation where, jenga before this feature would consider the command is still running. This makes for some annoying debug, because which command started the stray process is hard to tell, the stdout and stderr of the initial process are not visible anywhere (they are in jenga's memory), etc. Also jenga appears stuck. So instead, we expect the stdout/stderr to close shortly after the initial process dies, and if not, we close stdout/stderr ourselves and consider that the command failed. To test, I put this in a jbuild: (alias ((name DEFAULT) (deps ()) (action "(echo false >&2; sleep 4) & "))) and `JENGA_OPTIONS='((fd_close_timeout 2s))' jenga -P` complained: *** jenga: ERROR: (summary) lib/sexplib/src/.DEFAULT: External command failed - build lib/sexplib/src stdout + bash -e -u -o pipefail -c '(echo false >&2; sleep 4) & ' false - exit lib/sexplib/src stdout, 2.004s, stdout or stderr wasn't closed 2s after process exited (due to a stray process perhaps?) Also I had hydra do a full tree build (both in jane-continuous-release and jane-release) and they both succeeded. - Remove jenga command line flag which was `-delay N` This controlled `Config.delay_for_dev` which inserted a delay of N seconds before running any job in `Job.run`. I have heard this described as the most pointless command line flag of all time! Reason to fix now: Another feature `jane/jenga/api.printf` removes `Api.verbose` which exposes `Config.verbose` to jenga/root.ml. Following both features, we can remove the `Run.For_user` hackery. - Avoid `nan` from divide-by-0 in jenga monitor's `finish_time_estimator`. Without this fix, we may get `nan` as the value of a `Time.t` which breaks an invariant of the `Time` module. This may result in a crash in `Time.to_ofday`. ("unhandled exception" ((monitor.ml.Error_ ((exn "Option.value_exn None") (backtrace ("Raised by primitive operation at file \"option.ml\", line 59, characters 4-21" "Called from file \"zone.ml\", line 751, characters 8-104" "Called from file \"zone.ml\", line 761, characters 10-48" "Called from file \"time0.ml\", line 275, characters 31-54" "Called from file \"finish_time_estimator.ml\", line 38, characters 14-54" "Called from file \"message.ml\", line 384, characters 30-66" "Called from file \"list.ml\", line 73, characters 12-15" "Called from file \"list.ml\", line 73, characters 12-15" "Called from file \"message.ml\", line 158, characters 2-42" "Called from file \"deferred0.ml\", line 65, characters 64-69" "Called from file \"job_queue.ml\", line 160, characters 6-47" "")) (monitor (((name try_with) (here ()) (id 4) (has_seen_error true) (is_detached true)))))) ((pid 17688) (thread_id 4 - Code refactoring to avoid circular module dependencies in `jenga/server` feature. There changes have no semantic effects; it's just code movement. 1. Extract `Message.Job_summary` as a top level module. 2. Extract `Build.Run_reason` as a top level module. 3. Create all `Effort.Counter` instances in `Progress`. - The current cycle detection in build.ml is very hard to use correctly. Implement a robust cycle detection in Tenacious. Additionally, this feature lets the user see what jenga is currently working on by doing `jenga diagnostics dump-tenacious-graph`. Along the way, fixed `jenga monitor` to exit non-zero when it can't discover jengaroot. - Change `jenga -time` to prefix lines with absolute times. More standard & helpful. - Extend jenga Api with support for registering environment variables. These can be accessed and manipulated via RPC to the running jenga. This feature is necessary to support env-var discovery in build-manager via RPC, rather than the current hack of parsing the output from jenga. Environment variable manipulation is exposed on the command line with: jenga env get NAME jenga env set NAME VALUE jenga env unset NAME jenga env print The Api supports access to either the current value of a variable, or to a dependency which responds dynamically to changes made by `setenv`. Var.peek : 'a Var.t -> 'a Dep.getenv : 'a Var.t -> 'a Dep.t - Removed `update-stream` RPC. Was never used by anything. It wouldn't have worked anyway because it contained interned paths. Removed `app/jenga/lib/rpc_intf.mli` - just unhelpful code duplication of the descriptions in `rpc_intf.ml`. - Error message when running on nfs was super-confusing. - Avoid double-reporting build errors originating from the action of a multi-target rule. (And in future avoid double-reporting the error to the `error-pipe' of jenga/server.) 1. Attribute an error to the head target ONLY. 2. Additional targets are treated as dependants of the `head_target`, and so (if demanded) are regarded as having `failure' not `error' status. 3. Always use `head-target` in the "-build" line message, instead of whatever target happens (non deterministically) to get `demanded` first. 4. Allow user control over the which target is regarded as the `head_target`, by taking the first target listed as the `head_target`. - Couple of extra messages bracketing when GC is performed & finished. - Stop taking nfs locks, just rely on local locks, so we're finally free from the bug of Nfs_lock where it can't remove a half created lock. Also clean things up while I'm here, no change intended in special_paths.ml. - Refactor code to remove distinction between types `Progess.Need.t` and `Goal.t`; removing the `Progess.Need.t` type by supporting the one additional case of `Need.jengaroot` as a `Goal.t`. Allow the user to demand a `build' of (just) the jengaroot. - Add support for unsetting environment variables prior to running actions ### ocaml_plugin - Improve the check plugin command that comes with ocaml-plugin: 1) Improve documentation, add `readme` to include more info about what is being checked exactly. 2) Avoid the switch `-code-style _` for application that have made a choice of code style statically. Having the swtich available at runtime is just confusing, since only 1 style is going to work anyway. ### ppx_bench - Add an attribute `@name_suffix` to `let%bench_module`. This is an arbitrary expression that gets concatenated to the name of the bench module. It's useful to have this when using `ppx_bench` inside a functor, to distinguish each functor application in the output. ### ppx_expect - Don't remove trailing semicolons when producing a correction. - Corrected `%expect`s with double quoted strings don't have the single space padding. - In the ppx\_expect runtime, flush stdout before redirecting it This is to avoid capturing leftover of the stdout buffer. - Make sure the expect-test runtime doesn't generate `%collector_never_triggered`, which is not accepted by ppx\_expect. Instead generate: `%expect {| DID NOT REACH THIS PROGRAM POINT |}` - Make expect tests pass the user description to the inline test runtime - Fix a race condition in the ppx\_expect runtime - Change ppx\_expect be more permissive when matching whitespace in actual output. See `ppx/ppx_expect/README.org` for details. Changes to the implementation of ppx\_expect (including some refactoring): - factorized the common bits between the runtime and ppx rewriter into one library expect_test_common - factorized different structures representing the same thing using polymorphism - communicate data structures between the ppx rewriter and runtime using a generated lifter instead of hand-written lifters - splitted the matching and correction writing code: the .corrected is now only created when needed instead of all the time - added a concrete syntax tree to represent both the actual output and expectation in non-exact mode. This allow to keep the user formatting as much as possible - made various bits more re-usable - Change the default style of multi-line expectation to: `%expect {| abc def |}` More generally, try to preserve the formatting a bit more when correcting from empty or single to multi-line. - Arrange things so that when `open Async.Std` is opened, `%expect ...` expressions are of type `unit Deferred.t` and flush stdout before capturing the output. ### ppx_fields_conv - Fix errors in `ppx_fields_conv` documentation - Add unit tests for `ppx_fields_conv` functions - Fix some idiosyncracies where the implementations in `ppx_fields_conv.ml` differed (ex: a variable would be called one thing when implementing one function but would be called something different when implementing every other function). ### ppx_inline_test - Allow to configure hooks for inline tests by redefining a module Inline\_test\_config. ### ppx_optcomp - Install standalone ppx-optcomp program that can be run through `-pp` ### ppx_sexp_conv - Clean up the documentation for sexplib, modernizing it to include `ppx_sexp_conv`, and breaking up the documentation between sexplib and `ppx_sexp_conv`. Also changed the formatting to use org-mode, so it will render properly on github. Markdown doesn't render well by default, unless you use quite different conventions about linebeaks. ### rpc_parallel - Adding the mandatory arguments `redirect_stderr` and `redirect_stdout` to `Map_reduce` so it is easier to debug your workers. Also removed `spawn_exn` in favor of only exposing `spawn_config_exn`. If you want to spawn a single worker, make a config of one worker. - Cleans up the implementation-side interface for aborting `Pipe_rpc`s. Summary The `aborted` `Deferred.t` that got passed to `Pipe_rpc` implementations is gone. The situations where it would have been determined now close the reading end of the user-supplied pipe instead. Details Previously, when an RPC dispatcher decided to abort a query, the RPC implementation would get its `aborted` `Deferred.t` filled in, but would remain free to write some final elements to the pipe. This is a little bit more flexible than the new interface, but it's also problematic in that the implementer could easily just not pay attention to `aborted`. (They're not obligated to pay attention to when the pipe is closed, either, but at least they can't keep writing to it.) We don't think the extra flexibility was used at all. In the future, we may also simplify the client side to remove the `abort` function on the dispatch side (at least when not using `dispatch_iter`). For the time being it remains, but closing the received pipe is the preferred way of aborting the query. There are a couple of known ways this might have changed behavior from before. Both of these appear not to cause problems in the jane repo. - In the past, an implementation could write to a pipe as long as the client didn't disconnect, even if it closed its pipe. Now writes will raise after a client closes its pipe (or calls `abort`), since the implementor's pipe will also be closed in this case. Doing this was already unsafe, though, since the pipe *was* closed if the RPC connection was closed. - `aborted` was only determined if a client aborted the query or the connection was closed. The new alternative, `Pipe.closed` called on the returned pipe, will also be determined if the implementation closes the pipe itself. This is unlikely to cause serious issues but could potentially cause some confusing logging. - Deal with an fd leak in Managed worker connections - This feature completely restructured `rpc_parallel` to make it more transparent in what it does and doesn't do. I offloaded all the connection management and cleanup responsibilities to the user. What `Rpc_parallel` used to do for you: ------------------------------------ - Internally there was a special (behind the scenes) Rpc connection maintained between the worker and its master upon spawn. Upon this connection closing, the worker would shut itself down and the master would be notified of a connection lost. - We would keep a connection table for you. `run` was called on the worker type (i.e. a host and port) because it's implementation would get a connection (using an existing one or doing a `Rpc.Connection.client`). What it now does: ----------------- - No special Rpc connection is maintained, so all cleanup is the job of the user - No internal connection table. `run` is called on a connection type, so you have to manage connections yourself - Exposes connection states (unique to each connection) along with worker states (shared between connections) - There is an explicit `Managed` module that does connection management for you - There is an explicit `Heartbeater` type that can be used in spawned workers to connect to their master and handle cleanup - There was a race condition where `spawn` would return a `Worker.t` even though the worker had not daemonized yet. This manifested itself in a spurious test case failure. This also allowed for running the worker initialization code before we daemonize. - Reuse the master rpc settings when the worker connects using the `Heartbeater` module - Make all global state and rpc servers lazy. Namely: 1) Toplevel hashtables are initialized with size 1 to reduce overhead due to linking with `Rpc_parallel` 2) The master rpc server is only started upon the first call to spawn - Fixed a `Rpc_parallel` bug that caused a Master to Worker call to never return. - Add new functions `spawn_and_connect` and `spawn_and_connection_exn` which return back the worker and a connection to the worker. Also, move all the example code off of the managed module - Some refactoring: 1) Take `serve` out of the `Connection` module. It shouldn't be in there because it does nothing with the `Connection.t` type 2) Remove some unnecessary functions defined on `Connection.t`'s. E.g. no longer expose `close_reason` because all the exception handling will be done with the registered exn handlers. 3) Move `Rpc_settings` module 4) Separate `global_state` into `as_worker` and `as_master` 5) Some cleanup with how we are managing worker ids 6) create a record type for `Init_connection_state_rpc.query` ### sexplib - Changes `Sexp.to_string` to escape all non-ASCII characters. Previously chars >= 127 are escaped or not depending on: 1. other character in the string 2. the system 3. environment variable settings (2) and (3) are because `String.escaped` from the stdlib uses the C function `isprint` which is locale and OS dependent. This can cause invalid UTF-8 sequence to be printed by sexplib, which is annoying: https://github.com/janestreet/sexplib/issues/18 Starting with this release, sexplib: 1. copies the `String.escaped` function of OCaml 4.03 which escapes all non-ascii characters 2. make sure we escape the string when it contains characters >= 127 - Clean up the documentation for sexplib, modernizing it to include `ppx_sexp_conv`, and breaking up the documentation between sexplib and `ppx_sexp_conv`. Also changed the formatting to use org-mode, so it will render properly on github. Markdown doesn't render well by default, unless you use quite different conventions about linebeaks. - In sexp macro library, avoid returning success when there is any error reading a sexp. In particular, this prevents sexp resolve <(echo '(:use x)') from silently succeeding. Also, now we no longer read an included file multiple times. This lets even crazy stuff like this to work: $ echo 'hi ' | sexp resolve <(echo '((:include /dev/stdin) (:include /dev/stdin))')