csv.mli - comma separated values parser
$Id: csv.mli,v 1.5 2005/05/24 13:52:50 rich Exp $
val columns : t ‑> int
Work out the (maximum) number of columns in a CSV file. Note that each line may be a different length, so this finds the one with the most columns.
val load_in : ?separator:char ‑> Pervasives.in_channel ‑> t
Load a CSV file.
val load_string : ?separator:char ‑> string ‑> t
Load a CSV from a string.
val load_rows : ?separator:char ‑> (string list ‑> unit) ‑> Pervasives.in_channel ‑> unit
For very large CSV files which cannot be processed in memory at once, this function is appropriate. It parses the input one row at a time and calls your function once for each row.
Note that if you CSV file contains cells which have embedded
line feeds, then it is non-trivial to parse these lines and
pass them correctly to load_rows
.
val load_rows_inchar : ?separator:char ‑> (string list ‑> unit) ‑> (unit ‑> char) ‑> unit
Works exactly like load_rows
except that instead of reading from
a descriptor, it calls a read_char function. This function
throws End_of_file
on the end of input.
This takes a CSV file and trims empty cells.
All four of the option arguments (~top
, ~left
, ~right
, ~bottom
)
default to true
.
The exact behaviour is:
~right
: If true, remove any empty cells at the right hand end of
any row. The number of columns in the resulting CSV structure will
not necessarily be the same for each row.
~top
: If true, remove any empty rows (no cells, or containing just empty
cells) from the top of the CSV structure.
~bottom
: If true, remove any empty rows from the bottom of the
CSV structure.
~left
: If true, remove any empty columns from the left of the
CSV structure. Note that ~left
and ~right
are quite different:
~left
considers the whole CSV structure, whereas ~right
considers
each row in isolation.
Make the CSV data "square" (actually rectangular). This pads out each row with empty cells so that all rows are the same length as the longest row. After this operation, every row will have length columns.
val associate : string list ‑> t ‑> (string * string) list list
associate header data
takes a block of data and converts each
row in turn into an assoc list which maps column header to data cell.
Typically a spreadsheet will have the format:
header1 header2 header3 data11 data12 data13 data21 data22 data23 ...
This function arranges the data into a more usable form which is robust against changes in column ordering. The output of the function is:
[ ["header1", "data11"; "header2", "data12"; "header3", "data13"]; ["header1", "data21"; "header2", "data22"; "header3", "data23"]; etc. ]
Each row is turned into an assoc list (see List.assoc).
If a row is too short, it is padded with empty cells (""
). If
a row is too long, it is truncated.
You would typically call this function as:
let header, data = match csv with h :: d -> h, d | [] -> assert false;; let data = Csv.associate header data;;
The header strings are shared, so the actual space in memory consumed by the spreadsheet is not much larger.
val save_fn : ?separator:char ‑> (string ‑> unit) ‑> t ‑> unit
Save the string list list, writing the strings by calling the given given functon.
val save_out : ?separator:char ‑> Pervasives.out_channel ‑> t ‑> unit
Save string list list to a channel.
val print_readable : t ‑> unit
Print the CSV data to stdout
in a human-readable format. Not much
is guaranteed about how the CSV is printed, except that it will be
easier to follow than a "raw" output done with print. This is
a one-way operation. There is no easy way to parse the output of
this command back into CSV data.
val save_out_readable : Pervasives.out_channel ‑> t ‑> unit
As for print_readable, allowing the output to be sent to a channel.
val save_fn_readable : (string ‑> unit) ‑> t ‑> unit