tabularInputFileSummary

fun ReportBuilder.tabularInputFileSummary(file: TabularInputFile, caption: String? = null, maxRows: Int = 500, confidenceLevel: Double = 0.95, detail: Boolean = false)(source)

Appends a "Summary" section for a TabularInputFile containing the column schema, a compact across-column statistics table for all numeric columns, and a distinct/missing count table for all text columns.

At most maxRows rows are read from the file per column. Pass 0 to read all rows (subject to available memory).

Produces (inside a section titled caption or "Summary: <filename>"):

  1. tabularFileSchema sub-section.

  2. A ReportNode.Paragraph stating total row count and how many are reported.

  3. (if numeric columns exist) A ReportNode.StatTable ("Numeric Column Statistics") — one row per numeric column, built from a Statistic over the fetched values. NaN values in the file are preserved in the array and counted as Missing by Statistic without affecting the other summary fields.

  4. (if text columns exist) A ReportNode.DataTable ("Text Column Summary") — columns: Column | Count | Distinct | Missing.

Usage:

val doc = report("Sales Data") {
tabularInputFileSummary(tif, confidenceLevel = 0.90, detail = true)
}

Parameters

file

the tabular input file to summarise

caption

optional section title; defaults to "Summary: <filename>"

maxRows

maximum rows read per column; 0 = all rows; defaults to 500

confidenceLevel

confidence level for the ReportNode.StatTable half-width and CI; must be in (0, 1); defaults to 0.95

detail

false (default) = compact summary only; true = compact summary + diagnostic table