D.5 The DataFrameUtil
Object
A data frame is an in-memory data structure that holds tabular data. That is, data having rows and columns. Kotlin has a library to support this type of functionality and the DataFrameUtil
object has been designed to facilitate the use of data frame within the KSL.
Documentation, examples, and the basic functionality of Kotlin data frames can be found at this repository. Kotlin data frames provide similar functionality as that found in other data frame libraries such as R. Figure D.7 illustrates the functions and properties of the DataFrameUtil
object.
The main functionality added by DataFrameUtil
is sampling from rows and columns of the data frame and computing some basic statistics. The functions for sampling without replacement return a new data frame with the sampled rows. The functions for permutation return a new data frame with the permuted rows. The functions for randomly selection will select an element from the a column of the data frame or for selecting an entire row. The element or row can be selected with equal probability or via an empirical distribution over the elements (by row). KSL
statistics, histogram, box plot statistics, and frequencies can all be computed over the columns.
- sampleWithoutReplacement(DataFrame\(<T>\), Int, RNStreamIfc) DataFrame\(<T>\)
- sampleWithoutReplacement(DataFrame\(<T>\) Int, Int) DataFrame\(<T>\)
- permute(DataFrame\(<T>\), Int) DataFrame\(<T>\)
- permute(DataFrame\(<T>\), RNStreamIfc) DataFrame\(<T>\)
- randomlySelect(DataColumn\(<T>\), RNStreamIfc) T
- randomlySelect(DataColumn\(<T>\), Int) T
- randomlySelect(DataColumn\(<T>\), Double[], RNStreamIfc) T
- randomlySelect(DataColumn\(<T>\), Double[], Int) T
- randomlySelect(DataFrame\(<T>\), Int) DataRow\(<T>\)
- randomlySelect(DataFrame\(<T>\), RNStreamIfc) DataRow\(<T>\)
- randomlySelect(DataFrame\(<T>\), Double[], Int) DataRow\(<T>\)
- randomlySelect(DataFrame\(<T>\), Double[], RNStreamIfc) DataRow\(<T>\)
- buildMarkDown(DataFrame\(<T>\), Appendable) Unit
- histogram(DataColumn\(<Double>\), Double[]) Histogram
- statistics(DataColumn\(<Double>\)) Statistic
- frequencies(DataColumn\(<Int>\)) IntegerFrequency
- boxPlotSummary(DataColumn\(<Double>\)) BoxPlotSummary
Extension functions are also available for a data frame and its columns. The Kotlin data frame library has been included in the KSL as part of the API. Thus, clients also have access to the full features associated with the library. The main usage within the KSL is in the capturing of simulation output data. The easiest way to do this is by using the KSLDatabase
class. Data frame instances can be requested as part of the database functionality of the KSL.