3.3 Review of Statistical Concepts

The simulation models that have been illustrated have estimated the quantity of interest with a point estimate (i.e. \(\bar{X}\)). For example, in the previous section, we can easily compute the true area under the curve as \(4.6\bar{6}\); however, the point estimate returned by the simulation was a value of 4.781, based on 100 samples. Thus, there is sampling error in our estimate. The key question examined in this section is how to control the sampling error in a simulation experiment. The approach that we will take is to determine the number of samples so that we can have high confidence in our point estimate. In order to address these issues, we need to review some basic statistical concepts in order to understand the meaning of sampling error.

3.3.1 Point Estimates and Confidence Intervals

Let \(x_{i}\) represent the \(i^{th}\) observations in a sample, \(x_{1}, x_{2},...x_{n}\) of size \(n\). Represent the random variables in the sample as \(X_{1}, X_{2},...X_{n}\). The random variables form a random sample, if 1) the \(X_{i}\) are independent random variables and 2) every \(X_{i}\) has the same probability distribution. Let’s assume that these two assumptions have been established. Denote the unknown cumulative distribution function of \(X\) as \(F(x)\) and define the unknown expected value and variance of \(X\) with \(E[X] = \mu\) and \(Var[X] = \sigma^2\), respectively.

A statistic is any function of the random variables in a sample. Any statistic that is used to estimate an unknown quantity based on the sample is called an estimator. What would be a good estimator for the quantity \(E[X] = \mu\)? Without going into the details of the meaning of statistical goodness, one should remember that the sample average is generally a good estimator for \(E[X] = \mu\). Define the sample average as follows:

\[\begin{equation} \bar{X} = \frac{1}{n}\sum_{i=1}^{n}X_i \tag{3.3} \end{equation}\]

Notice that \(\bar{X}\) is a function of the random variables in the sample and therefore it is also a random variable. This makes \(\bar{X}\) a statistic, since it is a function of the random variables in the sample.

Any random variable has a corresponding probability distribution. The probability distribution associated with a statistic is called its sampling distribution. The sampling distribution of a statistic can be used to form a confidence interval on the point estimate associated with the statistic. The point estimate is simply the value obtained from the statistic once the data has been realized. The point estimate for the sample average is computed from the sample:

\[\bar{x} = \frac{1}{n}\sum_{i=1}^{n}x_i\]

A confidence interval expresses a degree of certainty associated with a point estimate. A specific confidence interval does not imply that the parameter \(\mu\) is inside the interval. In fact, the true parameter is either in the interval or it is not within the interval. Instead you should think about the confidence level \(1-\alpha\) as an assurance about the procedure used to compute the interval. That is, a confidence interval procedure ensures that if a large number of confidence intervals are computed each based on \(n\) samples, then the proportion of the confidence intervals that actually contain the true value of \(\mu\) should be close to \(1-\alpha\). The value \(\alpha\) represents risk that the confidence interval procedure will produce a specific interval that does not contain the true parameter value. Any one particular confidence interval will either contain the true parameter of interest or it will not. Since you do not know the true value, you can use the confidence interval to assess the risk of making a bad decision based on the point estimate. You can be confident in your decision making or conversely know that you are taking a risk of \(\alpha\) of making a bad decision. Think of \(\alpha\) as the risk that using the confidence interval procedure will get you fired. If we know the sampling distribution of the statistic, then we can form a confidence interval procedure.

Under the assumption that the sample size is large enough such that the distribution of \(\bar{X}\) is normally distributed then you can form an approximate confidence interval on the point estimate, \(\bar{x}\). Assuming that the sample variance:

\[\begin{equation} S^{2}(n) = \frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2 \tag{3.4} \end{equation}\]

is a good estimator for \(Var[X] = \sigma^2\), then a \((1-\alpha)100\%\) two-sided confidence interval estimate for \(\mu\) is:

\[\begin{equation} \bar{x} \pm t_{1-(\alpha/2), n-1} \dfrac{s}{\sqrt{n}} \tag{3.5} \end{equation}\]

where \(t_{p, \nu}\) is the \(100p\) percentage point of the Student t-distribution with \(\nu\) degrees of freedom. The spreadsheet function T.INV(p, degrees freedom) computes the left-tailed Student-t distribution value for a given probability and degrees of freedom. That is, \(t_{p, \nu} = T.INV(p,\nu)\). The spreadsheet tab labeled Student-t in the spreadsheet that accompanies this chapter illustrates functions related to the Student-t distribution. Thus, in order to compute \(t_{1-(\alpha/2), n-1}\), use the following formula:

\[t_{1-(\alpha/2), n-1} = T.INV(1-(\alpha/2),n-1)\] The T.INV.2T(\(\alpha\), \(\nu\)) spreadsheet function directly computes the upper two-tailed value. That is,

\[ t_{1-(\alpha/2), n-1} =T.INV.2T(\alpha, n-1) \] Within the statistical package R, this computation is straight-forward by using the \(t_{p, n} = qt(p,n)\) function.

\[ t_{p, n} = qt(p,n) \]

#'qt(0.975, 5)
qt(0.975,5)

## [1] 2.570582

#' set alpha
alpha = 0.05
#'qt(1-(alpha/2, 5)
qt(1-(alpha/2),5)

## [1] 2.570582

Using the KSL, we can compute the quantile of the Student-T distribution via the following:

fun main(){
    val level = 0.95
    val dof = 5.0
    val alpha = 1.0 - level
    val p = 1.0 - alpha / 2.0
    val t: Double = StudentT.invCDF(dof, p)
    println("p = $p dof = $dof t-value = $t")
}

With results:

p = 0.975 dof = 5.0 t-value = 2.5705818445939186

3.3.2 Sample Size Determination

The confidence interval for a point estimator can serve as the basis for determining how many observations to have in the sample. From Equation (3.5), the quantity:

\[\begin{equation} h = t_{1-(\alpha/2), n-1} \dfrac{s}{\sqrt{n}} \tag{3.6} \end{equation}\]

is called the half-width of the confidence interval. You can place a bound, \(\epsilon\), on the half-width by picking a sample size that satisfies:

\[\begin{equation} h = t_{1-(\alpha/2), n-1} \dfrac{s}{\sqrt{n}} \leq \epsilon \tag{3.7} \end{equation}\]

We call \(\epsilon\) the margin of error for the bound. Unfortunately, \(t_{1-(\alpha/2), n-1}\) depends on \(n\), and thus Equation (3.7) is an iterative equation. That is, you must try different values of \(n\) until the condition is satisfied. Within this text, we call this method for determining the sample size the Student-T iterative method.

Alternatively, the required sample size can be approximated using the normal distribution. Solving Equation (3.7) for \(n\) yields:

\[n \geq \left(\dfrac{t_{1-(\alpha/2), n-1} \; s}{\epsilon}\right)^2\]

As \(n\) gets large, \(t_{1-(\alpha/2), n-1}\) converges to the \(100(1-(\alpha/2))\) percentage point of the standard normal distribution \(z_{1-(\alpha/2)}\). This yields the following approximation:

\[\begin{equation} n \geq \left(\dfrac{z_{1-(\alpha/2)} s}{\epsilon}\right)^2 \tag{3.8} \end{equation}\]

Within this text, we refer to this method for determining the sample size as the normal approximation method. This approximation generally works well for large \(n\), say \(n > 50\). Both of these methods require an initial value for the standard deviation. This computation is implemented in the function, estimateSampleSize of the Statisitic class’s companion object.

        /**
         * Estimate the sample size based on a normal approximation
         *
         * @param desiredHW the desired half-width (must be bigger than 0)
         * @param stdDev    the standard deviation (must be bigger than or equal to 0)
         * @param level     the confidence level (must be between 0 and 1)
         * @return the estimated sample size
         */
        fun estimateSampleSize(desiredHW: Double, stdDev: Double, level: Double): Long {
            require(desiredHW > 0.0) { "The desired half-width must be > 0" }
            require(stdDev >= 0.0) { "The desired std. dev. must be >= 0" }
            require(!(level <= 0.0 || level >= 1.0)) { "Confidence Level must be (0,1)" }
            val a = 1.0 - level
            val a2 = a / 2.0
            val z = Normal.stdNormalInvCDF(1.0 - a2)
            val m = z * stdDev / desiredHW * (z * stdDev / desiredHW)
            return (m + .5).roundToLong()
        }

In order to use either the normal approximation method or the Student-T iterative method, you must have an initial value for, \(s\), the sample standard deviation. The simplest way to get an initial estimate of \(s\) is to make a small initial pilot sample (e.g. \(n_0=5\)). Given a value for \(s\) you can then set a desired bound and use the formulas. The bound is problem and performance measure dependent and is under your subjective control. You must determine what bound is reasonable for your given situation. One thing to remember is that the bound is squared in the denominator for evaluating \(n\). Thus, very small values of \(\epsilon\) can result in very large sample sizes.

Example 3.7 (Determining the Sample Size) Suppose we are required to estimate the output from a simulation so that we are 99% confidence that we are within \(\pm 0.1\) of the true population mean. After taking a pilot sample of size \(n=10\), we have estimated \(s=6\). What is the required sample size?

We will use Equation (3.8) to determine the sample size requirement. For a 99% confidence interval, we have \(\alpha = 0.01\) and \(\alpha/2 = 0.005\). Thus, \(z_{1-0.005} = z_{0.995} = 2.5758293064439264\). Because the margin of error, \(\epsilon\) is \(0.1\), we have that,

\[n \geq \left(\dfrac{z_{1-(\alpha/2)}\; s}{\epsilon}\right)^2 = \left(\dfrac{2.5758293064439264 \times 6}{0.1}\right)^2 = 23885.63 \approx 23886\]

The following KSL code will easily compute the sample size.

fun main() {
    val desiredHW = 0.1
    val s0 = 6.0
    val level = 0.99
    val n = Statistic.estimateSampleSize(
        desiredHW = desiredHW,
        stdDev = s0,
        level = level
    )
    println("Sample Size Determination")
    println("desiredHW = $desiredHW")
    println("stdDev = $s0")
    println("Level = $level")
    println("recommended sample size = $n")
}

Sample Size Determination
desiredHW = 0.1
stdDev = 6.0
Level = 0.99
recommended sample size = 23886

If the quantity of interest is a proportion, then a different method can be used. In particular, a \(100\times(1 - \alpha)\%\) large sample two sided confidence interval for a proportion, \(p\), has the following form:

\[\begin{equation} \hat{p} \pm z_{1-(\alpha/2)} \sqrt{\dfrac{\hat{p}(1 - \hat{p})}{n}} \tag{3.9} \end{equation}\]

where \(\hat{p}\) is the estimate for \(p\). From this, you can determine the sample size via the following equation:

\[\begin{equation} n = \left(\dfrac{z_{1-(\alpha/2)}}{\epsilon}\right)^{2} \hat{p}(1 - \hat{p}) \tag{3.10} \end{equation}\]

Again, a pilot run is necessary for obtaining an initial estimate of \(\hat{p}\), for use in determining the sample size. If no pilot run is available, then \(\hat{p} =0.5\) is often assumed as a worse case approximation.

If you have more than one performance measure of interest, you can use these sample size techniques for each of your performance measures and then use the maximum sample size required across the performance measures. Now, let’s illustrate these methods based on a small simulation.

3.3.3 Determining the Sample Size for a Monte Carlo Simulation Experiment

To facilitate some of the calculations related to determining the sample size for a simulation experiment, I have constructed a spreadsheet called SampleSizeDetermination.xlsx, which is found in the book support files for this chapter. You may want to utilize that spreadsheet as you go through this and subsequent sections.

Using a simple example, we will illustrate how to determine the sample size necessary to estimate a quantity of interest with a high level of confidence.

Example 3.8 (Sample Size Example) Suppose that we want to simulate a normally distributed random variable, \(X\), with \(E[X] = 10\) and \(Var[X] = 16\). From the simulation, we want to estimate the true population mean with 99 percent confidence for a half-width \(\pm 0.50\) margin of error. In addition to estimating the population mean, we want to estimate the probability that the random variable exceeds 8. That is, we want to estimate, \(p= P(X>8)\). We want to be 95% confident that our estimate of \(P(X>8)\) has a margin of error of \(\pm 0.05\). What are the sample size requirements needed to meet the desired margins of error?

Using simulation for this example is for illustrative purposes. Naturally, we actually know the true population mean and we can easily compute the \(p= P(X>8)\). for this situation. However, the example will illustrate sample size determination, which is an important planning requirement for simulation experiments.

Let \(X\) represent the unknown random variable of interest. Then, we are interested in estimating \(E[X]=\mu\) and \(p = P(X > 8)\). To estimate these quantities, we generate a random sample, \((X_1,X_2,...,X_n)\). \(E[X]\) can be estimated using \(\bar{X}\). To estimate a probability it is useful to define an indicator variable. An indicator variable has the value 1 if the condition associated with the probability is true and has the value 0 if it is false. To estimate \(p\), define the following indicator variable:

\[ Y_i = \begin{cases} 1 & X_i > 8\\ 0 & X_i \leq 8 \\ \end{cases} \] This definition of the indicator variable \(Y_i\) allows \(\bar{Y}\) to be used to estimate \(p\). Recall the definition of the sample average \(\bar{Y}\). Since \(Y_{i}\) is a \(1\) only if \(X_i > 8\), then the \(\sum \nolimits_{i=1}^{n} Y_{i}\) simply adds up the number of \(1\)’s in the sample. Thus, \(\sum\nolimits_{i=1}^{n} Y_{i}\) represents the count of the number of times the event \(X_i > 8\) occurred in the sample. Call this \(\#\lbrace X_i > 8\rbrace\). The ratio of the number of times an event occurred to the total number of possible occurrences represents the proportion. Thus, an estimator for \(p\) is:

\[ \hat{p} = \bar{Y} = \frac{1}{n}\sum_{i=1}^{n}Y_i = \frac{\#\lbrace X_i > 8\rbrace}{n} \] Therefore, computing the average of an indicator variable will estimate the desired probability. The code for this situation is quite simple:

fun main() {
    val rv = NormalRV(10.0, 16.0, streamNum = 3)
    val estimateX = Statistic("Estimated X")
    val estOfProb = Statistic("Pr(X>8)")
    val r = StatisticReporter(mutableListOf(estOfProb, estimateX))
    val n0 = 20 // sample size
    for (i in 1..n0) {
        val x = rv.value
        estimateX.collect(x)
        estOfProb.collect(x > 8)
    }
    println(r.halfWidthSummaryReport())
}

In the code, an instance of Statistic is used to observe values of the quantity \((x > 8)\). This quantity is actually a logical condition which will evaluate to true (1) or false (0) given the value of \(x\). Thus, the Statistic will be recording observations of 1’s and 0’s. Because the Statistic computes the average of the values that it “collects”, we will get an estimate of the probability. Thus, the quantity, \((x > 8)\) is an indicator variable for the desired event. Using a Statistic to estimate a probability in this manner is quite effective.

Now, in more realistic simulation situations, we would not know the true population mean and variance. Thus, in solving this problem, we will ignore that information. To apply the previously discussed sample size determination methods, we need estimates of \(\sigma = \sqrt{Var[X]}\) and \(p = P(X>8)\). In order to get estimates of these quantities, we need to run our simulation experiment; however, in order to run our simulation experiment, we need to know how many observations to make. This is quite a catch-22 situation!

To proceed, we need to run our simulation model for a pilot experiment. From a small number of samples (called a pilot experiment, pilot run or pilot study), we will estimate \(\sigma\) and \(p = P(X>8)\) and then use those estimates to determine how many samples we need to meet the desired margins of error. We will then re-run the simulation using the recommended sample size.

The natural question to ask is how big should your pilot sample be? In general, this depends on how much it costs for you to collect the sample. Within a simulation experiment context, generally, cost translates into how long your simulation model takes to execute. For this small problem, the execution time is trivial, but for large and complex simulation models the execution time may be in hours or even days. A general rule of thumb is between 10 and 30 observations. For this example, we will use an initial pilot sample size, \(n_0 = 20\).

Running the model for the 20 replications yields the following results.

Table 3.1: Simulation results for \(n_0 = 20\) replications
Performance Measures	Average	95% Half-Width
Pr(X>8)	0.80	0.1921
Estimated X	12.2484	2.2248

Notice that the confidence interval for the true mean is \((10.0236,14.4732)\) with a half-width of \(2.2248\), which exceeds the desired margin of error, \(\epsilon = 0.5\). Notice also that this particular confidence interval happens to not contain the true mean \(E[X] = 10\). To determine the sample size so that we get a half-width that is less than \(\epsilon = 0.1\) with 99% confidence, we can use Equation (3.8). In order to do this, we must first have an estimate of the sample standard deviation, \(s\). Since Table 3.1 reports a half-width for a 95% confidence interval, we need to use Equation (3.6) and solve for \(s\) in terms of \(h\). Rearranging Equation (3.6) yields,

\[\begin{equation} s = \dfrac{h\sqrt{n}}{t_{1-(\alpha/2), n-1}} \tag{3.11} \end{equation}\]

Using the results from Table 3.1 and \(\alpha = 0.05\) in Equation (3.11) yields,

\[ s_0 = \dfrac{h\sqrt{n_0}}{t_{1-(\alpha/2), n_0-1}} = \dfrac{2.2248\sqrt{20}}{t_{0.975, 19}}= \dfrac{2.2248\sqrt{20}}{2.093024}=4.7536998 \]

Now, we can use Equation (3.8) to determine the sample size requirement. For a 99% confidence interval, we have \(\alpha = 0.01\) and \(\alpha/2 = 0.005\). Thus, \(z_{0.995} = 2.576\). Because the margin of error, \(\epsilon\) is \(0.5,\) we have that,

\[n \geq \left(\dfrac{z_{1-(\alpha/2)}\; s}{\epsilon}\right)^2 = \left(\dfrac{2.5758293064439264 \times 4.7536998}{0.5}\right)^2 = 599.73 \approx 600\] To determine the sample size for estimating \(p=P(X>8)\) with 95% confidence to \(\pm 0.1\), we can use Equation (3.10)

\[ n = \left(\dfrac{z_{1-(\alpha/2)}}{\epsilon}\right)^{2} \hat{p}(1 - \hat{p})=\left(\dfrac{z_{0.975}}{0.05}\right)^{2} (0.80)(1 - 0.80)=\left(\dfrac{1.96}{0.05}\right)^{2} (0.80)(0.20) = 245.86 \approx 246 \]

All of these calculations can be easily accomplished using Kotlin code as follows:

fun main() {
    val rv = NormalRV(10.0, 16.0, streamNum = 3)
    val estimateX = Statistic("Estimated X")
    val estOfProb = Statistic("Pr(X>8)")
    val r = StatisticReporter(mutableListOf(estOfProb, estimateX))
    val n0 = 20 // sample size
    for (i in 1..n0) {
        val x = rv.value
        estimateX.collect(x)
        estOfProb.collect(x > 8)
    }
    println(r.halfWidthSummaryReport())
    val desiredHW = 0.5
    val s0 = estimateX.standardDeviation
    val level = 0.99
    val n = Statistic.estimateSampleSize(
        desiredHW = desiredHW,
        stdDev = s0,
        level = level
    )
    println("Sample Size Determination")
    println("desiredHW = $desiredHW")
    println("stdDev = $s0")
    println("Level = $level")
    println("recommended sample size = $n")
    println()
    val m = Statistic.estimateProportionSampleSize(
        desiredHW = 0.05,
        pEst = estOfProb.average,
        level = 0.95
    )
    println("Recommended sample size for proportion = $m")
}

The results match the values computed from the equations.

Sample Size Determination
desiredHW = 0.5
stdDev = 4.753707020199372
Level = 0.99
recommended sample size = 600
Recommended sample size for proportion = 246

By using the maximum of \(\max{(246, 600)=600}\), we can re-run the simulation for this number of replications. Doing so, yields,

Table 3.2: Simulation Results for \(n=600\) Replications
Performance Measures	Average	95% Half-Width
Pr(X>8)	0.7117	0.0363
Estimated X	10.2712	0.3284

As can be seen in Table 3.2, the half-width values meet the desire margins of error. It may be possible that the margins of error might not be met. This suggests that more than \(n = 600\) observations is needed to meet the margin of error criteria. Equation (3.8) and Equation (3.10) are only approximations and based on a pilot sample. Thus, if there was considerable sampling error associated with the pilot sample, the approximations may be inadequate.

As can be noted from this example in order to apply the normal approximation method for determining the sample size based on the pilot run, we need to compute the initial sample standard deviation, \(s_0\), from the initial reported half-width, \(h\). This requires the use of Equation (3.11) to first compute the value of \(s\) from \(h\). We can avoid this calculation by using the half-width ratio method for determining the sample size.

Let \(h_0\) be the initial value for the half-width from the pilot run of \(n_0\) replications. Then, rewriting Equation (3.6) in terms of the pilot data yields:

\[ h_0 = t_{1-(\alpha/2), n_0 - 1} \dfrac{s_0}{\sqrt{n_0}} \] Solving for \(n_0\) yields:

\[\begin{equation} n_0 = t_{1-(\alpha/2), n_0 -1}^{2} \dfrac{s_{0}^{2}}{h_{0}^{2}} \tag{3.12} \end{equation}\]

Similarly for any \(n\), we have:

\[\begin{equation} n = t_{1-(\alpha/2), n-1}^{2} \dfrac{s^{2}}{h^{2}} \tag{3.13} \end{equation}\]

Taking the ratio of \(n_0\) to \(n\) (Equations (3.12) and (3.13)) and assuming that \(t_{1-(\alpha/2), n-1}\) is approximately equal to \(t_{1-(\alpha/2), n_0 - 1}\) and \(s^2\) is approximately equal to \(s_0^2\), yields,

\[n \cong n_0 \dfrac{h_0^2}{h^2} = n_0 \left(\frac{h_0}{h}\right)^2\]

This is the half-width ratio equation.

\[\begin{equation} n \geq n_0 \left(\frac{h_0}{h}\right)^2 \tag{3.14} \end{equation}\]

The issue that we face when applying Equation (3.14) is that Table 3.1 only reports the 95% confidence interval half-width. Thus, to apply Equation (3.14) the value of \(h_0\) will be based on an \(\alpha = 0.05\). If the desired half-width, \(h\), is required for a different confidence level, e.g. a 99% confidence interval, then you must first translate \(h_0\) to be based on the desired confidence level or set the confidence level for the StatisticalReporter to be 99%. To compute the 95% half-width from Table 3.1, you can compute \(s\) from Equation (3.11) and then recomputing the actual half-width, \(h\), for the desired confidence level.

Using the results from Table 3.1 and \(\alpha = 0.05\) in Equation (3.11) we already know that \(s_0 = 4.7536998\) for a 95% confidence interval. Thus, for a 99% confidence interval, where \(\alpha = 0.01\) and \(\alpha/2 = 0.005\), we have:

\[ h_0 = t_{1-(\alpha/2), n_0-1} \dfrac{s_0}{\sqrt{n_0}}= t_{0.995, 19}\dfrac{4.7536998}{\sqrt{20}}=2.861\dfrac{4.7536998}{\sqrt{20}}=3.041127 \] Now, we can apply Equation (3.14) to this problem with the desire half-width bound \(\epsilon = 0.5 = h\):

\[ n \geq n_0 \left(\frac{h_0}{h}\right)^2=20\left(\frac{3.041127}{h}\right)^2=20\left(\frac{3.041127}{0.5}\right)^2=739.87 \cong 740 \]

We see that for this pilot sample, the half-width ratio method recommends a substantially larger sample size, \(740\) versus \(600\). In practice, the half-width ratio method tends to be conservative by recommending a larger sample size. We will see another example of these methods within the next section.

Based on these examples, you should have a basic understanding of how a simulation experiment can be performed to meet a desired margin of error. In general, simulation models are much more interesting than this simple example.