B.6 Testing Uniform (0,1) Pseudo-Random Numbers
Now that we have seen the general process for fitting continuous distributions, this section will discuss the special case of testing for uniform (0,1) random variables. The reason that this is important is because these methods serve the basis for testing if pseudo-random numbers can reasonably be expected to perform as if they are U(0,1) random variates. Thus, this section provides an overview of what is involved in testing the statistical properties of random number generators. Essentially a random number generator is supposed to produce sequences of numbers that appear to be independent and identically distributed (IID) \(U(0,1)\) random variables. The hypothesis that a sample from the generator is IID \(U(0,1)\) must be made. Then, the hypothesis is subjected to various statistical tests. There are standard batteries of test, see for example (Soto 1999), that are utilized. Typical tests examine:
Distributional properties: e.g. Chi-Squared and Kolmogorov-Smirnov test
Independence: e.g. Correlation tests, runs tests
Patterns: e.g. Poker test, gap test
When considering the quality of random number generators, the higher dimensional properties of the sequence also need to be considered. For example, the serial test is a higher dimensional Chi-Squared test. Just like for general continuous distributions, the two major tests that are utilized to examine the distributional properties of sequences of pseudo-random numbers are the Chi-Squared Goodness of Fit test and the Kolmogorov-Smirnov test. In the case of testing for \(U(0,1)\) random variates some simplifications can be made in computing the test statistics.
B.6.1 Chi-Squared Goodness of Fit Tests for Pseudo-Random Numbers
When applying the Chi-Squared goodness of fit to test if the data are \(U(0,1)\), the following is the typical procedure:
Divide the interval \((0,1)\) into \(k\) equally spaced classes so that \(\Delta b = b_{j} - b_{j-1}\) resulting in \(p_{j} = \frac{1}{k}\) for \(j=1, 2, \cdots, k\). This results in the expected number in each interval being \(np_{j} = n \times \frac{1}{k} = \frac{n}{k}\)
As a practical rule, the expected number in each interval \(np_{j}\) should be at least 5. Thus, in this case \(\frac{n}{k} \geq 5\) or \(n \geq 5k\) or \(k \leq \frac{n}{5}\). Thus, for a given value of \(n\), you should choose the number of intervals \(k \leq \frac{n}{5}\),
Since the parameters of the distribution are known \(a=0\) and \(b=1\), then \(s=0\). Therefore, we reject \(H_{0}: U_{i} \sim U(0,1)\) if \(\chi^{2}_{0} > \chi^{2}_{\alpha, k-1}\) or if the p-value is less than \(\alpha\)
If \(p_{j}\) values are chosen as \(\frac{1}{k}\), then Equation (B.15) can be rewritten as:
\[\begin{equation} \chi^{2}_{0} = \frac{k}{n}\sum\limits_{j=1}^{k} \left( c_{j} - \frac{n}{k} \right)^{2} \tag{B.15} \end{equation}\]
Let’s apply these concepts to a small example.
Example B.3 (Testing 100 Pseudo-Random Numbers) Suppose we have 100 observations from a pseudo-random number generator. Perform a \(\chi^{2}\) test that the numbers are distributed \(U(0,1)\).
0.971 | 0.668 | 0.742 | 0.171 | 0.350 | 0.931 | 0.803 | 0.848 | 0.160 | 0.085 |
0.687 | 0.799 | 0.530 | 0.933 | 0.105 | 0.783 | 0.828 | 0.177 | 0.535 | 0.601 |
0.314 | 0.345 | 0.034 | 0.472 | 0.607 | 0.501 | 0.818 | 0.506 | 0.407 | 0.675 |
0.752 | 0.771 | 0.006 | 0.749 | 0.116 | 0.849 | 0.016 | 0.605 | 0.920 | 0.856 |
0.830 | 0.746 | 0.531 | 0.686 | 0.254 | 0.139 | 0.911 | 0.493 | 0.684 | 0.938 |
0.040 | 0.798 | 0.845 | 0.461 | 0.385 | 0.099 | 0.724 | 0.636 | 0.846 | 0.897 |
0.468 | 0.339 | 0.079 | 0.902 | 0.866 | 0.054 | 0.265 | 0.586 | 0.638 | 0.869 |
0.951 | 0.842 | 0.241 | 0.251 | 0.548 | 0.952 | 0.017 | 0.544 | 0.316 | 0.710 |
0.074 | 0.730 | 0.285 | 0.940 | 0.214 | 0.679 | 0.087 | 0.700 | 0.332 | 0.610 |
0.061 | 0.164 | 0.775 | 0.015 | 0.224 | 0.474 | 0.521 | 0.777 | 0.764 | 0.144 |
Since we have \(n = 100\) observations, the number of intervals should be less than or equal to 20. Let’s choose \(k=10\). This means that \(p_{j} = 0.1\) for all \(j\). The following table summarizes the computations for each interval for computing the chi-squared test statistic.
\(j\) | \(b_{j-1}\) | \(b_{j}\) | \(p_j\) | \(c(\vec{x} \leq b_{j-1})\) | \(c(\vec{x} \leq b_{j})\) | \(c_{j}\) | \(np_{j}\) | \(\frac{\left( c_{j} - np_{j} \right)^{2}}{np_{j}}\) |
---|---|---|---|---|---|---|---|---|
1 | 0 | 0.1 | 0.1 | 0 | 13 | 13 | 10 | 0.9 |
2 | 0.1 | 0.2 | 0.1 | 13 | 21 | 8 | 10 | 0.4 |
3 | 0.2 | 0.3 | 0.1 | 21 | 28 | 7 | 10 | 0.9 |
4 | 0.3 | 0.4 | 0.1 | 28 | 35 | 7 | 10 | 0.9 |
5 | 0.4 | 0.5 | 0.1 | 35 | 41 | 6 | 10 | 1.6 |
6 | 0.5 | 0.6 | 0.1 | 41 | 50 | 9 | 10 | 0.1 |
7 | 0.6 | 0.7 | 0.1 | 50 | 63 | 13 | 10 | 0.9 |
8 | 0.7 | 0.8 | 0.1 | 63 | 77 | 14 | 10 | 1.6 |
9 | 0.8 | 0.9 | 0.1 | 77 | 90 | 13 | 10 | 0.9 |
10 | 0.9 | 1 | 0.1 | 90 | 100 | 10 | 10 | 0 |
Summing the last column yields:
\[\chi^{2}_{0} = \sum\limits_{j=1}^{k} \frac{\left( c_{j} - np_{j} \right)^{2}}{np_{j}} = 8.2\]
Computing the p-value for \(k-s-1=10-0-1=9\) degrees of freedom, yields
\(P\{\chi^{2}_{9} > 8.2\} = 0.514\). Thus, given such a high p-value, we
would not reject the hypothesis that the observed data is \(U(0,1)\). This process can be readily implemented within a spreadsheet or performed using R. Assuming that the file, u01data.txt, contains the PRNs for this example, then the following R commands will
perform the test:
data = scan(file="data/AppDistFitting/u01data.txt") # read in the file
b = seq(0,1, by = 0.1) # set up the break points
h = hist(data, b, right = FALSE, plot = FALSE) # tabulate the counts
chisq.test(h$counts) # perform the test
##
## Chi-squared test for given probabilities
##
## data: h$counts
## X-squared = 8.2, df = 9, p-value = 0.5141
Because we are assuming that the data is \(U(0,1)\), the chisq.test() function within R is simplified because its default is to assume that all data is equally likely. Notice that we get exactly the same result as we computed when doing the calculations manually.
B.6.2 Higher Dimensional Chi-Squared Test
The pseudo-random numbers should not only be uniformly distributed on the interval \((0,1)\) they should also be uniformly distributed within within the unit square, \(\lbrace (x,y): x\in (0,1), y \in (0,1) \rbrace\), the unit cube, and so forth for higher number of dimensions \(d\).
The serial test, described in (Law 2007) can be used to assess the quality of the higher dimensional properties of pseudo-random numbers. Suppose the sequence of pseudo-random numbers can be formed into non-overlapping vectors each of size \(d\). The vectors should be independent and identically distributed random vectors uniformly distributed on the d-dimensional unit hyper-cube. This motivates the development of the serial test for uniformity in higher dimensions. To perform the serial test:
Divide (0,1) into \(k\) sub-intervals of equal size.
Generate \(n\) vectors of pseudo-random numbers each of size \(d\),
\[ \begin{aligned} \vec{U}_{1} & = (U_{1}, U_{2}, \cdots, U_{d})\\ \vec{U}_{2} & = (U_{d+1}, U_{d+2}, \cdots, U_{2d})\\ \vdots \\ \vec{U}_{n} & = (U_{(n-1)d+1}, U_{(n-1)d+2}, \cdots, U_{nd}) \end{aligned} \]
Let \(c_{j_{1}, j_{2}, \cdots, j_{d}}\) be the count of the number of \(\vec{U}_{i}\)’s having the first component in subinterval \(j_{1}\), second component in subinterval \(j_{2}\) etc.
Compute
\[\begin{equation} \begin{aligned} \chi^{2}_{0}(d) = \frac{k^{d}}{n}\sum\limits_{j_{1}=1}^{k} \sum\limits_{j_{2}=1}^{k} \dotso \sum\limits_{j_{d}=1}^{k} \left( c_{j_{1}, j_{2}, \cdots, j_{d}} - \frac{n}{k^{d}} \right)^{2} \end{aligned} \tag{B.16} \end{equation}\]
- Reject the hypothesis that the \(\vec{U}_{i}\)’s are uniformly distributed in the d-dimensional unit hyper-cube if \(\chi^{2}_{0}(d) > \chi^{2}_{\alpha, k^{d}-1}\) or if the p-value \(P\{\chi^{2}_{k^{d}-1} > \chi^{2}_{0}(d)\}\) is less than \(\alpha\).
(Law 2007) provides an algorithm for computing \(c_{j_{1}, j_{2}, \cdots, j_{d}}\). This test examines how uniformly the random numbers fill-up the multi-dimensional space. This is a very important property when applying simulation to the evaluation of multi-dimensional integrals as is often found in the physical sciences.
Example B.4 (2-D Chi-Squared Test in R for U(0,1)) Using the same data as in the previous example, perform a 2-Dimensional \(\chi^{2}\) Test using the statistical package R. Use 4 intervals for each of the dimensions.
The following code listing is liberally commented for understanding the commands and essentially utilizes some of the classification and tabulation functionality in R to compute Equation (B.16). The code displayed here is available in the files associated with the chapter in file, 2dchisq.R.
nd = 100 #number of data points
data <- read.table("u01data.txt") # read in the data
d = 2 # dimensions to test
n = nd/d # number of vectors
m = t(matrix(data$V1,nrow=d)) # convert to matrix and transpose
b = seq(0,1, by = 0.25) # setup the cut points
xg = cut(m[,1],b,right=FALSE) # classify the x dimension
yg = cut(m[,2],b,right=FALSE) # classify the y dimension
xy = table(xg,yg) # tabulate the classifications
k = length(b) - 1 # the number of intervals
en = n/(k^d) # the expected number in an interval
vxy = c(xy) # convert matrix to vector for easier summing
vxymen = vxy-en # substract expected number from each element
vxymen2 = vxymen*vxymen # square each element
schi = sum(vxymen2) # compute sum of squares
chi = schi/en # compute the chi-square test statistic
dof = (k^d) - 1 # compute the degrees of freedom
pv = pchisq(chi,dof, lower.tail=FALSE) # compute the p-value
# print out the results
cat("#observations = ", nd,"\n")
cat("#vectors = ", n, "\n")
cat("size of vectors, d = ", d, "\n")
cat("#intervals =", k, "\n")
cat("cut points = ", b, "\n")
cat("expected # in each interval, n/k^d = ", en, "\n")
cat("interval tabulation = \n")
print(xy)
cat("\n")
cat("chisq value =", chi,"\n")
cat("dof =", dof,"\n")
cat("p-value = ",pv,"\n")
The results shown in the following listing are for demonstration purposes only since the expected number in the intervals is less than 5. However, you can see from the interval tabulation, that the counts for the 2-D intervals are close to the expected. The p-value suggests that we would not reject the hypothesis that there is non-uniformity in the pairs of the psuedo-random numbers. The R code can be generalized for a larger sample or for performing a higher dimensional test. The reader is encouraged to try to run the code for a larger data set.
Output:
#observations = 100
#vectors = 50
size of vectors, d = 2
#intervals = 4
cut points = 0 0.25 0.5 0.75 1
expected # in each interval, n/k^d = 3.125
interval tabulation =
yg
xg [0,0.25) [0.25,0.5) [0.5,0.75) [0.75,1)
[0,0.25) 5 0 3 2
[0.25,0.5) 2 1 2 7
[0.5,0.75) 3 3 3 7
[0.75,1) 4 1 4 3
chisq value = 18.48
dof = 15
p-value = 0.2382715
B.6.3 Kolmogorov-Smirnov Test for Pseudo-Random Numbers
When applying the K-S test to testing pseudo-random numbers, the hypothesized distribution is the uniform distribution on 0 to 1. The CDF for a uniform distribution on the interval \((a, b)\) is given by:
\[F(x) = P \lbrace X \leq x \rbrace = \frac{x-a}{b-a} \; \; \text{for} \; a < x < b\]
Thus, for \(a=0\) and \(b=1\), we have that \(F(x) = x\) for the \(U(0,1)\) distribution. This simplifies the calculation of \(D^{+}_{n}\) and \(D^{-}_{n}\) to the following:
\[ \begin{aligned} D^{+}_{n} & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace \frac{i}{n} - \hat{F}(x_{(i)}) \Bigr\rbrace \\ & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace \frac{i}{n} - x_{(i)}\Bigr\rbrace \end{aligned} \]
\[ \begin{aligned} D^{-}_{n} & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace \hat{F}(x_{(i)}) - \frac{i-1}{n} \Bigr\rbrace \\ & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace x_{(i)} - \frac{i-1}{n} \Bigr\rbrace \end{aligned} \]
Example B.5 (K-S Test for U(0,1)) Given the data from Example B.3 test the hypothesis that the data appears \(U(0,1)\) versus that it is not \(U(0,1)\) using the Kolmogorov-Smirnov goodness of fit test at the \(\alpha = 0.05\) significance level.
A good way to organize the computations is in a tabular form, which also
facilitates the use of a spreadsheet. The second column is constructed
by sorting the data. Recall that because we are testing the \(U(0,1)\)
distribution, \(F(x) = x\), and thus the third column is simply
\(F(x_{(i)}) = x_{(i)}\). The rest of the columns follow accordingly.
\(i\) | \(x_{(i)}\) | \(i/n\) | \(\dfrac{i-1}{n}\) | \(F(x_{(i)})\) | \(\dfrac{i}{n}-F(x_{(i)})\) | \(F(x_{(i)})-\dfrac{i-1}{n}\) |
---|---|---|---|---|---|---|
1 | 0.006 | 0.010 | 0.000 | 0.006 | 0.004 | 0.006 |
2 | 0.015 | 0.020 | 0.010 | 0.015 | 0.005 | 0.005 |
3 | 0.016 | 0.030 | 0.020 | 0.016 | 0.014 | -0.004 |
4 | 0.017 | 0.040 | 0.030 | 0.017 | 0.023 | -0.013 |
5 | 0.034 | 0.050 | 0.040 | 0.034 | 0.016 | -0.006 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
95 | 0.933 | 0.950 | 0.940 | 0.933 | 0.017 | -0.007 |
96 | 0.938 | 0.960 | 0.950 | 0.938 | 0.022 | -0.012 |
97 | 0.940 | 0.970 | 0.960 | 0.940 | 0.030 | -0.020 |
98 | 0.951 | 0.980 | 0.970 | 0.951 | 0.029 | -0.019 |
99 | 0.952 | 0.990 | 0.980 | 0.952 | 0.038 | -0.028 |
100 | 0.971 | 1.000 | 0.990 | 0.971 | 0.029 | -0.019 |
Computing \(D^{+}_{n}\) and \(D^{-}_{n}\) yields
\[\begin{aligned} D^{+}_{n} & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace \frac{i}{n} - x_{(i)}\Bigr\rbrace \\ & = 0.038 \end{aligned} \] \[ \begin{aligned} D^{-}_{n} & = \underset{1 \leq i \leq n}{\max} \Bigl\lbrace x_{(i)} - \frac{i-1}{n} \Bigr\rbrace \\ & = 0.108 \end{aligned} \] Thus, we have that
\[D_{n} = \max \lbrace D^{+}_{n}, D^{-}_{n} \rbrace = \max \lbrace 0.038, 0.108 \rbrace = 0.108\] Referring to Table F.5 and using the approximation for sample sizes greater than 35, we have that \(D_{0.05} \approx 1.36/\sqrt{n}\). Thus, \(D_{0.05} \approx 1.36/\sqrt{100} = 0.136\). Since \(D_{n} < D_{0.05}\), we would not reject the hypothesis that the data is uniformly distribution over the range from 0 to 1.
The K-S test performed in the solution to
Example B.5 can also be readily performed using the
statistical software \(R\). Assuming that the file, u01data.txt,
contains the data for this example, then the following R commands will
perform the test:
data = scan(file="data/AppDistFitting/u01data.txt") # read in the file
ks.test(data,"punif",0,1) # perform the test
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: data
## D = 0.10809, p-value = 0.1932
## alternative hypothesis: two-sided
Since the p-value for the test is greater than \(\alpha = 0.05\), we would not reject the hypothesis that the data is \(U(0,1)\).
B.6.4 Testing for Independence and Patterns in Pseudo-Random Numbers
A full treatment for testing for independence and patterns within sequences of pseudo-random numbers is beyond the scope of this text. However, we will illustrate some of the basic concepts in this section.
As previously noted an analysis of the autocorrelation structure associated with the sequence of pseudo-random numbers forms a basis for testing dependence. A sample autocorrelation plot can be easily developed once the autocorrelation values have been estimated Generally, the maximum lag should be set to no larger than one tenth of the size of the data set because the estimation of higher lags is generally unreliable.
An autocorrelation plot can be easily performed using the statistical software \(R\) for the data in Example B.3. Using the *acf()$ function in R makes it easy to get estimates of the autocorrelation estimates. The resulting plot is shown in Figure B.28. The \(r_{0}\) value represents the estimated variance. As can be seen in the figure, the acf() function of \(R\) automatically places the confidence band within the plot. In this instance, since none of the \(r_{k}, k \geq 1\) are outside the confidence band, we can conclude that the data are likely independent observations.
##
## Autocorrelations of series 'data', by lag
##
## 0 1 2 3 4 5 6 7 8 9
## 1.000 0.015 -0.068 0.008 0.179 -0.197 -0.187 0.066 -0.095 -0.120
The autocorrelation test examines serial dependence; however, sequences that do not have other kinds of patterns are also desired. For example, the runs test attempts to test ‘upward’ or ‘downward’ patterns. The runs up and down test count the number of runs up, down or sometimes just total number of runs versus expected number of runs. A run is a succession of similar events preceded and followed by a different event. The length of a run is the number of events in the run. The Table B.9 illustrates how to compute runs up and runs down for a simple sequence of numbers. In the table, a sequence of ‘-’ indicates a run down and a sequence of ‘+’ indicates a run up. In the table, there are a 8 runs (4 runs up and 4 runs down).
0.90 | 0.13 | 0.27 | 0.41 | 0.71 | 0.28 | 0.18 | 0.22 | 0.26 | 0.19 | 0.61 | 0.87 | 0.95 | 0.21 | 0.79 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
- | + | + | + | - | - | - | + | - | + | + | + | - | + |
Digit patterns can also examined. The gap test counts the number of digits that appear between repetitions of a particular digit and uses a K-S test statistic to compare with the expected number of gaps. The poker test examines for independence based on the frequency with which certain digits are repeated in a series of numbers, (e.g. pairs, three of a kind, etc.). Banks et al. (2005) discusses how to perform these tests.
Does all this testing really matter? Yes! You should always know what pseudo-random number generator you are using within your simulation models and you should always know if the generator has passed a battery of statistical tests. The development and testing of random number generators is serious business. You should stick with well researched generators and reliable well tested software.