# Data Quality Issues

While it may seem a contradiction in terms to discuss the "quality" of two files of random numbers, there are some areas of concern beyond the randomness issues discussed in the Introduction.

Our intent is to have the numbers presented here match those in the printed volume. However, cross-checks have revealed some discrepancies. We would like to express our appreciation to Bob Clements — an engineer at BBN by day and a student of random numbers by night — who first brought the discrepancies to our attention in November 1994.

## Recalculation of Table 1

We used the random-digits data to recalculate the numbers in Table 1, which presents a tally of the ones, twos, threes, etc., that appear in the whole file and in each of 20 blocks of 50,000 consecutive digits.

Our calculated numbers did not agree with those in Table 1 for the second and eighth block of 50,000 digits. Mr. Clements had compared a printout of the second block of data with the corresponding lines in the book and located several discrepancies between them. We used scanning and optical character recognition to input the 20 pages of the book comprising the eighth block of digits and electronically compared the resulting file to the corresponding data, finding that the two were in complete agreement.

We corrected the second block of the data file to agree with the book and again recalculated Table 1. Only the discrepancies for the eighth block of digits remained. Table 1 shows one fewer zero and one more two in the eighth block than does our recalculation. We believe this discrepancy is the result of an error made in reading the digits data cards during the calculation of Table 1. The results of this test, however, do not tell us anything about transposed digits or other self-canceling errors.

## Recalculation of Random Deviates

We used the random digits data to recreate the random deviates and electronically compared the recreated and original deviates data. There were five instances in which the recreated deviate did not match the corresponding item in the deviates data file.

However, for those five discrepancies, we compared the deviates data item and the corresponding digits to those in the book, and each of these ten comparisons matched. Therefore, the discrepancies exist in the printed volume as well.

We believe that at the time the original deviates were generated, errors made in reading the digits data led to "incorrect" deviates (i.e., deviates that do not correspond to the application of formula 1 to the digits from which they were generated).

## Recalculation of Table 3

Mr. Clements ran the poker-hands test on the data with the discrepancies in the second block and compared the results with the data in Table 3. He found several discrepancies there as well.

We did not reproduce this analysis. However, on the basis of our experience, we believe that these discrepancies are due to errors made in reading the digits data when the original tests were run.