A Random Number Generator Test Suite for the C++ Standard

Size: px

Start display at page:

Download "A Random Number Generator Test Suite for the C++ Standard"

Claude Sharp
6 years ago
Views:

1 Institute for Theoretical Physics Winter ETH Zürich Diploma Thesis A Random Number Generator Test Suite for the C++ Standard Mario Rütti March 10, 2004 Supervisor: Prof. M. Troyer maruetti@comp-phys.org troyer@phys.ethz.ch

2 I am grateful to my diploma professor Prof. Matthias Troyer for giving me the opportunity to write this instructive and inspiring diploma thesis. To say nothing of the time he spent helping me to resolve my (and my computer s) problems, and his effort to find new and unconventional solutions. My special thanks also go to my office co-worker Manuel Gil for the motivating and amusing discussions about our work and his pleasant companionship. I am grateful to Frank Moser who was acting as editor and assisted me in correcting and polishing my English sentences. I want to apologize to Ariana about lackluster evenings with a friend lost in thought. Thank you for your support and understanding during this time. Finally, I am grateful to my parents for the tremendous support they gave me during my years of studies which enabled me to achieve my goals.

3 to my parents Urs and Heidi

5 Abstract The heart of every Monte Carlo simulation is a source of high quality random numbers and the generator has to be picked carefully. Since the Ferrenberg affair it is known to a broad community that statistical tests alone do not suffice to determine the quality of a generator, but also application-based tests are needed. With the inclusion of an extensible random number library and the definition of a generic interface into the revised C++ standard it will be important to have access to an extensive C++ random number test suite. Most currently available test suites are limited to a subset of tests are written in Fortran or C and cannot easily be used with the C++ random number generator library. In this paper we will present a generic random number test suite written in C++. The framework is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The Boost implementation so far contains most modern random number generators. Employing generic programming techniques the test suite is flexible, easily extensible and can be used with any random number generator library, including those written in C and Fortran. Test results are produced in an XML format, which through the use of XSLT transformations allows extraction of summaries or detailed reports, and conversion to HTML, PDF, PostScript or any other format. At this time, the test suite contains a wide range of different test, including the standard tests described by Knuth, Vattulainen s physical tests, parts of Marsaglia s Diehard test suite, and a number of number of newer tests.

7 Contents 1. Introduction 1 2. What are random numbers? Types of random numbers Analyzing Statistics χ 2 test ( Chi-square test) Kolmogorov-Smirnov test (KS test) Gaussian Test Using the Random Number Generator Test Suite How to run a test The rng_test_suite environment Template Parameter Confidence Level Seeds Random Number Generators Tests Running Testing Parallel Random Number Generators Iterating a test Count failings Bit Tests Bit extract test The XML output Tests for Studying Random Data Equidistribution test Run test Runs up and down Runs above and below mean Length of runs Gap test Poker test Coupon-collectors test Permutation test Maximum of t test Birthday Spacings test Collision test (Hash test) Serial correlation i

8 5.11. Serial test Blocking test Repeating Time Test gcd test (greatest common divisor) Gorilla test Ising-model test Random-walk test n-block test Random Walker on a line (S n test) D Intersection test D Height Correlation test Sum of independent distributions test Fourier transform test Universal statistical test The Diehard Test Suite Birthday Spacings test The overlapping 5-permutation test Ranks of binary matrices The bitstream test The OPSO, OQSO and DNA tests The count-the-1 s test The parking lot test The overlapping sums test Squeeze test The Minimum Distance test Random Sphere test The runs test Craps test Extending the Random Number Generator Test Suite How to implement a test Implementing a χ 2, Kolmogorov-Smirnov or a Gaussian test χ 2 test Kolmogorov-Smirnov test Gaussian test The multiple_test wrapper Useful sequence diagrams Demands on Random Number Generators Foreign Random Number Generators The XML Schema A. Collection of Test Parameters 64 B. Examples 66 C. Compiling the Test Suite 67 ii

9 1. Introduction How random is random? In this diploma thesis a generic random number test suite (RNGTS) is developed. The test suite framework is written in C++ with attention to modern generic programming paradigm. It is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The aim of RNGTS is to assist in finding a suitable random number generator for a specific purpose and in deciding between good and bad random number generators. Through a generic interface the RNGTS makes a variety of different tests available and provides the possibility to extending the suite with user defined tests. The test results are produced in XML format, which allows the transformation into summaries or detailed reports through the use of XSLT style sheets. The main purpose is to support the user in his decision about a random number generator, and in the question how random the numbers produced by the random number generators are. In the second part of this paper there is a short discussion about the different types of random numbers and their applications. Then, in the third part the involved statistical methods and their pertaining programming interface are presented. The fourth part contains the handling of RNGTS. This part is a must for the user who wants to perform any tests. It also describes the core of the whole test suite. In the fifth part there is a presentation of the most popular random number generator tests, their parameters and programming interfaces. These tests are collected from different sources and authors. The sixth part is for advanced users who want to extend RNGTS and add new tests or different extensions. Finally, the appendix contains a collection of different lists with test parameters and other useful stuff. Download The RNGTS framework is located on the web server and may be downloaded there. There are also some installation hints, some examples and the full documentation with additional interface descriptions and the XSL schema. 1

10 2. What are random numbers? Random numbers are characterised by the fact that their value can not be predicted. Or, in other words, if one constructs a sequence of random numbers, the probability distribution of the following random numbers have to be completely independent of all the other generated numbers. A more sophisticated mathematical definition and discussion can be found in [6] Types of random numbers There are three types of random numbers, quasi-, pseudo- and true- random numbers. These different types of random numbers have different applications. (It is philosophical question what we can call random or not, but here, we use the following descriptions, its simpler... ) True Random Number The most often used example for truly random numbers is the decay of a radioactive material. If a Geiger counter is put in front of such a radioactive source, the intervals between the decay events are truly random. True random numbers are gained from physical processes like radioactive decay or also rolling a dice. But rolling a dice is difficult, perhaps someone could control the dice so well to determine the outcome. Pseudo Random Number These numbers are generated by a computer or that is to say, by an algorithm and because of this not truly random. Every new number is generated from the previous ones by an algorithm. This means that the new value is fully determined by the previous ones. But, depending on the algorithm, they often have properties making them very suitable for simulations. Quasi Random Number A good description quoted from [25], Chapter 7.7 Sequences of n-tuples that fill n-space more uniformly than uncorrelated random points are called quasi-random sequences. That term is somewhat of a misnomer, since there is nothing random about quasi-random sequences: They are cleverly crafted to be, in fact, sub-random. The sample points in a quasi-random sequence are, in a precise sense, maximally avoiding each other. Quasi random numbers are not designed to appear random, rather to be uniformly distributed. One aim of such numbers is to reduce and control errors in Monte Carlo simulations. A picture is always a good way to illustrate the difference between this two types. In figure and we have plots with different numbers of pseudo- and quasi-random numbers. This is a good demonstration to show the structure of quasi-random numbers, but it is also 1 This plot was generated with the Matlab 6 rand generator, a combination of a lagged Fibonacci generator, with a cache of 32 floating point numbers and a shift register random integer generator. 2 This plot was generated with the sobol.m routine for Matlab from ~burkardt/m_src/sobol/sobol.html. This web-site includes also a variety of references for Sobol sequences and some implementations in different programming languages. 2

11 2.1. Types of random numbers possible to see that quasi-random numbers fill continuously the hole plane, while pseudorandom numbers may build clusters and holes. If we are talking about random numbers in the following parts, we mean pseudo random numbers Points Points Points Points Figure 2.1.: Pseudo Random Numbers Points Points Points Points Figure 2.2.: Quasi Random Numbers 3

12 3. Analyzing Statistics In this section we describe the χ 2 test and the Kolmogorov-Smirnov test. Both are designed to check if the measured distribution is similar to the expected distribution. So we can compare different distributions. Later on we describe the gaussian test which is based on the gaussian normal distribution. A detailed description for the outlined C++ classes can be found in the section about implementing additional tests χ 2 test ( Chi-square test 1 ) The χ 2 -Test is perhaps the best known statistical test. It is based on a comparison between the empirical distribution function and the theoretically expected distribution. The empirical distribution is based on the results of the random process. The n measured random values must be divided in k classes I 1 I 2 I k. The classes contain N 1 N 2 N k N values. For each class, the expected number of values must be calculated with the expected distribution function N i N p i for a given p i (p i p i ) Considering the squares of the differences between the measured values and the expected values gives the χ 2 value χ 2 k i 1 n i np i 2 1 np i n k i 1 n 2 i p i n (3.1) With k classes, there are ν k 1 degrees of freedom in the χ 2 distribution. Looking up for χ 2 and ν in χ 2 distribution tables, which can be found in [16], [3], the probability being above or below the given χ 2 can be found. Calculating the probability of a χ 2 value is not such an easy task, but there is an algorithm published by Hill and Pike, which can be used, see [11], [12], [14]. Example: Throwing a die After throwing a die 120 times we get the following results value # observed Sometimes the χ 2 test stands for the Equidistribution test. 4

13 3.1. χ 2 test ( Chi-square test) There is no reason to change the k 6 natural classes I 1 I 2 I 6. The number of values is n 120. For a true die we expect a probability of p i 1 6 for each die-number The expected number of values is n i np i 20 The χ 2 value is calculated by the following sum. χ 2 k n i np i 2 np i i Here we have k 6 classes. This means that the number of degrees of freedom is ν 5. Looking up for χ in a table, the value lies between 50% and 75%. This means that we will have a χ between 25% and 50% of the time. The randomness observed in this experiment is satisfactory in this test. Available code To handle the χ 2 statistics there is the chisquare_test class, which provides different methods used for the calculation. Some important methods are listed in the declaration below. The class is defined in the chisquare_test.h file. class chisquare_test { void prepare_statistics(std::size_t count_size, uint64_t runs, std::size_t degoffreedom = 0); template<class ForwardIterator> void calculate_chisquare_value(forwarditerator first, ForwardIterator last, std::size_t degoffreedom); template<class ForwardIterator> void calculate_chisquare_value(forwarditerator first, ForwardIterator last); void set_chisquare_value(double chisquarevalue, std::size_t degoffreedom); chisqr_stat_type get_chisquare_value(); double get_chisquare_prob(); } In the same file there is also a function to calculate the χ 2 value without class stuff. template<class ForwardIterator, class UnaryFunction> double calc_chisquare_value(forwarditerator first, ForwardIterator last, UnaryFunction probability, std::size_t degoffreedom) To calculate the probability from a χ 2 value in the file chisqr_prob.h file there is a function managing this task. double chi_probability(double chisqr, int dof) 5

14 3. Analyzing Statistics 3.2. Kolmogorov-Smirnov test (KS test) As we have seen, the χ 2 test be applied when observations can fall into a finite number of categories. But normally one will consider random quantities which may assume an infinite number of values. In this test, the random number generators distribution function F n x is compared to the expected distribution function F x. In [16], Knuth defined this functions as follows: F x probability that X x F n x number of X 1 X 2 X n which are x n The n measured random values must be sorted in ascending order, X 1 X 2 X n To make the test, we form the following statistics: K n n max x n x F x F n max 1 i n K n n max x F x F n x n max 1 i n i F X n i F X i i 1 n Like in the χ 2 -test, we may now look up the values K n, K n in a table [16] to determine if they are significantly high or low. An other way is to calculate the probabilities by the algorithm given in [1] and in chapter 3.3.1, C. History, bibliography, and theory of [16] In [16] there is also formula given to calculate the probability exactly prob K n t n t n n t k 0 n k k t k t n k n k 1 (3.2) Example: 10 random numbers We got 10 numbers from a random number generator. These are {0.809, 0.465, 0.151, 0.628, 0.318, 0.824, 0.394, 0.968, 0.179, 0.458} First we sort the random numbers X i ascending order Calculate the quantities K i and K i and find the maximum of these quantities 6

15 With these values we calculate K 10 and K 10 as follows K 10 n max 1 i K n K 10 n max 1 i K n 3.2. Kolmogorov-Smirnov test (KS test) i X i K i K i i i If we look up these values in an appropriate table for n 10, we find that the chance to get a K 10 greater then or lies between 50% and 75%. Available code To calculate the Kolmogorov-Smirnov statistics there is a class which supports the required routines. The definition of this class called ks_test is found in ks_test.h. Some important methods are listed below class ks_test { void prepare_statistics(uint64_t runs); template<class ForwardIterator> void calculate_ks_value(forwarditerator first, ForwardIterator last); template<class ForwardIterator, class UnaryFunction> void ks_value(forwarditerator first, ForwardIterator last, UnaryFunction integratedprobdistr); ks_stat_type get_ks_value(); ks_prob_type get_ks_prob(); } There is also a function to calculate the KS values. template<class ForwardIterator, class UnaryFunction> std::pair<double, double> calc_ks_value(forwarditerator first, ForwardIterator last, UnaryFunction integratedprobdistr) To calculate the probability for a KS value the following function is defined in the file ks_prob.h. boost::tuple<double, double> ks_probability(int n, std::pair<double, double> kspair) 7

16 2 3. Analyzing Statistics percent factor % 5% % % 5% Σ mean Σ Figure 3.1.: Gaussian distribution Figure 3.2.: Percentage function 3.3. Gaussian Test The Gaussian test is a little different from the χ 2 or the Kolmogorov-Smirnov test. In these two tests the expected distribution function is compared with the measured distribution function and based on the difference some indicators are calculated. In the Gaussian test a physical view is used. If a measurement is done, it is known that, even if the best tools are used, the result depends on a number of ruleless and uncontrolled parameters. These measurement errors are random and a combination of different single errors. The central limit theorem predicates that the measured value behaves like a normal distributed random variable (This is valid in the normal case). The normalized density function is written as x f x 1 2πσ e 1 2 µ σ x (3.3) where µ is the mean of expected value and σ the standard deviation. To make a classification of measured values one can compare the deviation from the expected value with the standard deviation. It can be calculated that in the interval µ σ µ σ 68.3 % of all measured values are expected. If the interval is expanded to µ 3σ µ 3σ we expect 99.7% of all measured values in this range. Based on this theory it is possible to give a possibility for a measured value. The assumption is that the expected value and the deviation are known. The deviation factor is calculated with the following formula: 2 where erf denotes the error function erf z perc x erf 1 2x (3.4) π z 0 e t 2 dt. In this formula we define the mean value (expected value) as 50 %, if the deviation is positive a percentage value bigger than 50 % results or if the deviation is negative, a percentage value smaller than 50 % results. The function is shown in figure

17 3.3. Gaussian Test Example: Ising model test statistic We run the Ising model test described later on and check the result. From the simulation we get a specific energy of whereas a value of is expected. The standard deviation is calculated as This results in a deviation from the mean of σ. This result can be converted in percent and one gets % from the mean value. That means that only % of the measured values will be smaller than this value. Available code To get some support calculating the gaussian statistics there is a class called gaussian_test in the file gaussian_test.h. The declarations of the most important methods are listed below. class gaussian_test { void prepare_statistics(double deviation, double stat_value, double mean); void calc_gaussian_value(); double get_gaussian_prob(); } 9

18 4. Using the Random Number Generator Test Suite This section describes how to use the Random Number Generator Test Suit (RNGTS) with the available wrappers and helpers. The aim was to supply a simple but enough powerful interface to build a flexible system to test different types of random number generators with different tests. But also to allow the generation of various kind of result representation through using a universal XML output format How to run a test Testing a random number generator is simple, the only requirement for the generator is that it fulfils the Boost Pseudo-random number engine requirements. This can be found in http: // written by Jens Maurer. The listing below shows a exemplary test program. // include Boosts random number generator #include <boost/random.hpp> // definition to show progress during the test #define PRINT_STATUS // include the test suite environment #include "rng_test_suite.h" // include all header of used tests #include "poker_test.h" #include "ising_model_test.h" int main() { // import random number generator from Boost using boost::lagged_fibonacci44497; using boost::mt19937; // create a TestSuite using uint32_t seeds rng_test_suite<> testsuite; // add desired confidence level testsuite.add_confidence_level(0.05); testsuite.add_confidence_level(0.95); testsuite.add_confidence_level(0.1); testsuite.add_confidence_level(0.9); // add desired seeds testsuite.add_seed( ); testsuite.add_seed(236598); testsuite.add_seed(1237); // register the random number generator to test 10

19 4.2. The rng_test_suite environment testsuite.register_rng<lagged_fibonacci44497>("lagged Fibonacci 44497"); testsuite.register_rng<mt19937>("mt19937, Mersenne Twister", 10000); // create the test object poker_test pokertest(100000, 5); ising_model_test isingtest( , 16); // register the tests testsuite.register_test<ising_model_test>(isingtest); testsuite.register_test<poker_test>(pokertest); // run tests... // specify destination for writing the XML output, write output into a file std::ofstream file_out("test_output.xml"); // runs all tests and catches possible exceptions try { // catch possible logic_error exceptions testsuite.run_test(file_out, true); } catch (std::exception& e) { std::cout << "exception occured, program terminated : " << e.what(); } file_out.close(); } return 0; 4.2. The rng_test_suite environment In the following sections we describe the use and the possibilities of the rng_test_suite environment Template Parameter The template parameter used to construct the class specify the type of seed values. As noted in the wg21-proposal for Boost random number generators the seed values have to be unsigned integral value types. As a default value uint32_t is spezified. template <typename seedtype=uint32_t> class rng_test_suite {... } Confidence Level To add a confidence level to specify the limits of the statistical calculations the following method has to be used. The confidence level has to be 0 1. If nothing was added the test suit use 5 and 95 % as standard. void add_confidence_level(double cl); 11

20 4. Using the Random Number Generator Test Suite Seeds Adding seeds is not such an easy task, because the Pseudo-random number engine requirements does only specify the iterator based seeding, nothing else. But most generators support also a seed(seedtype) method. So it is possible to add multiple seeds to use with the generators. If a generator does not support the seed(seedtype) method the test suite uses a pseudo-des algorithm (see [25], sec. 7.5) to create a set of numbers and feeds these numbers into the generator with the mandatory iterator based seed method. void add_seed(uint32_t seed) The other way to seed the generator is filling its buffer with values. To do this there is the seed(iterator, iterator) method. This method must be supported by all generators from Boost. The user has to check himself if there are enough values between the two iterators to fill the buffer. If there are insufficient values an exception might be thrown. template <typename seediter> void add_seed_iterators(const seediter begin, const seediter end) <seediter> type of iterator begin iterator to the begin of the buffer with seeds end iterator to the end of the buffer with seeds starting the test Before a random number generator is seeded, it is reset to the initial state. This means to the same state as it was while adding to the test suite. This guarantees the repeatability for different seeding. If no seeds are added, the tests run with the initial state of the generator. If a generator has to be tested in a special state, e. g. with a special seeded buffer, there is the method register_seeded_rng to handle this case Random Number Generators To register the random number generators which have to be tested, the test suite provides the following two methods. (The requirements of a random number generator are described in section 6.4) template <class T> void register_rng(std::string rng_name, uint64_t warmup = 0) <class T> type of the random number generator to test rng_name name of the generator, should be unique warmup number of random numbers to produce with the generator before starting the test This method takes the type of the random number generator as a template parameter. The concrete generator object is created inside the test suite with the default constructor. The seed calls are done in this initial state. template <class T> void register_seeded_rng(t mrng, std::string rng_name, std::string description, uint64_t warmup = 0) 12

21 4.2. The rng_test_suite environment <class T> type of the random number generator to test mrng object of the random number generator of type T rng_name name of the generator, should be unique description a description of the seed-state warmup number of random numbers to produce with the generator before starting the test This method takes an object of a random number generator as a parameter. So it is possible to use pre-seeded generators. For most generators, all further operations are done on this state of the generator. This is valid if and only if the generator class does not have external links, e. g. function pointers. If a foreign random number generator (see 6.5) is used, the generator will not be seeded before the test - it remains in the previous state Tests Adding a test is really simple, just create an object of the test class and add it to the test suite. This is done with the following method. template <class T> void register_test(t test) <class T> type of the test test test to add to collection of tests to perform It is important to note that the test must be in a ready to run state when it is added to the test suit, because the test suit calls only the run method and nothing before Running... If all desired generators, seeds and test are added to the test suite the test can be run by calling the run_test method. One has to specify where to write the XML output to. Writing to the terminal is as simple as using a file as target to write to. The second argument specifies if logic errors, thrown by a test, are caught or not. As an example, an exception may be thrown if one tries to make a binary rank test for matrices bigger than the number of bits of the random number generator. If the exception is not caught, the test suite stops and does not finish the other tests. If the exception is caught, the test is omitted and the test suit continue its work. void run_test(std::ostream& out, bool catch_logic_errors = true) out ostream to write the XML output to catch_logic_errors specifies if logic_errors thrown by tests are caught or not The run_test method should be in a try-block, there are sources which throws exceptions. The order of testing all seeds is the following: user seeded generators seed a generator with seed(s) seed a generator with seed(it, it) 13

22 4. Using the Random Number Generator Test Suite 4.3. Testing Parallel Random Number Generators To test a parallel application using different random number generators in different threads, there is a class called parallel_rng_imitator(from parallel_rng_imitator.h) which simulates such an application. The class contains a collection of definable generators and calls one after another. This generator fulfils the Boost specification and can be used in a normal way. There are some preconditions to keep in mind when using such a random number generator. All random number generators used in this parallel random number generator must have the same result_type. Unfortunately the boost::uniform_01 type does not support an default constructor, so it is not possible to map the result type to an other type. To do this, a converter which fulfils the specified interface for generators has to be written. All random number generators must have the same maximum and minimum value. Pre-seeded random number generators should be favoured because of a better control over seeding the particular generators. // include Boosts RNGs #include <boost/random.hpp> // include parallel generator #include "parallel_rng_imitator.h" // import RNGs from Boost using boost::minstd_rand0; using boost::lagged_fibonacci19937; using boost::lagged_fibonacci23209; using boost::lagged_fibonacci44497; using boost::mt19937; using boost::ecuyer1988; // make a RNG from two different Lagged Fibonacci RNGs parallel_rng_imitator< boost::tuple< lagged_fibonacci23209, lagged_fibonacci44497> > parallelrng; // does not compile, because the enlisted generators do not // have same result_type parallel_rng_imitator< boost::tuple< minstd_rand0, // result_type int32_t lagged_fibonacci23209, // result_type double mt19937> // result_type uint32_t > parallelrng_error_compile; // does compile, but throws an exception because the RNGs // does not have same min() or max() value parallel_rng_imitator< boost::tuple< ecuyer1988, // max =

23 4.4. Iterating a test minstd_rand // max = > parallelrng_error_runtime; 4.4. Iterating a test The idiom says that Once doesn t count. So, we have to repeat a test multiple times and make a statistic over all results. (Probably we also like to repeat this repetition... ) This class iterates a given test n times and calculates a Kolmogorov-Smirnov statistic over all results. This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test give a normal K-S statistic. But if we have to do this for a K-S test itself, we get four values, K and K for the original K and the same for the original K. The iterate_test fulfils the test interface and acts like a normal test. template< class Test > iterate_test(test test, std::size_t iterations) <class Test> type of the test test test to iterate iterations number of times to iterate the test 4.5. Count failings Another way to decide about success or failure is to count the failings of each test and compare with a maximal number of failures. This class iterates a given test n times and count the number of failings. If the test fails more than the faillimit allows, then it will fail, else the test is passed. (Mathematically, failings faillimit) This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test gives one value for failings, the K-S test variation results in two values, one for K and one for K. The count_fails_test fulfils the test interface and acts like a normal test. template< class Test > count_fails_test(test test, std::size_t iterations, std::size_t faillimit) <class Test> type of the test test test to iterate iterations number of times to iterate the test faillimit Limit deciding between failure or success 4.6. Bit Tests In some kind of tests, like in the count-the-1 s test from the Diehard test suite, overlapping ranges of bits are tested. From each random number some new particular numbers are 15

24 4. Using the Random Number Generator Test Suite built. This is done by masking the bit representation of the number with a specific mask which is shift from the least significant bit to the most significant bit. An example of splitting up a number in overlapping sub-numbers is given in figure 4.1. This class is called original = 180 Bits 3..0 = 4 Bits 4..1 = 10 Bits 5..2 = 13 Bits 6..3 = 6 Bits 7..4 = Figure 4.1.: Bit Test, Example of Bit Concatenation rng_bit_test and is located in the same denominated header file. This wrapper can only be used if the test is derived from one of the given base classes (chisquare_test, ks_test or gaussian_test). The interface is template<class TEST, int no_bits> rng_bit_test(test test) <class TEST> type of the test <int no_bits> number of bits for each random number test test to use for bit test Example As an example we want to know if a sequence of each 10 bits is uniformly distributed in a χ 2 sense. We have to create a test object, pass this to the wrapper an register the test. chisqr_uniformity_test chi_uni_test(200000, 10); rng_bit_test<chisqr_uniformity_test, 10> bit_chi_uni_test(chi_uni_test); rngtest.register_test<rng_bit_test<chisqr_uniformity_test, 10> >(bit_chi_uni_test); 4.7. Bit extract test Another way to test a generator is to extract only a specific range of bits from each generated random number and interpret this bits as a new number. In figure 4.2 bits 2 5 are used to make a new number. Or, we take a specific bit of a number of random numbers and interpret this bits as a new number. In figure 4.3 this is done with bit 5. To build a new random number bit five of six consecutive random numbers are used. This tests are supported by two wrappers in rng_bit_extract.h. template<typename RNG, int start_bit, int no_bits> bit_extract(std::size_t b=10240) <typename RNG> type of random number generator 16

25 4.8. The XML output Original Bit 5 Bit 2 Selected Figure 4.2.: Extracting subsequences as next random numbers Original Bit Selected 36 Figure 4.3.: Concatenating single bits to the next random number <int start_bit> first bit of new random number <int no_bits> number of bits of new random number b buffer size of random number generator template<typename RNG, int bit_no, int seqlength> bit_sequence(std::size_t b=10240) <typename RNG> type of random number generator <int bit_no> bit to use for random number <int seqlength> number of bits for each random number b test to use for bit test 4.8. The XML output The result of every test is written out on a specific stream. This stream may be defined in the run_test(std::ostream, bool catch_logic_errors = true) method. The output may be written onto the console via std::cout or, better for further processing, to a file. To write the output in a file, one has to create a file like this: #include <fstream> std::ofstream fileout("results.xml"); For a more detailed description about the XML-schema see

26 5. Tests for Studying Random Data In this section we present different tests to study the behavior of random number generators. We can distinguish two different sorts of tests, statistical tests and physical test 1. The only difference is the motivation to do the test. In the first case, we want to know the behavior of some statistical properties, in the second case, we simulate a physical system. (Strictly speaking there are some more tests like visual tests or theoretical test. But we do not look at them because of lack of automatism). Each of these tests checks a special property of the generated numbers against the theoretically expected behavior. These tests are not my invention, I only collected them and add examples of usage to it. A reference to the source (not source code) is mentioned with each test. Table 5.1 lists many known random number generator tests and its occurrence in often cited test-benches. It is impossible to list all tests, there are an infinite number of them, so we mention the most popular ones. A more interesting table for testers is table 5.2. It shows all available 2 tests in the test suite and their class names. (The name of the header file is the concatenation of class name and.h) Equidistribution test In this test we check if the generated numbers are equally distributed. See [16]. The N measured random values in the interval α;β must be divided in k classes I 1 I 2 I k. The classes contains N 1 N 2 N k N values. For each class, the expected number is calculated with the assumption that all values k N appear with the same probability p β α Check the probability with the χ 2 test for the classes and use the KS test to check the whole data. 1 A nice description of physical tests is given in [29] Passing several tests does not prove the randomness of any sequence, however. This is due to the fact that proving randomness requires that the sequence fulfils an actual definition for randomness. An unfortunate fact is, however, that there is no unique definition for randomness. [...] Therefore, passing many tests is never a sufficient condition for the use of any pseudo random number generator in all applications. In other words, in addition to standard tests, efficient application specific tests of randomness are also needed. This need is emphasized by recent simulations, in which some physical models combined with special algorithms have been found which are very sensitive to the quality of random numbers. 2 I hope that by the time this paper is published the list will already be updated with further implementations 18

27 5.2. Run test Example: Throwing a die An example for the χ 2 part of this test is given in section 3.1 and the example for the KS test can be found in section 3.2. Constructor in chisqr_uniformity_test.h chisqr_uniformity_test(uint64_t n, std::size_t classes) n number of numbers to count classes number of classes to range in random numbers Constructor in ks_uniformity_test ks_uniformity_test(std::size_t n) n number of random numbers to count 5.2. Run test In this test, we are looking for monotone subsequences of the original sequence, which are called runs. There are three different sorts of tests. We can count runs up and runs down, runs above and runs below the mean or the length of runs. As an example of a run, consider the sequence of eleven numbers { }. To show the runs up we put a vertical line at the left and right and between X i and X i 1 whenever X i X i 1. Here we get Runs up and down Split the sequence of random numbers into increasing and decreasing subsequences and count the sequences n inc n dec N If N has an adequate size, the mean and variance are given by µ a 2N 1 3 σ 2 16N For N N µ a σa 2. Converting to a standardized normal distribution by Z 0 a µ a σ a a 2N , the distribution of a is reasonably approximated by a normal distribution, 16N 29 Failure to reject the hypothesis of independence occurs when z α 2 Z 0 z α 2, where α is the level of significance 90 19

28 5. Tests for Studying Random Data Test Available in Test-Bench Knuth 1 Helsinki 2 Diehard 3 SPRNG 4 Equidistribution Test (Frequency Test) Gap Test Ising Model Test n-block test Serial Test Poker Test (Partition Test) Coupon collector s Test Permutation Test Run Test Maximum of t Test Collision Test (Hash Test) Serial correlation Test Birthday-Spacing s Test Overlapping Permutations Test Ranks of and matrices Test Ranks of 6 8 Matrices Test Monkey Tests on 20-bit Words Monkey Tests OPSO, OQSO, DNA Count the 1 s in a Stream of Bytes Count the 1 s in Specific Bytes Parking Lot Test Minimum Distance Test Random Spheres Test The Sqeeze Test Overlapping Sums Test The Craps Test Sum of distributions (for parallel streams) FFT Blocking Test 2-d Random Walk Random Walkers on a line (S n Test) 2D Intersection Test 2D Height Correlation Test Repeating Time Test Gorilla Test gcd Test Maurers Universal Test 1 [16] 2 [29] 3 [18], [19] 4 [21] Figure 5.1.: Compilation of known tests 20

29 5.2. Run test Test Class Name Description Equidistribution Test (Frequency Test) ks_uniformity_test 5.1 chisqr_uniformity_test 5.1 Gap Test gap_test 5.3 Ising Model Test ising_model_test 5.16 n-block test n_block_test 5.18 Serial Test serial_test 5.11 Poker Test (Partition Test) poker_test 5.4 Coupon collector s Test coupon_collector_test 5.5 Permutation Test permutation_test 5.6 Run Test runs_test Maximum of t Test max_of_t_test 5.7 Collision Test (Hash Test) collision_test 5.9 Serial correlation Test serial_correlation_test 5.10 Birthday-Spacing s Test birthday_spacing_test 5.8 Overlapping Permutations Test Ranks of and matrices Test bin_rank_chisqr_test Ranks of 6 8 Matrices Test bin_rank_ks_test Monkey Tests on 20-bit Words Monkey Tests OPSO,OQSO,DNA Count the 1 s in a Stream of Bytes Count the 1 s in Specific Bytes Parking Lot Test Minimum Distance Test minimum_distance_test Random Spheres Test random_sphere_test The Sqeeze Test squeeze_test Overlapping Sums Test The Craps Test craps_test Sum of distributions (for parallel streams) 5.22 FFT 5.23 Blocking Test d Random Walk random_walk_test 5.17 Random Walkers on a line (S_n Test) D Intersection Test D Height Correlation Test height_corr2d_test 5.21 Repeating Time Test 5.13 Gorilla Test 5.15 GCD Test 5.14 Maurers Universal Test 5.24 Figure 5.2.: Available tests in the RNGTS framework 21

30 5. Tests for Studying Random Data Example: If a sequence of numbers has to few runs, it is unlikely that it is a real random sequence. If we look at the following sequence, {0.12, 0.35, 0.38, 0.45, 0.51, 0.69, 0.77, 0.78, 0.90, 0.93} we can only find one run up. It is not likely to be a random sequence. If a sequence of numbers has too many runs, it is unlikely to be a real random sequence. Look at the sequence {0.08, 0.93, 0.15, 0.96, 0.26, 0.84, 0.28, 0.79, 0.36, 0.57}. If we split this sequence into runs up and runs down, we will find the following five runs up four runs down It has nine runs, five up and four down Runs above and below mean This test is an addition to the Runs up and down test (5.2.1). It s easy to build a sequence, with the first 20 numbers above mean while the following 20 numbers are below the mean, which does not fail the Runs up and down test. So we have to check the behaviour of the runs above and below the mean. Calculate the mean of the sequence of random numbers Split this sequence into subsequences above and below the mean and count the number of runs below n b and above n a. r is the total number of runs. The mean and variance of r can be expressed as µ r 2n a n b N 1 2 σr 2 2n a n b 2n a n b N N N 1 2 For either n a or n b greater than 20, r is approximately normally distributed Z 0 b 2n a n b N 1 2 2n a n b 2n a n b N N N 1 2 Failure to reject the hypothesis of independence occurs when z α 2 Z 0 z α 2, where α is the level of significance Example: We have the following sequence of random numbers. {0.78, 0.49, 0.41, 0.58, 0.82, 0.26, 0.30, 0.06, 0.36, 0.01}. Calculating the mean gives µ

31 5.2. Run test Splitting up in subsequences above and below the mean gives the following situation: In this case one run is above, one below the mean. It is not likely to be a random sequence Length of runs This test is an addition to the last two tests. It s still possible to create a sequence of numbers which passes the last two tests, but the probability that this sequence is truly random is very small. Such a sequence may be a run of two numbers below the mean, then a run of two numbers above the mean and so on. So we need to test the randomness of the length of runs. Split the sequence into subsequences in one of the given manner above where N is the number of samples Store the number of runs of length i into RUN[i] Here, we should not apply a χ 2 -test to the data stored in RUN. This is because adjacent runs are not independent. A long run will tend to be followed by a short run, and vice-versa. So, the statistic should be computed as following 1 N 6 i j 1 RUN[i] Nb i RUN[j] nb j a i j (5.1) The coefficients a i j and b i can be found in [16], there is also a method shown to calculate the coefficients for arbitrary maximal run length. Example: Length of runs up We have a random sequence: { } Marking the runs up in the sequence produces We get the following statistic 1 run of length 3 3 runs of length 2 2 run of length 1 Constructor in runs_test.h runs_test(uint64_t n, std::size_t maxrunlength) n number of random numbers to check for runs maxrunlength run length above this length are cumulated 23

32 5. Tests for Studying Random Data Internally, this test has to invert a matrix. This functionality is supported by the LAPACK library and the matrix handling is covered with routines from BLAS. The Boost interface for this two libraries is not yet in the official release, but available in the Boost-Sandbox. To use this test, the Boost-Sandbox must be installed which is also available at [2] at Sandbox CVS Gap test This test is used to examine the length of gaps between occurrences of samples in a certain range. It determines the length of consecutive subsequences with samples not in a specific range. The algorithm to count the gap length is found in [16]. Define an interval α;β with 0 α β 1 Define a list to save the number of occurrence of gaps with length l, where 0 l t. This is easily done with a structure like COUNT[l]. With every occurrence of a sequence of length l, do COUNT[l] = COUNT[l]+1. If l is bigger than t, increase COUNT[t]. Search a subsequence X i X i 1 X i l of the random sequence X 0 X 1 X N in which X i l lies in α;β but the other X s do not. This subsequence of l 1 numbers represents a gap of length l. This increases the number in COUNT[l] After enough samples are tested, the χ 2 -test is applied to the k t 1 values of COUNT[0], COUNT[1],... COUNT[t], using the following probabilities: 2 p 0 p p 1 p 1 p p 2 p 1 p p t 1 p 1 p t 1 t p t 1 p Here p β α, the probability that α X i The gap test can be applied with α 0 or β 0 to facilitate the test-procedure. The special 1 1 case α β 0 2 of 2 1 give rise to the runs above mean or runs below mean test. This is not the same implementation of the test as used in [29]. They use n random numbers and count the number of gaps, this algorithm produces random numbers until n gaps were counted. An approximative conversion from one test to the other is possible with an estimation for the number of gaps within n random numbers. gaps n β α Example: We have the following sequence: {0.11, 0.83, 0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} In this case we would take the first two numbers to determine the interval. This means α 0 11 β 0 83 or 0 11;0 83. The sequence to check is {0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} β. 24

33 5.4. Poker test The first value lies in the interval, the next two values not. This means that the gaplength is 2. Marked in the sequence, with bold letters for values in the interval and numbers to count the gap-length, the sequence looks as following: Calculating the probabilities with p and a total of five gaps t p t expected # of gaps counted # of gaps Constructor in gap_test.h gap_test(std::size_t n, double lowergaplimit, double uppergaplimit, std::size_t maxgapcount) n number of random numbers to count lowergaplimit start of gap (α) uppergaplimit end of gap (β) maxgapcount number of steps counted until they are cumulated 5.4. Poker test The original poker test considers n groups of five successive integers, denoted by X 5i X 5i 1 X 5i 4, 0 i n. We observe which of the following seven patterns each quintuple matches: All different: abcde Full house: aaabb One pair: aabcd Four of a kind: aaaab Two pairs: aabbc Five of a kind: aaaaa Three of a kind: aaabc A χ 2 -test is based on the number of quintuples in each category. To get a simpler version of this test, a good compromise [16] would be to simply count the number of distinct values in the set of five. So we would have five categories: 5 different = all different 4 different = one pair 3 different = two pairs, or three of a kind 2 different = full house, or four of a kind 1 different = five of a kind This breakdown is easier to determine systematically, and the test is nearly as good. 25

Monte Carlo Integration and Random Numbers

Monte Carlo Integration and Random Numbers Higher dimensional integration u Simpson rule with M evaluations in u one dimension the error is order M -4! u d dimensions the error is order M -4/d u In general