A Random Number Generator Test Suite for the C++ Standard

Size: px
Start display at page:

Download "A Random Number Generator Test Suite for the C++ Standard"

Transcription

1 Institute for Theoretical Physics Winter ETH Zürich Diploma Thesis A Random Number Generator Test Suite for the C++ Standard Mario Rütti March 10, 2004 Supervisor: Prof. M. Troyer maruetti@comp-phys.org troyer@phys.ethz.ch

2 I am grateful to my diploma professor Prof. Matthias Troyer for giving me the opportunity to write this instructive and inspiring diploma thesis. To say nothing of the time he spent helping me to resolve my (and my computer s) problems, and his effort to find new and unconventional solutions. My special thanks also go to my office co-worker Manuel Gil for the motivating and amusing discussions about our work and his pleasant companionship. I am grateful to Frank Moser who was acting as editor and assisted me in correcting and polishing my English sentences. I want to apologize to Ariana about lackluster evenings with a friend lost in thought. Thank you for your support and understanding during this time. Finally, I am grateful to my parents for the tremendous support they gave me during my years of studies which enabled me to achieve my goals.

3 to my parents Urs and Heidi

4

5 Abstract The heart of every Monte Carlo simulation is a source of high quality random numbers and the generator has to be picked carefully. Since the Ferrenberg affair it is known to a broad community that statistical tests alone do not suffice to determine the quality of a generator, but also application-based tests are needed. With the inclusion of an extensible random number library and the definition of a generic interface into the revised C++ standard it will be important to have access to an extensive C++ random number test suite. Most currently available test suites are limited to a subset of tests are written in Fortran or C and cannot easily be used with the C++ random number generator library. In this paper we will present a generic random number test suite written in C++. The framework is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The Boost implementation so far contains most modern random number generators. Employing generic programming techniques the test suite is flexible, easily extensible and can be used with any random number generator library, including those written in C and Fortran. Test results are produced in an XML format, which through the use of XSLT transformations allows extraction of summaries or detailed reports, and conversion to HTML, PDF, PostScript or any other format. At this time, the test suite contains a wide range of different test, including the standard tests described by Knuth, Vattulainen s physical tests, parts of Marsaglia s Diehard test suite, and a number of number of newer tests.

6

7 Contents 1. Introduction 1 2. What are random numbers? Types of random numbers Analyzing Statistics χ 2 test ( Chi-square test) Kolmogorov-Smirnov test (KS test) Gaussian Test Using the Random Number Generator Test Suite How to run a test The rng_test_suite environment Template Parameter Confidence Level Seeds Random Number Generators Tests Running Testing Parallel Random Number Generators Iterating a test Count failings Bit Tests Bit extract test The XML output Tests for Studying Random Data Equidistribution test Run test Runs up and down Runs above and below mean Length of runs Gap test Poker test Coupon-collectors test Permutation test Maximum of t test Birthday Spacings test Collision test (Hash test) Serial correlation i

8 5.11. Serial test Blocking test Repeating Time Test gcd test (greatest common divisor) Gorilla test Ising-model test Random-walk test n-block test Random Walker on a line (S n test) D Intersection test D Height Correlation test Sum of independent distributions test Fourier transform test Universal statistical test The Diehard Test Suite Birthday Spacings test The overlapping 5-permutation test Ranks of binary matrices The bitstream test The OPSO, OQSO and DNA tests The count-the-1 s test The parking lot test The overlapping sums test Squeeze test The Minimum Distance test Random Sphere test The runs test Craps test Extending the Random Number Generator Test Suite How to implement a test Implementing a χ 2, Kolmogorov-Smirnov or a Gaussian test χ 2 test Kolmogorov-Smirnov test Gaussian test The multiple_test wrapper Useful sequence diagrams Demands on Random Number Generators Foreign Random Number Generators The XML Schema A. Collection of Test Parameters 64 B. Examples 66 C. Compiling the Test Suite 67 ii

9 1. Introduction How random is random? In this diploma thesis a generic random number test suite (RNGTS) is developed. The test suite framework is written in C++ with attention to modern generic programming paradigm. It is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The aim of RNGTS is to assist in finding a suitable random number generator for a specific purpose and in deciding between good and bad random number generators. Through a generic interface the RNGTS makes a variety of different tests available and provides the possibility to extending the suite with user defined tests. The test results are produced in XML format, which allows the transformation into summaries or detailed reports through the use of XSLT style sheets. The main purpose is to support the user in his decision about a random number generator, and in the question how random the numbers produced by the random number generators are. In the second part of this paper there is a short discussion about the different types of random numbers and their applications. Then, in the third part the involved statistical methods and their pertaining programming interface are presented. The fourth part contains the handling of RNGTS. This part is a must for the user who wants to perform any tests. It also describes the core of the whole test suite. In the fifth part there is a presentation of the most popular random number generator tests, their parameters and programming interfaces. These tests are collected from different sources and authors. The sixth part is for advanced users who want to extend RNGTS and add new tests or different extensions. Finally, the appendix contains a collection of different lists with test parameters and other useful stuff. Download The RNGTS framework is located on the web server and may be downloaded there. There are also some installation hints, some examples and the full documentation with additional interface descriptions and the XSL schema. 1

10 2. What are random numbers? Random numbers are characterised by the fact that their value can not be predicted. Or, in other words, if one constructs a sequence of random numbers, the probability distribution of the following random numbers have to be completely independent of all the other generated numbers. A more sophisticated mathematical definition and discussion can be found in [6] Types of random numbers There are three types of random numbers, quasi-, pseudo- and true- random numbers. These different types of random numbers have different applications. (It is philosophical question what we can call random or not, but here, we use the following descriptions, its simpler... ) True Random Number The most often used example for truly random numbers is the decay of a radioactive material. If a Geiger counter is put in front of such a radioactive source, the intervals between the decay events are truly random. True random numbers are gained from physical processes like radioactive decay or also rolling a dice. But rolling a dice is difficult, perhaps someone could control the dice so well to determine the outcome. Pseudo Random Number These numbers are generated by a computer or that is to say, by an algorithm and because of this not truly random. Every new number is generated from the previous ones by an algorithm. This means that the new value is fully determined by the previous ones. But, depending on the algorithm, they often have properties making them very suitable for simulations. Quasi Random Number A good description quoted from [25], Chapter 7.7 Sequences of n-tuples that fill n-space more uniformly than uncorrelated random points are called quasi-random sequences. That term is somewhat of a misnomer, since there is nothing random about quasi-random sequences: They are cleverly crafted to be, in fact, sub-random. The sample points in a quasi-random sequence are, in a precise sense, maximally avoiding each other. Quasi random numbers are not designed to appear random, rather to be uniformly distributed. One aim of such numbers is to reduce and control errors in Monte Carlo simulations. A picture is always a good way to illustrate the difference between this two types. In figure and we have plots with different numbers of pseudo- and quasi-random numbers. This is a good demonstration to show the structure of quasi-random numbers, but it is also 1 This plot was generated with the Matlab 6 rand generator, a combination of a lagged Fibonacci generator, with a cache of 32 floating point numbers and a shift register random integer generator. 2 This plot was generated with the sobol.m routine for Matlab from ~burkardt/m_src/sobol/sobol.html. This web-site includes also a variety of references for Sobol sequences and some implementations in different programming languages. 2

11 2.1. Types of random numbers possible to see that quasi-random numbers fill continuously the hole plane, while pseudorandom numbers may build clusters and holes. If we are talking about random numbers in the following parts, we mean pseudo random numbers Points Points Points Points Figure 2.1.: Pseudo Random Numbers Points Points Points Points Figure 2.2.: Quasi Random Numbers 3

12 3. Analyzing Statistics In this section we describe the χ 2 test and the Kolmogorov-Smirnov test. Both are designed to check if the measured distribution is similar to the expected distribution. So we can compare different distributions. Later on we describe the gaussian test which is based on the gaussian normal distribution. A detailed description for the outlined C++ classes can be found in the section about implementing additional tests χ 2 test ( Chi-square test 1 ) The χ 2 -Test is perhaps the best known statistical test. It is based on a comparison between the empirical distribution function and the theoretically expected distribution. The empirical distribution is based on the results of the random process. The n measured random values must be divided in k classes I 1 I 2 I k. The classes contain N 1 N 2 N k N values. For each class, the expected number of values must be calculated with the expected distribution function N i N p i for a given p i (p i p i ) Considering the squares of the differences between the measured values and the expected values gives the χ 2 value χ 2 k i 1 n i np i 2 1 np i n k i 1 n 2 i p i n (3.1) With k classes, there are ν k 1 degrees of freedom in the χ 2 distribution. Looking up for χ 2 and ν in χ 2 distribution tables, which can be found in [16], [3], the probability being above or below the given χ 2 can be found. Calculating the probability of a χ 2 value is not such an easy task, but there is an algorithm published by Hill and Pike, which can be used, see [11], [12], [14]. Example: Throwing a die After throwing a die 120 times we get the following results value # observed Sometimes the χ 2 test stands for the Equidistribution test. 4

13 3.1. χ 2 test ( Chi-square test) There is no reason to change the k 6 natural classes I 1 I 2 I 6. The number of values is n 120. For a true die we expect a probability of p i 1 6 for each die-number The expected number of values is n i np i 20 The χ 2 value is calculated by the following sum. χ 2 k n i np i 2 np i i Here we have k 6 classes. This means that the number of degrees of freedom is ν 5. Looking up for χ in a table, the value lies between 50% and 75%. This means that we will have a χ between 25% and 50% of the time. The randomness observed in this experiment is satisfactory in this test. Available code To handle the χ 2 statistics there is the chisquare_test class, which provides different methods used for the calculation. Some important methods are listed in the declaration below. The class is defined in the chisquare_test.h file. class chisquare_test { void prepare_statistics(std::size_t count_size, uint64_t runs, std::size_t degoffreedom = 0); template<class ForwardIterator> void calculate_chisquare_value(forwarditerator first, ForwardIterator last, std::size_t degoffreedom); template<class ForwardIterator> void calculate_chisquare_value(forwarditerator first, ForwardIterator last); void set_chisquare_value(double chisquarevalue, std::size_t degoffreedom); chisqr_stat_type get_chisquare_value(); double get_chisquare_prob(); } In the same file there is also a function to calculate the χ 2 value without class stuff. template<class ForwardIterator, class UnaryFunction> double calc_chisquare_value(forwarditerator first, ForwardIterator last, UnaryFunction probability, std::size_t degoffreedom) To calculate the probability from a χ 2 value in the file chisqr_prob.h file there is a function managing this task. double chi_probability(double chisqr, int dof) 5

14 3. Analyzing Statistics 3.2. Kolmogorov-Smirnov test (KS test) As we have seen, the χ 2 test be applied when observations can fall into a finite number of categories. But normally one will consider random quantities which may assume an infinite number of values. In this test, the random number generators distribution function F n x is compared to the expected distribution function F x. In [16], Knuth defined this functions as follows: F x probability that X x F n x number of X 1 X 2 X n which are x n The n measured random values must be sorted in ascending order, X 1 X 2 X n To make the test, we form the following statistics: K n n max x n x F x F n max 1 i n K n n max x F x F n x n max 1 i n i F X n i F X i i 1 n Like in the χ 2 -test, we may now look up the values K n, K n in a table [16] to determine if they are significantly high or low. An other way is to calculate the probabilities by the algorithm given in [1] and in chapter 3.3.1, C. History, bibliography, and theory of [16] In [16] there is also formula given to calculate the probability exactly prob K n t n t n n t k 0 n k k t k t n k n k 1 (3.2) Example: 10 random numbers We got 10 numbers from a random number generator. These are {0.809, 0.465, 0.151, 0.628, 0.318, 0.824, 0.394, 0.968, 0.179, 0.458} First we sort the random numbers X i ascending order Calculate the quantities K i and K i and find the maximum of these quantities 6

15 With these values we calculate K 10 and K 10 as follows K 10 n max 1 i K n K 10 n max 1 i K n 3.2. Kolmogorov-Smirnov test (KS test) i X i K i K i i i If we look up these values in an appropriate table for n 10, we find that the chance to get a K 10 greater then or lies between 50% and 75%. Available code To calculate the Kolmogorov-Smirnov statistics there is a class which supports the required routines. The definition of this class called ks_test is found in ks_test.h. Some important methods are listed below class ks_test { void prepare_statistics(uint64_t runs); template<class ForwardIterator> void calculate_ks_value(forwarditerator first, ForwardIterator last); template<class ForwardIterator, class UnaryFunction> void ks_value(forwarditerator first, ForwardIterator last, UnaryFunction integratedprobdistr); ks_stat_type get_ks_value(); ks_prob_type get_ks_prob(); } There is also a function to calculate the KS values. template<class ForwardIterator, class UnaryFunction> std::pair<double, double> calc_ks_value(forwarditerator first, ForwardIterator last, UnaryFunction integratedprobdistr) To calculate the probability for a KS value the following function is defined in the file ks_prob.h. boost::tuple<double, double> ks_probability(int n, std::pair<double, double> kspair) 7

16 2 3. Analyzing Statistics percent factor % 5% % % 5% Σ mean Σ Figure 3.1.: Gaussian distribution Figure 3.2.: Percentage function 3.3. Gaussian Test The Gaussian test is a little different from the χ 2 or the Kolmogorov-Smirnov test. In these two tests the expected distribution function is compared with the measured distribution function and based on the difference some indicators are calculated. In the Gaussian test a physical view is used. If a measurement is done, it is known that, even if the best tools are used, the result depends on a number of ruleless and uncontrolled parameters. These measurement errors are random and a combination of different single errors. The central limit theorem predicates that the measured value behaves like a normal distributed random variable (This is valid in the normal case). The normalized density function is written as x f x 1 2πσ e 1 2 µ σ x (3.3) where µ is the mean of expected value and σ the standard deviation. To make a classification of measured values one can compare the deviation from the expected value with the standard deviation. It can be calculated that in the interval µ σ µ σ 68.3 % of all measured values are expected. If the interval is expanded to µ 3σ µ 3σ we expect 99.7% of all measured values in this range. Based on this theory it is possible to give a possibility for a measured value. The assumption is that the expected value and the deviation are known. The deviation factor is calculated with the following formula: 2 where erf denotes the error function erf z perc x erf 1 2x (3.4) π z 0 e t 2 dt. In this formula we define the mean value (expected value) as 50 %, if the deviation is positive a percentage value bigger than 50 % results or if the deviation is negative, a percentage value smaller than 50 % results. The function is shown in figure

17 3.3. Gaussian Test Example: Ising model test statistic We run the Ising model test described later on and check the result. From the simulation we get a specific energy of whereas a value of is expected. The standard deviation is calculated as This results in a deviation from the mean of σ. This result can be converted in percent and one gets % from the mean value. That means that only % of the measured values will be smaller than this value. Available code To get some support calculating the gaussian statistics there is a class called gaussian_test in the file gaussian_test.h. The declarations of the most important methods are listed below. class gaussian_test { void prepare_statistics(double deviation, double stat_value, double mean); void calc_gaussian_value(); double get_gaussian_prob(); } 9

18 4. Using the Random Number Generator Test Suite This section describes how to use the Random Number Generator Test Suit (RNGTS) with the available wrappers and helpers. The aim was to supply a simple but enough powerful interface to build a flexible system to test different types of random number generators with different tests. But also to allow the generation of various kind of result representation through using a universal XML output format How to run a test Testing a random number generator is simple, the only requirement for the generator is that it fulfils the Boost Pseudo-random number engine requirements. This can be found in http: // written by Jens Maurer. The listing below shows a exemplary test program. // include Boosts random number generator #include <boost/random.hpp> // definition to show progress during the test #define PRINT_STATUS // include the test suite environment #include "rng_test_suite.h" // include all header of used tests #include "poker_test.h" #include "ising_model_test.h" int main() { // import random number generator from Boost using boost::lagged_fibonacci44497; using boost::mt19937; // create a TestSuite using uint32_t seeds rng_test_suite<> testsuite; // add desired confidence level testsuite.add_confidence_level(0.05); testsuite.add_confidence_level(0.95); testsuite.add_confidence_level(0.1); testsuite.add_confidence_level(0.9); // add desired seeds testsuite.add_seed( ); testsuite.add_seed(236598); testsuite.add_seed(1237); // register the random number generator to test 10

19 4.2. The rng_test_suite environment testsuite.register_rng<lagged_fibonacci44497>("lagged Fibonacci 44497"); testsuite.register_rng<mt19937>("mt19937, Mersenne Twister", 10000); // create the test object poker_test pokertest(100000, 5); ising_model_test isingtest( , 16); // register the tests testsuite.register_test<ising_model_test>(isingtest); testsuite.register_test<poker_test>(pokertest); // run tests... // specify destination for writing the XML output, write output into a file std::ofstream file_out("test_output.xml"); // runs all tests and catches possible exceptions try { // catch possible logic_error exceptions testsuite.run_test(file_out, true); } catch (std::exception& e) { std::cout << "exception occured, program terminated : " << e.what(); } file_out.close(); } return 0; 4.2. The rng_test_suite environment In the following sections we describe the use and the possibilities of the rng_test_suite environment Template Parameter The template parameter used to construct the class specify the type of seed values. As noted in the wg21-proposal for Boost random number generators the seed values have to be unsigned integral value types. As a default value uint32_t is spezified. template <typename seedtype=uint32_t> class rng_test_suite {... } Confidence Level To add a confidence level to specify the limits of the statistical calculations the following method has to be used. The confidence level has to be 0 1. If nothing was added the test suit use 5 and 95 % as standard. void add_confidence_level(double cl); 11

20 4. Using the Random Number Generator Test Suite Seeds Adding seeds is not such an easy task, because the Pseudo-random number engine requirements does only specify the iterator based seeding, nothing else. But most generators support also a seed(seedtype) method. So it is possible to add multiple seeds to use with the generators. If a generator does not support the seed(seedtype) method the test suite uses a pseudo-des algorithm (see [25], sec. 7.5) to create a set of numbers and feeds these numbers into the generator with the mandatory iterator based seed method. void add_seed(uint32_t seed) The other way to seed the generator is filling its buffer with values. To do this there is the seed(iterator, iterator) method. This method must be supported by all generators from Boost. The user has to check himself if there are enough values between the two iterators to fill the buffer. If there are insufficient values an exception might be thrown. template <typename seediter> void add_seed_iterators(const seediter begin, const seediter end) <seediter> type of iterator begin iterator to the begin of the buffer with seeds end iterator to the end of the buffer with seeds starting the test Before a random number generator is seeded, it is reset to the initial state. This means to the same state as it was while adding to the test suite. This guarantees the repeatability for different seeding. If no seeds are added, the tests run with the initial state of the generator. If a generator has to be tested in a special state, e. g. with a special seeded buffer, there is the method register_seeded_rng to handle this case Random Number Generators To register the random number generators which have to be tested, the test suite provides the following two methods. (The requirements of a random number generator are described in section 6.4) template <class T> void register_rng(std::string rng_name, uint64_t warmup = 0) <class T> type of the random number generator to test rng_name name of the generator, should be unique warmup number of random numbers to produce with the generator before starting the test This method takes the type of the random number generator as a template parameter. The concrete generator object is created inside the test suite with the default constructor. The seed calls are done in this initial state. template <class T> void register_seeded_rng(t mrng, std::string rng_name, std::string description, uint64_t warmup = 0) 12

21 4.2. The rng_test_suite environment <class T> type of the random number generator to test mrng object of the random number generator of type T rng_name name of the generator, should be unique description a description of the seed-state warmup number of random numbers to produce with the generator before starting the test This method takes an object of a random number generator as a parameter. So it is possible to use pre-seeded generators. For most generators, all further operations are done on this state of the generator. This is valid if and only if the generator class does not have external links, e. g. function pointers. If a foreign random number generator (see 6.5) is used, the generator will not be seeded before the test - it remains in the previous state Tests Adding a test is really simple, just create an object of the test class and add it to the test suite. This is done with the following method. template <class T> void register_test(t test) <class T> type of the test test test to add to collection of tests to perform It is important to note that the test must be in a ready to run state when it is added to the test suit, because the test suit calls only the run method and nothing before Running... If all desired generators, seeds and test are added to the test suite the test can be run by calling the run_test method. One has to specify where to write the XML output to. Writing to the terminal is as simple as using a file as target to write to. The second argument specifies if logic errors, thrown by a test, are caught or not. As an example, an exception may be thrown if one tries to make a binary rank test for matrices bigger than the number of bits of the random number generator. If the exception is not caught, the test suite stops and does not finish the other tests. If the exception is caught, the test is omitted and the test suit continue its work. void run_test(std::ostream& out, bool catch_logic_errors = true) out ostream to write the XML output to catch_logic_errors specifies if logic_errors thrown by tests are caught or not The run_test method should be in a try-block, there are sources which throws exceptions. The order of testing all seeds is the following: user seeded generators seed a generator with seed(s) seed a generator with seed(it, it) 13

22 4. Using the Random Number Generator Test Suite 4.3. Testing Parallel Random Number Generators To test a parallel application using different random number generators in different threads, there is a class called parallel_rng_imitator(from parallel_rng_imitator.h) which simulates such an application. The class contains a collection of definable generators and calls one after another. This generator fulfils the Boost specification and can be used in a normal way. There are some preconditions to keep in mind when using such a random number generator. All random number generators used in this parallel random number generator must have the same result_type. Unfortunately the boost::uniform_01 type does not support an default constructor, so it is not possible to map the result type to an other type. To do this, a converter which fulfils the specified interface for generators has to be written. All random number generators must have the same maximum and minimum value. Pre-seeded random number generators should be favoured because of a better control over seeding the particular generators. // include Boosts RNGs #include <boost/random.hpp> // include parallel generator #include "parallel_rng_imitator.h" // import RNGs from Boost using boost::minstd_rand0; using boost::lagged_fibonacci19937; using boost::lagged_fibonacci23209; using boost::lagged_fibonacci44497; using boost::mt19937; using boost::ecuyer1988; // make a RNG from two different Lagged Fibonacci RNGs parallel_rng_imitator< boost::tuple< lagged_fibonacci23209, lagged_fibonacci44497> > parallelrng; // does not compile, because the enlisted generators do not // have same result_type parallel_rng_imitator< boost::tuple< minstd_rand0, // result_type int32_t lagged_fibonacci23209, // result_type double mt19937> // result_type uint32_t > parallelrng_error_compile; // does compile, but throws an exception because the RNGs // does not have same min() or max() value parallel_rng_imitator< boost::tuple< ecuyer1988, // max =

23 4.4. Iterating a test minstd_rand // max = > parallelrng_error_runtime; 4.4. Iterating a test The idiom says that Once doesn t count. So, we have to repeat a test multiple times and make a statistic over all results. (Probably we also like to repeat this repetition... ) This class iterates a given test n times and calculates a Kolmogorov-Smirnov statistic over all results. This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test give a normal K-S statistic. But if we have to do this for a K-S test itself, we get four values, K and K for the original K and the same for the original K. The iterate_test fulfils the test interface and acts like a normal test. template< class Test > iterate_test(test test, std::size_t iterations) <class Test> type of the test test test to iterate iterations number of times to iterate the test 4.5. Count failings Another way to decide about success or failure is to count the failings of each test and compare with a maximal number of failures. This class iterates a given test n times and count the number of failings. If the test fails more than the faillimit allows, then it will fail, else the test is passed. (Mathematically, failings faillimit) This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test gives one value for failings, the K-S test variation results in two values, one for K and one for K. The count_fails_test fulfils the test interface and acts like a normal test. template< class Test > count_fails_test(test test, std::size_t iterations, std::size_t faillimit) <class Test> type of the test test test to iterate iterations number of times to iterate the test faillimit Limit deciding between failure or success 4.6. Bit Tests In some kind of tests, like in the count-the-1 s test from the Diehard test suite, overlapping ranges of bits are tested. From each random number some new particular numbers are 15

24 4. Using the Random Number Generator Test Suite built. This is done by masking the bit representation of the number with a specific mask which is shift from the least significant bit to the most significant bit. An example of splitting up a number in overlapping sub-numbers is given in figure 4.1. This class is called original = 180 Bits 3..0 = 4 Bits 4..1 = 10 Bits 5..2 = 13 Bits 6..3 = 6 Bits 7..4 = Figure 4.1.: Bit Test, Example of Bit Concatenation rng_bit_test and is located in the same denominated header file. This wrapper can only be used if the test is derived from one of the given base classes (chisquare_test, ks_test or gaussian_test). The interface is template<class TEST, int no_bits> rng_bit_test(test test) <class TEST> type of the test <int no_bits> number of bits for each random number test test to use for bit test Example As an example we want to know if a sequence of each 10 bits is uniformly distributed in a χ 2 sense. We have to create a test object, pass this to the wrapper an register the test. chisqr_uniformity_test chi_uni_test(200000, 10); rng_bit_test<chisqr_uniformity_test, 10> bit_chi_uni_test(chi_uni_test); rngtest.register_test<rng_bit_test<chisqr_uniformity_test, 10> >(bit_chi_uni_test); 4.7. Bit extract test Another way to test a generator is to extract only a specific range of bits from each generated random number and interpret this bits as a new number. In figure 4.2 bits 2 5 are used to make a new number. Or, we take a specific bit of a number of random numbers and interpret this bits as a new number. In figure 4.3 this is done with bit 5. To build a new random number bit five of six consecutive random numbers are used. This tests are supported by two wrappers in rng_bit_extract.h. template<typename RNG, int start_bit, int no_bits> bit_extract(std::size_t b=10240) <typename RNG> type of random number generator 16

25 4.8. The XML output Original Bit 5 Bit 2 Selected Figure 4.2.: Extracting subsequences as next random numbers Original Bit Selected 36 Figure 4.3.: Concatenating single bits to the next random number <int start_bit> first bit of new random number <int no_bits> number of bits of new random number b buffer size of random number generator template<typename RNG, int bit_no, int seqlength> bit_sequence(std::size_t b=10240) <typename RNG> type of random number generator <int bit_no> bit to use for random number <int seqlength> number of bits for each random number b test to use for bit test 4.8. The XML output The result of every test is written out on a specific stream. This stream may be defined in the run_test(std::ostream, bool catch_logic_errors = true) method. The output may be written onto the console via std::cout or, better for further processing, to a file. To write the output in a file, one has to create a file like this: #include <fstream> std::ofstream fileout("results.xml"); For a more detailed description about the XML-schema see

26 5. Tests for Studying Random Data In this section we present different tests to study the behavior of random number generators. We can distinguish two different sorts of tests, statistical tests and physical test 1. The only difference is the motivation to do the test. In the first case, we want to know the behavior of some statistical properties, in the second case, we simulate a physical system. (Strictly speaking there are some more tests like visual tests or theoretical test. But we do not look at them because of lack of automatism). Each of these tests checks a special property of the generated numbers against the theoretically expected behavior. These tests are not my invention, I only collected them and add examples of usage to it. A reference to the source (not source code) is mentioned with each test. Table 5.1 lists many known random number generator tests and its occurrence in often cited test-benches. It is impossible to list all tests, there are an infinite number of them, so we mention the most popular ones. A more interesting table for testers is table 5.2. It shows all available 2 tests in the test suite and their class names. (The name of the header file is the concatenation of class name and.h) Equidistribution test In this test we check if the generated numbers are equally distributed. See [16]. The N measured random values in the interval α;β must be divided in k classes I 1 I 2 I k. The classes contains N 1 N 2 N k N values. For each class, the expected number is calculated with the assumption that all values k N appear with the same probability p β α Check the probability with the χ 2 test for the classes and use the KS test to check the whole data. 1 A nice description of physical tests is given in [29] Passing several tests does not prove the randomness of any sequence, however. This is due to the fact that proving randomness requires that the sequence fulfils an actual definition for randomness. An unfortunate fact is, however, that there is no unique definition for randomness. [...] Therefore, passing many tests is never a sufficient condition for the use of any pseudo random number generator in all applications. In other words, in addition to standard tests, efficient application specific tests of randomness are also needed. This need is emphasized by recent simulations, in which some physical models combined with special algorithms have been found which are very sensitive to the quality of random numbers. 2 I hope that by the time this paper is published the list will already be updated with further implementations 18

27 5.2. Run test Example: Throwing a die An example for the χ 2 part of this test is given in section 3.1 and the example for the KS test can be found in section 3.2. Constructor in chisqr_uniformity_test.h chisqr_uniformity_test(uint64_t n, std::size_t classes) n number of numbers to count classes number of classes to range in random numbers Constructor in ks_uniformity_test ks_uniformity_test(std::size_t n) n number of random numbers to count 5.2. Run test In this test, we are looking for monotone subsequences of the original sequence, which are called runs. There are three different sorts of tests. We can count runs up and runs down, runs above and runs below the mean or the length of runs. As an example of a run, consider the sequence of eleven numbers { }. To show the runs up we put a vertical line at the left and right and between X i and X i 1 whenever X i X i 1. Here we get Runs up and down Split the sequence of random numbers into increasing and decreasing subsequences and count the sequences n inc n dec N If N has an adequate size, the mean and variance are given by µ a 2N 1 3 σ 2 16N For N N µ a σa 2. Converting to a standardized normal distribution by Z 0 a µ a σ a a 2N , the distribution of a is reasonably approximated by a normal distribution, 16N 29 Failure to reject the hypothesis of independence occurs when z α 2 Z 0 z α 2, where α is the level of significance 90 19

28 5. Tests for Studying Random Data Test Available in Test-Bench Knuth 1 Helsinki 2 Diehard 3 SPRNG 4 Equidistribution Test (Frequency Test) Gap Test Ising Model Test n-block test Serial Test Poker Test (Partition Test) Coupon collector s Test Permutation Test Run Test Maximum of t Test Collision Test (Hash Test) Serial correlation Test Birthday-Spacing s Test Overlapping Permutations Test Ranks of and matrices Test Ranks of 6 8 Matrices Test Monkey Tests on 20-bit Words Monkey Tests OPSO, OQSO, DNA Count the 1 s in a Stream of Bytes Count the 1 s in Specific Bytes Parking Lot Test Minimum Distance Test Random Spheres Test The Sqeeze Test Overlapping Sums Test The Craps Test Sum of distributions (for parallel streams) FFT Blocking Test 2-d Random Walk Random Walkers on a line (S n Test) 2D Intersection Test 2D Height Correlation Test Repeating Time Test Gorilla Test gcd Test Maurers Universal Test 1 [16] 2 [29] 3 [18], [19] 4 [21] Figure 5.1.: Compilation of known tests 20

29 5.2. Run test Test Class Name Description Equidistribution Test (Frequency Test) ks_uniformity_test 5.1 chisqr_uniformity_test 5.1 Gap Test gap_test 5.3 Ising Model Test ising_model_test 5.16 n-block test n_block_test 5.18 Serial Test serial_test 5.11 Poker Test (Partition Test) poker_test 5.4 Coupon collector s Test coupon_collector_test 5.5 Permutation Test permutation_test 5.6 Run Test runs_test Maximum of t Test max_of_t_test 5.7 Collision Test (Hash Test) collision_test 5.9 Serial correlation Test serial_correlation_test 5.10 Birthday-Spacing s Test birthday_spacing_test 5.8 Overlapping Permutations Test Ranks of and matrices Test bin_rank_chisqr_test Ranks of 6 8 Matrices Test bin_rank_ks_test Monkey Tests on 20-bit Words Monkey Tests OPSO,OQSO,DNA Count the 1 s in a Stream of Bytes Count the 1 s in Specific Bytes Parking Lot Test Minimum Distance Test minimum_distance_test Random Spheres Test random_sphere_test The Sqeeze Test squeeze_test Overlapping Sums Test The Craps Test craps_test Sum of distributions (for parallel streams) 5.22 FFT 5.23 Blocking Test d Random Walk random_walk_test 5.17 Random Walkers on a line (S_n Test) D Intersection Test D Height Correlation Test height_corr2d_test 5.21 Repeating Time Test 5.13 Gorilla Test 5.15 GCD Test 5.14 Maurers Universal Test 5.24 Figure 5.2.: Available tests in the RNGTS framework 21

30 5. Tests for Studying Random Data Example: If a sequence of numbers has to few runs, it is unlikely that it is a real random sequence. If we look at the following sequence, {0.12, 0.35, 0.38, 0.45, 0.51, 0.69, 0.77, 0.78, 0.90, 0.93} we can only find one run up. It is not likely to be a random sequence. If a sequence of numbers has too many runs, it is unlikely to be a real random sequence. Look at the sequence {0.08, 0.93, 0.15, 0.96, 0.26, 0.84, 0.28, 0.79, 0.36, 0.57}. If we split this sequence into runs up and runs down, we will find the following five runs up four runs down It has nine runs, five up and four down Runs above and below mean This test is an addition to the Runs up and down test (5.2.1). It s easy to build a sequence, with the first 20 numbers above mean while the following 20 numbers are below the mean, which does not fail the Runs up and down test. So we have to check the behaviour of the runs above and below the mean. Calculate the mean of the sequence of random numbers Split this sequence into subsequences above and below the mean and count the number of runs below n b and above n a. r is the total number of runs. The mean and variance of r can be expressed as µ r 2n a n b N 1 2 σr 2 2n a n b 2n a n b N N N 1 2 For either n a or n b greater than 20, r is approximately normally distributed Z 0 b 2n a n b N 1 2 2n a n b 2n a n b N N N 1 2 Failure to reject the hypothesis of independence occurs when z α 2 Z 0 z α 2, where α is the level of significance Example: We have the following sequence of random numbers. {0.78, 0.49, 0.41, 0.58, 0.82, 0.26, 0.30, 0.06, 0.36, 0.01}. Calculating the mean gives µ

31 5.2. Run test Splitting up in subsequences above and below the mean gives the following situation: In this case one run is above, one below the mean. It is not likely to be a random sequence Length of runs This test is an addition to the last two tests. It s still possible to create a sequence of numbers which passes the last two tests, but the probability that this sequence is truly random is very small. Such a sequence may be a run of two numbers below the mean, then a run of two numbers above the mean and so on. So we need to test the randomness of the length of runs. Split the sequence into subsequences in one of the given manner above where N is the number of samples Store the number of runs of length i into RUN[i] Here, we should not apply a χ 2 -test to the data stored in RUN. This is because adjacent runs are not independent. A long run will tend to be followed by a short run, and vice-versa. So, the statistic should be computed as following 1 N 6 i j 1 RUN[i] Nb i RUN[j] nb j a i j (5.1) The coefficients a i j and b i can be found in [16], there is also a method shown to calculate the coefficients for arbitrary maximal run length. Example: Length of runs up We have a random sequence: { } Marking the runs up in the sequence produces We get the following statistic 1 run of length 3 3 runs of length 2 2 run of length 1 Constructor in runs_test.h runs_test(uint64_t n, std::size_t maxrunlength) n number of random numbers to check for runs maxrunlength run length above this length are cumulated 23

32 5. Tests for Studying Random Data Internally, this test has to invert a matrix. This functionality is supported by the LAPACK library and the matrix handling is covered with routines from BLAS. The Boost interface for this two libraries is not yet in the official release, but available in the Boost-Sandbox. To use this test, the Boost-Sandbox must be installed which is also available at [2] at Sandbox CVS Gap test This test is used to examine the length of gaps between occurrences of samples in a certain range. It determines the length of consecutive subsequences with samples not in a specific range. The algorithm to count the gap length is found in [16]. Define an interval α;β with 0 α β 1 Define a list to save the number of occurrence of gaps with length l, where 0 l t. This is easily done with a structure like COUNT[l]. With every occurrence of a sequence of length l, do COUNT[l] = COUNT[l]+1. If l is bigger than t, increase COUNT[t]. Search a subsequence X i X i 1 X i l of the random sequence X 0 X 1 X N in which X i l lies in α;β but the other X s do not. This subsequence of l 1 numbers represents a gap of length l. This increases the number in COUNT[l] After enough samples are tested, the χ 2 -test is applied to the k t 1 values of COUNT[0], COUNT[1],... COUNT[t], using the following probabilities: 2 p 0 p p 1 p 1 p p 2 p 1 p p t 1 p 1 p t 1 t p t 1 p Here p β α, the probability that α X i The gap test can be applied with α 0 or β 0 to facilitate the test-procedure. The special 1 1 case α β 0 2 of 2 1 give rise to the runs above mean or runs below mean test. This is not the same implementation of the test as used in [29]. They use n random numbers and count the number of gaps, this algorithm produces random numbers until n gaps were counted. An approximative conversion from one test to the other is possible with an estimation for the number of gaps within n random numbers. gaps n β α Example: We have the following sequence: {0.11, 0.83, 0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} In this case we would take the first two numbers to determine the interval. This means α 0 11 β 0 83 or 0 11;0 83. The sequence to check is {0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} β. 24

33 5.4. Poker test The first value lies in the interval, the next two values not. This means that the gaplength is 2. Marked in the sequence, with bold letters for values in the interval and numbers to count the gap-length, the sequence looks as following: Calculating the probabilities with p and a total of five gaps t p t expected # of gaps counted # of gaps Constructor in gap_test.h gap_test(std::size_t n, double lowergaplimit, double uppergaplimit, std::size_t maxgapcount) n number of random numbers to count lowergaplimit start of gap (α) uppergaplimit end of gap (β) maxgapcount number of steps counted until they are cumulated 5.4. Poker test The original poker test considers n groups of five successive integers, denoted by X 5i X 5i 1 X 5i 4, 0 i n. We observe which of the following seven patterns each quintuple matches: All different: abcde Full house: aaabb One pair: aabcd Four of a kind: aaaab Two pairs: aabbc Five of a kind: aaaaa Three of a kind: aaabc A χ 2 -test is based on the number of quintuples in each category. To get a simpler version of this test, a good compromise [16] would be to simply count the number of distinct values in the set of five. So we would have five categories: 5 different = all different 4 different = one pair 3 different = two pairs, or three of a kind 2 different = full house, or four of a kind 1 different = five of a kind This breakdown is easier to determine systematically, and the test is nearly as good. 25

Monte Carlo Integration and Random Numbers

Monte Carlo Integration and Random Numbers Monte Carlo Integration and Random Numbers Higher dimensional integration u Simpson rule with M evaluations in u one dimension the error is order M -4! u d dimensions the error is order M -4/d u In general

More information

Forrest B. Brown, Yasunobu Nagaya. American Nuclear Society 2002 Winter Meeting November 17-21, 2002 Washington, DC

Forrest B. Brown, Yasunobu Nagaya. American Nuclear Society 2002 Winter Meeting November 17-21, 2002 Washington, DC LA-UR-02-3782 Approved for public release; distribution is unlimited. Title: THE MCNP5 RANDOM NUMBER GENERATOR Author(s): Forrest B. Brown, Yasunobu Nagaya Submitted to: American Nuclear Society 2002 Winter

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 13 Random Numbers and Stochastic Simulation Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Tutorial: Random Number Generation

Tutorial: Random Number Generation Tutorial: Random Number Generation John Lau johnlau@umail.ucsb.edu Henry Yu henryyu@umail.ucsb.edu June 2018 1 Abstract This tutorial will cover the basics of Random Number Generation. This includes properties

More information

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods -

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods - Physics 736 Experimental Methods in Nuclear-, Particle-, and Astrophysics - Statistical Methods - Karsten Heeger heeger@wisc.edu Course Schedule and Reading course website http://neutrino.physics.wisc.edu/teaching/phys736/

More information

Testing parallel random number generators

Testing parallel random number generators Parallel Computing 29 (2003) 69 94 www.elsevier.com/locate/parco Testing parallel random number generators Ashok Srinivasan a, Michael Mascagni b, *, David Ceperley c a Department of Computer Science,

More information

Proposed Pseudorandom Number Generator

Proposed Pseudorandom Number Generator IJSRD National Conference on Technological Advancement and Automatization in Engineering January 2016 ISSN:2321-0613 Mahesh S Naik Research Scholar Shri Jagdishprasad Jhabarmal Tibrewala University, Rajasthan

More information

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques Hashing Manolis Koubarakis 1 The Symbol Table ADT A symbol table T is an abstract storage that contains table entries that are either empty or are pairs of the form (K, I) where K is a key and I is some

More information

Raj Boppana, Ph.D. Professor and Interim Chair. University of Texas at San Antonio

Raj Boppana, Ph.D. Professor and Interim Chair. University of Texas at San Antonio Raj Boppana, Ph.D. Professor and Interim Chair Computer Science Department University of Texas at San Antonio Terminology RN: pseudorandom number RN stream: a sequence of RNs Cycle: the maximum number

More information

CS 142 Style Guide Grading and Details

CS 142 Style Guide Grading and Details CS 142 Style Guide Grading and Details In the English language, there are many different ways to convey a message or idea: some ways are acceptable, whereas others are not. Similarly, there are acceptable

More information

What We ll Do... Random

What We ll Do... Random What We ll Do... Random- number generation Random Number Generation Generating random variates Nonstationary Poisson processes Variance reduction Sequential sampling Designing and executing simulation

More information

Testing Random- Number Generators

Testing Random- Number Generators Testing Random- Number Generators Raj Jain Washington University Saint Louis, MO 63131 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse574-06/ 27-1 Overview

More information

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be

More information

Pseudo-random Bit Generation Algorithm Based on Chebyshev Polynomial and Tinkerbell Map

Pseudo-random Bit Generation Algorithm Based on Chebyshev Polynomial and Tinkerbell Map Applied Mathematical Sciences, Vol. 8, 2014, no. 125, 6205-6210 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.48676 Pseudo-random Bit Generation Algorithm Based on Chebyshev Polynomial

More information

Lecture 04 FUNCTIONS AND ARRAYS

Lecture 04 FUNCTIONS AND ARRAYS Lecture 04 FUNCTIONS AND ARRAYS 1 Motivations Divide hug tasks to blocks: divide programs up into sets of cooperating functions. Define new functions with function calls and parameter passing. Use functions

More information

(Refer Slide Time: 00:26)

(Refer Slide Time: 00:26) Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute Technology, Madras Module 07 Lecture 07 Contents Repetitive statements

More information

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistics and Error Analysis -

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistics and Error Analysis - Physics 736 Experimental Methods in Nuclear-, Particle-, and Astrophysics - Statistics and Error Analysis - Karsten Heeger heeger@wisc.edu Feldman&Cousin what are the issues they deal with? what processes

More information

COMP 110 Programming Exercise: Simulation of the Game of Craps

COMP 110 Programming Exercise: Simulation of the Game of Craps COMP 110 Programming Exercise: Simulation of the Game of Craps Craps is a game of chance played by rolling two dice for a series of rolls and placing bets on the outcomes. The background on probability,

More information

6.338 Final Paper: Parallel Huffman Encoding and Move to Front Encoding in Julia

6.338 Final Paper: Parallel Huffman Encoding and Move to Front Encoding in Julia 6.338 Final Paper: Parallel Huffman Encoding and Move to Front Encoding in Julia Gil Goldshlager December 2015 1 Introduction 1.1 Background The Burrows-Wheeler transform (BWT) is a string transform used

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

NetEnt Product Services Ltd. Random Number Generator Certification Report

NetEnt Product Services Ltd. Random Number Generator Certification Report NetEnt Product Services Ltd Random Number Generator 01 April 2015 itech Labs Australia ACN 108 249 761 Suite 24, 40 Montclair Ave, Glen Waverley, VIC 3150, Australia. Tel. +61 3 9561 9955 www.itechlabs.com.au

More information

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }

for (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; } Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas

More information

CS 512, Spring 2017: Take-Home End-of-Term Examination

CS 512, Spring 2017: Take-Home End-of-Term Examination CS 512, Spring 2017: Take-Home End-of-Term Examination Out: Tuesday, 9 May 2017, 12:00 noon Due: Wednesday, 10 May 2017, by 11:59 am Turn in your solutions electronically, as a single PDF file, by placing

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Function Call Stack and Activation Records

Function Call Stack and Activation Records 71 Function Call Stack and Activation Records To understand how C performs function calls, we first need to consider a data structure (i.e., collection of related data items) known as a stack. Students

More information

15-110: Principles of Computing, Spring 2018

15-110: Principles of Computing, Spring 2018 5-: Principles of Computing, Spring 28 Problem Set 8 (PS8) Due: Friday, March 3 by 2:3PM via Gradescope Hand-in HANDIN INSTRUCTIONS Download a copy of this PDF file. You have two ways to fill in your answers:.

More information

(Refer Slide Time: 01:25)

(Refer Slide Time: 01:25) Computer Architecture Prof. Anshul Kumar Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture - 32 Memory Hierarchy: Virtual Memory (contd.) We have discussed virtual

More information

Variables. Data Types.

Variables. Data Types. Variables. Data Types. The usefulness of the "Hello World" programs shown in the previous section is quite questionable. We had to write several lines of code, compile them, and then execute the resulting

More information

Chapter 9 Reduction Variables

Chapter 9 Reduction Variables Chapter 9 Reduction Variables Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

C Language, Token, Keywords, Constant, variable

C Language, Token, Keywords, Constant, variable C Language, Token, Keywords, Constant, variable A language written by Brian Kernighan and Dennis Ritchie. This was to be the language that UNIX was written in to become the first "portable" language. C

More information

For our example, we will look at the following factors and factor levels.

For our example, we will look at the following factors and factor levels. In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball

More information

Parallel Implementation of the NIST Statistical Test Suite

Parallel Implementation of the NIST Statistical Test Suite Parallel Implementation of the NIST Statistical Test Suite Alin Suciu, Iszabela Nagy, Kinga Marton, Ioana Pinca Computer Science Department Technical University of Cluj-Napoca Cluj-Napoca, Romania Alin.Suciu@cs.utcluj.ro,

More information

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16.

The Plan: Basic statistics: Random and pseudorandom numbers and their generation: Chapter 16. Scientific Computing with Case Studies SIAM Press, 29 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit IV Monte Carlo Computations Dianne P. O Leary c 28 What is a Monte-Carlo method?

More information

Chapter Four: Loops II

Chapter Four: Loops II Chapter Four: Loops II Slides by Evan Gallagher & Nikolay Kirov Chapter Goals To understand nested loops To implement programs that read and process data sets To use a computer for simulations Processing

More information

Lecture 3: Recursion; Structural Induction

Lecture 3: Recursion; Structural Induction 15-150 Lecture 3: Recursion; Structural Induction Lecture by Dan Licata January 24, 2012 Today, we are going to talk about one of the most important ideas in functional programming, structural recursion

More information

Practice Problems for the Final

Practice Problems for the Final ECE-250 Algorithms and Data Structures (Winter 2012) Practice Problems for the Final Disclaimer: Please do keep in mind that this problem set does not reflect the exact topics or the fractions of each

More information

Problem Set 7 Solutions

Problem Set 7 Solutions 6.42/8.62J Mathematics for Computer Science March 29, 25 Srini Devadas and Eric Lehman Problem Set 7 Solutions Due: Monday, April 4 at 9 PM Problem. Every function has some subset of these properties:

More information

Introduction to Algorithms

Introduction to Algorithms Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

More information

SSJ User s Guide. Package rng Random Number Generators. Version: December 17, 2014

SSJ User s Guide. Package rng Random Number Generators. Version: December 17, 2014 SSJ User s Guide Package rng Random Number Generators Version: December 17, 2014 CONTENTS 1 Contents Overview........................................ 2 RandomStream.....................................

More information

GPU-based Quasi-Monte Carlo Algorithms for Matrix Computations

GPU-based Quasi-Monte Carlo Algorithms for Matrix Computations GPU-based Quasi-Monte Carlo Algorithms for Matrix Computations Aneta Karaivanova (with E. Atanassov, S. Ivanovska, M. Durchova) anet@parallel.bas.bg Institute of Information and Communication Technologies

More information

Successful vs. Unsuccessful

Successful vs. Unsuccessful Hashing Search Given: Distinct keys k 1, k 2,, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ),, (k n, I n ) where I j is the information associated with key k j for 1

More information

Computational Methods. Randomness and Monte Carlo Methods

Computational Methods. Randomness and Monte Carlo Methods Computational Methods Randomness and Monte Carlo Methods Manfred Huber 2010 1 Randomness and Monte Carlo Methods Introducing randomness in an algorithm can lead to improved efficiencies Random sampling

More information

2.3 Algorithms Using Map-Reduce

2.3 Algorithms Using Map-Reduce 28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure

More information

NetEnt Product Services Ltd. Platform 10.2 and RNG Certification Report. for UK Gambling Commission

NetEnt Product Services Ltd. Platform 10.2 and RNG Certification Report. for UK Gambling Commission NetEnt Product Services Ltd Platform 10.2 and RNG Certification Report for UK Gambling Commission 26 March 2015 itech Labs Australia ACN 108 249 761 Suite 24, 40 Montclair Ave, Glen Waverley, VIC 3150,

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 20 Priority Queues Today we are going to look at the priority

More information

News and information! Review: Java Programs! Feedback after Lecture 2! Dead-lines for the first two lab assignment have been posted.!

News and information! Review: Java Programs! Feedback after Lecture 2! Dead-lines for the first two lab assignment have been posted.! True object-oriented programming: Dynamic Objects Reference Variables D0010E Object-Oriented Programming and Design Lecture 3 Static Object-Oriented Programming UML" knows-about Eckel: 30-31, 41-46, 107-111,

More information

Comprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 2001 >>> SOLUTIONS <<<

Comprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 2001 >>> SOLUTIONS <<< Comprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 001 >>> SOLUTIONS

More information

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion, Introduction Chapter 5 Hashing hashing performs basic operations, such as insertion, deletion, and finds in average time 2 Hashing a hash table is merely an of some fixed size hashing converts into locations

More information

Quasi-Monte Carlo Methods Combating Complexity in Cost Risk Analysis

Quasi-Monte Carlo Methods Combating Complexity in Cost Risk Analysis Quasi-Monte Carlo Methods Combating Complexity in Cost Risk Analysis Blake Boswell Booz Allen Hamilton ISPA / SCEA Conference Albuquerque, NM June 2011 1 Table Of Contents Introduction Monte Carlo Methods

More information

1 Counting triangles and cliques

1 Counting triangles and cliques ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit IV Monte Carlo

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit IV Monte Carlo Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit IV Monte Carlo Computations Dianne P. O Leary c 2008 1 What is a Monte-Carlo

More information

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK B.E. - Electrical and Electronics Engineering IV SEMESTER CS6456 - OBJECT ORIENTED

More information

SPSS Basics for Probability Distributions

SPSS Basics for Probability Distributions Built-in Statistical Functions in SPSS Begin by defining some variables in the Variable View of a data file, save this file as Probability_Distributions.sav and save the corresponding output file as Probability_Distributions.spo.

More information

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 04 Programs with IO and Loop We will now discuss the module 2,

More information

Notation Index. Probability notation. (there exists) (such that) Fn-4 B n (Bell numbers) CL-27 s t (equivalence relation) GT-5.

Notation Index. Probability notation. (there exists) (such that) Fn-4 B n (Bell numbers) CL-27 s t (equivalence relation) GT-5. Notation Index (there exists) (for all) Fn-4 Fn-4 (such that) Fn-4 B n (Bell numbers) CL-27 s t (equivalence relation) GT-5 ( n ) k (binomial coefficient) CL-15 ( n m 1,m 2,...) (multinomial coefficient)

More information

14. Pointers, Algorithms, Iterators and Containers II

14. Pointers, Algorithms, Iterators and Containers II Recall: Pointers running over the Array Beispiel 14. Pointers, Algorithms, Iterators and Containers II Iterations with Pointers, Arrays: Indices vs. Pointers, Arrays and Functions, Pointers and const,

More information

2 Review of Set Theory

2 Review of Set Theory 2 Review of Set Theory Example 2.1. Let Ω = {1, 2, 3, 4, 5, 6} 2.2. Venn diagram is very useful in set theory. It is often used to portray relationships between sets. Many identities can be read out simply

More information

Random Number Generators. Summer Internship Project Report submitted to Institute for Development and. Research in Banking Technology (IDRBT)

Random Number Generators. Summer Internship Project Report submitted to Institute for Development and. Research in Banking Technology (IDRBT) Random Number Generators Summer Internship Project Report submitted to Institute for Development and Research in Banking Technology (IDRBT) Submitted by: Vipin Kumar Singhal Bachelor in Technology, 3 rd

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

SOFTWARE ENGINEERING DESIGN I

SOFTWARE ENGINEERING DESIGN I 2 SOFTWARE ENGINEERING DESIGN I 3. Schemas and Theories The aim of this course is to learn how to write formal specifications of computer systems, using classical logic. The key descriptional technique

More information

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine

More information

Introduction to Linked Lists. Introduction to Recursion Search Algorithms CS 311 Data Structures and Algorithms

Introduction to Linked Lists. Introduction to Recursion Search Algorithms CS 311 Data Structures and Algorithms Introduction to Linked Lists Introduction to Recursion Search Algorithms CS 311 Data Structures and Algorithms Lecture Slides Friday, September 25, 2009 Glenn G. Chappell Department of Computer Science

More information

Predictive Analysis: Evaluation and Experimentation. Heejun Kim

Predictive Analysis: Evaluation and Experimentation. Heejun Kim Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training

More information

Discrete Mathematics Lecture 4. Harper Langston New York University

Discrete Mathematics Lecture 4. Harper Langston New York University Discrete Mathematics Lecture 4 Harper Langston New York University Sequences Sequence is a set of (usually infinite number of) ordered elements: a 1, a 2,, a n, Each individual element a k is called a

More information

How Random is Random?

How Random is Random? "!$#%!&(' )*!$#+, -/.(#2 cd4me 3%46587:9=?46@A;CBEDGF 7H;>I846=?7H;>JLKM7ONQPRKSJL4T@8KM4SUV7O@8W X 46@A;u+4mg^hb@8ub;>ji;>jk;t"q(cufwvaxay6vaz

More information

Chapter Four: Loops. Slides by Evan Gallagher. C++ for Everyone by Cay Horstmann Copyright 2012 by John Wiley & Sons. All rights reserved

Chapter Four: Loops. Slides by Evan Gallagher. C++ for Everyone by Cay Horstmann Copyright 2012 by John Wiley & Sons. All rights reserved Chapter Four: Loops Slides by Evan Gallagher The Three Loops in C++ C++ has these three looping statements: while for do The while Loop while (condition) { statements } The condition is some kind of test

More information

Notebook Assignments

Notebook Assignments Notebook Assignments These six assignments are a notebook using techniques from class in the single concrete context of graph theory. This is supplemental to your usual assignments, and is designed for

More information

(Refer Slide Time: 00:02:02)

(Refer Slide Time: 00:02:02) Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 20 Clipping: Lines and Polygons Hello and welcome everybody to the lecture

More information

Notation Index 9 (there exists) Fn-4 8 (for all) Fn-4 3 (such that) Fn-4 B n (Bell numbers) CL-25 s ο t (equivalence relation) GT-4 n k (binomial coef

Notation Index 9 (there exists) Fn-4 8 (for all) Fn-4 3 (such that) Fn-4 B n (Bell numbers) CL-25 s ο t (equivalence relation) GT-4 n k (binomial coef Notation 9 (there exists) Fn-4 8 (for all) Fn-4 3 (such that) Fn-4 B n (Bell numbers) CL-25 s ο t (equivalence relation) GT-4 n k (binomial coefficient) CL-14 (multinomial coefficient) CL-18 n m 1 ;m 2

More information

C Functions. 5.2 Program Modules in C

C Functions. 5.2 Program Modules in C 1 5 C Functions 5.2 Program Modules in C 2 Functions Modules in C Programs combine user-defined functions with library functions - C standard library has a wide variety of functions Function calls Invoking

More information

Slide Set 2. for ENCM 335 in Fall Steve Norman, PhD, PEng

Slide Set 2. for ENCM 335 in Fall Steve Norman, PhD, PEng Slide Set 2 for ENCM 335 in Fall 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary September 2018 ENCM 335 Fall 2018 Slide Set 2 slide

More information

(Refer Slide Time: 00:01:30)

(Refer Slide Time: 00:01:30) Digital Circuits and Systems Prof. S. Srinivasan Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 32 Design using Programmable Logic Devices (Refer Slide Time: 00:01:30)

More information

6.856 Randomized Algorithms

6.856 Randomized Algorithms 6.856 Randomized Algorithms David Karger Handout #4, September 21, 2002 Homework 1 Solutions Problem 1 MR 1.8. (a) The min-cut algorithm given in class works because at each step it is very unlikely (probability

More information

Module 2: Classical Algorithm Design Techniques

Module 2: Classical Algorithm Design Techniques Module 2: Classical Algorithm Design Techniques Dr. Natarajan Meghanathan Associate Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu Module

More information

Section 4 General Factorial Tutorials

Section 4 General Factorial Tutorials Section 4 General Factorial Tutorials General Factorial Part One: Categorical Introduction Design-Ease software version 6 offers a General Factorial option on the Factorial tab. If you completed the One

More information

CVEN Computer Applications in Engineering and Construction. Programming Assignment #2 Random Number Generation and Particle Diffusion

CVEN Computer Applications in Engineering and Construction. Programming Assignment #2 Random Number Generation and Particle Diffusion CVE 0-50 Computer Applications in Engineering and Construction Programming Assignment # Random umber Generation and Particle Diffusion Date distributed: 0/06/09 Date due: 0//09 by :59 pm (submit an electronic

More information

Maciej Sobieraj. Lecture 1

Maciej Sobieraj. Lecture 1 Maciej Sobieraj Lecture 1 Outline 1. Introduction to computer programming 2. Advanced flow control and data aggregates Your first program First we need to define our expectations for the program. They

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #44. Multidimensional Array and pointers

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #44. Multidimensional Array and pointers Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #44 Multidimensional Array and pointers In this video, we will look at the relation between Multi-dimensional

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Package CryptRndTest

Package CryptRndTest Type Package Package CryptRndTest February 25, 2016 Title Statistical Tests for Cryptographic Randomness Version 1.2.2 Date 2016-02-24 Author Maintainer Performs cryptographic

More information

15. Pointers, Algorithms, Iterators and Containers II

15. Pointers, Algorithms, Iterators and Containers II 498 Recall: Pointers running over the Array 499 Beispiel 15. Pointers, Algorithms, Iterators and Containers II int a[5] = 3, 4, 6, 1, 2; for (int p = a; p < a+5; ++p) std::cout

More information

10.4 Linear interpolation method Newton s method

10.4 Linear interpolation method Newton s method 10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by

More information

C++ Basics. Data Processing Course, I. Hrivnacova, IPN Orsay

C++ Basics. Data Processing Course, I. Hrivnacova, IPN Orsay C++ Basics Data Processing Course, I. Hrivnacova, IPN Orsay The First Program Comments Function main() Input and Output Namespaces Variables Fundamental Types Operators Control constructs 1 C++ Programming

More information

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing

More information

Functions in C++ Problem-Solving Procedure With Modular Design C ++ Function Definition: a single

Functions in C++ Problem-Solving Procedure With Modular Design C ++ Function Definition: a single Functions in C++ Problem-Solving Procedure With Modular Design: Program development steps: Analyze the problem Develop a solution Code the solution Test/Debug the program C ++ Function Definition: A module

More information

Trees, Part 1: Unbalanced Trees

Trees, Part 1: Unbalanced Trees Trees, Part 1: Unbalanced Trees The first part of this chapter takes a look at trees in general and unbalanced binary trees. The second part looks at various schemes to balance trees and/or make them more

More information

Chapter Summary. Mathematical Induction Recursive Definitions Structural Induction Recursive Algorithms

Chapter Summary. Mathematical Induction Recursive Definitions Structural Induction Recursive Algorithms Chapter Summary Mathematical Induction Recursive Definitions Structural Induction Recursive Algorithms Section 5.1 Sec.on Summary Mathematical Induction Examples of Proof by Mathematical Induction Mistaken

More information

(Refer Slide Time: 00:02:00)

(Refer Slide Time: 00:02:00) Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 18 Polyfill - Scan Conversion of a Polygon Today we will discuss the concepts

More information

Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics

Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics Masayo Haneda Peter M.W. Knijnenburg Harry A.G. Wijshoff LIACS, Leiden University Motivation An optimal compiler optimization

More information

CMPSCI 646, Information Retrieval (Fall 2003)

CMPSCI 646, Information Retrieval (Fall 2003) CMPSCI 646, Information Retrieval (Fall 2003) Midterm exam solutions Problem CO (compression) 1. The problem of text classification can be described as follows. Given a set of classes, C = {C i }, where

More information

Image Compression With Haar Discrete Wavelet Transform

Image Compression With Haar Discrete Wavelet Transform Image Compression With Haar Discrete Wavelet Transform Cory Cox ME 535: Computational Techniques in Mech. Eng. Figure 1 : An example of the 2D discrete wavelet transform that is used in JPEG2000. Source:

More information

Multivariate Normal Random Numbers

Multivariate Normal Random Numbers Multivariate Normal Random Numbers Revised: 10/11/2017 Summary... 1 Data Input... 3 Analysis Options... 4 Analysis Summary... 5 Matrix Plot... 6 Save Results... 8 Calculations... 9 Summary This procedure

More information

Testing Primitive Polynomials for Generalized Feedback Shift Register Random Number Generators

Testing Primitive Polynomials for Generalized Feedback Shift Register Random Number Generators Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2005-11-30 Testing Primitive Polynomials for Generalized Feedback Shift Register Random Number Generators Guinan Lian Brigham Young

More information

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore (Refer Slide Time: 00:20) Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 4 Lexical Analysis-Part-3 Welcome

More information

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40 Lecture 16 Hashing Hash table and hash function design Hash functions for integers and strings Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo

More information