Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Tests of random functions, that is to say functions specified to act randomly, must be STATISTICAL tests. Run the function a large number of times ($N), and check that the basic and not-so-basic statistics match the model you require of your function under test.

    Pseudo code for an inadequate number of tests of an imaginary model, assumed not unlike a bell-curve - your mileage WILL vary -

    $fut= \&FunctionUnderTest;
    my @X= map { $fut->() } 1..$N ;

    use Acme::Statistics; # Fantasy package that provides what I use below ...

    ok( abs(mean(@X)) < 0.5,       "mean near 0");
    ok( abs(stddev(@X) - 1) < 0.2, "StdDev near 1");
    ok( abs(skew(@X) < 0.1,        "symetric");

    my %Count;  $Count{$_}++ for @X;
    ok( chisq(values %Count) > 0.9,"distribution ok");

    ok( abs(corr(@X[grep {  $_ % 2 } 0..$#X],
                 @X[grep {!($_ % 2)} 0..$#X] ))
             < 0.1, "Not auto correlated even/odd");

    The classical references on this are Knuth http://isbn.nu/0201896842 [isbn.nu], and then GG&M JACM 1986 [psu.edu] and B&M SIAM JoC [psu.edu]; ACM CALGO [acm.org] has good stuff buried in it too. WikiPedia [wikipedia.org] has reference & short summary of the German standard for rating random functions; and the US NIST standard [nist.gov] is voluminous, with test info.

    Just remember

    Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. -- John von Neumann
    --
    Bill
    # I had a sig when sigs were cool
    use Sig;
    • Interesting. I was just talking to a coworker about that. It's been years since statistics classes and I was struggling to remember the formulae involved in calculating what I was interested in. I wasn't thinking of a distribution under a bell curve so much as I was thinking "if X might not be correct, how many times do I need to calculate X to ensure the odds of it being incorrect are acceptably minimized?" More inportantly, I can't have randomly failing tests, but I'm quite happy to have a test which