Arjen Markus (4 april 2019) Monte Carlo methods are often used to integrate (or average) some function over a multidimensional space. The idea is quite simple: you randomly select points in the parameter/coordinate space, evaluate the function and the sum or the mean of the values you get is an approximation to the integral or the average.

The nice thing about Monte Carlo methods is that you need not worry about the number of dimensions - just pick one more coordinate and you are done.

Well, not quite. A significant drawback of Monte Carlo methods is that the error you make is inversely proportional to the square root of the number of points (1/sqrt(N)). In other words, to reduce the error (on average - this is after all a probabilistic method) by a factor of 2, you need to pick four times more points.

Using so-called quasi-random points gives a much better result: you can expect the error to behave as 1/N instead of 1/sqrt(N): to reduce the error by a factor of 2, you pick twice as many points. For a reduction by a factor 10, ten times as many points in stead of 100 times.

To select quasi-random points in a coordinate space is, however, not as simple as it looks and there have been many studies on their properties. And that is where this blog comes in. I heard about it on the Tclers chat and did some experiments. It does look very promising.

The code below is a very simple ad hoc program to compare the new method for quasi-random numbers with a straightforward Monte Carlo method:

- The function is one of three, fairly simple and smooth, functions.
- The goal is to determine the integral over the square of 1 by 1.
- For each method it does ten trials of 100 points and prints the result.

I could have made a more formal comparison, of course, determine the standard deviation of the series of trials, compare to the exact value (if that is - it is for two of these functions), etc. But I just let the numbers speak instead.

For the first function, the integral is exactly (exp(1)-1)**2 or 2.952492440125593. The result of one particular run of the program is:

Monte Carlo: Trial 0: 2.9167670044773537 Trial 1: 2.810709684843801 Trial 2: 2.9047008201644164 Trial 3: 2.976484326431138 Trial 4: 3.0313593590149632 Trial 5: 2.96497151958601 Trial 6: 2.8629509293157156 Trial 7: 3.0411656660841055 Trial 8: 2.8703993994820847 Trial 9: 2.9159757069046632 Quasi-random points Trial 0: 2.978267096620306 Trial 1: 2.955749509583321 Trial 2: 2.9527730154883494 Trial 3: 2.926382581196727 Trial 4: 2.9579869418153515 Trial 5: 2.9618063810319537 Trial 6: 2.9649989225942868 Trial 7: 2.9364314388690826 Trial 8: 2.9293270016224824 Trial 9: 2.950819793096617 Trial 10: 2.9529078493520875

As you can see the spread for the Monte Carlo method is much bigger than that of the quasi-random points method.

For the third function, the product of two cosines, the exact value of the integral is 0. Here is the output from one run:

Monte Carlo: Trial 0: -0.0480207020593826 Trial 1: -0.0603522327671914 Trial 2: 0.05696694135009513 Trial 3: -0.047421832955353314 Trial 4: -0.026318495422166435 Trial 5: 0.05661887158719929 Trial 6: -0.026122597935765293 Trial 7: 0.044876304823986496 Trial 8: 0.0030658863374625036 Trial 9: -0.013536678398721615 Quasi-random points Trial 0: -0.009778893711060707 Trial 1: 0.009266650448147623 Trial 2: -0.008626148945971347 Trial 3: 0.007882773877382872 Trial 4: -0.007065039878311997 Trial 5: 0.006203695521161606 Trial 6: -0.005330757368947462 Trial 7: 0.004478504366989686 Trial 8: -0.0036784640718740703 Trial 9: 0.0029604224726672724 Trial 10: -3.389966518034056e-6

Here is the program:

proc func {x y} { expr {exp($x+$y)} } # Wavy alternative proc func {x y} { expr {cos(20.0*($x**2+$y**2))+$x} } # Product of cosines - integral should approach zero proc func {x y} { set pi [expr {acos(-1.0)}] expr {cos(2.0*$pi*$x) * cos(2.0*$pi*$y)} } # Determine the parameters for the quasi-random points # (in 2D) # set n 3 set rn [expr {1.0/$n}] set x 1.0 for {set i 0} {$i < 10} {incr i} { set x [expr {$x - ($x**$n-$x-1.0) / ($n*$x**($n-1)-1.0)}] } set ax [expr {1.0/$x}] set ay [expr {1.0/$x**2}] puts "Monte Carlo:" for {set trial 0} {$trial < 10} {incr trial} { set sum 0.0 for {set p 0} {$p < 100} {incr p} { set x [expr {rand()}] set y [expr {rand()}] set sum [expr {$sum + [func $x $y]}] } puts "Trial $trial: [expr {$sum/100.0}]" } puts "Quasi-random points" set p 0 ;# Needs to be outside the loop! for {set trial 0} {$trial < 10} {incr trial} { set sum 0.0 for {set p1 0} {$p1 < 100} {incr p1} { incr p set x [expr {fmod($p * $ax, 1.0)}] set y [expr {fmod($p * $ay, 1.0)}] set sum [expr {$sum + [func $x $y]}] } puts "Trial $trial: [expr {$sum/100.0}]" } set sum 0.0 for {set p1 0} {$p1 < 10000} {incr p1} { incr p set x [expr {fmod($p * $ax, 1.0)}] set y [expr {fmod($p * $ay, 1.0)}] set sum [expr {$sum + [func $x $y]}] } puts "Trial $trial: [expr {$sum/10000.0}]"

When you examine the paper, you will see a general method for determining the parameters - it can be extended to an arbitrary number of dimensions. Another attractive feature - besides the apparent accuracy - is that no external tuning parameters are required.

I intend to make a nice little package out of this for Tcllib, but that takes a bit more work than this little example.