Monte Carlo Methods

Tasks

Introduction - The goal of this assignment is to better understand the concepts taught in class, such as Monte Carlo methods and Monte-Carlo integration.

Using a random-number generator

The goal of this problem is to familiarize yourself with the use of a random number generator and routines to automatically seed the generator. As described in class we will rely on the facilities provided by the C++ standard library to generate random numbers and seeding the generator.

In the provided file random_numbers.cpp, write a program that generates an increasing number of uniformly distributed random numbers in the interval \([0, 1]\) and compute the averages of those. A maximum of \(10^N\) with \(N_{\max}=6\) should be enough. Use the C++ random number generator type std::mt19937 and seed it using the an instance of std::random_device. You can use the type std::uniform_real_distribution to generated the desired statistic distribution.

The exact value for the average should be close to the value of 0.5, you can stop generating random numbers once the average becomes close to the expected value. Study how the average changes when N is increased.

Generate a plot using the facilities provided in matplotlibcpp.h that shows the average of the generated numbers for an increasing \(N\). Save the generated plot as results/random_number.png and commit the generated plot as part of your overall submission.

The generated seed for the random number generator is different for each invocation. Therefore, every time you run the code, it should produce a (slightly) different output.

Estimating \(\pi\) using simple-sampling Monte Carlo

The goal of this problem is to illustrate how \(\pi = 3.1415...\) can be computed by random sampling of the unit disk. Starting from the code presented in class, write a program that calculates \(\pi\). Use the file estimate_pi.cpp for your work.

In your simulation run the code multiple times for \(N=10^{i}, i = 1, 2, 3, \cdots\) random numbers. See how the estimate for \(\pi\) improves with increasing N and compute the deviation from the exact result: \(error = \lvert \pi - \pi_{estimate} \rvert\).

Perform a log-log plot (using plt::loglog) of the error as a function of \(N\) and show that the data can be fit to a straight line of slope \(\frac {-1}{2}\).

Save the generated plot as results/estimate_pi.png and commit the generated plot as part of your overall submission.

A Random Walk in the Plane

Code a random walk in the plane where, in addition to a random angle, the step size is variable and follows an exponential distribution. I.e.,

\[P(s) = e^{-s}\]

where \(s\) is the step size to be specified on the command line and

\[P(\theta) = \frac{1}{2\pi}\]

where \(\theta\) is the angle. Use the file random_walk.cpp for your work.

Make a distribution of the distance from the origin by making a histogram and summing the weights in the bins after \(10^5\) steps. Select a reasonable bin-size to be able to plot a smooth distribution curve. You can use the class histogram that is implemented in the file histogram.hpp for generating it.

You will need to generate many random walks each of which takes \(10^5\) steps, extract the distance of the respective endpoint from the origin for each, and plot the resulting distribution that you collected using the histogram class. Study the implementation in histogram.hpp to see how to use the type for your purposes.

You can use the C++ type std::exponential_distribution from the standard library to generate the desired distribution of the generated random step sizes. Use the type std::uniform_real_distribution to generate the next random angle.

Start your random walk at the coordinate origin \([0.0, 0.0]\) and calculate the distance of the endpoint for each walk relative to it.

Experiment with different step sizes \(s\) and plot the distributions for those. Explain your results. Save the generated plot as results/random_walk.png and commit the generated plot as part of your overall submission.

Simple-sampling Monte Carlo estimate of the integral \(f(x) = x^n\)

The goal of the exercise is to apply the concepts learned in the previous problem and apply these to the real function \(f(x) = x^n\) where the integral is known analytically, namely

\[I[f(x)] =\int f(x) dx = \frac{1}{(n+1)}\]

in the interval \([0, 1]\). For now, set \(n = 3\), you can change the value of the exponent \(n\) later. Start from the following pseudo-code:

algorithm simple_integrate
    initialize integral 0
    initialize num_trials 10000
    initialize counter 0
    while(counter < num_trials) do
        x = rand(0, 1)
        integral += x**n
        ++counter
    done
    return integral / num_trials

As in the previous tasks, run the code multiple times for \(N=10^i\), \(i=1,2,3, ...\) random numbers. See how the estimate for \(I[f(x)]\) in the interval \([0, 1]\) improves with increasing \(N\) and compute and plot the deviation from the exact result:

\[error = \lvert I - I_{estimate} \rvert\]

Again, the error should scale \(~N^{\frac {-1}{2}}\).

Use the file simple_integral.cpp for your work. Submit your plot as results/simple_integral.png

Bonus: Probability of two or more People having the same Birthday

This part is optional and is worth 5 points of extra credit on this assignment.

Please keep in mind that the value of the extra credit may not be proportionate to the time it takes to complete it.

Find out the probability that, out of a group of thirty people, at least two people share a birthday. Implement your code in the file birthdays.cpp.

If you like, analytically derive the exact answer (which is ~70.55094192102…%, assume 365 days/year)
Use the Monte-Carlo method to estimate this probability.
Show that with increasing number of attempts, the estimation error is reduced, create a plot showing the relative estimation error for varying numbers of attempts.
Submit your plot as results/birthdays.png
How many attempts did you need to get an estimate that has a relative error of less than 0.0001?