Modelling uncertainty

( previous, home, next )


Further reading

ancient roman dice

Simulating events involving chance

People have long struggled to make sense of unexpected events. In some communities, unusual events are a cause for invoking superstition and supernatural spirits and ancestors. The ancient Greek story teller Aesop recognized the goddess Tyke (Fortuna) in his fables as the reason for outcomes that had no recognizable cause and suggested that people could use religious offerings to improve their luck. The 1663 and 1755 New England earthquakes were widely interpreted as signs of divine displeasure and inspired brief religious revivals. This idea also resonates with Islamic idea of occasionalism, promoted by Abu al-Ghazali in the 11th century. Occasionalism says that rather than interpreting predictable cause and effect as laws of got, all events like should really be seen like chance events as direct acts of God’s. This suggests that laws of nature are actually acts of God and that the study of theology should replace the study of nature. In popular culture as in Thomas Hardy’s 1878 novel “Return of the native”, fate and irony are invoked as agents of luck. (see Roll the Bones: The History of Gambling for more history)

In our lectures so far, our models have used equations or algorithms to help us summarize data and make predictions. All of our models so far have been “deterministic”. A deterministic model is one where calculations can be made as precisely as we desire and for a given set of inputs, the model always gives the same output. As long as we reset the starting conditions to the same thing each time we repeat the experiment, we always end up at the same outcome. In the late 1800’s, after a string of successes, scientists began to grapple with the reality that some phenomena do not seem to behave deterministically. (Poincare and the 3-body problem, quantum mechanics, Brownian motion)

What follows are three problems and simulation models written to help us understand them. These kinds of simulations are called Montecarlo simulations.

Gombaud’s bet

In the first book on the analysis of chance events (1657), Dutch scientist Christian Huygens tells how a gambling argument lead to two famous French mathematicians, Blaise Pascal and Pierre de Fermat, making important leaps forward in the description of chance events as probabilities. As the story goes, Antoine Gombaud, Chevalier de Méré, a French nobleman, had observed a problem with a system for winning a particular dice game. The game was to throw a pair of dice several times and to try and get a double-six. Experienced gamblers believed that to win over the long term, one needed atleast 25 tries when betting even money. But Gombaud had calculated that one would make a profit eventually even with only 24 rolls. His reasoning was based on an ancient rule-of-thumb that the critical number of rolls needed to break even was inversely proportional to the chance of success. When the chance of success was 1 in 6, a player needed 4 rolls to win more often than not, so if the chance of success was 1 in 36 (6 times smaller), the number of rolls needed was \(6 \times 4 = 24\). Gombaud wrote to Pascal to see if he could explain the disagreement.

Suppose we try to reproduce a naive gambler’s experience with Gombaud’s bet – we throw 2 dice 24 times and come up without double-6’s, so we say to ourselves, “I’m going to lose my money if I pay to play this game”, so we don’t. But somebody else has done that same experiment, and gets a pair of double sixes, and so they think, “This is a game I can win, at least most of the time. I’d be happy to pay fifty cents to play it.” And when they actually play, they might win or they might lose. That one experiment didn’t tell them (or us) what was going to happen very well – this is a game of chance, and the outcome can be different each time it is played. And so, we’ve reached a great philosophical bifurcation point – how do we model and make sense of things in the world that can give different outputs each time, even when the inputs are the same.

One way of making sense of the unpredictable is to imagine repeating the game not just once or twice, but many many times. If we keep repeating the this process over and over, what fraction of the time do expect to win? Antoine had initially calculated that a double-6 would occur about once every 36 throws, so if he threw the pairs of dice 24 times, he should win about 2/3rds \(\approx 0.66\) of the time. Yet, try as they did, gamblers never seemed to make a consistent profit in bets based on 24 rolls. We can easily repeat Gombaud’s experience with a pair of dice in hand. With a little more work to implement a simulation, we can replicate the bet very efficiently much faster and many more times than we can do by hand.

[Show code]
from scipy import rand

rand is a procedure provided in many languages that returns a random floating
point number between 0 and 1, or a full array of such floating point numbers.
It can be used to create many kinds random numbers.  In this case, by
multiplying by 6, adding 1 and then truncating to an integer, we can simulate
the roll of a cubic die with 6 sides.  The value returned by rand() may be
zero, but is never 1, so we don't have to worry about the rare possibility of
actually getting 7 instead of 6.

def dieroll():
    x = int(rand()*6)+1
    return x

def bet():
    numrolls = 24
    outcomes = [ (dieroll(), dieroll()) for i in range(numrolls) ]
    iswin = ((6,6) in outcomes)
    return iswin

for b in range(7):
    numbets = 10**b
    numwins = 0
    for i in range(numbets):
        if bet():
            numwins += 1
    fracwins = float(numwins) / float(numbets)
    print("Of %d bets, you won a fraction %f of them"%(numbets,fracwins))

If we run the script, we get output like this (though maybe not EXACTLY this).

Of 1 bets, you won a fraction 0.000000 of them
Of 10 bets, you won a fraction 0.500000 of them
Of 100 bets, you won a fraction 0.530000 of them
Of 1000 bets, you won a fraction 0.513000 of them
Of 10000 bets, you won a fraction 0.483400 of them
Of 100000 bets, you won a fraction 0.490640 of them
Of 1000000 bets, you won a fraction 0.492301 of them

The numbers are converging very slowly, but seem to be converging to about 0.49 after 1 million bets. While we aren’t very certain about the exact value, it’s clear that the chances of winning is smaller than the 0.66 that Gombaud was expecting. It appears that double-sixes are actually a losing-bet at even money; the dice-thrower is expected to lose just a little less often than they win over the long-term. But you’d have to do a very large number of bets to get a precise estimate – if you did a bet in 1 minute, and played for 18 hours non-stop each day, it would take you about three months to do one hundred-thousand bets, and two and a half years to accomplish a million bets.

This technique we just used is called “simulation”. Simulations are algorithms that allow us to step through possible series of events to get a feel for how our decisions will play out. Many simulations are done with computer programs, but simulations have also often done with pencil, paper, and dice (GURPS, D&D, board games, …).

Chocolate quality control

You have recently come into possession of a truck load of cacao beans useful for making chocolate (\(4\times 4 \times 8 = 128\) crates). You’d like to export them to the US, but there is a catch. Most of the crates come from plantations that grow high-grade cacao varieties, but about 25 of the crates are CCN-51 cacao, a miserable variety used to make most of the chocolate in the United States (see here).

When your truck goes to market, you know that the testers will not check every crate. Instead, they will pick 5 crates at random (each different), and pay you depending on quality of those crates. Since 25 crates is about a fifth of 128, we might expect 1 of the five tested to be found to be CCN-51, but you might get lucky and they miss all of them, or you might get unlucky. How often, in a truck load of this size, do the testers find none, one, two, three, four, or five crates of low-quality chocolate? Well, it depends on how they test and maybe you can game the tests, but if the testing is totally random, we can write a simple python program to simulate the process and see what happens.

[Show code]
from numpy.random import shuffle

N = 4*4*8 # total number of crates
K = 25 # number of crates with CCN-51 chocolate
n = 5 # number of crates tested

num_checks = 100000

truck = [ 0 ]*(N-K) + [ 1 ]*K
counts = [ 0 ]*(n+1)
for t in range(num_checks):
    # we shuffle the truck crates, and then check the first n
    counts[sum(truck[0:n])] += 1

frequency = [ x/float(num_checks) for x in counts ]
print("# count, frequency")
print("# %d checks"%num_checks)
for x in range(n+1):
    print("%d, %f"%(x, frequency[x]))

Your results are likely to be similar to what follows …

# count, frequency
# 100000 checks
0 0.33068
1 0.41902
2 0.2011
3 0.04398
4 0.005
5 0.00022

Of course, if the selection of crates is not random, but somewhat known ahead of time, you could arrange the crates on the truck to your benefit and do better than this. The numbers above are a worst-case scenario.

The Siege of Australia

If you have played the boardgame “Risk” a few times, you’ve probably noticed that Australia plays an unusually important strategic role – the games often pivot in the end on defence of the Australian archipelego, and the strait of Malacca in particular. If you can break through accross the straight into Indonesia, you can often take Australia, but smart defenders can mass forces in Indonesia, making a breakthrough very difficult. The question then becomes, how many armies do you need to break through and concur the subcontinent.

Question: Given your and your enemies counts of armies in Siam and Indonesia respectively, how frequently do you break through the defensive line?

To answer this, lets perform some simulations of the dice-rolls that might play out, given the rules of the game and knowing how many armies are on each side.

[Show code]
from scipy import rand

def dieroll():
    r = int(rand()*6)+1
    assert r > 0 and r < 7
    return r

def randbattle(num_mine, num_yours):
    d = dieroll
    while True:
        if num_mine == 1 or num_yours == 0:
            return num_mine, num_yours
        assert num_mine > 0
        (x, y) = min(3, num_mine-1), min(2, num_yours)
        me, you = [ d() for i in range(x) ], [d() for i in range(y)]
        while len(me) > 0 and len(you) > 0:
            if me.pop() > you.pop():
                num_yours -= 1
                num_mine -= 1

def main():
    numtrials = 10**4
    num_mine = int(input("How many armies do you have to attack? "))
    num_yours = int(input("How many armies are defending? "))
    outcomes = {}
    for x in range(-num_yours, num_mine+1):
        outcomes[x] = 0
    numwins = 0
    for t in range(numtrials):
        result = randbattle(num_mine, num_yours)
        assert result[0] == 1 or result[1] == 0
        result = result[0] - result[1]
        outcomes[result] += 1
        if result > 1:
            numwins += 1

    print("I have run %d battles."%numtrials)
    print("Of those trials, you won %f percent of them."%(100*float(numwins)/numtrials))
    # c = 0.
    # for x in range(num_mine, -num_yours-1, -1):
    #     f = outcomes[x]/float(numtrials)
    #     c += f
    #     print "%d,%.8f,%.4f"%(x, f, c)


It would be helpful to know how the frequency of winning changes as the initial number of armies on each side changes. This can be displayed with an image plot, after some recoding.

[Show code]
from scipy import rand, zeros, flipud
from matplotlib.pyplot import *

def dieroll():
    r = int(rand()*6)+1
    assert r > 0 and r < 7
    return r

def randbattle(num_mine, num_yours):
    d = dieroll
    while True:
        if num_mine == 1 or num_yours == 0:
            return num_mine, num_yours
        assert num_mine > 0
        (x, y) = min(3, num_mine-1), min(2, num_yours)
        me, you = [ d() for i in range(x) ], [d() for i in range(y)]
        while len(me) > 0 and len(you) > 0:
            if me.pop() > you.pop():
                num_yours -= 1
                num_mine -= 1

def winchance(num_mine, num_yours, numtrials = 10**4):
    numwins = 0
    for t in range(numtrials):
        result = randbattle(num_mine, num_yours)
        assert result[0] == 1 or result[1] == 0
        result = result[0] - result[1]
        if result > 1:
            numwins += 1

    return numwins/float(numtrials)

def main():
    max_num_armies = 30
    z = zeros( (max_num_armies, max_num_armies) )
    for x in range(1,z.shape[0]):
        for y in range(1,z.shape[1]):
            z[x,y] = winchance(x,y, numtrials=3*10**2)

    z = flipud(z[1:,1:].T)
    rect = [1, max_num_armies, 1, max_num_armies]

    imshow(z, extent=rect, aspect='auto', interpolation='none')
    title("Probability of a Breakthrough",fontsize=18)
    xlabel("Initial number of attacking armies",fontsize=18)
    ylabel("Initial number of defending armies",fontsize=18)



So, what do we see from these simulations? In each case, the simulations are “direct” implementations of our problem – the simulations require just little math to translate the original situations into a computer algorithm. We don’t actually roll dice but we do simulate dice rolls when needed.

The first and last examples make use of a library function called rand(). This is a random number generating “function” – every time we call it, it returns a different decimal number between 0 and 1. (it isn’t really a function in the mathematical sense because functions always give the same output for the same input) While each individual call is unpredictable, on average, the numbers returned by rand() are uniformly distributed across the [0,1] interval. Most other random things like dice rolls and shuffles can be simulated in terms of these random decimal numbers. We actually should call these numbers pseudo-random instead of random because the outputs of rand() are generated by a fancy algorithm, and random numbers, by definition, do not follow any algorithm. Most of the time, this distinction doesn’t matter too much – if it looks like a duck, and quacks like a duck, we might as well call it a duck – but there are a few situations, such as in cryptography applications, where the difference can be very important.

Another thing we may notice is that despite the large number of simulation runs we use, there is still uncertainty about the results. There is a fundamental limitation on all Monte Carlo simulation methods like the one above – they converge to an answer at a rate proportional to the square root of the number of trials. Thus, to get an answer that is twice as accurate, we have to quadruple the number of trials. The gamblers of the 1600’s did not realize this precisely, but they knew it was hard to tell exactly how fair a game was by trial and error, and hoped that some mathematical calculations would be a better way. Pascal, Fermat, Huygens, and Laplace obliged them.

The term “Montecarlo method” was coined by Stanislaw Ulam and John von Neumann during their work on the Manhattan project. Ulam and von Neumann were working with some of the first programmable computers, but the idea of calculation by simulation can be traced back farther. Edward Molina used human computers to do “throw-down” simulations of telephone traffic at Bell labs in the 1920’s. And special problems, like Buffon’s needle problem, have histories of experiment-based calculatiion tracing even further back in time (Hall, 1872, Wolf, 1850).

A Baseball Model

Now, let’s consider a real modelling problem. Can we make a simulation for predicting the outcomes of a baseball game? The sport of baseball is particularly convenient simulaiton because the games progress in well-defined discrete time-steps – pitch by pitch. Other popular sports like football and basketball have some discrete time steps (downs and possessions) but also a fair amount of continuous game play, while other sports like soccer are almost entirely continuous in time.

To see how we build the model, let’s imagine we are totally naive of the rules of baseball, and learning as we watch our first baseball game. At the most basic level, baseball is a game between pitcher throwning a ball and batters trying to hit that ball. The progression of the game depends on the pitches – each pitch moves the game forward.

Let’s suppose the game proceeds as follows. The pitcher throws the first pitch. The batter swings and misses. The pitcher throws another pitch. The batter swings and misses. The pitcher throws another pitch. The batter swings and misses. Then the batter sits down and a new batter comes up. Why? Nothing different happened between the first, second, and third pitches - each time, the events seemed to be the same. But everybody agreed that it was time for the first batter to sit down. Of course, reason is that the rules of the game is that you get 3 tries to hit the ball, and after 3 tries, somebody else gets a turn.

In order to represent this, we introduce the auxillary concept of strikes. The count of strikes helps us represent the state of the game right now without having to repeat all the events leading to now. A player gets 3 swings at the ball – once they’ve had their 3 swings, if they haven’t hit the ball, they’re done.

But what if the batter doesn’t swing at the pitch? Well, then it depends on whether the pitch was a good pitch or not. If it was a good pitch, then batter should have swung, even though they didn’t – we’ll count that as a strike also. But if it was a bad pitch, that’s the pitcher’s fault, and we also want to penalize the pitcher for throwing bad pitches. They call bad pitches balls, and we count those also. If a pitcher throws four balls, then the player gets to walk to first base, even though they didn’t hit anything, because the pitcher isn’t giving him a good chance.

The pair of balls and strikes is called the “count” in baseball, and represents the batter’s state as time progresses. The count is not a physical/tangible thing of the process – it is a abstract concept we create as a convenient way of keeping track of the history of the at-bat and the game so far. And it doesn’t keep track of the complete history of the at-bat so far. For example, the count can not tell us what order the pitches have been thrown in so far. A count of 2 balls and 1 strike could have had the strike be the first pitch or the third pich. Either way, we end up with the same (2,1) count.

This is a trick that is generally very useful – we can enlarge the state-space with extra abstract variables to account for the past history as well as the current situation.

Here is drawing of the state-transition digraph for the first batter in a half-inning (not very pretty, as drawn here, but hopefully, you can still follow the arrows).

Markov transitions for a batter


For our very simple 1/2-inning model, the only source of randomness is how each pitch comes accross the plate, and how the batter responds to the pitch. For the way we’ve broken things down, this is represented by two discrete probability distribuitons. The first distribution gives the chances for outcomes of the pitch – ball, strike, foul, or the batter making solid contact. The second distribution gives chances of various outcomes when the batter makes contact – pop-out, single, double, triple, or home run.

These two distributions we need are not “conventional” statistics used when describing baseball games. But they are available in crude forms if you do a little search. With a little searching, I found the following. In the major leagues, about 1/3rd of pitches are balls, 1/4 are strikes, 1/4 are fouls, and 1/6th are hit into the field. Of those pitches hit, about 70% are outs, 20% are base hits, 6% are doubles, 1% are triples, and 3% are home runs.

history of hitting in the MLB


[Show code]
#!/usr/bin/env python2

import scipy

class RandVariable():
    def __init__(self, weights):
        s = float(sum(weights))
        self.w = [ w/s for w in weights]
        self.n = len(self.w)
    def rand(self):
        rv = scipy.rand()
        i = self.n-1
        while rv > self.w[i] and 0 <= i:
            rv -= self.w[i]
            i -= 1
        assert 0 <= i
        return i

# ball, strike, contact, foul ball
pitchDistro = RandVariable([0.38, 0.154+0.087, 0.144+0.066, 0.169])

# use 0 to represent getting out
# " 1 is a single
# " 2 is a double
# " 3 is a triple
# " 4 is a home run
x = 0.066/0.21
hitDistro = RandVariable([1-x,0.67*x,0.2*x,0.03*x,0.1*x])

class Batter:
    def __init__(self): = False
        self.walk = False
        self.out = False
        self.strikes = 0
        self.balls = 0
        print("\t\tNew batter")
    def __str__(self):
        if self.out:
            return "Strike-out"
        if self.walk:
            return "Walk"
            return "Contact"
        s = str((self.balls, self.strikes))
        return s
    def copy(self):
        o = self.__class__() =
        o.out = self.out
        o.outs = self.outs
        o.walk = self.walk
        o.strikes = self.strikes
        o.balls = self.balls
        return o
    def pitchball(self):
        self.balls += 1
        if self.balls == 4:
            self.walk  = True
    def pitchstrike(self):
        self.strikes += 1
        if self.strikes == 3:
            self.out = True
    def pitchfoul(self):
        if self.strikes < 2:
            self.strikes += 1
    def pitchcontact(self): = True

    def neighbors(self):
        if self.out or or self.walk:
        yield self.copy().pitchball()
        yield self.copy().pitchstrike()
        yield self.copy().pitchcontact()
        yield self.copy().pitchfoul()

    def __next__(self):
        pitches = [ self.pitchball, self.pitchstrike, \
            self.pitchcontact, self.pitchfoul ]
        pitches = [ self.pitchball, self.pitchstrike, \
            self.pitchcontact, self.pitchfoul ]

    def atbat(self):
        while not (self.out or or self.walk):

class Field:
    def __init__(self):
        self.bases = [0, 0, 0, 0]
        self.runs = 0
        self.outs = 0
        self.verbose = True

    def __str__(self):
        s = "Runs:%d, Outs:%d, On base:"%(self.runs, self.outs)
        s = s + str(self.bases)
        return s

    def __next__(self):
        assert self.outs < 4
        batter = Batter()
        if batter.out:
            self.outs += 1
        elif batter.walk:
            i = 1
            while True:
                if self.bases[i] == 0:
                    self.bases[i] = 1
                elif i == 3:
                    self.runs += 1
                i += 1
            h = hitDistro.rand()
            if h == 0:
                self.outs += 1
                self.bases[0] = 1
                while h > 0:
                    self.bases = [0] + self.bases
                    self.runs += self.bases.pop()
                    h -= 1
            assert False, "should never be reached"

    def halfinning(self):
        while self.outs < 3:

def main():
    runs = [0, 0]
    print("Start of game, no score")
    for inning in range(1, 10):
        for team in [0, 1]:
            print("\tTeam taking the field")
            x = Field()
            runs[team] += x.runs
            s = "\nInning: %d, Batting Team: %d"%(inning, team)
            #s += "\nScore: %d to %d"%(runs[0], runs[1])
            s += "\n%d, %d Score"%(runs[0], runs[1])

if __name__ == '__main__':

You might find some of the following interesting…


  1. Using montecarlo simulation, estimate the distribution for the maximum number of sequential heads in 100 flips of a fair coin. Perform atleast 100,000 simulation runs. Present your results as a plot the number of runs with fewer than \(n\) heads in a row as a function of \(n\).

  2. Montecarlo simulations can also be used to calculate things like the area of a unit circle \(\pi\).

    1. If you draw a square circumscribing a unit circle, and then pick a point \((x,y)\) uniformly randomly from that square, then according to the principle of indifference, the chance that point is in the unit circle itself is \(\pi/4\). Write a python program that samples 10 million points from this square, counts how many are in the unit circle, and uses this to estimate \(\pi\). The pythagorian theorem can be used to check if a point is inside the unit circle. How many digits of accuracy does you estimate achieve?

    2. For comparison, use the formula \[\operatorname{arctan}{\left (\frac{\sqrt{3}}{3} \right )} = \frac{\pi}{6}\] and the Taylor series of \(\operatorname{arctan}\) around \(x=0\) to estimate \(\pi\). How many terms do you need to use to obtain the same accuracy as your Montecarlo simulation?

    3. Which approach is a better way to calculate \(\pi\)?

  3. Imagine you start with an urn containing 2 stones, one black and one white. Reach into the urn, pick out a stone, then return the stone and a second stone of the same color as the one you picked out (so now there are 3 stones in the urn). Repeat this over and over until there are 1,000 stones in the urn.

    1. What do you expect to have happened to the stones in the urn? Will most of the stones in the urn be the same color? Or will do you think it’s more likely that the ratio of white to black stones will be 1 to 1? Or something different?

    2. What should the empirical PDF and CDF of the outcome look like under your prediction for part a?

    3. Simulate 10,000 runs of this process, and plot the empirical PDF and CDF.

    4. Discuss.

  4. Poisson’s apocraphyl problem of estimating the number of deer in the King’s forest (references to Paxton, Good, and Fisher) The games keeper can recognize and distinguish all the deer after he sees them once.

  5. Use our baseball model to answer the following questions

    1. Modify this code to draw a scatter-plot of the scores for 100 games.

    2. What is the average margin of victory in these games?

  6. In a previous homework problem, we attempted to model the number of cases of Measles in a household using a binomial distribution, assuming each case was independent of each other cases, and got poor fits to the data. Greenwood and Yule proposed a better model in 1920. In this model, disease transmission is broken into discrete generations, and the number of infectious people in each generation is a binomial random sample from the currently susceptible people. People can only get the disease once, so people sick this week are recovered and resistant against re-infection next week.

    1. Suppose a family has 1 sick kid with measles and 2 susceptible kids. What are the 3 possible states next week and their probabilities?

    2. What are the family’s 4 possible states the second week, and their probabilities?

    3. What are the probabilities that when the outbreak ends, the family had a total of 1,2, or 3 cases of measles?

    4. How do these results compare with Wilson’s data on the observed frequencies of different outbreak sizes? which wilson data?

( previous, home, next )