Further Reading

Radiometric dating

Suppose you want to know how long ago something happened. For example, a forensic analyst might want to know how long ago a dead person ate breakfast. Or an astrophysicist might want to know the age of the solar system. Or an archeologist might want to now how old the dead-sea scrolls are, or how long ago did we evolve a desire for art. How do we tell?

Well, we can build a mathematical model, fit it to some data, and get out an estimate. Our model will consist of some timing process under which a system’s state evolves from some starting point to an end point. The end point will be measured to provide the data. Some secondary source of information will be used to estimate the system’s starting point. Then, based on the timing process, we can estimate how much time between the starting point and the end point.

This basic process applies to dendrochronology (tree rings), chronostratigraphy (dating by rock layers), genetic clocks, magneto and perhaps most famously to radiometric dating. The radioactive decay was discovered by Henri Becquerel and subsequently studied in the research of Marie and Peire Curie, Ernest Rutherford, and Hans Geiger. In 1896, Becquerel discovered that uranium salts could expose photographic paper that was still wrapped in light-proof paper. Further research with Marie and Peire Curie lead to the isolation of radioactive elements thorium, radium, and polonium. The cause of this penetrating radiation was eventually determined by Rutherford and Geiger to be due to the decay of atoms through the fragmenting of their nuclei.

Radioactive decay processes are useful in dating objects because they appear to be very regular. We can measure the amount of isotope left at the end using a Geiger counter or related method. The one challenge is knowing the initial isotope levels when the object of interest was made. For short-term experiments involving \(P^{32}\), initial levels can be measured directly. For dating problems involving carbon-14, we can not go back in time to measure initial isotope levels. But we believe that cosmic radiation processes have kept the atmosphere’s carbon isotope ratios relatively constant over time, and so any departures of these ratios after the creation of an object will be due to radioactive decay. There can, of course, be complications. For example, atmospheric nuclear weapons tests during the 1950’s and 1960’s dramatically increase the amount of carbon 14 in the atmosphere. While such events can complicate some dating processes, they can also be exploited to estimate rates of carbon turnover in parts of the human body.

Continuous-time Markov chains

Before proceeding with our analysis, we need to introduce the idea of a continuous-time Markov chain. Imagine that we have a Markov chain defined in time steps of \(\Delta t\), \[\begin{gather*} p(t+\Delta t) = A(\Delta t) p(t). \end{gather*}\] The individual transition probabilities \(a _ {ji}(\Delta t)\) of \(A\) depend on the time step, but as always, must sum to unity: \[1 = \sum _ j a _ {ji}(\Delta t).\] Now, suppose we start shrinking our time-step. How do we expect the transition-probabilities to change? Well, intuition suggests that as the time-step becomes smaller, it is less likely that the chain will change state, so the probability of staying in the same state will increase, with the probabilities of moving to a different state will decrease, so \[\begin{gather*} \lim_{\Delta t \rightarrow 0} a_{ii}(\Delta t) = 1, \\ \lim_{\Delta t \rightarrow 0} a_{ji}(\Delta t) = 0, \quad j \neq i. \end{gather*}\] Let’s suppose that the off-diagonal entries become linear in the time-step-size as it shrinks, so \(a _ {ji}(\Delta t) \approx m _ {ji} \Delta t\). Since the columns must sum to unity, this also implies \(a _ {ii}(\Delta t) = 1 - \Delta t \sum _ {j} m _ {ji}\). Rewriting this in matrix form, we have for small \(\Delta t\), \[\begin{gather*} p(t + \Delta t) = (I + M \Delta t) p(t) \end{gather*}\] where the off-diagonal entries of \(M\) are non-negative, the diagonal entries of \(M\) are non-positive, and the columns of \(M\) sum to zero. We call \(M\) the transition-rate matrix of the chain. If we now re-arrange and take the limit as \(\Delta t \rightarrow 0\), we recover a first-order linear differential equation system \[\begin{gather*} \frac{dp}{dt} = M p. \end{gather*}\]

Exponential decay equations

How long atoms are around is measured by “half-life” and decay rate. Phosphorus-32 (\(P^{32}\)) has a halflife of about 14.29 days. Carbon-14 (\(C^{14}\)) has a halflife of about 5,730 years. Uranium-235 has a halflife of about 700 million years.

Suppose we start with \(n = 1\) atom of \(P^{32}\). What is the chance it is still here after 28.6 days?

1-atom model and dt dependence

\[\begin{align*} \frac{dp_0}{dt} &= r p_1, \\ \frac{dp_1}{dt} &=-r p_1. \end{align*}\]

\[\begin{align*} p_0(t) &= 1 - e^{-rt} \\ p_1(t) &= e^{-rt} \end{align*}\]

Since \(2 p _ 0 = p _ 0 e^{r t _ {1/2}}\), then \(r = (\ln 2)/ t _ {1/2} \approx 0.0485\)

Show, suppose instead of starting with a single atom, we start with a large number of atoms \(N\).

\[\begin{align*} \frac{d p_0}{d t} &= r p_1 \\ \frac{d p_k}{d t} &= - r k p_k + r (k+1) p_{k+1}, \quad k = 1 \ldots \infty \end{align*}\] This infinite system of linear differential equations is known as the master equation for the process. Generally, closed-form solutions of the the master equation can be hard to find. But if we start with exactly \(N\) atoms at time \(t=0\), then the exact solution of this system of equations is \[p _ k(t) = {{}_{N}C_{k}} e^{-krt} (1-e^{-rt})^{N-k}\] with \(p _ k(t) = 0\) for all \(k > N\). This is a binomial distribution, where the expected value \[\text{E}( k ) = N e^{-rt}\] while the variance \[\text{Var}(k) = N e^{-rt} (1-e^{-rt}).\] We can check this solution by direct substitution back into the master equation.

To find the most likely time at which we might observe \(k\) atoms to remain, we can treat \(p _ k(t)\) as a likelihood function of \(t\). The most likely time will be the one that maximizes \(t\). By differentiation, we find the maximum likelihood estimate (MLE) \[\begin{gather*} t_{MLE} = \frac{1}{r} \ln\left( \frac{N}{k} \right) \end{gather*}\] The precision of this estimate can be approximately expressed in terms of the Fisher information. The Fisher information is \(I(\theta) = - \partial^2 _ {\theta} \ln \mathscr{L}(\theta)\) and 95 % confidence intervals under the Normal approximation are \[\begin{gather*} t _ {MLE} \pm \frac{1.96}{\sqrt{I(t _ {MLE})}} = \frac{1}{r} \left[ \ln\left( \frac{N}{k} \right) \pm c \sqrt{\frac{1}{k} - \frac{1}{N}} \right] \end{gather*}\]

Toy estimate of elapsed time for P^{32} decay
Toy estimate of elapsed time for \(P^{32}\) decay


  1. In 1910, Ernest Rutherford (Nobel Chemist) and Hans Geiger (creator of the Geiger counter) published a short study on radioactive decay based on the observation of \(\alpha\)-particles. \(\alpha\)-particles are pieces of an atomic nucleus released during radioactive decay. They are made of 2 protons and 2 neutrons held together by the strong force – essentially the same as a helium atom’s nucleus. Because the mass of an \(\alpha\)-particle is similar to that of other small atoms, they can have big effects on those atoms when moving at high speed. One of these effects is called “scintillation”, in which an \(\alpha\)-particle causes the molecule it strikes to emit a burst of light. This light can be recorded in a photograph, allowing us to “see” \(\alpha\)-particles. Show that the \(\alpha\)-particle data below collected by Rutherford and Geiger can be well-explained by a Poisson distribution and find the most likely intensity \(\lambda\). What does this imply about the creation of \(\alpha\)-particles?

[ Data : hide , shown as table , shown as CSV shown as QR ]

# Count, Number of times observed
# Counts of numbers of alpha particles occurring
# in 1/8 of a minute intervals
# @article{bib:Rutherford1910,
#   title={LXXVI.The probability variations in the distribution of α particles},
#   volume={20},
#   ISSN={1941-5990},
#   url={},
#   DOI={10.1080/14786441008636955},
#   number={118},
#   journal={Philosophical Magazine Series 6},
#   publisher={Informa UK Limited},
#   author={Rutherford, E. and Geiger, H. and Bateman, H.},
#   year={1910},
#   month={Oct},
#   pages={698-707},
# }

  1. In 1920, Agner Erlang, telephone engineer and pioneer of queing theory, presented a lecture including the following data for duration of calls at the main Copenhagen Telephone exchange as an example of how well practical data was conforming to his conceptual theories. Find the best-fitting exponential distribution to this data using linear least squares and an appropriate transformation of the empirical cumulative distribution function.

[ Data : hide , shown as table , shown as CSV shown as QR ]

# Duration of calls at the Copenhagen Main Exchange in 1916
# binned based on 10-second intervals.
# Data extracted from Table 1 of A. K. Erlang's 1920 lecture paper
# "The application of the theory of probabilities in telephone administration
# Extracted by T. Reluga, March, 2017
# Columns
# 1. duration of call measured in seconds
# 2. fraction of all calls with this duration or up to 10 seconds less
  1. If telephone calls arrive at a rate of 1 per minute and last an average of 8 minutes, how many phone lines would you need from a neighborhood to make sure the system is maxed out no more than 1 % of the time?

  2. Suppose a Poisson process is used to describe passing by a person attempting to cross a one-way street, where \(\lambda\) per second is the expected number of cars per minute, and the person needs \(T\) seconds to cross the street. Find an expression for how long the person should expect to wait before being able to cross.

  3. When the number of subscribers is relatively small, this is not a good approximation because as more calls are placed, the number of subscribers who might place a new call decreases. In 1918, Tore Engset published an analysis of this problem.

    1. Assuming there are \(S\) subscribers and \(N\) lines, that the hazard rate for each individal subscriber to place a call is \(a\) and that the hazard rate for each individual call to terminate is \(r\), and that blocked calls are lost, draw a digraph of the continuous- time Markov chain for the number of lines in use at a given time.

    2. Construct the master equation for this Markov chain.

    3. At steady-state, solve for \(p_1\) in terms of \(p_0\). Then solve for \(p_2\) in terms of \(p_1\), and so on. Find a pattern for the \(p_j\)’s term.

    4. Under what conditions will Engset’s solution and Erlang’s solution approximately agree?

  4. Every day, around 100,000 kilograms of meteoroids strike the earth, mostly in the form of stellar dust. That might seem like a lot, but not when compared to the \(6 \times 10^{24}\) kilogram mass of the earth – it’s adding less than \(10^{-14}\) percent each year. And most of these meteorites are so small, we won’t see them. There are also a few large meteors, technically called “bolides”, that enter our atmosphere and create impressive fireballs (another video compilation). Thanks to a US government program to watch out for intercontinental ballistic missiles, NASA’s Jet Propulsion Laboratory keeps a very nice database of large meteors. We’d like to model this data.

    1. Use this script to retrieve a data set of dates and times of observations of meteor arrivals.

    2. It’s always a good idea to plot a data set, if we can. Our eyes can sometimes help us pick out important features in the data quickly. Make a shot-plot for 1995, where we plot date as the independent variable and mark the date of each meteor arrival with a spike.

    3. Make a rank-arrival plot of the data from 1995 to 2005, where the independent variable is the rank of the arrival time (the first meteor is rank 1, the second to arrive is 2, …) against the actual arrival time. If that curve is a perfectly straight line, that means the meteors are arriving at a constant rate with a regular period in between. If the curve is a little irregular but still mostly straight, that means the time between arrivals has some randomness, but the average arrival rate is constant. If this plot was more of steps or a zig-zag, that would mean meteors were arriving in clusters. If the plot is concave or convex, that would suggest a changing rate of arrival.

    4. Based on the total number of meteor arrivals from 1995 to 2005, what was the average arrival rate (per day)?

    5. Plot a histogram of the number of meteors arriving every 45 days over the same 10 years.

    6. Fit a Poisson distribution to this histogram and estimate the Poisson distribution’s intensity parameter \(\lambda\). Plot your your histogram and your fit Poisson distribution on top of each other so we can compare them.

    7. Make a probability plot comparing your fit to your histogram. (If the fit is good, this should be close to the line \(y=x\).

    8. If the meteors are really arriving according to a Poisson process, the times between meteor arrivals should be exponentially distributed. Are they?

  5. (Open) Find a data set of telephone traffic from the 1908 era. A telegraph data set from then or earlier would also be interesting.