Lecture 38: Decision processes (4 lectures)


Continuous time Markov chain equations

\[p(t+1) = A p(t)\]

\[p(t+dt) = A(dt) p(t)\]

Expanding each component \(a _ {ij}(dt)\) of the Markov matrix \(A(dt)\) as a McLaurin series in \(dt\), we find \(A(dt) \approx I + Q dt + O(dt^2)\)

\[\frac{dp}{dt} = Q p\]

The transition rate matrix \(\mathbf{Q}\) is singular, has columns that sum to zero, non-positive diagonal elements and non-negative off-diagonal elements.

Valuing an uncertain future

In discrete time, \[\vec{v}^T(t-1) = \vec{1}^T ( \mathbf{P} \circ \mathbf{A} ) + \vec{v}^T(t) \mathbf{A} \theta\]

But implicit in this equation is the assumption of a time-step length, conveniently choosen to be 1. In fact, each of the components is implicitly choosen with respect to some time-step length \(dt\)

\[\vec{v}^T(t-dt) = \vec{1}^T ( \mathbf{P}(dt) \circ \mathbf{A}(dt) ) + \vec{v}^T(t) \mathbf{A}(dt) \theta(dt)\]

Now, suppose we take infinitessimally small time-steps

Let \(\theta(dt) \approx 1 - h dt\) Let \(\mathbf{P}(dt) = \mathbf{F} + \mathbf{P}' dt\) Let \(\mathbf{A}(dt) = \mathbf{I} + \mathbf{Q} dt\)

\[\vec{v}^T(t-dt) = \vec{1}^T \left( (\mathbf{F} + \mathbf{P}' dt) \circ (\mathbf{I} + \mathbf{Q} dt) \right) + (1-h dt) \vec{v}^T(t) ( \mathbf{I} + \mathbf{Q} dt)\]

\[\vec{v}^T(t-dt) = \vec{1}^T \left( \mathbf{F} \circ \mathbf{I} + \mathbf{P}' \circ \mathbf{I} dt + \mathbf{F} \circ \mathbf{Q} dt + \mathbf{P}' \circ Q dt^2 \right) + \vec{v}^T(t) ( \mathbf{I} + \mathbf{Q} dt - h \mathbf{I} dt - h \mathbf{Q} dt^2 )\]

We must have \(\mathbf{F} \circ \mathbf{I} = \vec{0}\), and equivalently \(\operatorname{diag}(\mathbf{F}) = \vec{0}\), meaning that as we pick smaller steps, the limit of the return per step approaches 0 to get a well-defined limit, but given this,

\[\vec{v}^T(t-dt) = \vec{1}^T \left( \mathbf{P}' \circ \mathbf{I} + \mathbf{F} \circ \mathbf{Q} \right) dt + \vec{v}^T(t) + \vec{v}^T(t) (\mathbf{Q} - h \mathbf{I})dt\]

Finally, taking \(\vec{u} = \vec{1}^T (\mathbf{P}' \circ \mathbf{I})\), this yields the ordinary system of linear inhomogeneous differenti equations \[-\frac{d\vec{v}^T}{dt} = \vec{u}^T + \vec{1}^T \left( \mathbf{F} \circ \mathbf{Q} \right) + \vec{v}^T \left( \mathbf{Q} - h \mathbf{I} \right)\] with terminal condition \(\vec{v}^T(t_f)\) known for some final time \(t _ f\) in the future (or some related terminal-time condition).

When this equation is autonomous and we have a converged terminal conditions (\(dv/dt = 0\)), we can study the steady-state. \[\vec{v}^T = \left( u^T + \vec{1}^T \left( \mathbf{F} \circ \mathbf{Q} \right) \right) \left( h \mathbf{I} - \mathbf{Q} \right)^{-1}\]

Application to vaccine decisions

We take our standard valuation equation, but take all our parameters to be time-independent. For a simple immunizing infection, we get value equations \[[ v_S, v_R ] = \left( [ 1 , 1] + [1,1] \left( \begin{bmatrix} 0 & 0 \\ -c_I & 0 \end{bmatrix} \begin{bmatrix} -\lambda - \mu & 0 \\ \lambda & -\mu \end{bmatrix} \right) \right) \left( \begin{bmatrix} h & 0 \\ 0 & h \end{bmatrix} - \begin{bmatrix} -\lambda - \mu & 0 \\ \lambda & -\mu \end{bmatrix} \right)^{-1}\]

\[\begin{align*}[ v_S, v_R ] &= [ 1 - \lambda c_I, 1] \begin{bmatrix} \lambda + \mu + h & 0 \\ -\lambda & \mu+h \end{bmatrix}^{-1} \\ &= [ 1 - \lambda c_I, 1] \begin{bmatrix}\frac{1}{h + \lambda + \mu} & 0\\\frac{\lambda}{\left(h + \mu\right) \left(h + \lambda + \mu\right)} & \frac{1}{h + \mu}\end{bmatrix} \end{align*}\] Taking \(h=0\) and finishing our multiplications, ... \[[v_S, v_R ] = \begin{bmatrix}\frac{1}{\mu} - \frac{c \lambda}{\lambda + \mu} & \frac{1}{\mu}\end{bmatrix}\] If the expected cost from the vaccine is less than the expected cost of the disease, then you should get the vaccine as soon as you can.

What about when \(h > 0\)?

Zika and age-dependent disease costs

Set up a 3 compartment SIR system for an individual with general age-dependence for how old you are when you get sick, and terminal conditions \(V _ S(a_{max}) = V _ I(a_{max}) = V _ R(a _ {max})\) But eventually work back to constant-age costs and constant infection risk for blackboard solutions. Then consider a piecewise jump in infection cost from low to high and construct the solution in piecewise form.

Finally, consider how changes in the risk of infection changes these valuations. Greater infection risks reduce expected costs before the critical age, but increase expected costs after the critical age.

\[-\frac{d v_ s}{dt} = \lambda ( -c _ i(a) - v _ s) - h v _ s\]

\(v _ s(80)=0\), \(c_I(a) = H(a - 40)\), \(h=0\), plot solutions for different values of \(\lambda\).

\[\begin{gather} %k := H(t- a) e^{-(\lambda+h) a} + e^{- \lambda - h} - e^{- a (\lambda + h)} \\ %k := e^{- \lambda - h} - e^{- a (\lambda + h)} H(a-t) %\\ v_s(a) = \frac{\lambda c_i }{\lambda + h} \left[ e^{(a-1) (\lambda + h)} - e^{(a- a_{\text{mid}}) (\lambda + h)} H(a_{\text{mid}}-a) - H(a- a _ {\text{mid}}) \right] \end{gather}\]

[Show code]
#include "code/zika_age_model.py">


Given smooth, time-dependent functions for cost of infection, risk of infection, and a terminal condition, calculate the value function numerically as a function of age.

European Call Option pricing with Black Scholes

(see paper notes)

Historical example

Stock prices are unknown

Fluctuate randomly up and down over time -- maybe we can describe these with a stochastic process, or a Markov process? Einstein and Bachelier showed how to connect Brownian motion and financial volatility to the heat equation ...

\[\dot{T} = \kappa \nabla^2 T\]

Suppose the price of a stock diffuses with rate \(D\), but also drifts upward at rate \(r\). Then the probability of observing the stock with price \(s\) at time \(t\) is \(p(t,s) ds\) where \(p\) solves the partial differential equation \[\frac{\partial p}{\partial t} = D \frac{\partial^2 p}{\partial s^2} - r \frac{\partial p}{\partial s}.\]

\[-\frac{dv}{dt} = \frac{1}{2}\sigma^2 s^2 \frac{d^2v}{ds^2} + r s \frac{dv}{ds} - r v\]