Introduction to Percolation Theory by Dietrich Stauffer and Ammon Aharony, 1994, is an excellent introduction with a theoretical physics perspective.
revise to use "permiable  impermiable" distinction for large lattices.
The concept of "space" is a rather flexible one in mathematics. We call many things spaces: vector spaces, Euclidian spaces, polish spaces, Banach spaces, Hilbert spaces, ... all of which share some aspect of our intuition about what "space" "is". Colloquially and intuitively, the most prominant aspect of space is our ability to move in it. Space is effectively onedimensional for a a car in a tunnel  it can only move forward or backward. We live most of our lives walking in two dimensions, being able to move forward and backward and also being able to turn left and right. Nature itself is concretely 3 dimensional. Many animals like fish and birds exploit this and we ourselves have invented devices from ramps and stools to roller coasters and rockets that allow us to enjoy and exploit this third dimension.
When we build a model, we have to decide if we want to include space. In our pharmacokinetics models, when we used different compartments to represent different organs like the stomach and liver, we were actually making a spatial model of the body. In that case, the use of "space" was implicit  we didn't use spatial coordinates, and some compartments like blood have a very diffuse spatial position. In our classical mechanics models of ballistic motion and the pendulum position and velocity where dependent variables of time, so space base again part of the state of the systems we were studying, but again in a rather implicit manner.
Explicit spatial models, or simply "spatial models" for short, are models where spatial location is an independent variable, and system state is a function of spatial location, either discretely or continuously, as well as time. A good example of how we do this is found in cellular automata models. A cellular automata model treats space discretely like a tiled floor. Suppose you stand on the floor and occupy one tile, or "cell". From your cell, you can take a step into each of the neighboring cells, and by repeatedly stepping from one cell to the other, you can move long distances.
But exactly how you move depends on the layout of tiles on the floor. You will have different numbers of neighbors to choose from, depending on the layout. We do not worry about this in real life  the idea of moving from tiletotile is only a rough approximation of real movement. But when building a cellular automata model, these details are crucial.
In theory, any graph of edges and nodes can be used as a discrete explicit representation of space, with each node representing a location, and each edge representing the possibilitie of movement between the nodes it connects. But in practice, cellular automata models almost always represent locations as elements of a rectangular array that can be accessed using Cartesian coordinates. Once we have locations, we next need a rule that defines each locations "neighbors", for movement. There are 4 conventional ways to define neighborhoods in a rectangular array. Each of these can be described with a "stencil" connecting each location node to each of its "neighboring" nodes.
Some of the mathematical properties of these lattices will be explored in the Exercises below.
During summers as a little kid, my friends and I liked to build sand castles at the beach. We would sometimes decorate the castles by putting little dribble towers on the top. We did this using a plastic funnel that would drip out sand and water from the bottom. However, getting the funnel to work right was a little tricky. If the sand I used was too fine and closely packed, the water would not drain down from the top through the funnel and no sand or water would come out the bottom.
The funnel of sand and water is one small example of a very general question  how does liquid flow through a porous medium? Solutions of this question can be useful for fields from the design of sponges and insulation to well water, and oil and gas extraction. Solutions can even help us understand the spread of malware over the internet and the Ebola virus in Africa. But in order to make sense of flows through porous media, we have to develop new kinds of models.
Percolation is the process by which a fluid or similar substance moves through a porous medium. Hot water, for instance, is percolated through ground beans to make coffee. We call a medium "permeable" if the fluid can move through it, and "impermeable" if it can not. The permeability of the medium depends on the properties of the medium. If the medium is too dense and tightly packed, the fluid cannot get through, but if the medium is loose and lite, then the fluid can pass through. When talking about rubber foams, this condition distinguishes closedcell and opencell foams respectively. What we'd like to understand is how the smallscale structure of the medium  the number, size, and arrangement of its gaps and fills  relates to the permeability we observe at large scales. In one case, you get a sponge. In the other, you get a life preserver. This may not seem so interesting at first, but the math turns out to be surprising and instructive, and will also give us a chance to exercise our concepts of space described above.
In any real setting, permeability depend on the chemical and physical composition of the medium, the viscosity of the fluid, temperature, pressure, and even erosion and contamination. But the core concepts of percolation can be qualitatively modeled in very simple and illuminating discrete spatial model. Imagine we want to make a simple model of a porous sponge. Instead of dealing with all three dimensions, suppose we take a planar crosssection of the sponge and look at the connectivity of that. Start with a rectangular array of cells. Each cell corresponds to a node connected to other cells according to some neighborhood. Each cell is colored black or white randomly. Let \(p\) be the probability that a given site is a void in the media (marked white) while \(1p\) is the probability that a given site is filled and marked black. The variable \(p\) is called the "void fraction", and \(1p\) the "fill fraction". When the state of each site is independent of each other sites, such an array can be generated with just a few lines of python code.
p = 0.9
n = 16
A = ceil(rand(n,n)  p)
spy(A)
We can use this code to create a menagerie of random media by tuning the void fraction \(p\) and the size \(n\). Let's start by looking at some examples of random arrays. Starting at the top, can we move through the white voidspace and reach the bottom?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

Well, if \(p\) is close to 1, it is easy  there is lots of open white space to move through. And for \(p\) close to zero, the answer is clearly no. But for values of \(p\) in the middle, things are a little trickier. what directions can we move in? Can we go diagonal? At this point, we have to make a modelling assumption about movement and which voids are neighbors. Let us propose we can only travel left, right, up, or down, in a von Neumann neighborhood around each site. Then if \(p=0.5\), we can make it from top to bottom, but if \(p=0.4\), we can not.
We can compute how far down percolation goes  the "depth"  by hand, but that can get hard when \(n\) gets large. It is better to compute the depth algorithmically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 

As an example application of this algorithm, above we have a 12x12 matrix, with fill fraction 0.450000 If we start on the top, how far toward the bottom can we move on blanks?
From column 0, we go 2 steps down
From column 1, we go 2 steps down
From column 2, we go 2 steps down
From column 3, we go 0 steps down
From column 4, we go 0 steps down
From column 5, we go 0 steps down
From column 6, we go 12 steps down
From column 7, we go 12 steps down
From column 8, we go 0 steps down
From column 9, we go 3 steps down
From column 10, we go 3 steps down
From column 11, we go 3 steps down
Max depth: 12
Now, doublecheck with your own eyes. Are all of these correct?
The examples given so far are particular realizations. There are \(16 \times 16\) array with void fractions of \(p=0.5\) where the array does not percolation from top to bottom, and there are \(16 \times 16\) arrays with \(p=0.4\) where the arrays do percolate from top to bottom. But in practice, the arrays are much larger also. Let's consider the same question for some progressively larger arrays.
Note above that the fluid will not get through for small void fraction \(p\), and that it will for large void fractions. And interestingly, it seems that as the we look at larger matrices, the chance of getting through becomes lessrandom and more predictable based on \(p\). Very important question: For very large systems, is there a specific value \(p=p _ c\) when the water stop flowing?
To explore this question, we can perform some simulations. Suppose for a given value of the void fraction \(p\), we draw a thousand \(40 \times 40\) sample arrays, and use the search algorithm above to calculate the percolation depth of each. Then, we plot what fraction of those arrays percolate to each possible depth as an estimate of the probability that another randomly drawn array will percolate to that depth. The result we see is given below. When the void fraction is small, only a small fraction of the sampled arrays percolate deeper than 10 steps. But when the void fraction is larger than \(0.6\), the majority of sampled arrays percolate from the top all the way to the bottom of their array. There's a change in the character of the curves someplace between \(p = 0.5\) and \(p = 0.7\).
To better see what is happening with percolation process for void fractions between \(0.5\) and \(0.7\), we plot the fraction of arrays that percolate the full array height \(n\). As the array size increases, we see that the curves of fraction percolating verse void fraction become less linelike and more sshaped. As we approach the limit of infinitely large arrays, these curves converge to a Heaviside function with a jump around \(p = 0.59\). For very large arrays, very few samples will percolate for void fractions smaller than \(0.59\), but almost all arrays we sample will percolate for void fractions greater than \(0.59\).
This steep change in the depth characteristic of these random arrays is called a "phase transition". Phase transitions are discontinuouschanges in the properties of very large systems. In this case, the phase transition is between impermeability and permeability. The most familiar examples may be the boiling of water into steam, or the freezing of water into ice. Steam, water, and ice are three different phases of a system with a very a large number of water molecules. The changes in the properties of the system are abrupt, and occur at special "critical" temperatures; \(0^{\circ}\) Celcius for freezing and \(100^{\circ}\) Celcius for boiling. (these temperatures depend on the atmospheric pressure).
The permeability phase transition occurs when the void fraction is near certain percolation threshold \(p _ c\). Percolation threshold for the model we've described here of sitepercolation with vonNeumann neighborhoods is \(p _ c \approx 0.5927\). Even though the von Neumann neighborhood is one of the simplest spatial models, no closedform formula is known for its percolation threshold  it must be estimated using Monte Carlo approaches related to those we've used above. The challenge to find an efficient formula for this percolation threshold remains greatly vexing for statistical mechanics.
But we should also emphasize that the value of the threshold would change if we changed the neighborhood of allowed movement. If we could move in diagonal directions as well, like a Moore neighborhood, then we wouldn't need as much empty space to move through the medium, and the percolation threshold would be smaller.
Bond percolation is a different geometric approach to the same questions, and a great illustration of the merits of alternative models of a phenomena. In bond percolation, the void fraction \(p\) controls the probability that two adjacent nodes are connected along an edge. So, it's not the cells themselves that are empty or filled, but rather the edges connecting the cells.
Instead of each cell being empty or filled, in bond percolation, each edge is present with some probability \(p\) or absent with probability \(1p\). Here is an example of an \(80 \times 80\) matrix with \(p=0.45\).Can you travel from top to bottom along the bonds?
Despite the similarities to site percolation, the difference in geometry has a huge impact on our ability to answer this question. For von Neumann bond percolation, we can compute \(p_c\) exactly as the lattice becomes infinitely large. The threshold below which you can't travel from top to bottom \(p_c=1/2\).
To explain why \(p_c\) must be \(1/2\), we need to introduce the concept of our bond lattice's dual lattice. Suppose we create a random lattice with von Neuman neighborhoods and void fraction \(p\). We'll call this our principle lattice. Now, create a new set of nodes, shifted a halfstep in both directions for the nodes of the principle lattice. We will call this set of new nodes the "dual nodes". The neighbors of the dual nodes are also given by the von Neumann neighborhood. Now, for any two neighboring dual nodes, connect the nodes with an edge if and only if the new edge will not cross any of the edges in our principle lattice. The lattice made up of the dual nodes and their edges is called the dual lattice. Because of the conditions for edge existence, the dual lattice will have edges present with probability \(1p\). Here is an example.
Now, let's consider percolation on our lattice from toptobottom, simultaneously with percolation of the dual lattice from lefttoright. If our principle lattice is permeable, then there will be atleast one unbroken path of bonds from top to bottom. This implies that the dual lattice can not have an unbroken path of bonds from left to right, and hence, the dual lattice will be impermeable. On the other hand, if the dual lattice is permeable from left to right, then our principle lattice will be impermeable from top to bottom.
Imagine that \(p _ c < 1/2\) for bond percolation with von Neumann neighborhoods. Pick the void fraction \(p\) so \(p _ c< p < 1/2\). Then both the principle and the dual lattices must be permeable. But this is a contradiction to our observation that only one lattice can be permeable at a time.
On the other hand, if \(p _ c > 1/2\), then if we pick \(p\) so that \(1/2 < p < p _ c\), the lattice will be below the percolation threshold, and thus will not contain a connected path from top to bottom. This implies there is a connected path in the dual lattice separating top from bottom. Thus, the dual lattice is permeable. But since \(1/2 < p < p _ c\), \(1p < 1/2 < p _ c\), so the dual lattice is below percolation threshold and should not be permeable either. Thus, a contradiction. It follows then that the only possibility is \(p _ c = 1/2\).
Asymptotic approximations can have jumps
The distinction between permeable and impermeable is not one of degrees but one of kind  there is a sharp cutoff below which media are impermeable.
The solvability of a model is much more fragile than it's general properties. Both site and bond percolation are reasonable idealize models of percolation phenomena. But bondpercolation with von Neumann neighborhoods is easy to solve, while sitepercolation with von Neumann neighborhoods remains unsolved. When trying to solve a modelling problem, be open to alternative perspectives.
Our proof that the critical void fraction for bond percolation on a von Neuman lattice is a half relies on an important relationship that the boundary of any simply connected cluster of nodes in our principle lattice is a closed loop of nodes in our dual lattice.
in site percolation, boundary is not necessary a connected path, so both the principle and dual lattices might both not percolate.
boundary has a different character than the connected components in discrete spatial models. For example, for site percolation with von Neuman neighborhoods, the boundary of a cluster is NOT a path in von Neuman neighborhoods.
The Jordan curve theorem is a result in topology of the plane that says any continuous bounded closed curve (like a circle) splits the plane into two disconnected components. The Jordan curve theorem is notorious, as being something intuitively obvious to the lay observer, while being a great headache to prove formally. These percolation examples give some intuition of why it might be hard  boundaries are NOT always the same as clusters, and so, they might NOT divide the space in the way you expect.
Moore neighborhoods, being nonplanar, allow weird things to happen. For example, a simply connected set can have a boundary set with many holes in it, rather than just 1 hole.
The "neighborhood" in a cellular automata model defines the geometry of the space. In class, we discussed 4 different kinds of neighborhoods for 2dimensional cellular automata on a Cartesian grid  8neighbor Moore, the 6neighbor, the 4neighbor von Neumann, and the 3neighbor.
Let \(N_k(x,y)\) be the set of neighbors of site \((x,y)\) where \(x\) and \(y\) are integers. If \(k = 3\), then \[ N_3(x,y) = \begin{cases} \{ (x+1,y), (x,y+1), (x,y1) \} & \text{ if $x+y$ is even}, \\ \{ (x1,y), (x,y+1), (x,y1) \} & \text{ if $x+y$ is odd}. \end{cases} \] Find similar formulas for \(N_4(x,y)\), \(N_6(x,y)\), and \(N_8(x,y)\).
In a regular lattice, the atomic loop length is the smallest number of neighboring edges in loop from \((0,0)\) back to itself where the same edge between two neighbors is never traversed more than once. Each such loop with this minimal loop length is called an "atomic loop". For each \(k \in \{ 3,4,6,8 \}\), find the atomic loop length and the number of atomic loops containing \((0,0)\) for lattices with neighborhoods \(N_k(x,y)\).
Given a neighborhood \(N_k(x,y)\), we can define a metric \(d_k( (x,y), (u,v))\) to measure the distance between points \((x,y)\) and \((u,v)\) recursively as follows: \[d_k( (x,y), (u,v) ) = \begin{cases} 0 & \text{if $(x,y) = (u,v)$}, \\ 1 + \min \{ d_k( (w,z), (u,v) ) : (w,z) \in N_k( (x,y) ) \} & \text{otherwise}. \end{cases}\] For each \(k \in \{ 3,4,6,8 \}\), draw the set of points \(\{ (u,v) : d_k( (0,0), (u,v) ) \leq 2\}\) on the appropriate lattice.
For each \(k \in \{ 3,4,6,8 \}\), find \(d_k( ( 0, 0), (3,3) )\).
Our choice of neighborhood also has some unexpected implications for the topology of space. Suppose we define a set of nodes as simply connected if and only if between any two nodes in the set, there is a path of neighboring nodes also entirely in the set, and the interior of every closed loop in the set is also in the set. The outer boundary of a simply connected set is the set of nodes that are neighbors of nodes in the given set, but are themselves not in the set. For which neighborhoods, if any, is the boundary of a simply connected set a closed loop?
In class, we were able to visually calculate percolation depth because of our awesome pattern analysis wetware. Calculating percolation depth algorithmically is a little more complicated, but can be done recursively. See here
Let \(A(p,N)\) be a random \(0/1\) matrix with shape \(N \times N\) where sites are filled (\(1\)) with probability \(1p\), and void (\(0)\) with probability \(p\). Recall that in class, we generated example matrices like this with the python code \[ A = \text{floor(rand(N,N) + p)} \] For each value of \(p \in \{ 0.7, 0.6, 0.55, 0.5, 0.4 \}\), simulate \(1000\) \(40 \times 40\) matrices. Let \(d(x,p)\) be the fraction of these matrices with percolation depth greater than \(x\). Assuming a hexagonal neighborhood, plot \(d(x,p)\) for each value of \(p\), all in one plot. Remember to label your plot axes and include a legend.
What particular feature of your plot changes qualitatively as \(p\) changes. What does this change mean?
Suppose we define \(h(p,N)\) as the fraction of \(N\times N\) \(A(p,N)\) matrices with hexagonal neighborhoods which percolate all the way through. Plot \(h(p,N)\) as a function of \(p \in [0.3,0.6]\) (this is an INTERVAL  make sure to include values between the extremes.) for \(N \in \{ 5, 10, 20, 50, 100 \}\). Use atleast 1000 matrices for each. (Warning: This calculation may take you a long time.)
Extrapolating from your plot, what do you think will happen to \(\lim_{N\rightarrow \infty} h(p,N)\)?
Reality, of course, is often threedimensional, rather than twodimensional. Describe how you think the percolation threshold will change when we switch from two dimensions to three dimensions and why it will change.
Estimate the percolation threshold of the 3dimensional equivalent of the von Neumann neighborhood.
What is the dual lattice of bond percolation on a hexagonal neighborhood? What does this imply about the percolation threshold of the hexagonal lattice?
Show that our bond percolation proof can not be generalized to spaces using Moore neighborhoods by constructing paths from toptobottom and lefttoright that do not intersect each other.
The concept of percolation can be generalized to continuous space. Suppose we drop \(n\) circles of radius \(r\) randomly into a unit square (the locations of their centers is drawn from a twodimensional Poisson distribution), and that water can not flow through these circles.
To complete: Random graphs are an extreme case commonly referred to as the meanfield limit where space exists but is unstructured. Study the percolation threshold in random graphs. (this is erdo's result, but it is rather uninteresting from a spatial perspective)