Geometric optics

( previous, home, next )


Further reading

A laser beam bouncing off and transmitting though a glass lens.

A laser beam bouncing off and transmitting though a glass lens.


The theory of optics has one of the longest histories of study in the physical sciences. Lenses have been used for more than 6000 years. Over that time, we observed and developed theories to explain how light bounces off a mirror (reflection) and turns when passing into a new medium (refraction).

One of the first surviving natural philosophies of light was Heron of Alexandria's circa 50 AD. Heron proposed that sight was based on rays emitted by our eyes, that traveled instantaneously. Despite being wrong about the origins of light, he was correct about the properties of mirrors. He argued that because light must travel the shortest, most efficient path between two points, the angle at which a ray comes into a mirror must be the same as the angle at which it comes off the mirror. When put concisely, the observed relationship between the incoming and out-going light is called the Law of Reflection in optics, which states that light reflects off a mirror so that the angle of incidence is always equal to the angle of reflection.

The astronomer Ptolemy performed a set of experiments to study refraction -- how light's direction changes depending on its angle of incidence when entering a new medium. He showed that the change depended on the media themselves. However, it was not until the European Renaissance that natural philosophers found the mathematical formulation we now call Snell's Law or the Law of Refraction.

In the modelling problems like the Archimedes's trammel and geodesy, geometric considerations were all that was needed. But modelling with optics demands new ideas. The laws of reflection and refraction give us some excellent opportunities to illustrate how a natural law, distilled from systematic observational data, can be incorporated into a mathematical model to answer new and interesting questions. We will first we consider what shape mirror is needed to focus parallel rays to a point. We will the consider what shape a single lens interface must take to focus parallel incident rays. Along the way, we will further illustrate the use of trigonometry, vector-matrix algebra, complex variables, and nonlinear differential equation.

A focusing mirror

Suppose we would like to build a mirror that reflects parallel light rays to a single focal point. What shape should that mirror be? There are a number of ways to successfully approach this problem. The descriptions below illustrate two. The first method relies on vectors and matrix-algebra descriptions of certain linear transformations. The second uses some slick interpretations of complex variables. simplify the calculation.

Ray-tracing diagram of a focusing parabolic mirror

Ray-tracing diagram of a focusing parabolic mirror

The law of reflection

The law of reflection states that the angle of incidence equals the angle of reflection, or \(\theta_{in} = \theta_{out}\). However, it can be inconvenient to apply in this form in practice because of it's reliance on angles and trigonometry. A more convenient form can be obtained by treating reflection as a linear transformation from a vector representing the propagation of the incident light ray into a different vector representing the propagation of the reflected light ray. Let's let \(w\) represent the incident light ray, \(t\) represent the reflected light rate, and \(v\) be the physical orientation of a flat mirror.

The reflection matrix \[\mathrm{R} = 2 \frac{v v^T}{v^T v} - I\] performs a linear transformation of a vector such that the part parallel to \(v\) stays the same, but the part perpendicular to \(v\) is reflected. We can show that this matrix does what we say it will by breaking any vector \(w = w_{\parallel} + w_{\perp}\) where \(w_{\parallel}\) is parallel to \(v\) and \(w_{\perp}\) is perpendicular to \(v\). It then follows from matrix algebra that \[\mathrm{R} w = w_{\parallel} - w_{\perp}.\] Thus, in vector form, the law of reflection can be written \[t = \left( 2 \frac{v v^T}{v^T v} - I\right) w.\]

The mirror equation

Let's let the mirror be described by a curve \(y(x)\). And suppose we want our mirror to focus incoming light rays to the origin \((0,0)\). This choice means that a light ray coming down and stricking the mirror at \((x,y(x))\) should reflect off and travel in direction \((-x,-y(x))\) to reach the focus at the origin.

If we reverse our ray from focus and bounce it off of our mirror, we want to reflected ray to be vertical, hence perpendicular to the vector \((1,0)\). This can be expressed by the matrix equation \[\begin{bmatrix} 1 & 0\end{bmatrix} \mathrm{R} \begin{bmatrix} x\\y\end{bmatrix} = 0,\] where \(R\) is the approriate reflection matrix. Note that is more convenient to work with an perpendicular property rather than the fact that the reflected vector should be parallel to \((0,1)\) because we only care about the directions of our vectors, not their magnitudes, and this perpendicularity condition holds independent of vector magnitudes.

Now, we only need to determine the vector \(v\) for our reflection matrix. At every point, \((x, y(x))\), the law of reflection says that when light bounces off of a mirror, the angle of incidence equals the angle of reflection. This is equivalent to saying rays perpendicular to the mirror will be perfectly reflected while rays tangent to the mirror will continue unchanged. So, for our reflection matrix, we can take \(v =(1,y')\), a tangent vector of the mirror at \((x,y(x))\). Then on substitution, \[\begin{bmatrix} 1 & 0\end{bmatrix} \left( \frac{2}{1 + (y^\prime)^2} \begin{bmatrix} 1 & y' \\ y' & (y^\prime)^2 \end{bmatrix} - \begin{bmatrix} 1&0\\0&1 \end{bmatrix} \right) \begin{bmatrix} x\\y\end{bmatrix} = 0.\]

We can perform the calculation using the sympy package for symbolic computations in python.

[Show code]
v = Matrix([[1],[y(x).diff(x)]])
R = 2 * (v * v.T)/(v.T * v)[0,0] - eye(2)
(Matrix([[1,0]]) * R * Matrix([[x],[y(x)]]))[0,0].factor()

The matrices reduce to the scalar nonlinear ordinary differential equation \[-\frac{ \left(x \left[ \left(y'\right)^{2} - 1\right] - 2 y y' \right) }{\left( y' \right)^{2} + 1} = 0\] or equivalently, just \[x \left(\frac{d y}{d x} \right)^{2} - 2 y \frac{d y}{d x} - x = 0\]

We'll call this the mirror equation. Just from looking at the mirror equation, we can learn things. The mirror equation is first order, nonlinear, and quadratic in the first-derivative term. It is a kind of equation you'll probably never see in the introductory classes on differential equations, and so the solution methods we're most familiar with won't be of much help.

Analysis and solution

Since the mirror equation is quadratic in the first derivative, we can use the quadratic formula to calculate the mirror slope at any given point -- if we know a position \((x,y(x))\) where the mirror passes, we can determine the slope of the curve at that point by using the quadratic formula. But, the equation has 2 solutions for the slope most of the time! Why? Well, a little thought reveals that this makes perfect sense -- there are actually two different mirrors that solve our equations -- one that collects light from above, and one that collects light from below! So the ambiguity actually makes good sense and reveals a property of the problem that we had actually overlooked so far. (Another property the mirror equation reveals is symmetry of solutions -- see exercises).

While the mirror equation lacks unique solutions, it is still very solvable. If you recognize that the mirror equation is a homogeneous equation of the form \[F(dy/dx, y/x) = 0,\] you might discover that a substitution of the form \(u(x):= y(x)/x\) makes the equation separable and integrable in close form. Another approach that will always give you atleast some information is to guess that our solution has a power series \[y(x) = c _ 0 + c _ 1 x + c _ 2 x^2 + c _ 3 x^3 \ldots.\] We expect our solution to be symmetric accross the y-axis, so we can drop the odd terms (\(0 = c _ 1 = c _ 3 = \ldots\)). If we now substitute into the mirror equation and match terms in the power series, we find \(-1/4 = c _ 0 c _ 2\), \(0 = c_4 = c_6 = \ldots\). Thus, solutions are parabolas of the form \[y(x) = \frac{x^2}{4f} - f\] where \(f\) is the focal length of the mirror (the distance from the focus to the bottom of the mirror). If the parabola passes below the focus (\(f > 0\)), then it is convex and reflects light from above to the focus. If the parabola passes above the focus (\(f < 0\)), then it is concave and reflects light from below to the focus.

Solutions of the mirror equation.

Solutions of the mirror equation.

A complex variables approach

The mirror equation is much easier to derive if we have a good understanding of complex variables. The following approach is a special case of techniques generally referred to as geometric algebra, a powerful set of techniques largely ignored in most university curricula.

Let \(w = x+y(x)i\) be the position vector, and \(v = 1+y^\prime i\) be a vector tangent to the mirror at that location, where \(y' = dy/dx\). The tangent vector \(v\) should be such that we rotate it by the angle of incidence, the resulting vector is verticle. The angle of incidence from \(w\) to \(v\) is the argument of \[v w^{-1} = \frac{r_v e^{i\theta_v}}{r_w e^{i\theta_w}} = \frac{r_v}{r_w} e^{i(\theta_v - \theta_w)}.\] If we then apply that rotation to the vector \(v\), the new vector will point in the direction of the reflected ray. If we want that ray to be vertical, that the real part will be zero, so \[\operatorname{Re}\left( (v w^{-1}) v \right) = 0\] or \[\begin{gather} \operatorname{Re}\left( \frac{(1 + y^\prime i)^2}{x + yi} \right) = 0, \\ x (y^\prime)^2 - 2 y y^\prime -x = 0. \end{gather}\]

This is again the mirror equation obtained previously.

Refraction and Lenses

n n θ θ 1 2 1 2

We can further use this complex-variables approach in the design of a lense. The problem of lens design is based on refraction and is more difficult (atleast at first) than that of mirror design because refraction is a nonlinear transformation. A mathematical law describing refraction was first published by Descartes as support for the scientific method in 1637, though it is now named for and was likely originated with Snell, whose's development of triangulation we have already mentioned. Snell's law for refraction states that when a light ray hits an interface between two media, \[n_1 \sin \theta_1 = n_2 \sin \theta_2,\] where \(\theta_1\) is the angle of incidence from vertical (normal, orthogal to the surface), \(\theta_2\) is the angle of exodus from vertical, and \(n_1\) and \(n_2\) are refractive indices of the two materials.

So let's use Snell's law to design a simple lens. Assume all rays propagate outward from a focus at the origin to the lens interface at \(z(x) = x + i y(x)\). On hitting the lens, we want them to be refracted to travel up, in direction \(i\). A tangent vector to the lens interface at \(z(x)\) will be \(z' = 1 + i y'\) and normal vector \(i z' = -y' + i\).

To convert Snell's law into a vector form, we use the complex variables vector identity based on Euler's formula. If \(v := r_v e^{i \theta_v}\) and \(w := r_w e^{i \theta_w}\), then \[\begin{gather} r_v r_w \sin(\theta_v - \theta_w) = \operatorname{Im}( v \overline{w} ). \end{gather}\] The sine of the angle between any two complex vectors can be found from this identity. The first sine we need is the angle between the incident vector \(z\) and the normal \(iz'\), while the second sine we need is the angle between the refracted vector \(i\) and the normal \(iz'\). If we rewrite Snell's law using these vectors and the identity above, we obtain a differential equation for the lens surface. We find \[\begin{gather} n_1 \frac{\operatorname{Im}(i z' \overline{z})}{|i z'||z|} = n_2 \frac{\operatorname{Im}(i z' \overline{i})}{|i z'||i|}, \end{gather}\] and after simplifying, we find the lens equation \[\begin{gather} \left(\frac{n_1}{n_2}\right) \left( \frac{1}{y'} + \frac{y}{x}\right) = \sqrt{1 + \left(\frac{y}{x}\right)^2}. \end{gather}\]

We can see from inspection that solutions of the lens equation are scale-invariant, as expected -- the lens tangent \(z'\) along every point of the line \(y = m x\) depends only on \(m\), not \(x\) or \(y\). Solutions also exhibit reflection symmetry across the y-axis. Interestingly, if \(n_1 > n_2\) and we choose rays where \(m^2 = 1/((n_1/n_2)^2 - 1)\), we find that the slope of the lens curve must be infinite allong these rays. No physicaly meaningful solution to the lens equation exists for shallower rays where \(m^2 < 1/((n_1/n_2)^2 - 1)\).

The lens equation again is outside the scope of our standard solution methods for ordinary differential equations, but the solution curves can be shown to be conic sections. The general solution, letting \(k = n_1/n_2\), is given implicitly as \[\frac{k^2}{k^2-1} \left(\frac{x}{C}\right)^{2} + \left(1 + \frac{y}{C}\right)^{2} = k^2\] where \(C\) is the integration constant. We can see from inspection now that if \(k>1\) (equivalently, \(n_1 > n_2\)), then the lense solutions will be ellipses, while if \(k < 1\) (\(n_1 < n_2\)), the solutions will be hyperbolas.

Ray-tracing diagram of a focusing elliptic lens

Ray-tracing diagram of a focusing elliptic lens


to finish

A comment on method

These two competing methods -- matrix algebra and complex variables -- both lead to the same mirror equation at the end, but illustrate a common trade-off in applied mathematics between discovery and elegance. When we start in on a new problem, we don't know where to begin, make a few clumsy guesses at first, before the pieces start to fit together, and then, suddenly or not-so-suddenly, we've discovered our solution before we really have a good understanding of what we are doing.

We could just stop there, but we might also continue to study things -- having reached the end, we can look back and see all the false-starts, stumbles, and sidetracks. We can redo everything so it's clean, efficient, and elegant -- the "right" way to solve the problem. It's unlikely we'd ever have come up with this elegant solution the first time through, and at the end, we've only solved a problem to which we already had the solution. On the other hand, we now know how to solve a whole family of problems, and our deeper understanding is something we can share with and explain to others. Is the extra work worth it? Well, that depends on the situation, and is a question we can always keep in the back of our heads.

Further stories

The law of reflection and law of refraction special case of what's called Fermat's principle: "Light follows the path of least time between two points." They can also be derived from Huygens principle: "Every point reached by a wavefront of light becomes the source of a new spherically symmetric wave of light".

Today, based on our modern understanding of light as part wave, we say instead that light is observed to follow paths of stationary action (which might be a minimum or maximum of the distance traveled).


  1. A perfect mirror should be symmetric under flips. Let's check if the mirror equation's solution's are symmetric.

    1. Show by substitution that if \(Y(x)\) is a solution to the mirror equation, then so is \(-Y(x)\).

    2. Show by substitution that if \(Y(x)\) is a solution to the mirror equation, then so is \(Y(-x)\). (Hint: First solve the mirror equation for \(y'\), then substitute.)

  2. If \(z = x + iy\), then the conjugate \(\overline{z} = x - iy\). Using conjugation, the mirror equation can be written in the alternative form \[\operatorname{Re}\left( (t \overline{w}) t \right) = 0,\] which is slightly more efficient to analyze because it does not require any division. Provide a verbal geometric explanation of why replacing the inverse with the conjugate leads to the same equation.

  3. Polish natural philosopher Vitello measured refraction angles experimentally in the 13th century. A close approximation of Vitello's data relating angle of incidence to angle of refraction for light passing from air to glass is given below.

    1. Use least squares to fit a curve of the form \(\theta_2 = a_1 \theta_1 + a_3 \theta_1^3\) to Vitello's data.
    2. Solve Snel's Law for the angle of refraction \(\theta_2\) as a function of the angle of incidence \(\theta_1\), and calculate the McLauren-series this function to fourth order.
    3. Using your previous results, estimate the refractive index of glass relative to air. Explain your reasoning.

[ Data : hide , shown as table , shown as CSV shown as QR ]

# angle-of-incidence (deg),angle-of-refraction (deg)
# Witelo's (aka Vitello) experimental data on the refraction angle at
# the air/glass boundary from *Optica*, extracted from
# by g3data
# all angles are measured in degrees
# Warning: These numbers have not been double-checked against the original manuscript.  Apply with caution.

  1. Kepler's law of refraction, method in our dimesional analysis exercises, is given below. \[\theta_1 = \frac{k \theta_2}{k-(k-1)\sec \theta_2}.\]
    1. Use a series expansion to show that Kepler's law gives the same prediction as Snel's Law when the angles of incidence and refraction are small.
    2. Fit Kepler's law to Vitello's data and discuss.
  2. The parabolic solutions of the mirror equation can be derived directly using coordinate geometry if one assumes Huygens' principle. Suppose a flash of light is emitted from the origin and reflected off a mirror so that all light rays travel straight up. Under Huygen's principle all light rays travel at the same speed, and all light rays reflected off the mirror will reach a horizontal line above the focus at the same time. If the bottom of the mirror is 12 centimeters below the origin, use coordinate geometry to find an equation for the mirror.

  3. We have shown above that a perfectly focusing mirror should be shaped like a circular paraboloid (a parabola extended to three dimensions by rotation). However, it is much easier to make a spherical mirror (by rotating a circle) than a circular paraboloid (by rotating a parabola). If the aperture (explained?) radius of the mirror is small relative to the focal length of the mirror, we might make a spherical mirror that does almost as well as a circular paraboloid.

    1. Find the equation for a circle that focuses light to the origin and is the best approximation to the parabola \(y = b - x^2/4b\) when b is large. (Hint: Match the curvature of the parabola at it's minimum to that of the circle.)

    2. (Hard) While the spherical lens is a good approximation when the appeture is small, it is not perfect, and introduces some error known as spherical abberation. Spherical abberation can be removed by including a lens to focus light before it reaches the mirror. Find a differential equation for a lens that will remove spherical abboration.

  4. Find the shape of a lens that focuses light from 1 unit to the left of the lens center to one unit to the right of the lens center.

( previous, home, next )