## Archive for July, 2013

### The practical use of proofs

July 31, 2013

A question which has come up in this blog from time to time is that of the benefits of mathematical proofs. Here I want to return to it. The idea of a proof is central in mathematics. A key part of research papers in mathematics is generally stating and proving theorems. In many mathematics lectures for students a lot of the time is spent stating and proving theorems. In practical applications of mathematics, to theoretical parts of other sciences or to real life problems, proofs play a much less prominent role. Very often theoretical developments go more in the direction of computer simulations or use heuristic approaches. A scientist used to working in this way may be sceptical about what proofs have to contribute. Thus it is up to us mathematicians to understand what mathematics has to offer and then to explain it to a wider circle of scientists.

Thinking about this subject I remembered some examples from my earlier research area in mathematical relativity. Some years ago, at a time when dark energy and accelerated cosmological expansion were not yet popular research subjects, a paper was published containing numerical simulations of a model of the universe with oscillating volume. In other words in this model the expansion of the universe alternately increased and decreased. This can only happen if there is accelerated expansion at some time. In the model considered there was no cosmological constant and no exotic matter with negative pressure. In this situation it has been well known for a very long time that cosmic acceleration is impossible. Proving it is a simple argument using ordinary differential equations. Apparently this was not known to the authors since they presented numerical results blatantly contradicting this fact. This is an example of the fact that once something has been proved it can sometimes be used to show immediately that certain results must be wrong, although it does not always indicate what has gone wrong. So what was wrong in the example? In the problem under consideration there is an ordinary differential equation to be solved, but that is not all. To be physically relevant an extra algebraic condition (the Hamiltonian constraint) must also be satisfied. There is an analytical result which says that if a solution of the evolution equation satisfies the constraint at some time it satisfies it for ever. Numerically the constraint will only approximately be satisfied by the initial data and the theorem just mentioned does not say that, being non-zero, it cannot grow very fast so that at later times the constraint is far from being satisfied. Thus the situation was probably as follows: the calculation successfully simulated the solution of an ordinary differential equation, even the right ordinary differential equation, but it was not the physically correct solution. Incidentally the error in the constraint tends to act like exotic matter.

A similar problem came up in a more sophisticated type of numerical calculation. This was work by Beverly Berger and Vincent Moncrief on the dynamics of inhomogeneous cosmological models near the big bang. The issue was whether the approach of these solutions to the limit is monotone or oscillatory. The expectation was that it should be oscillatory but at one point the numerical calculations showed monotone behaviour. The authors were suspicious of the conclusion although they did not understand the problem. Being careful and conscientious they avoided publishing these results for a year (if I remember correctly). This was good since they found the explanation for the discrepancy. It was basically the same explanation as in the previous example. They were using a free evolution code where the constraint is used for the initial data and never again. Accumulating violation of the constraint was falsifying the dynamics. In this case the sign of the error was different so that it tended to damp oscillations rather than creating them. Changing the code so as to solve the constraint equation repeatedly at later times restored the correct dynamics.

Another example concerns a different class of homogeneous cosmological models. Here the equations, which are ordinary differential equations, can be formulated as a dynamical system on the interior of a rectangle. This system can be extended smoothly to the rectangle itself. Then the sides of the rectangle are invariant and the corners are stationary solutions. The solutions in the boundary tend to one corner in the past and another corner in the future. They fit together to form a cycle, a heteroclinic cycle. I proved together with Paul Tod that almost all solutions which start in the interior converge to the rectangle in the approach to the singularity, circling infinitely many times around it. The top right hand corner of the rectangle, call it $N$, plays an important role in the story which follows. A relatively straightforward numerical simulation of this system suggested that there were solutions which started close to $N$ and converged to it directly, in contradiction to the theorem. A more sophisticated simulation, due to Beverly Berger, replaced this by the scenario where some solutions starting close to $N$ go once around the rectangle before converging to $N$. Solutions which go around more than once could not be found, despite trying hard. (The results I am talking about here were never published – I heard about them by word of mouth.) So what is the problem? The point $N$ is non-hyperbolic. The linearization of the system there has a zero eigenvalue. The point is what is called a saddle node. On one side of a certain line (corresponding to one side of the rectangle) the solutions converge to $N$. If, due to inaccuracies in the calculation, the solution gets on the wrong side of the line we get the first erroneous result. On the physical side of the line the point $N$ is a saddlle and all physical solutions pass it without converging to it. The problem is that, due the non-hyperbolic nature of $N$ a solution which starts near the rectangle and goes once around it comes back extremely close to it. It fact it is so close that a program using a fixed number of decimal digits can no longer resolve the behaviour. Thus, as was explained to me by Beverly, this type of problem can only be solved using a program which is capable of implementing an arbitrarily high number of digits. At the time it was known how to do this but Beverly did not have experience with that and as far as I know this type of technique has never been applied to the problem. (The reader interested in details of what was proved is referred to the paper in Class. Quantum Grav. 16, 1705 (1999).)

The general conclusions which are drawn here are simple and commonplace. The computer can be a powerful ally but the reliability of what comes out is strongly dependent on how appropriate the formulation of the problem was. This cannot always be known in advance and so continuous vigilance is highly recommended. Sometimes theorems which have been proved can be powerful tools in discovering flaws in the formulation. In the case of arguments which are analytical but heuristic the same is true but in an even stronger sense. The reliability of the conclusions depends crucially on the individual who did the work. At the same time even the best intuition should be subjected to careful scrutiny. The most prominent example of this in general relativity is that of the singularity theorems of Penrose and Hawking starting in the mid 1960’s which led spacetime singularities to be taken seriously on the basis of rigorous proofs which contradicted the earlier heuristic work of Khalatnikov and Lifshitz.

The general questions discussed here are ones I will certainly return to. Here I have highlighted two dangerous situations: simulations which do not exactly implement conservation laws in the system being modelled and non-hyperbolic steady states of a dynamical system.

In a previous post I wrote about the Michaelis-Menten reduction of reactions catalysed by enzymes in which a single equation (effective equation) is the limit of a system of two equations (extended equations) as a parameter $\epsilon$ tends to zero. What I did not talk about is the sense in which solutions of the effective equation approximate those of the extended ones. I was sure that this must be well-known but I did not know a source for it. Now I discovered that what I had been seeking is to be found in a very nice form in a book which had been standing on a shelf in my office for many years. This is the book ‘Asymptotic Expansions for Ordinary Differential Equations’ by Wolfgang Wasow and the part of relevance to Michaelis-Menten reduction starts on p. 249. Michaelis-Menten is not mentioned there but the key mathematical result is exactly what is needed for that application. The theorem is due to Tikhonov but the original paper is in Russian and so not accessible to me. For convenience I repeat the equations from the previous post on this subject. $\dot u=f(u,v),\epsilon\dot v= g(u,v)$. This is the type of system treated in Tikhonov’s theorem, including the possibility that $u$ and $v$ are vector-valued.
The statement of the theorem is as follows. On any finite time interval $[0,T]$ the function $u$ in the extended system converges uniformly to the solution of the reduced system as $\epsilon\to 0$. Given a solution of the reduced system it is possible to compute a corresponding function $v$. On the time interval $(0,T]$ the function $v$ in the extended system converges to the function $v$ coming from the reduced system uniformly on compact subsets. Of course this conclusion  requires some hypotheses on the functions $f$ and $g$. The key thing is that for a fixed value of $u$ we have an asymptotically stable stationary solution of the equation for $v$ (with $\epsilon\ne 0$).
With this result in hand it is possible to compute higher order corrections in $\epsilon$. This was first done by Vasileva and is also explained in the book of Wasow. The result was extended to a statement global in $t$ by Hoppensteadt, Trans. Amer. Math. Soc. 123, 521. I expect that there are more modern treatments of these things in the literature but I find the sources quoted here very helpful for the beginner like myself. There remains the question of the relation to the usual Michaelis-Menten procedure. This is nicely discussed in a paper by Heineken et. al., Math. Biosci. 1, 95.