My spontaneous reaction to a computer-assisted proof is to regard it as having a lesser status than one done by hand. Here I want to consider why I feel this way and if and under what circumstances this point of view is justified. I start by considering the situation of a traditional mathematical proof, done by hand and documented in journal articles or books. In this context it is impossible to write out all details of the proof. Somehow the aim is to bridge the gap between the new result and what experts in the area are already convinced of. This general difficulty becomes more acute when the proof is very long and parts of it are quite repetitive. There is the tendency to say that the next step is strictly analogous to the previous one and if the next step is written out there is the tendency for the reader to think that it is strictly analogous and to gloss over it. Human beings (a class which includes mathematicians) make mistakes and have a limited capacity to concentrate. To sum up, a traditional proof is never perfect and very long and repetitive proofs are likely to be less so than others. So what is it that often makes a traditional proof so convincing? I think that in the end it is its embedding in a certain context. An experienced mathematician has met with countless examples of proofs, his own and those of others, errors large and small in those proofs and how they can often be repaired. This is complemented by experience of the interactions between different mathematicians and their texts. These things give a basis for judging the validity of a proof which is by no means exclusively on the level of explicit logical argumentation.

How is it, by comparison, with computer-assisted proofs? The first point to be raised is what is meant by that phrase. Let me start with a rather trivial example. Suppose I use a computer to calculate the kth digit of n factorial where k and n are quite large. If for given choices of the numbers a well-tested computer programme can give me the answer in one minute then I will not doubt the answer. Why is this? Because I believe that the answer comes from an algorithm which determines a unique answer. No approximations or floating point operations are involved. For me interval arithmetic, which I discussed in a previous post, is on the same level of credibility, which is the same level of credibility as a computer-free proof. There could be an error in the hardware or the software or the programme but this is not essentially different from the uncertainties connected with a traditional proof. So what might be the problem in other cases? One problem is that of transparency. If a computer-assisted proof is to be convincing for me then I must either understand what algorithm the computer is supposed to be implementing or at least have the impression that I could do so if I invested some time and effort. Thus the question arises to what extent this aspect is documented in a given case. There is also the issue of the loss of the context which I mentioned previously. Suppose I believe that showing that the answer to a certain question is ‘yes’ in 1000 cases constitutes a proof of a certain theorem but that checking these cases is so arduous that a human being can hardly do so. Suppose further that I understand an algorithm which, if implemented, can carry out this task on a computer. Will I then be convinced? I think the answer is that I will but I am still likely to be left with an uncomfortable feeling if I do not have the opportunity to see the details in a given case if I want to. In addition to the question of whether the nature of the application is documented there is the question of whether this has been done in a way that is sufficiently palatable that mathematicians will actually carefully study the documentation. Rather than remain on the level of generalities I prefer to now go over to an example.

Perhaps the most famous computer-assisted proof is that of the four colour theorem by Appel and Haken. To fill myself in on the background on this subject I read the book ‘Four colors suffice’ by Robin Wilson. The original problem is to colour a map in such a way that no two countries with a common border have the same colour. The statement of the theorem is that it is always possible with four colours. This statement can be reformulated as a question in graph theory. Here I am not interested in the details of how this reformulation is carried out. The intuitive idea is to associate a vertex to each country and an edge to each common border. Then the problem becomes that of colouring the vertices of a planar graph in such a way that no two adjacent vertices have the same colour. From now on I take this graph-theoretic statement as basic. (Unfortunately it is in fact not just a graph-theoretic, in particular combinatorial, statement since we are talking about planar graphs.) What I am interested in is not so much the problem itself as what it can teach us about computer-assisted proofs in general. I found the book of Wilson very entertaining but I was disappointed by the fact that he consistently avoids going over to the graph-theoretic formulation which I find more transparent (that word again). In an article by Robin Thomas (Notices of the AMS, 44, 848) he uses the graph-theoretic formulation more but I still cannot say I understood the structure of the proof on the coarsest scale. Thomas does write that in his own simplified version of the original proof the contribution of computers only involves integer arithmetic. Thus this proof does seem to belong to the category of things I said above I would tend to accept as a mathematical proof, modulo the fact that I would have to invest the time and effort to understand the algorithm. There is also a ‘computer-checked proof’ of the four colour theorem by Georges Gonthier. I found this text interesting to look at but felt as if I was quickly getting into logical deep waters. I do not really understand what is going on there.

To sum up this discussion, I am afraid that in the end the four colour problem was not the right example for me to start with and I that I need to take some other example which is closer to mathematical topics which I know better and perhaps also further from having been formalized and documented.