Hilbert’s sixteenth problem

September 14, 2021

 

The International Congress of Mathematicians takes place every four years and is the most important mathematical conference. In 1900 the conference took place in Paris and David Hilbert gave a talk where he presented a list of 23 mathematical problems which he considered to be of special importance. This list has been a programme for the development of mathematics ever since. The individual problems are famous and are known by their numbers in the original list. Here I will write about the 16th problem. In fact the problem comes in two parts and I will say nothing about one part, which belongs to the domain of algebraic geometry. Instead I will concentrate exclusively on the other, which concerns dynamical systems in the plane. Consider two equations \frac{dx}{dt}=P(x,y), \frac{dy}{dt}=Q(x,y), where P and Q are polynomial. Roughly speaking, the problem is concerned with the question of how many periodic solutions this system has. The simple example P(x,y)=y, Q(x,y)=-x shows that there can be infinitely many periodic solutions and that the precise question has to be a little different. A periodic solution is called a limit cycle if there is another solution which converges to the image of the first as t\to\infty. The real issue is how many limit cycles the system can have. The first question is whether for a given system the number N of limit cycles is always finite. A second is whether an inequality of the form N\le H(n) holds, where H(n) depends only on the degree of the polynomials. H(n) is called the Hilbert number. Linear systems have no limit cycles so that H(1)=0. Until recently it was not known whether H(2) was finite. A third question is to obtain an explicit bound for H(n). The first question was answered positively by Écalle (1992) and Ilyashenko (1991), independently.

The subject has a long and troubled history. Already before 1900 Poincaré was interested in this problem and gave a partial solution. In 1923 Dulac claimed to have proved that the answer to the first question was yes. In a paper in 1955 Petrovskii and Landis claimed to have proved that H(2)=3 and that H(n)\le P_3(n) for a particular cubic polynomial P_3. Both claims were false. As shown by Ilyashenko in 1982 there was a gap in Dulac’s proof. After 60 years there was almost no progress on this problem. Écalle worked on it intensively for a long time. For this reason he produced few publications and almost lost his job. At this point I pause to give a personal story. Some years ago I was invited to give a talk in Bayrischzell in the context of the elite programme in mathematical physics of the LMU in Munich. The programme of such an event includes two invited talks, one from outside mathematical physics (which in this case was mine) and one in the area of mathematical physics (which in this case was on a topic from string theory). In the second talk the concept of resurgence, which was invented by Écalle in his work on Hilbert’s sixteenth problem, played a central role. I see this as a further proof of the universality of mathematics.

The basic idea of the argument of Dulac was as follows. If there are infinitely many limit cycles then we can expect that they will accumulate somewhere. A point where they accumulate will lie on a periodic solution, a homoclinic orbit or a heteroclinic cycle. Starting at a nearby point and following the solution until it is near to the original point leads to a Poincaré mapping of a transversal. In the case of a periodic limiting solution this mapping is analytic. If there are infinitely many limit cycles the fixed points of the Poincaré mapping accumulate. It follows that this mapping is equal to the identity, contradicting the limit cycle nature of the solutions concerned. Dulac wanted to use a similar argument in the other two cases. Unfortunately in that case the Poincaré mapping is not analytic. What we need is a class of functions which on the one hand is general enough to include the Poincaré mappings in this situation and on the on the other hand cannot have an accumulating set of fixed points without being equal to the identity. This is very difficult and is where Dulac made his mistake.

What about the second question? It has been known since 1980 that P(2)\ge 4. There is an example with three limit cycles which contain a common steady state and another which does not. It is problematic to find (or to portray) these with a computer since three are very small and one very large. More about the history can be found in an excellent article of Iyashenko (Centennial history of Hilbert’s 16th problem. Bull. Amer. Math. Soc. 39, 301) which was the main source for what I have written. In March 2021 a preprint by Pablo Pedregal appeared (http://arxiv.org/pdf/2103.07193.pdf) where he claimed to have answered the second problem. I feel that some caution is necessary in accepting this claim. A first reason is the illustrious history of mistakes in this field. A second is that Pedregal himself produced a related preprint with Llibre in 2014 which seems to have contained a mistake. The new preprint uses techniques which are far away from those usually applied to dynamical systems. On the one hand this gives some plausibility that it might contain a really new idea. On the other hand it makes it relatively difficult for most people coming from dynamical systems (including myself) to check the arguments. Can anyone out there tell me more?

 

My COVID-19 vaccination, part 2

August 5, 2021

Since my last post on this subject a few things have changed. In Germany 53% of people are fully vaccinated against COVID-19, which is good news. We are now in a situation where in this country any adult who wants the vaccination can get it. Of course this percentage is still a lot lower than what is desirable and the number of people being vaccinated per day has dropped to less than half what it was in mid June. I find it sad, and at the same time difficult to understand, that there are so many people who are not motivated enough to go out and get the vaccination.

Yesterday my wife and I got our second vaccination. In the meantime the relevant authority (STIKO) has recommended that those vaccinated once with the product of AstraZeneca should get an mRNA vaccine the second time. The fact that we waited the rather long time suggested to get our second injection meant that the new recommendation had already come out and we were able to get the vaccine of Biontech the second time around. There have not been many studies of the combination vaccination but as far as I have seen those that there are gave very positive results. So we are happy that it turned out this way. This time the arm where I got the injection was sensitive to pressure during the night but this effect was almost gone by this morning. The only other side effect I noticed was an increased production of endorphins. In other words, I was very happy to have reached this point although I know that it takes a couple of weeks before the maximal protection is there.

Every second year there is an event in Mainz devoted to the popularization of science called the Wissenschaftsmarkt. It has been taking place for the last twenty years. Normally it is in the centre of town but due to the pandemic it will be largely digital this year. This year it is on 11th and 12th September and has the title ‘Mensch und Gesundheit’ [rough translation: human beings and their health]. I will contribute a video with the title ‘Gegen COVID-19 mit Mathematik’ [against COVID-19 with mathematics]. The aim of this video is to explain to non-scientists the importance of mathematics in fighting infectious diseases. I talk about what mathematical models can contribute in this domain but also, which is just as important, about what they cannot do. If the public is to trust statements by scientists it is important to take measures against creating false expectations. I do not go into too much detail about COVID-19 itself since at the moment there is too little information available and too much public controversy. Instead I concentrate on an example from long ago where it is easier to see clearly. It also happens to be the example where the basic reproductive number was discovered. This is the work of Ronald Ross on the control of malaria. Ross was the one who demonstrated that malaria is transmitted by mosquito bites and he was rewarded for that discovery with a Nobel Prize in 1902. After that he studied ways of controlling the disease. This was for instance important in the context of the construction of the Panama Canal. There the first attempt failed because so many workers died of infectious diseases, mainly malaria and yellow fever, both transmitted by mosquitos. The question came up, whether killing a certain percentage of mosquitos could lead to a long-term elimination of malaria or whether the disease would simply come back. Ross, a man of many talents, set up a simple mathematical model and used it to show that elimination is possible and was even able to estimate the percentage necessary. This provided him with a powerful argument which he could use against the many people who were sceptical about the idea.

Codimension three bifurcations

July 13, 2021

In our recent work with Lisa Kreusser on Lck we observed that it can happen that to reach a certain goal it is easier to work with a bifurcation of higher codimension although this means working with an object which is intrinsically more complicated. In that specific case we used a Bogdanov-Takens bifurcation (codimension two) instead of a Hopf bifurcation (codimension one). The goal that we reached was to prove the existence of a periodic solution, which turned out to be unstable. We did not prove the existence of a stable periodic solution of the system being considered. I wondered if this strategy could be pushed further, going to a bifurcation of codimension three.

The general idea is as follows. We have a dynamical system \dot x=f(x,\alpha) depending on parameters. We want to compare it with a normal form which is a standard model dynamical system depending on parameters. Here everything is local in a neighbourhood of (0,0), which is a steady state. The number of parameters in the normal form is the codimension. In the end we want to conclude that dynamical features which are present in the normal form are also present in the original system. The normal form for the Hopf bifurcation contains periodic solutions. In fact there are two versions depending on a parameter which can take the value plus or minus one. These are known as supercritical and subcritical. In the supercritical case the periodic solutions are stable, in the subcritical case unstable. The normal form for the Bogdanov-Takens bifurcation also has two cases which may, by analogy with the Hopf case, be called super- and subcritical. The fact we used is that when a Bogdanov-Takens bifurcation exists there is always a Hopf bifurcation close to it. Moreover the super- or subcritical nature of the bifurcation is inherited. What I wanted to investigate is whether the presence of a suitable bifurcation of codimension three could imply the existence of a Bogdanov-Takens bifurcation close to it and whether it might even be the case that both super- and subcritical Bogdanov-Takens bifurcations are obtained. If this were the case then it would indicate an avenue to a statement of the kind that the existence of a generic bifurcation of codimension three of a certain type implies the existence of stable periodic solutions.

The successful comparison of a given system with the normal form relies on certain genericity conditions being satisfied. These are of two types. The first type is a condition on the system defined by setting \alpha=0. It says that the system does not belong to a certain degenerate set defined by the vanishing of particular functions of the derivatives of f(x,0) at x=0. In an abstract way this degenerate set may be thought of a subset of the set of k-jets of functions at the origin. This subset is a real algebraic variety which is a union of smooth submanifolds (strata) of different dimensions which fit together in a nice way. The set defining the bifurcation itself is also a variety of this type and the degenerate set consists of all strata of the bifurcation set except that of highest dimension. The family f(x,\alpha) induces a mapping from the parameter space into the set of k-jets. The second type of genericity condition says that this mapping should be transverse to the bifurcation set. Because of the genericity condition of the first type the image of x=0 lies in the stratum of highest dimension in the bifurcation set, which is a smooth submanifold. The transversality condition says that the sum of the tangent space to this manifold and the image of the linearization at x=0 of the mapping defined by f is the whole space. Writing about these things gives me a strange feeling since they involve concepts which I used in a very different context in my PhD, which I submitted 34 years ago, and not very often since then.

The following refers to a two-dimensional system. In the Bogdanov-Takens case the bifurcation set is defined by the condition that the linearization about (0,0) has a double zero eigenvalue. The genericity condition of the first type involves the 2-jet. The first part BT.0 (cf. this post for the terminology) says that the linearization should not be identically zero. The second part consists of conditions BT.1 and BT.2 involving the second derivatives. The remaining condition BT.3 is the transversality condition (genericity condition of the second type). Now I want to study bifurcations which satisfy the eigenvalue condition for a Bogdanov-Takens bifurcation and the condition BT.0 but which may fail to satisfy BT.1 or BT.2. At one point the literature on the subject appeared to me like a high wall which I did not have the resolve to try to climb. Fortunately Sebastiaan Janssens pointed me to a paper of Kuznetsov (Int. J. Bif. Chaos, 15, 3535) which was a gate through the wall and gave me access to things on the other side which I use in the following discussion. Following Dumortier et al. (Erg. Th. Dyn. Sys. 7, 375) we look at cases where the 2-jet of the right hand side of the equation for \dot y is of the form \alpha x^2+\beta xy. It turns out that we can divide the problem into two cases, in each of which one of the coefficients \alpha and \beta is zero and the other non-zero. We first concentrate on the case \beta=0 which is that studied in the paper just quoted. There it is referred to as the cusp case. It is the case where BT.2 is satisfied and BT.1 fails. As explained in the book of Dumortier et al. (Bifurcations of planar vector fields. Nilpotent singularites and Abelian integrals.) it is useful to further subdivide the case \alpha=0 into three subcases, known as the saddle, focus and elliptic cases. These cases give rise to different normal forms (partially conjectural). A bifurcation diagram for the cusp case can be found in the paper of Dumortier et al. Since we are dealing with a codimension 3 bifurcation the bifurcation diagram should be three-dimensional. It turns out, however, that the bifurcation set has a conical structure near the origin, so that its structure is determined by its intersection with any small sphere near the origin. This gives rise to a two-dimensional object which can be well represented in the plane. Note, however, that passing from the sphere to the plane necessarily involves discarding a ‘point at infinity’. It is this intersection which is represented in Fig. 3 of the paper of Dumortier et al. That this is an accurate representation of the bifurcation diagram in this case is the main content of the paper of Dumortier et al.

In the bifurcation diagram in the paper of Dumortier et al. we see that there are two Bogdanov-Takens points b_1 and b_2 which are joined by a curve of Hopf bifurcations. The points b_1 and b_2 are stated to be codimension 2 and this indicates that they are non-degenerate Bogdanov-Takens points. There is one exceptional point c_2 on the curve of Hopf bifurcations which is stated to be of codimension 2. This indicates that it is a non-degenerte Bautin bifurcation point. This in turn implies that one of the Bogdanov-Takens points is supercritical and the other subcritical. Thus in this case there exist both stable and unstable periodic solutions in each neighbourhood of the bifurcation point. This indicates that the strategy outlined at the beginning of this post can in principle work. Whether it can work in practise in a given system depends on how difficult it is to check the relevant non-degeneracy conditions. I end with a brief comment on the case \alpha=0. The book of Dumortier et al. presents conjectured bifurcation diagrams for all subcases but does not claim complete proofs that they are all correct. In all cases we have two Bogdanov-Takens points joined by a curve of Hopf bifurcations on thich there is precisely one Bautin point. Thus in this sense we have the same configuration as in the case \beta=0. In a discussion of these matters in the paper of Kuznetsov mentioned above (which is from 2005) it is stated that a complete proof is only available in the saddle case. I do not know if there has been any further progress on this since then.

A model for the Calvin cycle with diffusion

June 29, 2021

In the past I have spent a lot of time thinking about mathematical models for the Calvin cycle of photosynthesis. These were almost exclusively systems of ordinary differential equations. One thing which was in the back of my mind was a system involving diffusion written down in a paper by Grimbs et al. and which I will describe in more detail in what follows. Recently Burcu Gürbüz and I have been looking at the dynamics of solutions of this system in detail and we have just produced a preprint on this. The starting point is a system of five ODE with mass action kinetics written down by Grimbs et al. and which I call the MA system (for mass action). That system has only one positive steady state and on the basis of experimental data the authors of that paper were expecting more than one. To get around this discrepancy they suggested a modified system where ATP is introduced as an additional variable and the diffusion of ATP is included in the model. I call this the MAd system (for mass action with diffusion). Diffusion of the other five chemical species is not included in the model. The resulting system can be thought of as a degenerate system of six reaction reaction-diffusion equations or as a system where five ODE are coupled to one (non-degenerate) diffusion equation. The idea was that the MAd system might have more than one steady state, with an inhomogenous steady state in addition to the known homogeneous one. Experiments which measure only spatial averages would not detect the inhomogeneity. To be relevant for experiments the steady states should be stable.

For simplicity we consider the model in one space dimension. The spatial domain is then a finite closed interval and Neumann boundary conditions are imposed for the concentration of ATP. We prove that for suitable values of the parameters in the model there exist infinitely many smooth inhomogeneous steady states. It turns out that all of these are (nonlinearly) unstable. This is not a special feature of this system but in fact, as pointed out in a paper of Marciniak-Czochra et al. (J. Math. Biol. 74, 583), it is a frequent feature of systems where ODE are coupled to a diffusion equation. This can be proved using a method discussed in a previous post which allows nonlinear instability to be concluded from spectral instability. We prove the spectral instability in our example. There may also exist non-smooth inhomogeneous steady states but we did not enter into that theme in our paper. If stable inhomogeneous steady states cannot be used to explain the experimental observations what alternatives are available which still keep the same model? If experiments only measure time averages then an alternative would be limit sets other than steady states. In this context it would be interesting to know whether the system has spatially homogeneous solutions which are periodic or converge to a strange attractor. A preliminary investigation of this question in the paper did not yield definitive results. With the help of computer calculations we were able to show that it can happen that the linearization of the (spatially homogeneous) system about a steady state has non-real eigenvalues, suggesting the presence of at least damped oscillations. We proved global existence for the full system but we were not able to show whether general solutions are bounded in time, not even in L^1. It can be concluded that there are still many issues to be investigated concerning the long-time behaviour of solutions of this system.

From spectral to nonlinear instability, part 2

May 21, 2021

Here I want to extend the discussion of nonlinear instability in the previous post to the infinite-dimensional case. To formulate the problem fix a Banach space X. If the starting point is a system of PDE then finding the right X might be a non-trivial task and the best choice might depend on which application we have in mind. We consider an abstract equation of the form \frac{du}{dt}=Au+F(u). Here t\mapsto u(t) is a mapping from an interval to X. A is a linear operator which may be unbounded. In fact the case I am most interested in here is that where A is the generator of a strongly continuous semigroup of linear operators on X. Hence A may not be globally defined on X. It will be assumed that F is globally defined on X. This is not a restriction in the applications I have in mind although it might be in other interesting cases. In this setting it is necessary to think about how a solution should be defined. In fact we will define a solution of the above equation with initial value v to be a continuous solution of the integral equation u(t)=e^{tA}v+\int_0^t e^{(t-\tau)A}F(u(\tau))d\tau. Here e^{tA} is not to be thought of literally as an exponential but as an element of the semigroup generated by A. When A is bounded then e^{tA} is really an exponential and the solution concept used here is equivalent to usual concept of solution for an ODE in a Banach space. Nonlinear stability can be defined just as in the finite-dimensional case, using the norm topology. In general the fact that a solution remains in a fixed ball as long as it exists may not ensure global existence, in contrast to the case of finite dimension. We are attempting to prove instability, i.e. to prove that solutions leave a certain ball. Hence it is convenient to introduce the convention that a solution which ceases to exist after finite time is deemed to have left the ball. In other words, when proving nonlinear instability we may assume that the solution u being considered exists globally in the future. What we want to show is that there is an \epsilon>0 such that for any \delta>0 there are solutions which start in the ball of radius \delta about x_0 and leave the ball of radius \epsilon about that point.

The discussion which follows is based on a paper called ‘Spectral Condition for Instability’ by Jalal Shatah and Walter Strauss. At first sight their proof looks very different from the proof I presented in the last post and here I want to compare them, with particular attention to what happens in the finite-dimensional case. We want to show that the origin is nonlinearly unstable under the assumption that the spectrum of A intersects the half of the complex plane where the real part is positive. In the finite-dimensional case this means that A has an eigenvalue with positive real part. The spectral mapping theorem relates this to the situation where the spectrum of e^A intersects the exterior of the unit disk. In the finite-dimensional case it means that there is an eigenvalue with modulus \mu greater than one. We now consider the hypotheses of the Theorem of Shatah and Strauss. The first is that the linear operator A occurring in the equation generates a strongly continuous semigroup. The second is that the spectrum of e^A meets the exterior of the unit disk. The third is that in a neighbourhood of the origin the nonlinear term can be estimated by a power of the norm greater than one. The proof of nonlinear instability is based on three lemmas. We take e^\lambda to be a point of the spectrum of e^A whose modulus is equal to the spectral radius. In the finite-dimensional case this would be an eigenvalue and we would consider the corresponding eigenvector v. In general we need a suitable approximate analogue of this. Lemma 1 provides for this by showing that e^\lambda belongs to the approximate point spectrum of e^A. Lemma 2 then shows that there is a v whose norm grows at a rate no larger than e^{{\rm Re}\lambda t} and is such that the norm of the difference between e^Av and e^{{\rm Re}\lambda t}v can be controlled. Lemma 3 shows that the growth rate of the norm of e^A is close to e^{{\rm Re}\lambda t}. In the finite-dimensional case the proofs of Lemma 1 and Lemma 2 are trivial. In Lemma 3 the lower bound is trivial in the finite-dimensional case. To get the upper bound in that case a change of basis can be used to make the off-diagonal entries in the Jordan normal form as small as desired. This argument is similar to that used to treat A_c in the previous post. In the theorem we choose a ball B such that instability corresponds to leaving it. As long as a solution remains in that ball the nonlinearity is under good control. The idea is to show that as long as the norm of the initial condition is sufficiently small the contribution of the nonlinear term on the right hand side of the integral equation will remain small compared to that coming from the linearized equation, which is growing at a known exponential rate. The details are complicated and are of course the essence of the proof. I will not try to explain them here.

My COVID-19 vaccination

May 14, 2021

In Germany vaccination against COVID-19 has been going slowly compared to some other places. One of the leading vaccines was developed by Biontech, a company based in Mainz, which is where I live. Due to this and my impression that that vaccine is very good my first choice would have been to get the Biontech vaccine. In the end things turned out differently. The Biontech vaccine is the one which has been given most often in Germany but the number of vaccinations have been limited by the available supply. The vaccine developed by AstraZenenca has also been used here but has acquired a bad reputation. There were several reasons for that. One is the bad relations of the company to the media. Another is the occurrence of rare side effects involving thromboses. A third is that the publicly quoted efficiency of preventing the illness is much less for AstraZeneca than for Biontech (80% compared to 95%). A fourth is that the waiting time between the first and second injections is longer for AstraZeneca (12 instead of 6 weeks). Despite these things my wife and I decided to get vaccinated with the product from AstraZeneca and had our first injections today. Here I want to explain why we did so and discuss some more general related issues.

I already mentioned that AstraZeneca became unpopular. In connection with the thromboses it was only recommended by the Paul Ehrlich Institute (the official body for such things in Germany) for people over sixty. The argument was that since people over sixty had a higher probability of serious consequences if they got infected the net benefit was positive for them. Doctors were having problems getting rid of their doses. However the situation is very dynamic. Although AstraZeneca is still not recommended for people under 60 (like myself) and it is not administered to them in the official vaccination centres doctors (GP’s and specialists) are allowed to vaccinate people under 60 with that product if they inform them about the risks. They are also allowed to reduce the waiting time between the injections. These political moves have led to many people (particularly young people) wanting to get vaccinated with AstraZeneca. The result is that now instead of being hard to get rid of the AstraZeneca vaccine is scarce. Thus now there is a lot of competition and the whole process is quite chaotic. We were lucky that my wife was offered a vaccination with AstraZeneca at a time when this seemed like asking for a favour rather than offering to do a favour. She asked if I could also be vaccinated by the same doctor and got a positive answer. This was our good fortune.

It was still not clear to us whether we should accept the offer. It is in the nature of the situation that many things concerning the disease and the vaccination are simply not known. On the other hand the amount of information available is huge. All this makes a rational decision difficult. From my point of view the thromboses were so rare that they did not influence my decision at all. The combination of the earlier availability of AstraZeneca for us with the longer waiting time meant that it was not clear which option (taking AstraZeneca now or waiting for Biontech) would lead to being fully immunized sooner. We did not want to shorten the interval for the AstraZeneca vaccination (which our doctor would have accepted) for a reason I will discuss later. A reason for making the choice we did was that this gave the feeling of finally making some progress. In the comparison of the efficiencies of the two products it is important to know that if the criterion considered is the risk of serious illness or hospitalization then according to the official figures AstraZeneca is at least as good as Biontech (both at least 95%). In any case, it was a good feeling to know that we had received one dose of the vaccine. A few hours later I have not noticed any side-effects whatsoever.

Let me return to the question of shortening the time between the injections for AstraZeneca. There is some evidence that this decreases the efficiency of the vaccinations and I have read about two possible mechanisms for this which seem to me at least plausible. The first is connected with the phenomenon of affinity maturation. The first B cells which are activated in response to a certain antigen can undergo somewhat random mutations. This means that the population of B cells becomes rather heterogeneous. The different variants then compete with each other in such a way that those which bind most strongly to the antigen come to dominate the population. In this way the quality of the antibodies is increased. If a second vaccination is given too soon it can interrupt the affinity maturation initiated by the first vaccination before the antibodies have been optimized. The second mechanism is as follows. The immune reponse against the antigen remains active for some time after the vaccination. If second vaccination is given while that is still the case then the antibodies generated in the first vaccination can bind to the vector viruses coming from the second vaccination and prevent them from achieving their intended purpose. These are not established facts but I prefer to have plausible hypotheses than a complete lack of a possible explanation.

From spectral to nonlinear instability

April 28, 2021

An important property of a steady state of a dynamical system is its stability. Let x(t) be the state of the system at time t and let x_0 be a steady state. For a system of ODE these are points in Euclidean space while for more general dynamical systems they are functions which can be thought of as points in suitably chosen function spaces. In general it may be useful to have more than one function space in mind when considering a given dynamical system. First I will concentrate on the ODE case. It is possible to linearize the system about x_0 to get a linear system \dot y=Ay. The steady state x_0 is said to be linearly stable when the origin is stable for the linearized system. Since the linear system is simpler than the nonlinear one we would ideally like to be able to use linear stability as a criterion for nonlinear stability. In general the relation between linear and nonlinear stability is subtle even for ODE. We can go a step further by trying to replace linear stability by spectral stability. There are relations between eigenvalues of A with positive real parts and unstable solutions of the linearized system. Again there are subtleties. Nevertheless there are two simple results about the relation between spectral stability and nonlinear stability which can be proved for ODE. The first is that if there is any eigenvalue of A with positive real part then x_0 is nonlinearly unstable. The second is that if all eigenvalues of A have negative real parts then x_0 is nonlinearly stable, in fact asymptotically stable. These two results are far from covering all situations of interest but at least they do define a comfortable region which is often enough. In what follows I will only consider the first of these two results, the one asserting instability.

Up to this point I have avoided giving precise definitions. So what does nonlinear instability of x_0 mean? It means that there is a neighbourhood U of x_0 such that for each neighbourhood W of x_0 there is a solution satisfying x(0)\in W and x(t)\notin U for some t>0. In other words, there are solutions which start arbitrarily close to x_0 and do not stay in U. How can this be proved? One way of doing so is to use a suitable monotone function V defined on a neighbourhood of x_0. This function should be continuously differentiable and satisfy the conditions that V(x_0)=0, V(x)>0 for x\ne x_0 and \dot V>0 for x\ne x_0. Here \dot V is the rate of change of V along the solution. Let \epsilon be sufficiently small so that the closed ball \overline{B_\epsilon (x_0)} is contained in the domain of definition of V. We will take this ball to be the neighbourood U in the definition of instability. Let M be the maximum of V on \overline{B_\epsilon (x_0)}. Thus in order to show that a solution leaves U it is enough to show that V exceeds M. Consider any solution which starts at a point of V other than x_0 for t=0. The set where V(x)<V(x_0) is open and the solution can never enter it for t>0. The intersection of its complement with U is compact. Thus \dot V has a positive minimum there. As long as the solution does not leave U we have \dot V(x(t))\ge m. Hence V(t)\ge V(0)+mt. This implies that if the solution remains in U for all t>0 then V(x(t)) eventually exceeds M, a contradiction. This result can be generalized as follows. Let Z be an open set such that x_0 is contained in its closure. Suppose that we have a function V which vanishes on the part of the boundary of Z intersecting U and for which \dot V>0 on Z except at x_0. Then x_0 is nonlinearly unstable with a proof similar to that just given.

Now it will be shown that if A has an eigenvalue with positive real part a function V with the desired properties exists. We can choose coordinates so that the steady state is at the origin and that the stable, centre and unstable subspaces at the origin are coordinate subspaces. The solution can be written in the form (x,y,z) where these three variables are the projections on the three subspaces. Then A is a direct sum of matrices A_+, A_{\rm c} and A_-, whose eigenvalues have real parts which are positive, zero and negative respectively. It can be arranged by a choice of basis in the centre subspace that the symmetric part of A_c is as small as desired. It can also be shown that because of the eigenvalue properties of A_+ there exists a positive definite matrix B_+ such that A_+^TB_++B_+A_+=I. For the same reason there exists a positive definite matrix B_- such that A_-^TB_-+B_-A_-=-I. Let V=x^TB_+x-y^Ty-z^TB_-z. Then \dot V=x^Tx+z^Tz-y^T(A_c^T+A)y+o(x^Tx+y^Ty+z^Tz). The set U is defined by the condition V>0. There y^Ty\le Cx^Tx for a positive constant C. On this region \dot V\ge\frac12(x^Tx+z^Tz)+o(|x|^2+|z|^2), where we profit from the special basis of the centre subspace mentioned earlier. The quadratic term in y which does not have a good sign has been absorbed in the quadratic term in x which does. This completes the proof of nonlinear instability. As they stand these arguments do not apply to the infinite-dimensional case since compactness has been used freely. A discussion of the infinite-dimensional case will be postponed to a later post.

Strongly continuous semigroups

March 19, 2021

An evolution equation is an equation for a function u(t,x) depending on a time variable t and some other variables x which can often be thought of as spatial variables. There is also the case where there are no variables x, which is the one relevant for ordinary differential equations. We can reinterpret the function u as being something of the form u(t)(x), a function of x which depends on t. I am thinking of the case where u(t,x) takes its values in a Euclidean space. Then u(t) should be thought of as taking values in a function space. Different regularity assumptions on the solutions naturally lead to different choices of function space. Suppose, for instance, I consider the one-dimensional heat equation u_t=u_{xx}. Then I could choose the function space to be C^0, the space of continuous functions, C^2, the space of twice continuously differentiable functions or L^p. For some choices of function spaces we are forced to consider weak solutions. It is tempting to consider the evolution equation as an ODE in a function space. This can have advantages but also disadvantages. It gives us intuition which can suggest ideas but the analogues of statements about ODE often do not hold for more general evolution equations, in particular due to loss of compactness. (In what follows the function space considered will always be a Banach space.) We can formally write the equation of the example as u_t=Au for a linear operator A. If we choose the function space to be X=L^p then this operator cannot be globally defined, since A does not map from X to itself. This leads to the consideration of unbounded operators from X to X. This is a topic which requires care with the use of language and the ideas which we associate to certain formulations. An unbounded operator from X to X is a linear mapping from a linear subspace D(A), the domain of A, to X. Just as there may be more than one space X which is of interest for a given evolution equation there may be more than one choice of domain which is of interest even after the space has been chosen. To take account of this an unbounded operator is often written as a pair (A, D(A)). In the example we could for instance choose D(A) to be the space of C^\infty functions or the Sobolev space W^{2,p}.

In the finite-dimensional case we know the solution of the equation u_t=Au with initial datum u_0. It is u(t)=e^{tA}u_0. It is tempting to keep this formula even when A is unbounded, but it must then be supplied with a suitable interpretation. There are general ways of defining nonlinear functions of unbounded linear operators using spectral theory but here I want to pursue another direction, which uses a kind of axiomatic approach to the exponential function S(t)=e^{tA}. It should have the property that S(0)=I and it should satisfy the semigroup property S(s+t)=S(s)S(t) for all non-negative s and t. It remains to require some regularity property. One obvious possibility would be to require that S is a continuous function from [0,\infty) into the space of bounded operators L(X) with the topology defined by the operator norm. Unfortunately this is too much. Let us define an operator Ax=\lim_{t\to 0}\frac{S(t)x-x}{t} whenever this limit exists in X and D(A) to be the linear subspace for which it exists. In this way we get an unbounded operator on a definite domain. The problem with the continuity assumption made above is that it implies that D(A)=X. In other words, if the operator A is genuinely unbounded then this definition cannot apply. In particular it cannot apply to our example. It turns out that the right assumption is that \|S(t)x-x\|\to 0 for t\to 0 and any x\in D(X). This leads to what is called a strongly continuous one-parameter semigroup. A is called the infinitesimal generator of S(t). Its domain is dense and it is a closed operator, which means that its graph is a closed subset (in fact linear subspace) of the product X\times X with the topology defined by the product norm. In a case like the example above the problem with continuity is only at t=0. The solution of the heat equation is continuous in t in any reasonable topology on any reasonable Banach space for t>0 but not for t=0. In fact it is even analytic for t>0, something which is typical for linear parabolic equations.

In this discussion we have said how to start with a semigroup S(t) and get its generator (A,D(A)) but what about the converse? What is a criterion which tells us for a given operator (A,D(A)) that it is the generator of a semigroup? A fundamental result of this type is the Hille-Yosida theorem. I do not want to go into detail about this and related results here. I will just mention that it has to do with spectral theory. It is possible to define the spectrum of an unbounded operator as a generalization of the eigenvalues of a matrix. The complement of the spectrum is called the resolvent set and the resolvent is (A-\lambda I)^{-1}, which is a bounded operator. The hypotheses made on the generator of a semigroup concern the position of the resolvent set in the complex plane and estimates on the norm of the resolvent at infinity. In this context the concept of a sectorial operator arises.

My interest in these topics comes from an interest in systems of reaction-diffusion equations of the form u_t+D\Delta u=f(u). Here u is vector-valued, D is a diagonal matrix with non-negative elements and the Laplace operator is to be applied to each component of u. I have not found it easy to extract the results I would like to have from the literature. Part of the reason for this is that I am interested in examples where not all the diagonal elements of D are positive. That situation might be described as a degenerate system of reaction diffusion equations or as a system of reaction-diffusion equations coupled to a system of ODE. In that case a lot of results are not available ‘off the shelf’. Therefore to obtain an understanding it is necessary to penetrate into the underlying theory. One of the best sources I have found is the book ‘Global Solutions of Reaction-Diffusion Systems’ by Franz Rothe.

Dynamics of the activation of Lck

January 26, 2021

The enzyme Lck (lymphocyte-associated tyrosine kinase) is of central importance in the function of immune cells. I hope that mathematics can contribute to the understanding and improvement of immune checkpoint therapies for cancer. For this reason I have looked at the relevant mathematical models in the literature. In doing this I have realized the importance in this context of obtaining a better understanding of the activation of Lck. I was already familiar with the role of Lck in the activation of T cells. There are two tyrosines in Lck, Y394 and Y505, whose phosphorylation state influences its activity. Roughly speaking, phosphorylation of Y394 increases the activity of Lck while phosphorylation of Y505 decreases it. The fact that these are influences going in opposite directions already indicates complications. In fact the kinase Lck catalyses its own phosphorylation, especially on Y394. This is an example of autophosphorylation in trans, i.e. one molecule of Lck catalyses the phosphorylation of another molcule of Lck. It turns out that autophosphorylation tends to favour complicated dynamics. It is already the case that in a protein with a single phosphorylation site the occurrence of autophosphorylation can lead to bistability. Normally bistability in a chemical reaction network means the existence of more than one stable positive steady state and this is the definition I usually adopt. The definition may be weakened to the existence of more than one non-negative stable steady state. That autophosphorylation can produce bistability in this weaker sense was already observed by Lisman in 1985 (PNAS 82, 3055). He was interested in this as a mechanism of information storage in a biochemical system. In 2006 Fuss et al. (Bioinformatics 22, 158) found bistability in the strict sense in a model for the dynamics of Src kinases. Since Lck is a typical member of the family of Src kinases these results are also of relevance for Lck. In that work the phosphorylation processes are embedded in feedback loops. In fact the bistability is present without the feedback, as observed by Kaimachnikov and Kholodenko (FEBS J. 276, 4102). Finally, it was shown by Doherty et al. (J. Theor. Biol. 370, 27) that bistability (in the strict sense) can occur for a protein with only one phosphorylation site. This is in contrast to more commonly considered phosphorylation systems. These authors have also seen more complicated dynamical behaviour such as periodic solutions.

All these results on the properties of solutions of reaction networks are on the level of simulations. Recently Lisa Kreusser and I set out to investigate these phenomena on the level of rigorous mathematics and we have just put a paper on this subject on the archive. The model developed by Doherty et al. is one-dimensional and therefore relatively easy to analyse. The first thing we do is to give a rigorous proof of bistability for this system together with some information on the region of parameter space where this phenomenon occurs. We also show that it can be lifted to the system from which the one-dimensional system is obtained by timescale separation. The latter system has no periodic solutions. To obtain bistability the effect of the phosphorylation must be to activate the enzyme. It does not occur in the case of inhibition. We show that when an external kinase is included (in the case of Lck there is an external kinase Csk which may be relevant) and we do not restrict to the Michaelis-Menten regime bistability is restored.

We then go on to study the dynamics of the model of Kaimachnikov and Kholodenko, which is three-dimensional. These authors mention that it can be reduced to a two-dimensional model by timescale separation. Unfortunately we did not succeed in finding a rigorous framework for their reduction. Instead we used another related reduction which gives a setting which is well-behaved in the sense of geometric singular perturbation theory (normally hyperbolic) and can therefore be used to lift dynamical features from two to three dimensions in a rather straightforward way. It then remains to analyse the two-dimensonal system. It is easy to deduce bistability from the results already mentioned. We go further and show that there exist periodic and homoclinic solutions. This is done by showing the existence of a generic Bogdanov-Takens bifurcation, a procedure described more generally here and here. This system contains an abundance of parameters and the idea is to fix these so as to get the desired result. First we choose the coordinates of the steady state to be fixed simple rational numbers. Then we fix all but four of the parameters in the system. The four conditions for a BT bifurcation are then used to determine the values of the other four parameters. To get the desired positivity for the computed values the choices must be made carefully. This was done by trial and error. Establishing genericity required a calculation which was complicated but doable by hand. When a generic BT bifurcation has been obtained it follows that there are generic Hopf bifurcations nearby in parameter space and the nature of these (sub- or supercritical) can be determined. It turns out that in our case they are subcritical so that the associated periodic solutions are unstable. Having proved that periodic solutions exist we wanted to see what a solution of this type looks like by simulation. We had difficulties in finding parameter values for which this could be done. (We know the parameter values for the BT point and that those for the periodic solution are nearby but they must be chosen in a rather narrow region which we do not know explicitly.) Eventually we succeeded in doing this. In this context I used the program XPPAUT for the first time in my life and I learned to appreciate it. I see this paper as the beginning rather than the end of a path and I am very curious as to where it will lead.

The Bogdanov-Takens bifurcation, part 2

December 30, 2020

This post continues the discussion in the previous post on this subject. It turns out that the (exact) normal form is a system whose dynamics are not easy to analyse. The bifurcation diagram can be found as Figure 8.8 in the book of Kuznetsov. The case presented there is the supercritical one. The diagram can easily be converted into that describing the subcritical case, the most important thing being that the time direction is reversed so that the types of stability are interchanged. The bifurcation point is the origin in the plane of the parameters (\beta_1,\beta_2). The plane can be written as the union of four open regions R_i separated by bifurcation curves. These regions are distinguished by the types of dynamical behaviour which occur for the given values of the parameters. R_1 is simplest, having a trivial dynamics, without steady states or periodic solutions. The boundary of R_1 is a smooth curve T passing through the origin. If the origin is deleted the remainder consists of two connected components T_- and T_+. The curve T_- is the boundary between R_1 and R_2. There a fold bifurcation takes place, producing a saddle point and a sink. The region R_2 is that between T_- and the negative part of \beta_2 axis, which we call H. On that axis a Hopf bifurcation takes place. The sink turns into a source and a stable periodic solution is born. This is a supercritical Hopf bifurcation. For the other sign of the parameter s there is an unstable periodic solution and the Hopf bifurcation is subcritical. This is the reason that we have also adopted the terms sub- and supercritical in the case of a BT bifurcation. (Note that the sign of s can be computed from the parameters a_{ij} and b_{ij}.) On the other side of H we come to R_3. In that region there is a saddle point and the periodic solution already mentioned. It is the unique periodic solution for given values of the parameters. The other boundary of R_3 is a curve C where a non-local bifurcation takes place. On the curve there is an orbit which is homoclinic to the saddle. On the other side of C is R_4 where the homoclinic loop has broken and the qualitative behaviour is similar to that in R_2 except that there is a source instead of a sink. From R_4 we can pass through T_+ and return to R_1 by means of a fold bifurcation. It is perhaps worth mentioning what happens on the upper half of the \beta_2 axis. There is no bifurcation there but the two eigenvalues of the linearization are equal and opposite (a neutral saddle). This type of situation may be dangerous when looking for Hopf bifurcations because it shares certain algebraic properties with a Hopf point without being one. For this reason it could make sense to call it a ‘false Hopf point’, which is my internal name for it.

The proofs of the statements about the homoclinic orbit and the periodic solution are hard and I will just make a few comments on that. There is a subtle rescaling which, in particular, ensures that the two steady states are a fixed distance apart, independent of the parameters. Doing this transformation brings the system into the form of a perturbation of a Hamiltonian system. In the latter case the homoclinic orbit is obtained as a level surface of the Hamiltonian. This perturbation can be analysed by a method due to Pontryagin. Use is made of elliptic functions. These things are sketched in some detail by Kuznetsov in an appendix. He also mentions an alternative method for proving the uniqueness of the periodic orbit due to Dumortier and Rousseau. Concerning the fact that the approximate normal form can be transformed to the exact normal form Kuznetsov only gives a very brief sketch of the proof. Combining what is known about the qualitative behaviour of the solutions in normal form and the fact that any system satisfying the Bogdanov-Takens conditions can be reduced to this normal form provides a method for proving the existence and stability of periodic solutions and the existence of a homoclinic loop in given dynamical systems. The genericity conditions can be easier to check than for a Hopf bifurcation since they are conditions on the second derivatives rather than on the third derivatives. How can it be that it is easier to analyse a more complicated bifurcation than a simpler one? The point is that the hardest work is hidden in the proofs of the theorems about the normal form and does not have to be repeated when analysing a concrete system. The use of a BT bifurcation to help prove the existence of a Hopf bifurcation is connected to the idea of an organizing centre. The idea is to obtain insight into a dynamical system by looking for points with extremely special properties. In a given system a point like this of a given type may not exist but when it exists it may be easier to find than a point with a lower degree of speciality, even if the latter occurs more commonly. For instance we can look for a BT point instead of looking for a Hopf point. This type of strategy may be especially useful in systems where there are many parameters over which there is a lot of control. It gives a way of focussing the search in the parameter space. It is the opposite of the situation where you feel that you are looking for something you expect to be plentiful but do not know where to start.