Archive for the ‘analysis’ Category

Matched asymptotic expansions and the Kaplun-Lagerstrom model

December 13, 2022

In a post I wrote a long time ago I described matched asymptotic expansions as being like black magic. Now I have understood some more about how to get from there to rigorous mathematics. My main guide in doing so has been Chapter 7 of the book ‘Classical Methods in Ordinary Differential Equations’ by Hastings and McLeod. There they give an extensive treatment of a model problem by Kaplun and Lagerstrom. The ultimate source of this is work of Stokes on hydrodynamics around 1850. In his calculations he found some paradoxical phenomena. Roughly speaking, attempting to obtain an asymptotic expansion for a solution led to inconsistencies. These things remained a mystery for many years. A big step forward came in the work of Kaplun and Lagerstrom in 1957. There they introduced an ODE model which, while having no direct physical interpretation, provides a relatively simple mathematical context in which to understand these phenomena. It is this model problem which is treated in detail by Hastings and McLeod. The model is a boundary value problem for the equation y''+\frac{n-1}{r}y'+\epsilon yy'=0. We look for a solution with y(1)=0 and \lim_{r\to\infty}y(r)=1. The first two terms look like the expression for the Laplacian of a spherically symmetric function in n dimensions and for this reason the motivation is strong to look at the cases n=2 and n=3 which are vaguely related to fluid flow around a cylinder and flow around a sphere, respectively. It turns out that the case n=2 is a lot harder to analyse than the case n=3. When n=3 the problem has a unique solution for \epsilon>0. We would like to understand what happens to this solution as \epsilon\to 0. It is possible to find an asymptotic expansion in \epsilon but it is not enough to use powers of \epsilon when building the expansion. There occurs a so-called switchback term containing \log\epsilon. This is a singular limit although the parameter in the equation only occurs in a lower order term. This happens because the equation is defined on a non-compact region.

Consider the case n=3. In applying matched asymptotic expansions to this problem the first step is to do a straightforward (formal) expansion of the equation in powers of \epsilon. This gives differential equations for the expansion coefficients. At order zero there is no problem solving the equation with the desired boundary conditions. At order one this changes and it is not possible to implement the desired boundary condition at infinity. This has to do with the fact that in the correct asymptotic expansion the second term is not of order \epsilon but of order \epsilon\log\epsilon. This extra term is the switchback term. Up to this point all this is formal. One method of obtaining rigorous proofs for the asymptotics is to use GSPT, as done in two papers of Popovic and Szmolyan (J. Diff. Eq. 199, 290 and Nonlin. Anal. 59, 531). There is an introduction to this work in the book but I felt the need to go deeper and I looked at the original papers as well. To fit the notation of those papers I replace y by u. Reducing the equation to first order by introducing v=u' as a new variable leads to a non-autonomous system of two equations. Introducing \eta=1/r as a new dependent variable and using it to eliminate r from the right hand side of the equations in favour of \eta leads to an autonomous system of three equations. This allows the original problem to be reformulated in the following geometric way. The u-axis consists of steady states. The point (1,0,0) is denoted by Q. The aim is to find a solution which starts at a point of the form (0,v,1) and tends to Q as r\to\infty. A solution of this form for \epsilon small and positive is to be found by perturbation of a corresponding solution in the case \epsilon=0. For \epsilon>0 the centre manifold of Q is two-dimensional and given explicitly by v=0. In the case \epsilon=0 it is more degenerate and has an additional zero eigenvalue. To prove the existence of the desired connecting orbit we may note that for \epsilon>0 this is equivalent to showing that the manifold consisting of solutions starting at points of the form (0,v,1) and the manifold consisting of solutions converging to Q intersect. The first of these manifolds is obviously a deformation of a manifold for \epsilon=0. We would like the corresponding statement for the second manifold. This is difficult to get because of the singularity of the limit. To overcome this \epsilon is introduced as a new dynamical variable and a suitable blow-up is carried out near the u-axis. In this way it is possible to get to a situation where there are two manifolds which exist for all \epsilon\ge 0 and depend smoothly on \epsilon. They intersect for \epsilon=0 and in fact do so transversely. It follows that they also intersect for \epsilon small and positive. What I have said here only scratches the surface of this subject but it indicates the direction in which progress could be made and this is a fundamental insight.

Advertisement

The shooting method for ordinary differential equations

November 25, 2022

A book which has been sitting on a shelf in my office for some years without getting much attention is ‘Classical Methods in Ordinary Differential Equations’ by Hastings and McLeod. Recently I read the first two chapters of the book and I found the material both very pleasant to read and enlightening. A central theme of those chapters is the shooting method. I have previously used the method more than once in my own research. I used it together with Juan Velázquez to investigate the existence of self-similar solutions of the Einstein-Vlasov system as discussed here. I used it together with Pia Brechmann to prove the existence of unbounded oscillatory solutions of the Selkov system for glycolysis as discussed here. This method is associated with the idea of a certain type of numerical experiment. Suppose we consider a first order ODE with initial datum x(0)=\alpha. Suppose that for some value \alpha_1 the solution tends to +\infty for t\to\infty and for another value \alpha_2>\alpha_1 the solution tends to -\infty. Then we might expect that in between there is a value \alpha^* for which the solution is bounded. We could try to home in on the value of \alpha^* by a bisection method. What I am interested in here is a corresponding analytical procedure which sometimes provides existence theorems.

In the book the procedure is explained in topological terms. We consider a connected parameter space and a property P. Let A be the subset where P does not hold. If we can show that A=A_1\cup A_2 where A_1 and A_2 are non-empty and open and A=A_1\cap A_2=\emptyset then A is disconnected and so cannot be the whole of the parameter space. Hence there is at least one point in the complement of A and there property P holds. The most common case is where the parameter space is an interval in the real numbers. For some authors this is the only case where the term ‘shooting method’ is used. In the book it is used in a more general sense, which might be called multi-parameter shooting. The book discusses a number of cases where this type of method can be used to get an existence theorem. The first example is to show that x'=-x^3+\sin t has a periodic solution. In fact this is related to the Brouwer fixed point theorem specialised to dimension one (which of course is elementary to prove). The next example is to show that x'=x^3+\sin t has a periodic solution. After that this is generalised to the case where \sin t is replaced by an arbitrary bounded continuous function on [0,\infty) and we look for a bounded solution. The next example is a kind of forced pendulum equation x''+\sin x=f(t) and the aim is to find a solution which is at the origin at two given times. In the second chapter a wide variety of examples is presented, including those just mentioned, and used to illustrate a number of general points. The key point in a given application is to find a good choice for the subsets. There is also a discussion of two-parameter shooting and its relation to the topology of the plane. This has a very different flavour from the arguments I am familiar with. It is related to Wazewski’s theorem (which I never looked at before) and the Conley index. The latter is a subject which has crossed my path a few times in various guises but where I never really developed a warm relationship. I did spend some time looking at Conley’s book. I found it nicely written but so intense as to require more commitment than I was prepared to make at that time. Perhaps the book of Hastings and McLeod can provide me with an easier way to move in that direction.

The centre manifold theorem and its proof

September 2, 2022

In my research I have often used centre manifolds but I have not thoroughly studied the proof of their existence. The standard reference I have quoted for this topic is the book of Carr. The basic statement is the following. Let \dot x=f(x) be a dynamical system on R^n and x_0 a point with f(x_0)=0. Let A=Df(x_0). Then R^n can be written as a direct sum of invariant subspaces of A, V_-\oplus V_c\oplus V_+, such that the real parts of the eigenvalues of the restrictions of A to these subspaces are negative, zero and positive, respectively. V_c is the centre subspace. The centre manifold theorem states that there exists an invariant manifold of the system passing through x_0 whose tangent space at x_0 is equal to V_c. This manifold, which is in general not unique, is called a centre manifold for the system at x_0. Theorem 1 on p. 16 of the book of Carr is a statement of this type. I want to make two comments on this theorem. The first is that Carr states and proves the theorem only in the case that the subspace V_+ is trivial although he states vaguely that this restriction is not necessary. The other concerns the issue of regularity. Carr assumes that the system is C^2 and states that the centre manifold obtained is also C^2. In the book of Perko on dynamical systems the result is stated in the general case with the regularity C^r for any r\ge 1. No proof is given there. Perko cites a book of Guckenheimer and Holmes and one of Ruelle for this but as far as I can see neither of them contains a proof of this statement. Looking through the literature the situation of what order of differentiability is required to get a result and whether the regularity which comes out is equal to that which goes in or whether it is a bit less seems quite chaotic. Having been frustrated by this situation a trip to the library finally allowed me to find what I now see as the best source. This is a book called ‘Normal forms and bifurcations of planar vector fields’ by Chow, Li and Wang. Despite the title it offers an extensive treatment of the existence theory in any (finite) dimension and proves, among other things, the result stated by Perko. I feel grateful to those authors for their effort.

A general approach to proving the existence of a local centre manifold, which is what I am interested in here, is first to do a cut-off of the system and prove the existence of a global centre manifold for the cut-off system. It is unique and can be obtained as a fixed point of a contraction mapping. A suitable restriction of it is then the desired local centre manifold for the original system. Due to the arbitrariness involved in the cut-off the uniqueness gets lost in this process. A mapping whose fixed points correspond to (global) centre manifolds is described by Carr and is defined as follows. We look for the centre manifold as the graph of a function y=h(x). The cut-off is done only in the x variables. If a suitable function h is chosen then setting y=h(x) gives a system of ODE for x which we can solve with a prescribed initial value x_0 at t=0. Substituting the solution into the nonlinearity in the evolution equation for y defines a function of time. If this function were given we could solve the equation for y by variation of constants. A special solution is singled out by requiring that it vanishes sufficiently fast as t\to -\infty. This leads to an integral equation of the general form y=I(h). If y=h, i.e. h is a fixed point of the integral operator then the graph of h is a centre manifold. It is shown that when certain parameters in the problem are chosen correctly (small enough) this mapping is a contraction in a suitable space of Lipschitz functions. Proving higher regularity of the manifold defined by the fixed point requires more work and this is not presented by Carr. As far as I can see the arguments he does present in the existence proof nowhere use that the system is C^2 and it would be enough to assume C^1 for them to work. It is only necessary to replace O(|z|^2) by o(|z|) in some places.

Lyapunov-Schmidt reduction and stability

August 30, 2022

I discussed the process of Lyapunov-Schmidt reduction in a previous post. Here I give an extension of that to treat the question of stability. I again follow the book of Golubitsky and Schaeffer. My interest in this question has the following source. Suppose we have a system of ODE \dot y+F(y,\alpha)=0 depending on a parameter \alpha. The equation for steady state solutions is then F(y,\alpha)=0. Sometimes we can eliminate all but one of the variables to obtain an equation g(x,\alpha)=0 for a real variable x whose solutions are in one to one correspondence with those of the original equation for steady states. Clearly this situation is closely related to Lyapunov-Schmidt reduction in the case where the linearization has corank one. Often the reduced equation is much easier to treat that the original one and this can be used to obtain information on the number of steady states of the ODE system. This can be used to study multistationarity in systems of ODE arising as models in molecular biology. In that context we would like more refined information related to multistability. In other words, we would like to know something about the stability of the steady states produced by the reduction process. Stablity is a dynamical property and so it is not a priori clear that it can be investigated by looking at the equation for steady states on its own. Different ODE systems can have the same set of steady states. Note, however, that in the case of hyperbolic steady states the stability of a steady state is determined by the eigenvalues of the linearization of the function F at that point. Golubitsky and Schaeffer prove the following remarkable result. (It seems clear that results of this kind were previously known in some form but I did not yet find an earlier source with a clear statement of this result free from many auxiliary complications.) Suppose that we have a bifurcation point (y_0,\lambda_0) where the linearization of F has a zero eigenvalue of multiplicity one and all other eigenvalues have negative real part. Let x_0 be the corresponding zero of g. The result is that if g(x)=0 and g'(x)\ne 0 then for x close to x_0 the linearization of F about the steady state has a unique eigenvalue close to zero and its sign is the same as that of g'(x). Thus the stability of steady states arising at the bifurcation point is determined by the function g.

I found the proof of this theorem hard to follow. I can understand the individual steps but I feel that I am still missing a global intuition for the strategy used. In this post I describe the proof and present the partial intuition I have for it. Close to the bifurcation point the unique eigenvalue close to zero, call it \mu, is a smooth function of x and \alpha because it is of multiplicity one. The derivative g' is also a smooth function. The aim is to show that they have the same sign. This would be enough to prove the desired stability statement. Suppose that the gradient of g' at x_0 is non-zero. Then the zero set of g is a submanifold in a neighbourhood of x_0. It turns out that \mu vanishes on that manifold. If we could show that the gradient of \mu is non-zero there then it would follow that the sign of \mu off the manifold is determined by that of g'. With suitable sign conventions they are equal and this is the desired conclusion. The statement about the vanishing of \mu is relatively easy to prove. Differentiating the basic equations arising in the Lyapunov-Schmidt reduction shows that the derivative of F applied to the gradient of a function \Omega arising in the reduction process is zero. Thus the derivative of F has a zero eigenvalue and it can only be equal to \mu. For by the continuous dependence of eigenvalues no other eigenvalue can come close to zero in a neighbourhood of the bifurcation point.

After this the argument becomes more complicated since in general the gradients of g' and \mu could be zero. This is got around by introducing a deformation of the original problem depending on an additional parameter \beta and letting \beta tend to zero at the end of the day to recover the original problem. The deformed problem is defined by the function \tilde F(y,\alpha,\beta)=F(y,\alpha)+\beta y. Lyapunov-Schmidt reduction is applied to \tilde F to get a function \tilde g. Let \tilde\mu be the eigenvalue of D\tilde F which is analogous to the eigenvalue \mu of DF. From what was said above it follows that, in a notation which is hopefully clear, \tilde g_x=0 implies \tilde\mu=0. We now want to show that the gradients of these two functions are non-zero. Lyapunov-Schmidt theory includes a formula expressing \tilde g_{x\beta} in terms of F. This formula allows us to prove that \tilde g_{x\beta}(0,0,0)=\langle v_0^*,v_0\rangle>0. Next we turn to the gradient of \tilde \mu, more specifically to the derivative of \tilde \mu with respect to \beta. First it is proved that \tilde\Omega (0,0,\beta)=0 for all \beta. I omit the proof, which is not hard. Differentiating \tilde F and evaluating at the point (0,0,\beta) shows that v_0 is an eigenvalue of D\tilde F there with eigenvalue \beta. Hence \tilde\mu (0,0,\beta)=0 for all \beta. Putting these facts together shows that \tilde\mu (\tilde \Omega,0,\beta)=\beta and the derivative of \tilde\mu (\tilde \Omega,0,\beta) with respect to \beta at the bifurcation point is equal to one.

We now use the following general fact. If f_1 and f_2 are two smooth functions, f_2 vanishes whenever f_1 does and the gradients of both functions are non-zero then f_2/f_1 extends smoothly to the zero set of f_1 and the value of the extension there is given by the ratio of the gradients (which are necessarily proportional to each other). In our example we get \tilde\mu(\tilde\Omega,\alpha,\beta)=\tilde a(x,\alpha,\beta)\tilde g_x(x,\alpha,\beta) with \tilde a(0,0,0)=[\frac{\partial}{\partial\beta}(\tilde\Omega,0,0)]/\tilde g_{x\beta}(0,0,0)]>0. Setting \beta=0 in the first equation then gives the desired conclusion.

Persistence and permanence of dynamical systems

February 10, 2022

In mathematical models for chemical or biological systems the unknowns, which might be concentrations or populations, are naturally positive quantities. It is of interest to know whether in the course of time these quantities may approach zero in some sense. For example in the case of ecological models this is related to the question whether some species might become extinct. The notions of persistence and permanence are attempts to formulate precise concepts which could be helpful in discussing these questions. Unfortunately the use of these terms in the literature is not consistent.

I first consider the case of a dynamical system with continuous time defined by a system of ODE \dot x=f(x). The system is defined on the closure \bar E of a set E which is either the positive orthant in Euclidean space (the subset where all coordinates are non-negative) or the intersection of that set with an affine subspace. The second case is included to allow the imposition of fixed values of some conserved quantities. Let \partial E be the boundary of E, where in the second case we mean the boundary when E is considered as a subset of the affine subspace. Suppose that for any x_0\in \bar E there is a unique solution x(t) with x(0)=x_0. If x_0\in E let c_1(x_0)=\liminf d(x(t),\partial E), where x(t) is the solution with initial datum x_0 and d is the Euclidean distance. Let c_2(x_0)=\limsup d(x(t),\partial E). A first attempt to define persistence (PD1) says that it means that c_1(x_0)>0 for all x_0\in E. Similarly, a first attempt to define uniform persistence (PD2) is to say that it means that there is some m>0 such that c_1(x_0)\ge m for all x_0\in E. The system may be called weakly persistent (PD3) if c_2(x_0)>0 for all x_0\in E but this concept will not be considered further in what follows. A source of difficulties in dealing with definitions of this type is that there can be a mixing between what happens at zero and what happens at infinity. In a system where all solutions are bounded PD1 is equivalent to the condition that no positive solution has an \omega-limit point in \partial E. Given the kind of examples I am interested in I prefer to only define persistence for systems where all solutions are bounded and then use the definition formulated in terms of \omega-limit points. In that context it is equivalent to PD1. The term permanence is often used instead of uniform persistence. I prefer the former term and I prefer to define it only for systems where the solutions are uniformly bounded at late times, i.e. there exists a constant M such that all components of all solutions are bounded by M for times greater than some t_0, where t_0 might depend on the solution considered. Then permanence is equivalent to the condition that there is a compact set K\subset E such that for any positive solution x(t)\in K for t\ge t_0. A criterion for permanence which is sometimes useful will now be stated without proof. If a system whose solutions are uniformly bounded at infinity has the property that \overline{\omega(E)}\cap\partial E=\emptyset then it is permanent. Here \omega(E) is the \omega-limit set of E, i.e. the union of the \omega-limit sets of solutions starting at points of E. If there is a point in \overline{\omega(E)}\cap\partial E there is a solution through that point on an interval (-\epsilon,\epsilon) which is non-negative. For some systems points like this can be ruled out directly and this gives a way of proving that they are permanent.

Let \phi:[0,\infty)\times \bar E\to \bar E be the flow of the dynamical system, i.e. \phi(t,x_0) is the value at time t of the solution with x(0)=x_0. The time-one map F of the system is defined as F(x)=\phi(1,x). Its powers define a discrete semi-dynamical system with flow \psi(n,x)=F^n(x)=\phi(n,x), where n is a non-negative integer. For a discrete system of this type there is an obvious way to define \omega-limit points, persistence and permanence in analogy to what was done in the case with continuous time. For the last two we make suitable boundedness assumptions, as before. It is then clear that if the system of ODE is persistent or permanent its time-one map has the corresponding property. What is more interesting is that the converse statements also hold. Suppose the the time-one map is persistent. For a given E let K be the closure of the set of points F^n(x). This is a compact set and because of persistence it lies in E. Let K'=\phi (K\times [0,1]). This is also a compact subset of E. If t is sufficiently large then \phi([t],x)\in K where [t] is the largest integer smaller that t. This implies that \phi(t,x)\in K' and proves that the system of ODE is persistent. A similar argument shows that if the time-one map is permanent the system of ODE is permanent. We only have to start with the compact set K provided by the permanence of the time-one map.

Hopf bifurcations and Lyapunov-Schmidt theory

January 1, 2022

In a previous post I wrote something about the existence proof for Hopf bifurcations. Here I want to explain another proof which uses Lyapunov-Schmidt reduction. This is based on the book ‘Singularities and Groups in Bifurcation Theory’ by Golubitsky and Schaeffer. I like it because of the way it puts the theorem of Hopf into a wider context. The starting point is a system of the form \dot x+f(x,\alpha)=0. We would like to reformulate the problem in terms of periodic functions on an interval. By a suitable normalization of the problem it can be supposed that the linearized problem at the bifurcation point has periodic solutions with period 2\pi. Then the periodic solutions whose existence we wish to prove will have period close to 2\pi. To be able to treat these in a space of functions of period 2\pi we do a rescaling using a parameter \tau. The rescaled equation is (1+\tau)\dot x+f(x,\alpha)=0 where \tau should be thought of as small. A periodic solution of the rescaled equation of period 2\pi corresponds to a periodic solution of the original system of period 2\pi/(1+\tau). Let X_1 be the Banach space of periodic C^1 functions on [0,2\pi] with the usual C^1 norm and X_0 the analogous space of continuous functions. Define a mapping \Phi:X_1\times{\bf R}\times{\bf R}\to X_0 by \Phi(x,\alpha,\tau)=(1+\tau)\dot x+f(x,\alpha). Note that it is equivariant under translations of the argument. The periodic solutions we are looking for correspond to zeroes of \Phi. Let A be the linearization of f at the origin. The linearization of \Phi at (0,0,0) is Ly=\frac{dy}{dt}+Ay. The operator L is a (bounded) Fredholm operator of index zero.

Golubitsky and Schaeffer prove the following facts about this operator. The dimension of its kernel is two. It has a basis such that the equivariant action already mentioned acts on it by rotation. X_0 has an invariant splitting as the direct sum of the kernel and range of L. Moreover X_1 has splitting as the sum of the kernel of L and the intersection M of the range of L with X_1. These observations provide us with the type of splitting used in Lyapunov-Schmidt reduction. We obtain a reduced mapping \phi:{\rm ker}L\times{\bf R}\times{\bf R}\to {\rm ker}L and it is equivariant with respect to the action by translations. The special basis already mentioned allows this mapping to be written in a concrete form as a mapping {\bf R}^2\times{\bf R}\times{\bf R}\to {\bf R}^2. It can be written as a linear combination of the vectors with components [x,y] and [-y,x] where the coefficients are of the form p(x^2+y^2,\alpha,\tau) and q(x^2+y^2,\alpha,\tau). These functions satisfy the conditions p(0,0,0)=0, q(0,0,0)=0, p_\tau(0,0,0)=0, q_\tau(0,0,0)=-1. It follows that \phi is only zero if either x=y=0 or p=q=0. The first case corresponds to the steady state solution at the bifurcation point. The second case corresponds to 2\pi-periodic solutions which are non-constant if z=x^2+y^2>0. By a rotation we can reduce to the case y=0, x\ge 0. Then the two cases correspond to x=0 and p(x^2,\alpha,\tau)=q(x^2,\alpha,\tau)=0. The equation q(x^2,\alpha,\tau)=0 can be solved in the form \tau=\tau (x^2,\alpha), which follows from the implicit function theorem. Let r(z,\alpha)=p(z,\tau(z,\alpha)) and g(x,\alpha)=r(x^2,\alpha)x. Then \phi(x,y,\tau,\alpha)=0 has solutions with x^2+y^2>0 only if \tau=\tau(x^2+y^2,\alpha). All zeroes of \phi can be obtained from zeroes of g. This means that we have reduced the search for periodic solutions to the search for zeroes of the function g or for those of the function r.

If the derivative r_\alpha(0,0) is non-zero then it follows from the implicit function theorem that we can write \alpha=\mu(x^2) and there is a one-parameter family of solutions. There are two eigenvalues of D_xf of the form \sigma(\alpha)-i\omega(\alpha) where \sigma and \omega are smooth, \sigma (0)=0 and \omega (0)=1. It turns out that r_\alpha (0,0)=\sigma_\alpha(0,0) which provides the link between the central hypothesis of the theorem of Hopf and the hypothesis needed to apply the implicit function theorem in this situation. The equation r=0 is formally identical to that for a pitchfork bifurcation, i.e. a cusp bifurcation with reflection symmetry. The second non-degeneracy condition is r_z(0,0)=0. It is related to the non-vanishing of the first Lyapunov number.

Hilbert’s sixteenth problem

September 14, 2021

 

The International Congress of Mathematicians takes place every four years and is the most important mathematical conference. In 1900 the conference took place in Paris and David Hilbert gave a talk where he presented a list of 23 mathematical problems which he considered to be of special importance. This list has been a programme for the development of mathematics ever since. The individual problems are famous and are known by their numbers in the original list. Here I will write about the 16th problem. In fact the problem comes in two parts and I will say nothing about one part, which belongs to the domain of algebraic geometry. Instead I will concentrate exclusively on the other, which concerns dynamical systems in the plane. Consider two equations \frac{dx}{dt}=P(x,y), \frac{dy}{dt}=Q(x,y), where P and Q are polynomial. Roughly speaking, the problem is concerned with the question of how many periodic solutions this system has. The simple example P(x,y)=y, Q(x,y)=-x shows that there can be infinitely many periodic solutions and that the precise question has to be a little different. A periodic solution is called a limit cycle if there is another solution which converges to the image of the first as t\to\infty. The real issue is how many limit cycles the system can have. The first question is whether for a given system the number N of limit cycles is always finite. A second is whether an inequality of the form N\le H(n) holds, where H(n) depends only on the degree of the polynomials. H(n) is called the Hilbert number. Linear systems have no limit cycles so that H(1)=0. Until recently it was not known whether H(2) was finite. A third question is to obtain an explicit bound for H(n). The first question was answered positively by Écalle (1992) and Ilyashenko (1991), independently.

The subject has a long and troubled history. Already before 1900 Poincaré was interested in this problem and gave a partial solution. In 1923 Dulac claimed to have proved that the answer to the first question was yes. In a paper in 1955 Petrovskii and Landis claimed to have proved that H(2)=3 and that H(n)\le P_3(n) for a particular cubic polynomial P_3. Both claims were false. As shown by Ilyashenko in 1982 there was a gap in Dulac’s proof. After 60 years there was almost no progress on this problem. Écalle worked on it intensively for a long time. For this reason he produced few publications and almost lost his job. At this point I pause to give a personal story. Some years ago I was invited to give a talk in Bayrischzell in the context of the elite programme in mathematical physics of the LMU in Munich. The programme of such an event includes two invited talks, one from outside mathematical physics (which in this case was mine) and one in the area of mathematical physics (which in this case was on a topic from string theory). In the second talk the concept of resurgence, which was invented by Écalle in his work on Hilbert’s sixteenth problem, played a central role. I see this as a further proof of the universality of mathematics.

The basic idea of the argument of Dulac was as follows. If there are infinitely many limit cycles then we can expect that they will accumulate somewhere. A point where they accumulate will lie on a periodic solution, a homoclinic orbit or a heteroclinic cycle. Starting at a nearby point and following the solution until it is near to the original point leads to a Poincaré mapping of a transversal. In the case of a periodic limiting solution this mapping is analytic. If there are infinitely many limit cycles the fixed points of the Poincaré mapping accumulate. It follows that this mapping is equal to the identity, contradicting the limit cycle nature of the solutions concerned. Dulac wanted to use a similar argument in the other two cases. Unfortunately in that case the Poincaré mapping is not analytic. What we need is a class of functions which on the one hand is general enough to include the Poincaré mappings in this situation and on the on the other hand cannot have an accumulating set of fixed points without being equal to the identity. This is very difficult and is where Dulac made his mistake.

What about the second question? It has been known since 1980 that P(2)\ge 4. There is an example with three limit cycles which contain a common steady state and another which does not. It is problematic to find (or to portray) these with a computer since three are very small and one very large. More about the history can be found in an excellent article of Iyashenko (Centennial history of Hilbert’s 16th problem. Bull. Amer. Math. Soc. 39, 301) which was the main source for what I have written. In March 2021 a preprint by Pablo Pedregal appeared (http://arxiv.org/pdf/2103.07193.pdf) where he claimed to have answered the second problem. I feel that some caution is necessary in accepting this claim. A first reason is the illustrious history of mistakes in this field. A second is that Pedregal himself produced a related preprint with Llibre in 2014 which seems to have contained a mistake. The new preprint uses techniques which are far away from those usually applied to dynamical systems. On the one hand this gives some plausibility that it might contain a really new idea. On the other hand it makes it relatively difficult for most people coming from dynamical systems (including myself) to check the arguments. Can anyone out there tell me more?

 

A model for the Calvin cycle with diffusion

June 29, 2021

In the past I have spent a lot of time thinking about mathematical models for the Calvin cycle of photosynthesis. These were almost exclusively systems of ordinary differential equations. One thing which was in the back of my mind was a system involving diffusion written down in a paper by Grimbs et al. and which I will describe in more detail in what follows. Recently Burcu Gürbüz and I have been looking at the dynamics of solutions of this system in detail and we have just produced a preprint on this. The starting point is a system of five ODE with mass action kinetics written down by Grimbs et al. and which I call the MA system (for mass action). That system has only one positive steady state and on the basis of experimental data the authors of that paper were expecting more than one. To get around this discrepancy they suggested a modified system where ATP is introduced as an additional variable and the diffusion of ATP is included in the model. I call this the MAd system (for mass action with diffusion). Diffusion of the other five chemical species is not included in the model. The resulting system can be thought of as a degenerate system of six reaction reaction-diffusion equations or as a system where five ODE are coupled to one (non-degenerate) diffusion equation. The idea was that the MAd system might have more than one steady state, with an inhomogenous steady state in addition to the known homogeneous one. Experiments which measure only spatial averages would not detect the inhomogeneity. To be relevant for experiments the steady states should be stable.

For simplicity we consider the model in one space dimension. The spatial domain is then a finite closed interval and Neumann boundary conditions are imposed for the concentration of ATP. We prove that for suitable values of the parameters in the model there exist infinitely many smooth inhomogeneous steady states. It turns out that all of these are (nonlinearly) unstable. This is not a special feature of this system but in fact, as pointed out in a paper of Marciniak-Czochra et al. (J. Math. Biol. 74, 583), it is a frequent feature of systems where ODE are coupled to a diffusion equation. This can be proved using a method discussed in a previous post which allows nonlinear instability to be concluded from spectral instability. We prove the spectral instability in our example. There may also exist non-smooth inhomogeneous steady states but we did not enter into that theme in our paper. If stable inhomogeneous steady states cannot be used to explain the experimental observations what alternatives are available which still keep the same model? If experiments only measure time averages then an alternative would be limit sets other than steady states. In this context it would be interesting to know whether the system has spatially homogeneous solutions which are periodic or converge to a strange attractor. A preliminary investigation of this question in the paper did not yield definitive results. With the help of computer calculations we were able to show that it can happen that the linearization of the (spatially homogeneous) system about a steady state has non-real eigenvalues, suggesting the presence of at least damped oscillations. We proved global existence for the full system but we were not able to show whether general solutions are bounded in time, not even in L^1. It can be concluded that there are still many issues to be investigated concerning the long-time behaviour of solutions of this system.

From spectral to nonlinear instability, part 2

May 21, 2021

Here I want to extend the discussion of nonlinear instability in the previous post to the infinite-dimensional case. To formulate the problem fix a Banach space X. If the starting point is a system of PDE then finding the right X might be a non-trivial task and the best choice might depend on which application we have in mind. We consider an abstract equation of the form \frac{du}{dt}=Au+F(u). Here t\mapsto u(t) is a mapping from an interval to X. A is a linear operator which may be unbounded. In fact the case I am most interested in here is that where A is the generator of a strongly continuous semigroup of linear operators on X. Hence A may not be globally defined on X. It will be assumed that F is globally defined on X. This is not a restriction in the applications I have in mind although it might be in other interesting cases. In this setting it is necessary to think about how a solution should be defined. In fact we will define a solution of the above equation with initial value v to be a continuous solution of the integral equation u(t)=e^{tA}v+\int_0^t e^{(t-\tau)A}F(u(\tau))d\tau. Here e^{tA} is not to be thought of literally as an exponential but as an element of the semigroup generated by A. When A is bounded then e^{tA} is really an exponential and the solution concept used here is equivalent to usual concept of solution for an ODE in a Banach space. Nonlinear stability can be defined just as in the finite-dimensional case, using the norm topology. In general the fact that a solution remains in a fixed ball as long as it exists may not ensure global existence, in contrast to the case of finite dimension. We are attempting to prove instability, i.e. to prove that solutions leave a certain ball. Hence it is convenient to introduce the convention that a solution which ceases to exist after finite time is deemed to have left the ball. In other words, when proving nonlinear instability we may assume that the solution u being considered exists globally in the future. What we want to show is that there is an \epsilon>0 such that for any \delta>0 there are solutions which start in the ball of radius \delta about x_0 and leave the ball of radius \epsilon about that point.

The discussion which follows is based on a paper called ‘Spectral Condition for Instability’ by Jalal Shatah and Walter Strauss. At first sight their proof looks very different from the proof I presented in the last post and here I want to compare them, with particular attention to what happens in the finite-dimensional case. We want to show that the origin is nonlinearly unstable under the assumption that the spectrum of A intersects the half of the complex plane where the real part is positive. In the finite-dimensional case this means that A has an eigenvalue with positive real part. The spectral mapping theorem relates this to the situation where the spectrum of e^A intersects the exterior of the unit disk. In the finite-dimensional case it means that there is an eigenvalue with modulus \mu greater than one. We now consider the hypotheses of the Theorem of Shatah and Strauss. The first is that the linear operator A occurring in the equation generates a strongly continuous semigroup. The second is that the spectrum of e^A meets the exterior of the unit disk. The third is that in a neighbourhood of the origin the nonlinear term can be estimated by a power of the norm greater than one. The proof of nonlinear instability is based on three lemmas. We take e^\lambda to be a point of the spectrum of e^A whose modulus is equal to the spectral radius. In the finite-dimensional case this would be an eigenvalue and we would consider the corresponding eigenvector v. In general we need a suitable approximate analogue of this. Lemma 1 provides for this by showing that e^\lambda belongs to the approximate point spectrum of e^A. Lemma 2 then shows that there is a v whose norm grows at a rate no larger than e^{{\rm Re}\lambda t} and is such that the norm of the difference between e^Av and e^{{\rm Re}\lambda t}v can be controlled. Lemma 3 shows that the growth rate of the norm of e^A is close to e^{{\rm Re}\lambda t}. In the finite-dimensional case the proofs of Lemma 1 and Lemma 2 are trivial. In Lemma 3 the lower bound is trivial in the finite-dimensional case. To get the upper bound in that case a change of basis can be used to make the off-diagonal entries in the Jordan normal form as small as desired. This argument is similar to that used to treat A_c in the previous post. In the theorem we choose a ball B such that instability corresponds to leaving it. As long as a solution remains in that ball the nonlinearity is under good control. The idea is to show that as long as the norm of the initial condition is sufficiently small the contribution of the nonlinear term on the right hand side of the integral equation will remain small compared to that coming from the linearized equation, which is growing at a known exponential rate. The details are complicated and are of course the essence of the proof. I will not try to explain them here.

From spectral to nonlinear instability

April 28, 2021

An important property of a steady state of a dynamical system is its stability. Let x(t) be the state of the system at time t and let x_0 be a steady state. For a system of ODE these are points in Euclidean space while for more general dynamical systems they are functions which can be thought of as points in suitably chosen function spaces. In general it may be useful to have more than one function space in mind when considering a given dynamical system. First I will concentrate on the ODE case. It is possible to linearize the system about x_0 to get a linear system \dot y=Ay. The steady state x_0 is said to be linearly stable when the origin is stable for the linearized system. Since the linear system is simpler than the nonlinear one we would ideally like to be able to use linear stability as a criterion for nonlinear stability. In general the relation between linear and nonlinear stability is subtle even for ODE. We can go a step further by trying to replace linear stability by spectral stability. There are relations between eigenvalues of A with positive real parts and unstable solutions of the linearized system. Again there are subtleties. Nevertheless there are two simple results about the relation between spectral stability and nonlinear stability which can be proved for ODE. The first is that if there is any eigenvalue of A with positive real part then x_0 is nonlinearly unstable. The second is that if all eigenvalues of A have negative real parts then x_0 is nonlinearly stable, in fact asymptotically stable. These two results are far from covering all situations of interest but at least they do define a comfortable region which is often enough. In what follows I will only consider the first of these two results, the one asserting instability.

Up to this point I have avoided giving precise definitions. So what does nonlinear instability of x_0 mean? It means that there is a neighbourhood U of x_0 such that for each neighbourhood W of x_0 there is a solution satisfying x(0)\in W and x(t)\notin U for some t>0. In other words, there are solutions which start arbitrarily close to x_0 and do not stay in U. How can this be proved? One way of doing so is to use a suitable monotone function V defined on a neighbourhood of x_0. This function should be continuously differentiable and satisfy the conditions that V(x_0)=0, V(x)>0 for x\ne x_0 and \dot V>0 for x\ne x_0. Here \dot V is the rate of change of V along the solution. Let \epsilon be sufficiently small so that the closed ball \overline{B_\epsilon (x_0)} is contained in the domain of definition of V. We will take this ball to be the neighbourood U in the definition of instability. Let M be the maximum of V on \overline{B_\epsilon (x_0)}. Thus in order to show that a solution leaves U it is enough to show that V exceeds M. Consider any solution which starts at a point of V other than x_0 for t=0. The set where V(x)<V(x_0) is open and the solution can never enter it for t>0. The intersection of its complement with U is compact. Thus \dot V has a positive minimum there. As long as the solution does not leave U we have \dot V(x(t))\ge m. Hence V(t)\ge V(0)+mt. This implies that if the solution remains in U for all t>0 then V(x(t)) eventually exceeds M, a contradiction. This result can be generalized as follows. Let Z be an open set such that x_0 is contained in its closure. Suppose that we have a function V which vanishes on the part of the boundary of Z intersecting U and for which \dot V>0 on Z except at x_0. Then x_0 is nonlinearly unstable with a proof similar to that just given.

Now it will be shown that if A has an eigenvalue with positive real part a function V with the desired properties exists. We can choose coordinates so that the steady state is at the origin and that the stable, centre and unstable subspaces at the origin are coordinate subspaces. The solution can be written in the form (x,y,z) where these three variables are the projections on the three subspaces. Then A is a direct sum of matrices A_+, A_{\rm c} and A_-, whose eigenvalues have real parts which are positive, zero and negative respectively. It can be arranged by a choice of basis in the centre subspace that the symmetric part of A_c is as small as desired. It can also be shown that because of the eigenvalue properties of A_+ there exists a positive definite matrix B_+ such that A_+^TB_++B_+A_+=I. For the same reason there exists a positive definite matrix B_- such that A_-^TB_-+B_-A_-=-I. Let V=x^TB_+x-y^Ty-z^TB_-z. Then \dot V=x^Tx+z^Tz-y^T(A_c^T+A)y+o(x^Tx+y^Ty+z^Tz). The set U is defined by the condition V>0. There y^Ty\le Cx^Tx for a positive constant C. On this region \dot V\ge\frac12(x^Tx+z^Tz)+o(|x|^2+|z|^2), where we profit from the special basis of the centre subspace mentioned earlier. The quadratic term in y which does not have a good sign has been absorbed in the quadratic term in x which does. This completes the proof of nonlinear instability. As they stand these arguments do not apply to the infinite-dimensional case since compactness has been used freely. A discussion of the infinite-dimensional case will be postponed to a later post.