Archive for the ‘dynamical systems’ Category

Universal deformations of bifurcations

May 3, 2024

In bifurcation theory the fundamental object is an equation of the form \dot x=f(x,\alpha) where x represents one or more unknowns and \alpha one or more parameters. We may also consider the static case f(x,\alpha)=0. In this post I will only be concerned with the latter case. To what extent these ideas can be extended to more genuinely dynamical phenomena I do not know and I will not try to discuss that question here. Here I will be concerned with local phenomena, i.e. those in small regions close to a point in (x,\alpha)-space. The meaning of locality here can be formalized using germs but I will not do so here. The discussion here is based on that in the book ‘Singularities in Groups and Bifurcation Theory’ by Golubitsky and Schaeffer. In the theory of dynamical systems the notion of (local) equivalence is often used. I will avoid the issue of the differentiability of the mappings involved but the aim is to work with smooth mappings whenever possible. If we have two bifurcation problems defined by mappings f(x,\alpha) and g(y,\beta) then equivalence means that there is are invertible transformations y=F(x,\alpha), \beta=G(\alpha) and a positive function p(x,\alpha) such that g(F(x,\alpha),G(\alpha))=p(x,\alpha)f(x,\alpha). Intuitively this means that the two bifurcations are qualitatively the same. A deformation of the bifurcation involves introducing extra parameters. The notational conventions used in the following are not the same as in the book. Consider now an object f(x,\alpha,\lambda) depending on additional parameters \lambda. We think of the previous function f as corresponding to f(x,\alpha,0) in the new notation. This new object is called an unfolding of the original bifurcation. If we have another unfolding we can consider the relation g(F(x,\alpha,\lambda),G(\alpha,\lambda),H(\lambda))=p(x,\alpha,\lambda) f(x,\alpha,\lambda) where now F, G and H are not required to be invertible. In this case it is said that g factors through f. Alternatively it could be said that f induces g. An unfolding f is called versal if every unfolding of the same bifurcation factors through f. If in addition f depends on the minimum number of parameters \lambda it is called universal and this number is called the codimension of the bifurcation. The intuitive idea is that a versal unfolding includes all bifurcations which can be obtained by small deformations of the original one and these can be obtained by fixing the values of the parameters \lambda. A universal deformation is in a sense a parametrization of the space of all bifurcations close to a given one and defines the codimension of the space of bifurcations equivalent to the original one in that bigger space.

In the context of the book of Golubitsky and Schaeffer the ideas related to universal deformations are to be applied to a system which is in general obtained by a reduction of dimension. The relevant reduction process is related to centre manifold reduction but because the static problem is being considered it is natural to do it in the context of Lyapunov-Schmidt reduction.

There is a geometrical way of looking at these things which is related to ideas I applied in my PhD thesis very many years ago in a very different context. I already mentioned this in a previous post. One important source for me at that time was the book ‘Stable Mappings and their Singularities’ by Guillemin and Golubitsky. We see that this book has an author in common with the one I cited above. Different types of bifurcation are characterized by algebraic conditions on the derivatives of the vector field defining them at a point. The collection of derivatives up to order k is the k-jet of the vector field at that point. When the point is varied we get a section of the bundle of k-jets of vector fields. There is a bifurcation subset where the first derivative is not invertible. Let us call it S for singularity. It can be thought of as a subset of the total space of the bundle of k-jets of vector fields which in fact is induced by a subset of the bundle of 1-jets of vector fields. It is a real algebraic variety. Like all such varieties it is a union of strata, smooth submanifolds which fit together in a nice way. These strata correspond to different types of bifurcations. The jet tranversality theorem implies that any k-jet of a vector field can be modified by an arbitrarily small perturbation so as to be transverse to all strata of the variety. In this language a versal deformation of a given bifurcation point is a mapping whose k-jet prolongation is transverse at that point to the stratum corresponding to that bifurcation. The codimension of the bifurcation is the codimension of that stratum.

One-dimensional centre manifolds, part 4

April 30, 2024

In this post I consider the case of the transcritical bifurcation, starting with the case of a one-dimensional dynamical system depending on one parameter \dot x=f(x,\alpha). I consider the situation where for structural reasons x=0 is always a steady state. Hence f(x,\alpha)=xg(x,\alpha) for a smooth function g. The equation g(x,\alpha)=0 could potentially have a bifurcation at the origin but here I specifically consider the case where it does not. The assumptions are g(0,0)=0 and g_x(0,0)\ne 0. Moreover it is assumed that g_\alpha(0,0)\ne 0. By the implicit function theorem the zero set of g is a function of \alpha. By the last assumption made the derivative of this function at zero is non-zero. The graph crosses both axes transversally at the origin. It will be supposed, motivated by intended applications, that g_x(0,0)<0. If g_\alpha<0 there exist positive steady states for \alpha>0 and if g_\alpha>0 there exist positive steady states with \alpha<0. The conditions which have been required of g can be translated to conditions on f. Of course f(0,0)=0 and f_\alpha (0,0)=0. Since g(0,0)=0 we also have f_x(0,0)=0 and the origin is a bifurcation point for f. The other two conditions on g translate to f_{xx}(0,0)\ne 0 and f_{\alpha x}(0,0)\ne 0. It follows from the implicit function theorem, applied to g that the bifurcation can be reduced by a coordinate transformation to the standard form \dot x=x(\alpha\pm x). Thus we see how many steady states there are for each value of \alpha and what their stability properties are.

My primary motivation for discussing this here is to throw light on the concept of backward bifurcation. At a bifurcation point of this kind there is a one-dimensional centre manifold. I want to explain the relation of general centre manifold theory to Theorem 4 of in the paper of van den Driessche and Watmough. Background for this discussion can be found here. It is explicitly assumed in the discussion leading up to the theorem that the centre manifold at the bifurcation point is one-dimensional. All but one of the non-zero eigenvalues of the linearization have negative real parts close to the bifurcation point. The remaining one changes sign as the bifurcation point is passed. The main idea is to reduce the bifurcation to the two-dimensional extended centre manifold at the bifurcation point. In equations (23) and (24) the key diagnostic quantities a and b for the bifurcation are defined. They are expressed in invariant form both in terms of the extrinsic dynamical system and in a form intrinsic to the centre manifold. In the coordinate form used above a=\frac12 f_{xx}(0,0) and b=f_{\alpha x}(0,0). Theorem 4 of the paper makes statements about the existence and stability of positive steady states which are essentially equivalent to those made above when it is taken into account that in the situation of the theorem the centre manifold is asymptotically stable with respect to perturbations in the transverse directions. It does not say more than that. The fact, often seen in pictures of backward bifurcation that the unstable branch undergoes a fold bifurcation is not included. In particular the fact that for some parameter values there exists more than one positive steady state is not included.

One dimensional centre manifolds, part 3

April 24, 2024

I now want to generalize the discussion relating the case that the leading order term in the dynamics on a centre manifold is quadratic and the generic fold bifurcation to a related case where one more derivative has been taken. So the aim is to relate the case that the leading order term in the dynamics on the centre manifold is cubic and the generic cusp bifurcation. To do this we express h_{xxx} in terms of derivatives of f^1. In fact we will only write out those terms explicitly which do not always vanish at the bifurcation point. The result is h_{xxx}(0,0,0)=f^1_{xxx}(0,0,0)+3\psi_{xx}(0,0)f^1_{xy}(0,0,0).
In this way it is possible to express the condition for the non-vanishing of the cubic term in the dynamics on the centre manifold by f^1_{xxx}(0,0,0)+3\psi_{xx}(0,0)f^1_{xy}(0,0,0)\ne 0. Note, however, that the expression f^1_{xy}(0,0,0) is not intrinsic to the centre manifold. Because of the vanishing of the first derivatives the second order derivatives \psi_{xx}(0,0) and f^1_{xy}(0,0) define tensors in the full dimension of the dynamical system. Hence we can write \psi_{xx}(0,0)f^1_{xy}(0,0,0) invariantly in the form \psi^i_{,jk}(0,0)f^m_{,il}(0,0,0)p_mq^jq^kq^l. It remains to compute \psi^i_{,jk}(0,0). The invariance of the centre manifold implies that f^i_{,j}\psi^j=f^i_{,jk}q^jq^k (x_1)^2+\ldots. Hence f^i_{,l}(0,0,0)\psi^l_{,jk}(0,0)q^jq^k=f^i_{,jk}(0,0,0)q^jq^k. This equation has a unique solution for \psi^l_{,jk}(0,0)q^jq^k. In this way one of the conditions for a generic cusp bifurcation can be expressed in an invariant way. This corresponds to (8.128) in the book of Kuznetsov. In trying to understand these things better I was confused by equation (5.28) in the book which should be almost the same as (8.128) but in fact contains a typo. There is a repetition of B(q,. For a generic cusp bifurcation we need two parameters \alpha_1 and \alpha_2. The genericity condition involving derivatives with respect to parameters can easily translated into an invariant form. The result is
p_if^i_{\alpha_1}(0,0,0)p_jf^j_{,k\alpha_2}(0,0,0)q^k-p_if^i_{\alpha_2}(0,0,0)p_jf^j_{,k\alpha_1}(0,0,0)q^k=0. Formulae similar to those just given were used in a paper I wrote with Juliette Hell some years ago (Nonlin. Analysis: RWA. 24, 175). Looking back at that I have the feeling that we must have been doing some sleepwalking when writing that paper. Or maybe I was sleepwalking and Juliette was paying attention. In any case, the final result was correct.

One-dimensional centre manifolds, part 2

March 19, 2024

I continue with the discussion of the previous post. There I mentioned a preliminary transformation of the coordinates. I will discuss this in more detail here. Suppose we have a two-dimensional dynamical system for unknowns (x,y) and a steady state (x_0,y_0). The usual aim is to understand systems up to certain transformations. One type of transformation is of the form \beta=F_1(\alpha). A second is of the form (\tilde x,\tilde y)=F_2(x,y,\alpha). A third is of the form \tau=F_3(t). With the assumptions which have been made we can do a translation to achieve x_0=y_0=0. Assume now that the linearization of the system at the origin is diagonalizable with rank one. Then it has one eigenvalue zero and one non-zero eigenvalue. It can be diagonalized by a linear transformation of the coordinates. After these transformations the system has almost been reduced to the form in the last post, except that there is a non-zero constant in front of the first summand in the evolution equation for y which might not be -1. It can be reduced to the latter case by rescaling the time coordinate. A similar process can be carried out when the centre manifold is of higher codimension. Then the variable y becomes vector-valued and the first summand in the evolution equation for y is if the form Ay for an invertible matrix A. The function \psi defining the centre manifold is then also vector-valued. The centre manifold analysis can be done in a manner very similar to what we have seen already.

The discussion of the fold bifurcation also generalizes in a straighforward way to the case of higher codimension. There is, however, one thing about that discussion which is unsatisfactory. The bifurcation conditions are expressed in terms of the transformed coordinates. It would be more satisfactory, because more invariant, if they could be expressed in terms of the original coordinates. This leads to considering the way in which these conditions are affected by the three types of coordinate transformations previously discussed. The first type of transformation leaves the conditions involving only f and its derivatives with respect to x unchanged. The derivative of f with respect to \alpha is rescaled by the non-zero factor F_1'(0) and so the fact of its being non-zero is unchanged. The third type of transformation just scales all the relevant quantities by the non-zero factor F_3'(0) and so also does not change whether these are zero or not. It remains to consider the second type of transformation. At this point a more geometrical point of view will be adopted. We change the notation for the right hand side of the equations to f^i(x^j,\alpha). These are the components of a parameter-dependent vector field which satisfies f^i(0,0)=0. Because of this condition the derivative f^i_\alpha (0,0) defines a vector at the origin independently of the transformation of the second type. The linearization A^i_j of the vector field at the origin defines a tensor. We suppose as before that it is diagonalizable and of rank one. Let q^i be a vector spanning the kernel of A^i_j, Thus using the Einstein summation convention we have A^i_jq^j=0. Similarly there is a one-form p_i with p_iA^i_j=0. It is unique up to rescaling and we choose it to satisfy p_iq^i=1. With these notations the condition previously written in the form f_\alpha(0,0,0)\ne 0 can be written in the invariant form (f^i_\alpha p_i)(0,0)\ne 0. The condition previously written in the form f_{xx}\ne 0 can be written in the invariant form (p_if^i_{,jk}q^jq^k) (0,0)\ne 0. Here we use the notation f^i_{,jk}=\frac{\partial f^i}{\partial x^j\partial x^k}. It is because the vector field vanishes at the origin and q^i is in the kernel of its first derivative that the expression involving the second derivative is invariant. These considerations may be compared with those in section 5.4 in the book of Kuznetsov.

One-dimensional centre manifolds

March 18, 2024

I have used one-dimensional centre manifolds in my research on several occasions. I now see that I always did this in quite an ad hoc way. I did not exercise due diligence in the sense that I did not take the time to get a general picture of what was going on, so as to be able to use this technique more efficiently in the future. Now I want to do so. I start with the two ODE \dot x=f(x,y) and \dot y=-y+g(x,y). This example is general enough to illustrate several important ideas. Here f and g are supposed to be smooth and vanish at least quadratically near the origin. The linearization of this system has the eigenvalues -1 and zero. Its kernel is spanned by the vector with components (1,0). It follows that there exists a centre manifold of the form y=\psi (x), where \psi has any desired finite degree of differentiability. By definition this manifold is invariant under the flow of the system, it passes through the origin and its tangent space there is the x axis. Consider the Taylor expansion f(x,y)=a_{2,0}x^2+a_{1,1}xy+a_{0,2}y^2+a_{3,0}x^3+\ldots. Substituting this into the evolution equation for x gives \dot x=a_{2,0}x^2+\ldots. This is a leading order approximation to the flow on the centre manifold. If a_{2,0}\ne 0 we can use it to read off the stability of the origin within the centre manifold. This argument uses no information about the way the centre manifold deviates from the centre subspace, i.e. how fast \psi grows near zero. If a_{2,0}=0 we need to go further.

Differentiating the defining equation with respect to time and substituting the evolution equations into the result gives \psi'(x)f(x,\psi(x))=-\psi(x)+g(x,\psi(x)). Call this equation (*). It can be analysed with the help of the Taylor expansions g(x,y)=b_{2,0}x^2+b_{1,1}xy+b_{0,2}y^2+b_{3,0}x^3+\ldots and \psi(x)=c_2x^2+\ldots. Substituting these into (*) we see that the left hand side is of order three. Thus the same must be true of the right hand side and we get \psi(x)=g(x,\psi(x))+\ldots and c_2=b_{2,0}. Substituting this back into the evolution equation for x gives \dot x=a_{2,0}x^2+(a_{1,1}b_{2,0}+a_{3,0})x^3+\ldots. Thus if a_{2,0}=0 the stability of the origin is determined by the sign of a_{1,1}b_{2,0}+a_{3,0}. If it is zero we can do another loop of the same kind. Looking at the third order terms in the equation (*) we get c_3=(-2a_{2,0}+b_{1,1})b_{2,0}+b_{3,0}. This allows the fourth order term in the evolution equation for x to be determined. This procedure can be repeated as often as desired to get higher order approximations for the centre manifold and the restriction of the system to that manifold.

We can now sum up the steps involved in doing a stability analysis. First look at the coefficient of x^2 in the equation for \dot x. If it is non-zero we are done. If it is zero use the equation (*) to determine the leading term in the expansion of the centre manifold. Put this information into the equation for \dot x. If the leading term is non-zero we are done. If it is zero we can repeat the process as long as is necessary to get a case in which the leading order coefficient is non-zero. As long as this point has not been reached we cycle between using (*) and the equation for \dot x. The system I have discussed here was special. The codimension of the centre manifold was one and the system was in a form which usually could only be achieved by a preliminary linear transformation of the coordinates. The special case nevertheless exhibits the essential structure of the general case and can serve as a compass when treating examples.

These ideas can be extended to give information about bifurcations. The equations are replaced by \dot x=f(x,y,\alpha) and \dot y=-y+g(x,y,\alpha), where \alpha is a parameter. This can be made into a three-dimensional extended system by adding the equation \dot\alpha=0. The origin is a steady state of the extended system and the centre manifold at that point is of dimension two. It is of the form y=\psi (x,\alpha). Suppose that we are in the case a_{2,0}\ne 0. Then this looks very much like the case of a generic fold bifurcation. We are just missing one condition on the parameter dependence. The dynamics on the centre manifold is given by \dot x=f(x,\psi(x,\alpha),\alpha)=h(x,\alpha). Of course the equation \dot \alpha=0 remains unchanged. We can now check the conditions for a generic fold bifurcation in the system reduced to the centre manifold. The first is h(0,0)=f(0,0,0)=0. The second is h_x(0,0)=f_x(0,0,0)+\psi_x(0,0)f_y(0,0,0). Hence h_x(0,0)=0 is equivalent to f_x(0,0,0)=0. The third involves h_{xx}(0,0)=f_{xx}(0,0,0)+\psi_{xx}(0,0)f_y(0,0,0)+2\psi_x(0,0)f_{xy}(0,0,0)+\psi_x^2(0,0)f_{yy}(0,0,0). We see that h_{xx}(0,0)\ne 0 is equivalent to f_{xx}(0,0,0)\ne 0. The fourth involves h_\alpha (0,0)=\psi_\alpha(0,0)f_y(0,0,0)+f_\alpha(0,0,0). We see that h_\alpha (0,0)\ne 0 is equivalent to f_\alpha(0,0,0)\ne 0. The first three conditions for a generic fold bifurcation of the system on the centre manifold are already satisfied and the fourth is equivalent to f_\alpha(0,0,0)\ne 0. In this way the bifurcation conditions can be expressed directly in terms of the coefficients of the original system. This is an illustration in a relatively simple example of a relationship discussed in much more general cases in the book of Kuznetsov.

Fomites and backward bifurcations

July 25, 2023

I first met the curious word ‘fomite’ a few months ago. It means a carrier of infection which is not a human being or an animal. It evolved in an interesting way, as recounted in the Wikipedia article on this concept. It started out as the Latin word ‘fomes’. The plural of that is ‘fomites’, which was taken over into English. There it was interpreted as an English plural and gave rise to the corresponding singular ‘fomite’. Let me now get away from the word and consider the concept it designates. During the COVID-19 pandemic there was a lot of discussion about the modes of transmission of infection. In particular people wanted to understand the importance of the disinfection of surfaces (and of hands). These inanimate surfaces, where the virus might be present and act as a source of infection, are then (the surfaces of) fomites. As far as I can judge the consensus developed that this mode of transmission was not the key factor in the case of COVID-19 with attention being concentrated on aerosols. However important they may be in that particular case there is no doubt that there are diseases for which transmission by fomites is very important. The theme of hospital infections is a very important one these days and there fomites are likely to play a central role.

I came across the word fomite in the course of some research I did recently with Aytül Gökce and Burcu Gürbüz on some population models for infectious diseases. We just produced a preprint on that. One aspect of infection which was included in our model and which is not very common in the literature is that of infections coming from virus in the environment and here we can think of fomites. In modelling this phenomenon a choice has to be made of a function which describes the force of infection. We can consider an expression for the rate of infection of the from Sf(C), where S is the number of susceptibles and C is the concentration of virus in the environment which may cause infection. What should we choose for the function f? This type of issue comes up in other modelling settings, in particular in those related to biology, where a choice of response function has to be made. In a biochemical reaction f could describe the reaction rate as a function of the concentration of the substrate. Typical choices are the Michaelis-Menten function f(C)=\frac{V_{\max}C}{K+C} and the Hill function where C is replaced in this expression by a power C^n. The Michaelis-Menten function has a mechanistic basis, introduced by the people who it is now named after. Hill also presented a mechanistic basis for his function. It incorporates the phenomenon of cooperative binding, with the original example being the binding of oxygen to haemoglobin. Another example of the choice of response functions occurs in predator-prey models in ecology. Here f describes the dependence of the rate of uptake of prey as a function of the density of prey. The functions which are mathematically identical to the Michaelis-Menten and Hill functions are called Holling type II and Holling type III. Holling gave mechanistic interpretations of these. Type II is associated with the fact that the predator needs some time to process prey and that that time is lost for the search for prey. Type III is associated with learning by the predator. In this list type I is missing. Type I refers to a linear function, except that to account for the fact that the rate of uptake of prey is in reality limited the linear function is usually cut off at some level, after which it is constant.

It seems that in epidemiology these issues have not been studied so much. There is an extended discussion of the subject in Chapter 10 of the book ‘Mathematical Epidemiology of Infectious Diseases’ by Diekmann and Heesterbeek. I mentioned this book in a previous post but unfortunately forgot to mention the names of the authors. At the beginning of the chapter it is stated clearly that it will not give a definitive answer to the questions it raises. I appreciate this kind of modesty. In that chapter there is a mechanistic derivation of a response function in a particular epidemiological setting. It is more complicated that the conventional ones and is not a rational function. The function corresponding to Holling type II has been been discussed in relation to epidemiology, apparently for the first time by Dietz. His discussion is purely phenomenological as opposed to mechanistic. In other words he chooses the simplest function f satisfying some basic desired properties. In our paper we consider phenomenological models containing a power n as in Holling type II (n=1) and Holling type III (n\ge 2). We found that the case n=2 gives rise to backward bifurcations while the case n=1 does not. Thus here the details of the response function can be of considerable mathematical or even medical importance. (The presence of backward bifurcations has implications for the strategy of therapies.) Another topic of the paper is the different ways in which imperfect vaccination can be described mathematically but I will not go any deeper into this theme here.

The formalism of van den Driessche and Watmough

May 10, 2023

Here I continue the discussion of the paper of van den Driessche and Watmough from the point I left it in the last post. We have a system of ODE with unknowns x_i, 1\le i\le n. The first m of these have the status of infected variables. The right hand side of the evolution equation for x_i is the sum of three terms {\cal F}_i, {\cal V}_i^+ and -{\cal V}_i^- where {\cal F}_i, {\cal V}_i^+ and {\cal V}_i^- are all non-negative. This is condition (A1) of the paper. On the right hand side of this type of system there are positive and negative terms. If we take {\cal V}_i^- to be the total contribution of the negative terms then it is uniquely defined by the system of ODE. On the other hand splitting the positive terms involves a choice. The notation {\cal V}_i={\cal V}_i^--{\cal V}_i^+ is also used. As is usual in reaction networks {\cal V}_i^-=0 when x_i=0. This is condition (A2) of the paper. The {\cal F}_i represent new infections and hence must be zero for i>m. This is condition (A3) of the paper. It restricts the ambiguity in the splitting. In condition (A4) the set of disease-free steady states is supposed to be invariant and this leads to the vanishing of some non-negative terms at those points. In general this condition restricts the choice of the infected compartments. In the fundamental model of virus dynamics the infected variables are a subset of \{y,v\}. However choosing it to be a proper subset would violate condition (A4). Thus for this model there is no choice at this point. There does remain a choice in splitting the positive terms, as already discussed in a previous post. It remains to consider the last condition (A5). Its interpretation in words is that if the infection is stopped the disease-free steady state is asymptotically stable. Mathematically, stopping the infection means setting some parameters in the model to zero.

Consider now the linearization of the system at a disease-free steady state. The choice of infectious compartments induces a block structure. Splitting the RHS of the system into {\cal F}_i and {\cal V}_i also splits the linearization into a sum. It is an elementary consequence of the assumptions made that some of the entries of the matrices occurring in these decompositions are zero. It can also be shown that some entries are non-negative. To describe a new infection we can take an initial datum close to the disease-free steady state. Then we may reasonably suppose (leaving aside questions of proof) that the actual dynamics is well-approximated by the linearized dynamics during an initial period. Under the given assumptions the linearization is block lower triangular. The linearized evolution is partially decoupled and the uninfected variables decay exponentially. So the dynamics essentially reduces to that of the infected variables and is generated by the upper left block, called -V in the paper. We get an exponential phase of decay with populations proportional to e^{-tV}. During this phase the number of susceptibles is approximately constant, equal to its value at the disease-free steady state. Hence we get an explicit expression for the rate of infection in this phase. Integrating this with respect to time gives an expression for the ratio of total number of individuals infected in that phase as a function of the original number of individuals in the different compartments. It is given by the matrix FV^{-1}, which is the NGM previously mentioned.

I have not succeeded in making a connection between the formalism of van den Driessche and Watmough and the discussion on p. 19 of the book of May and Nowak. Towards the bottom of that page it is easy to see the connection with the unique positive eigenvalue of the linearization at the boundary steady state if R_0 is simply interpreted as the known combination of parameters. After I had written this I noticed a source cited by van den Driessche and Watmough, the book ‘Mathematical Epidemiology of Infectious Diseases’ by Diekmann and Heesterbeek and, in particular, Theorem 6.13 of that book. (To avoid any confusion, I am talking about the original edition of this book, not the later extended version.) I had never seen the book before and I borrowed it from the library. I found a text which pays attention to the relations between mathematical rigour and applications in an exemplary way. I feel that if I read the whole book carefully and did all the exercises (which is the recommendation of the authors) I might be able to get rid of all the problems which are the subject of this and the previous related posts. Since this ideal engagement with the book is something I will not achieve soon, if ever, I restrict myself to some limited comments. On p. 105 the authors write ‘Many modellers … even distrust whether the threshold value one of R_0 … corresponds exactly to their stability condition. … The remainder of this section is intended for modellers who recognise themselves in the above description.’ I recognise myself in this sense. The section referred to contains Theorem 6.13.

The basic reproduction number, yet again

May 4, 2023

In this post I come back to the basic reproduction number, a concept which seems to me like a thorn in my flesh. Some of what I write here might seem rather like a repetition of things I have written in previous posts but I feel a strong need to consolidate my knowledge on this issue so as to possibly finally extract the thorn. The immediate reason for doing so is that I have been reading the book ‘Infectious diseases of humans’ by Anderson and May. I find this book very good and pleasant to read but my progress was halted when I came to Section 2.2 on p. 17 whose title is ‘The basic reproductive rate of a parasite’. (The question of whether it is better to use the word ‘rate’ or the word ‘number’ in this context is not the one that interests me.) What is written there is ‘The basic reproductive rate, R_0 is essentially the average number of successful offspring a parasite is capable of producing’. It is clear that this is not a precise mathematical definition, as shown in particular by the use of the word ‘essentially’. In that section there is further discussion of the concept without a definition. As a mathematician I am made uneasy by a discussion using what is apparently a mathematical concept without giving a definition. A possible reaction to this unease would be to say, ‘Be patient, maybe a definition will be given later in the book’. Unfortunately I have not been able to find a definition elsewhere in the book. I also checked whether it might not be included in some supplementary material or appendices. Later in the book this quantity is calculated in some models. This might provide an opportunity for reverse engineering the definition. This book is not aimed at mathematicians, which is an excuse for the lack of a definition. Next I turned to the book ‘Virus dynamics’ by Nowak and May which is related but a bit more mathematical. Here the place I get stuck is on p. 19. Again there is a definition of R_0 in words, ‘the number of newly infected cells that arise from any one infected cell when almost all cells are uninfected’. Just before that a specific model, the ‘basic model of virus dynamics’ has been introduced.

The basic model is clearly defined in the book. It is a system of ordinary differential equations depending on some parameters. Depending on the values of these parameters there is one positive steady state or none. These two cases can be characterized by a certain ratio of the parameters. If this ratio is greater than one there is a positive steady state. If it is less than or equal to one there is none. We can introduce the notation R_0 for this quantity and call it the basic reproduction number. This is a clear and permissible mathematical definition and it is an interesting statement about the model that such a quantity exists and can be written down explicitly in terms of the parameters of the system. This definition does, however, have two disadvantages. There are many models similar to this one in the literature. Thus the question comes up: if I have a specific model does there exist a quantity analogous to R_0 and if so how can I calculate it? The other is the idea that R_0 is not just a mathematical quantity. It potentially contains useful biological insights and that is something I would like to understand.

Is there more to be learned about this issue from the book of Nowak and May? They do have a kind of derivation in words of the expression for R_0. It is obtained as the product of three quantities, each of which comes with a certain interpretation. Thus a strategy would be to try and understand these three quantities individually and then understand why it is relevant to consider their product. At this point I will introduce some insights which I mentioned in a previous post. For many models the existence of a positive steady state is coupled to the stability of a steady state on the boundary of the non-negative orthant (disease-free steady state). I believe that R_0 is really a quantity related to the disease-free steady state and that in a sense it is just a coincidence that it is related to the question of the existence of a positive steady state. This is illustrated by the fact that there are models exhibiting a so-called backward bifurcation where there do exist positive steady states in some cases with R_0<1. Beyond this, I believe that R_0 depends only on the linearization of the system about the disease-free steady state.

As I have discussed elsewhere, the best source I know for understanding R_0 mathematically is a paper of van den Driessche and Watmough. There it is explained how, subject to making some choices, the linearization of the system at the disease-free steady state gives rise to something called the ‘next generation matrix’, let me abbreviate it by NGM. In that paper R_0 is defined to be the spectral radius of the NGM. Thus we obtain a rigorous definition of R_0 provided we have a rigorous definition of the NGM. In addition it is proved in that paper that the linearization at the disease-free steady state has an eigenvalue with positive real part precisely when R_0>1, a fact with obvious consequences for stability of that steady state. It is clearly stated in the paper that the NGM does not only depend on the system of ODE defining the model but also on the choices already mentioned. The first choice is that of the n variables m are supposed to correspond to infected individuals. A disease-free steady state is then by definition one at which these m variables vanish. This means that if we have a specific model and a specific choice of boundary steady state where some unknowns vanish and we want to apply this approach the infected variables must be a subset of the unknowns which vanish there. It is not ruled out that they might be a proper subset. This discussion has made it clear to me what strategy I should pursue to make progress in this area. I should look in great detail at the part of the paper of van den Driessche and Watmough where the method is set up. Since I think this post is already long enough I will do that on another occasion.

Matched asymptotic expansions and the Kaplun-Lagerstrom model

December 13, 2022

In a post I wrote a long time ago I described matched asymptotic expansions as being like black magic. Now I have understood some more about how to get from there to rigorous mathematics. My main guide in doing so has been Chapter 7 of the book ‘Classical Methods in Ordinary Differential Equations’ by Hastings and McLeod. There they give an extensive treatment of a model problem by Kaplun and Lagerstrom. The ultimate source of this is work of Stokes on hydrodynamics around 1850. In his calculations he found some paradoxical phenomena. Roughly speaking, attempting to obtain an asymptotic expansion for a solution led to inconsistencies. These things remained a mystery for many years. A big step forward came in the work of Kaplun and Lagerstrom in 1957. There they introduced an ODE model which, while having no direct physical interpretation, provides a relatively simple mathematical context in which to understand these phenomena. It is this model problem which is treated in detail by Hastings and McLeod. The model is a boundary value problem for the equation y''+\frac{n-1}{r}y'+\epsilon yy'=0. We look for a solution with y(1)=0 and \lim_{r\to\infty}y(r)=1. The first two terms look like the expression for the Laplacian of a spherically symmetric function in n dimensions and for this reason the motivation is strong to look at the cases n=2 and n=3 which are vaguely related to fluid flow around a cylinder and flow around a sphere, respectively. It turns out that the case n=2 is a lot harder to analyse than the case n=3. When n=3 the problem has a unique solution for \epsilon>0. We would like to understand what happens to this solution as \epsilon\to 0. It is possible to find an asymptotic expansion in \epsilon but it is not enough to use powers of \epsilon when building the expansion. There occurs a so-called switchback term containing \log\epsilon. This is a singular limit although the parameter in the equation only occurs in a lower order term. This happens because the equation is defined on a non-compact region.

Consider the case n=3. In applying matched asymptotic expansions to this problem the first step is to do a straightforward (formal) expansion of the equation in powers of \epsilon. This gives differential equations for the expansion coefficients. At order zero there is no problem solving the equation with the desired boundary conditions. At order one this changes and it is not possible to implement the desired boundary condition at infinity. This has to do with the fact that in the correct asymptotic expansion the second term is not of order \epsilon but of order \epsilon\log\epsilon. This extra term is the switchback term. Up to this point all this is formal. One method of obtaining rigorous proofs for the asymptotics is to use GSPT, as done in two papers of Popovic and Szmolyan (J. Diff. Eq. 199, 290 and Nonlin. Anal. 59, 531). There is an introduction to this work in the book but I felt the need to go deeper and I looked at the original papers as well. To fit the notation of those papers I replace y by u. Reducing the equation to first order by introducing v=u' as a new variable leads to a non-autonomous system of two equations. Introducing \eta=1/r as a new dependent variable and using it to eliminate r from the right hand side of the equations in favour of \eta leads to an autonomous system of three equations. This allows the original problem to be reformulated in the following geometric way. The u-axis consists of steady states. The point (1,0,0) is denoted by Q. The aim is to find a solution which starts at a point of the form (0,v,1) and tends to Q as r\to\infty. A solution of this form for \epsilon small and positive is to be found by perturbation of a corresponding solution in the case \epsilon=0. For \epsilon>0 the centre manifold of Q is two-dimensional and given explicitly by v=0. In the case \epsilon=0 it is more degenerate and has an additional zero eigenvalue. To prove the existence of the desired connecting orbit we may note that for \epsilon>0 this is equivalent to showing that the manifold consisting of solutions starting at points of the form (0,v,1) and the manifold consisting of solutions converging to Q intersect. The first of these manifolds is obviously a deformation of a manifold for \epsilon=0. We would like the corresponding statement for the second manifold. This is difficult to get because of the singularity of the limit. To overcome this \epsilon is introduced as a new dynamical variable and a suitable blow-up is carried out near the u-axis. In this way it is possible to get to a situation where there are two manifolds which exist for all \epsilon\ge 0 and depend smoothly on \epsilon. They intersect for \epsilon=0 and in fact do so transversely. It follows that they also intersect for \epsilon small and positive. What I have said here only scratches the surface of this subject but it indicates the direction in which progress could be made and this is a fundamental insight.

The shooting method for ordinary differential equations

November 25, 2022

A book which has been sitting on a shelf in my office for some years without getting much attention is ‘Classical Methods in Ordinary Differential Equations’ by Hastings and McLeod. Recently I read the first two chapters of the book and I found the material both very pleasant to read and enlightening. A central theme of those chapters is the shooting method. I have previously used the method more than once in my own research. I used it together with Juan Velázquez to investigate the existence of self-similar solutions of the Einstein-Vlasov system as discussed here. I used it together with Pia Brechmann to prove the existence of unbounded oscillatory solutions of the Selkov system for glycolysis as discussed here. This method is associated with the idea of a certain type of numerical experiment. Suppose we consider a first order ODE with initial datum x(0)=\alpha. Suppose that for some value \alpha_1 the solution tends to +\infty for t\to\infty and for another value \alpha_2>\alpha_1 the solution tends to -\infty. Then we might expect that in between there is a value \alpha^* for which the solution is bounded. We could try to home in on the value of \alpha^* by a bisection method. What I am interested in here is a corresponding analytical procedure which sometimes provides existence theorems.

In the book the procedure is explained in topological terms. We consider a connected parameter space and a property P. Let A be the subset where P does not hold. If we can show that A=A_1\cup A_2 where A_1 and A_2 are non-empty and open and A=A_1\cap A_2=\emptyset then A is disconnected and so cannot be the whole of the parameter space. Hence there is at least one point in the complement of A and there property P holds. The most common case is where the parameter space is an interval in the real numbers. For some authors this is the only case where the term ‘shooting method’ is used. In the book it is used in a more general sense, which might be called multi-parameter shooting. The book discusses a number of cases where this type of method can be used to get an existence theorem. The first example is to show that x'=-x^3+\sin t has a periodic solution. In fact this is related to the Brouwer fixed point theorem specialised to dimension one (which of course is elementary to prove). The next example is to show that x'=x^3+\sin t has a periodic solution. After that this is generalised to the case where \sin t is replaced by an arbitrary bounded continuous function on [0,\infty) and we look for a bounded solution. The next example is a kind of forced pendulum equation x''+\sin x=f(t) and the aim is to find a solution which is at the origin at two given times. In the second chapter a wide variety of examples is presented, including those just mentioned, and used to illustrate a number of general points. The key point in a given application is to find a good choice for the subsets. There is also a discussion of two-parameter shooting and its relation to the topology of the plane. This has a very different flavour from the arguments I am familiar with. It is related to Wazewski’s theorem (which I never looked at before) and the Conley index. The latter is a subject which has crossed my path a few times in various guises but where I never really developed a warm relationship. I did spend some time looking at Conley’s book. I found it nicely written but so intense as to require more commitment than I was prepared to make at that time. Perhaps the book of Hastings and McLeod can provide me with an easier way to move in that direction.