From spectral to nonlinear instability, part 2

May 21, 2021

Here I want to extend the discussion of nonlinear instability in the previous post to the infinite-dimensional case. To formulate the problem fix a Banach space X. If the starting point is a system of PDE then finding the right X might be a non-trivial task and the best choice might depend on which application we have in mind. We consider an abstract equation of the form \frac{du}{dt}=Au+F(u). Here t\mapsto u(t) is a mapping from an interval to X. A is a linear operator which may be unbounded. In fact the case I am most interested in here is that where A is the generator of a strongly continuous semigroup of linear operators on X. Hence A may not be globally defined on X. It will be assumed that F is globally defined on X. This is not a restriction in the applications I have in mind although it might be in other interesting cases. In this setting it is necessary to think about how a solution should be defined. In fact we will define a solution of the above equation with initial value v to be a continuous solution of the integral equation u(t)=e^{tA}v+\int_0^t e^{(t-\tau)A}F(u(\tau))d\tau. Here e^{tA} is not to be thought of literally as an exponential but as an element of the semigroup generated by A. When A is bounded then e^{tA} is really an exponential and the solution concept used here is equivalent to usual concept of solution for an ODE in a Banach space. Nonlinear stability can be defined just as in the finite-dimensional case, using the norm topology. In general the fact that a solution remains in a fixed ball as long as it exists may not ensure global existence, in contrast to the case of finite dimension. We are attempting to prove instability, i.e. to prove that solutions leave a certain ball. Hence it is convenient to introduce the convention that a solution which ceases to exist after finite time is deemed to have left the ball. In other words, when proving nonlinear instability we may assume that the solution u being considered exists globally in the future. What we want to show is that there is an \epsilon>0 such that for any \delta>0 there are solutions which start in the ball of radius \delta about x_0 and leave the ball of radius \epsilon about that point.

The discussion which follows is based on a paper called ‘Spectral Condition for Instability’ by Jalal Shatah and Walter Strauss. At first sight their proof looks very different from the proof I presented in the last post and here I want to compare them, with particular attention to what happens in the finite-dimensional case. We want to show that the origin is nonlinearly unstable under the assumption that the spectrum of A intersects the half of the complex plane where the real part is positive. In the finite-dimensional case this means that A has an eigenvalue with positive real part. The spectral mapping theorem relates this to the situation where the spectrum of e^A intersects the exterior of the unit disk. In the finite-dimensional case it means that there is an eigenvalue with modulus \mu greater than one. We now consider the hypotheses of the Theorem of Shatah and Strauss. The first is that the linear operator A occurring in the equation generates a strongly continuous semigroup. The second is that the spectrum of e^A meets the exterior of the unit disk. The third is that in a neighbourhood of the origin the nonlinear term can be estimated by a power of the norm greater than one. The proof of nonlinear instability is based on three lemmas. We take e^\lambda to be a point of the spectrum of e^A whose modulus is equal to the spectral radius. In the finite-dimensional case this would be an eigenvalue and we would consider the corresponding eigenvector v. In general we need a suitable approximate analogue of this. Lemma 1 provides for this by showing that e^\lambda belongs to the approximate point spectrum of e^A. Lemma 2 then shows that there is a v whose norm grows at a rate no larger than e^{{\rm Re}\lambda t} and is such that the norm of the difference between e^Av and e^{{\rm Re}\lambda t}v can be controlled. Lemma 3 shows that the growth rate of the norm of e^A is close to e^{{\rm Re}\lambda t}. In the finite-dimensional case the proofs of Lemma 1 and Lemma 2 are trivial. In Lemma 3 the lower bound is trivial in the finite-dimensional case. To get the upper bound in that case a change of basis can be used to make the off-diagonal entries in the Jordan normal form as small as desired. This argument is similar to that used to treat A_c in the previous post. In the theorem we choose a ball B such that instability corresponds to leaving it. As long as a solution remains in that ball the nonlinearity is under good control. The idea is to show that as long as the norm of the initial condition is sufficiently small the contribution of the nonlinear term on the right hand side of the integral equation will remain small compared to that coming from the linearized equation, which is growing at a known exponential rate. The details are complicated and are of course the essence of the proof. I will not try to explain them here.

My COVID-19 vaccination

May 14, 2021

In Germany vaccination against COVID-19 has been going slowly compared to some other places. One of the leading vaccines was developed by Biontech, a company based in Mainz, which is where I live. Due to this and my impression that that vaccine is very good my first choice would have been to get the Biontech vaccine. In the end things turned out differently. The Biontech vaccine is the one which has been given most often in Germany but the number of vaccinations have been limited by the available supply. The vaccine developed by AstraZenenca has also been used here but has acquired a bad reputation. There were several reasons for that. One is the bad relations of the company to the media. Another is the occurrence of rare side effects involving thromboses. A third is that the publicly quoted efficiency of preventing the illness is much less for AstraZeneca than for Biontech (80% compared to 95%). A fourth is that the waiting time between the first and second injections is longer for AstraZeneca (12 instead of 6 weeks). Despite these things my wife and I decided to get vaccinated with the product from AstraZeneca and had our first injections today. Here I want to explain why we did so and discuss some more general related issues.

I already mentioned that AstraZeneca became unpopular. In connection with the thromboses it was only recommended by the Paul Ehrlich Institute (the official body for such things in Germany) for people over sixty. The argument was that since people over sixty had a higher probability of serious consequences if they got infected the net benefit was positive for them. Doctors were having problems getting rid of their doses. However the situation is very dynamic. Although AstraZeneca is still not recommended for people under 60 (like myself) and it is not administered to them in the official vaccination centres doctors (GP’s and specialists) are allowed to vaccinate people under 60 with that product if they inform them about the risks. They are also allowed to reduce the waiting time between the injections. These political moves have led to many people (particularly young people) wanting to get vaccinated with AstraZeneca. The result is that now instead of being hard to get rid of the AstraZeneca vaccine is scarce. Thus now there is a lot of competition and the whole process is quite chaotic. We were lucky that my wife was offered a vaccination with AstraZeneca at a time when this seemed like asking for a favour rather than offering to do a favour. She asked if I could also be vaccinated by the same doctor and got a positive answer. This was our good fortune.

It was still not clear to us whether we should accept the offer. It is in the nature of the situation that many things concerning the disease and the vaccination are simply not known. On the other hand the amount of information available is huge. All this makes a rational decision difficult. From my point of view the thromboses were so rare that they did not influence my decision at all. The combination of the earlier availability of AstraZeneca for us with the longer waiting time meant that it was not clear which option (taking AstraZeneca now or waiting for Biontech) would lead to being fully immunized sooner. We did not want to shorten the interval for the AstraZeneca vaccination (which our doctor would have accepted) for a reason I will discuss later. A reason for making the choice we did was that this gave the feeling of finally making some progress. In the comparison of the efficiencies of the two products it is important to know that if the criterion considered is the risk of serious illness or hospitalization then according to the official figures AstraZeneca is at least as good as Biontech (both at least 95%). In any case, it was a good feeling to know that we had received one dose of the vaccine. A few hours later I have not noticed any side-effects whatsoever.

Let me return to the question of shortening the time between the injections for AstraZeneca. There is some evidence that this decreases the efficiency of the vaccinations and I have read about two possible mechanisms for this which seem to me at least plausible. The first is connected with the phenomenon of affinity maturation. The first B cells which are activated in response to a certain antigen can undergo somewhat random mutations. This means that the population of B cells becomes rather heterogeneous. The different variants then compete with each other in such a way that those which bind most strongly to the antigen come to dominate the population. In this way the quality of the antibodies is increased. If a second vaccination is given too soon it can interrupt the affinity maturation initiated by the first vaccination before the antibodies have been optimized. The second mechanism is as follows. The immune reponse against the antigen remains active for some time after the vaccination. If second vaccination is given while that is still the case then the antibodies generated in the first vaccination can bind to the vector viruses coming from the second vaccination and prevent them from achieving their intended purpose. These are not established facts but I prefer to have plausible hypotheses than a complete lack of a possible explanation.

From spectral to nonlinear instability

April 28, 2021

An important property of a steady state of a dynamical system is its stability. Let x(t) be the state of the system at time t and let x_0 be a steady state. For a system of ODE these are points in Euclidean space while for more general dynamical systems they are functions which can be thought of as points in suitably chosen function spaces. In general it may be useful to have more than one function space in mind when considering a given dynamical system. First I will concentrate on the ODE case. It is possible to linearize the system about x_0 to get a linear system \dot y=Ay. The steady state x_0 is said to be linearly stable when the origin is stable for the linearized system. Since the linear system is simpler than the nonlinear one we would ideally like to be able to use linear stability as a criterion for nonlinear stability. In general the relation between linear and nonlinear stability is subtle even for ODE. We can go a step further by trying to replace linear stability by spectral stability. There are relations between eigenvalues of A with positive real parts and unstable solutions of the linearized system. Again there are subtleties. Nevertheless there are two simple results about the relation between spectral stability and nonlinear stability which can be proved for ODE. The first is that if there is any eigenvalue of A with positive real part then x_0 is nonlinearly unstable. The second is that if all eigenvalues of A have negative real parts then x_0 is nonlinearly stable, in fact asymptotically stable. These two results are far from covering all situations of interest but at least they do define a comfortable region which is often enough. In what follows I will only consider the first of these two results, the one asserting instability.

Up to this point I have avoided giving precise definitions. So what does nonlinear instability of x_0 mean? It means that there is a neighbourhood U of x_0 such that for each neighbourhood W of x_0 there is a solution satisfying x(0)\in W and x(t)\notin U for some t>0. In other words, there are solutions which start arbitrarily close to x_0 and do not stay in U. How can this be proved? One way of doing so is to use a suitable monotone function V defined on a neighbourhood of x_0. This function should be continuously differentiable and satisfy the conditions that V(x_0)=0, V(x)>0 for x\ne x_0 and \dot V>0 for x\ne x_0. Here \dot V is the rate of change of V along the solution. Let \epsilon be sufficiently small so that the closed ball \overline{B_\epsilon (x_0)} is contained in the domain of definition of V. We will take this ball to be the neighbourood U in the definition of instability. Let M be the maximum of V on \overline{B_\epsilon (x_0)}. Thus in order to show that a solution leaves U it is enough to show that V exceeds M. Consider any solution which starts at a point of V other than x_0 for t=0. The set where V(x)<V(x_0) is open and the solution can never enter it for t>0. The intersection of its complement with U is compact. Thus \dot V has a positive minimum there. As long as the solution does not leave U we have \dot V(x(t))\ge m. Hence V(t)\ge V(0)+mt. This implies that if the solution remains in U for all t>0 then V(x(t)) eventually exceeds M, a contradiction. This result can be generalized as follows. Let Z be an open set such that x_0 is contained in its closure. Suppose that we have a function V which vanishes on the part of the boundary of Z intersecting U and for which \dot V>0 on Z except at x_0. Then x_0 is nonlinearly unstable with a proof similar to that just given.

Now it will be shown that if A has an eigenvalue with positive real part a function V with the desired properties exists. We can choose coordinates so that the steady state is at the origin and that the stable, centre and unstable subspaces at the origin are coordinate subspaces. The solution can be written in the form (x,y,z) where these three variables are the projections on the three subspaces. Then A is a direct sum of matrices A_+, A_{\rm c} and A_-, whose eigenvalues have real parts which are positive, zero and negative respectively. It can be arranged by a choice of basis in the centre subspace that the symmetric part of A_c is as small as desired. It can also be shown that because of the eigenvalue properties of A_+ there exists a positive definite matrix B_+ such that A_+^TB_++B_+A_+=I. For the same reason there exists a positive definite matrix B_- such that A_-^TB_-+B_-A_-=-I. Let V=x^TB_+x-y^Ty-z^TB_-z. Then \dot V=x^Tx+z^Tz-y^T(A_c^T+A)y+o(x^Tx+y^Ty+z^Tz). The set U is defined by the condition V>0. There y^Ty\le Cx^Tx for a positive constant C. On this region \dot V\ge\frac12(x^Tx+z^Tz)+o(|x|^2+|z|^2), where we profit from the special basis of the centre subspace mentioned earlier. The quadratic term in y which does not have a good sign has been absorbed in the quadratic term in x which does. This completes the proof of nonlinear instability. As they stand these arguments do not apply to the infinite-dimensional case since compactness has been used freely. A discussion of the infinite-dimensional case will be postponed to a later post.

Strongly continuous semigroups

March 19, 2021

An evolution equation is an equation for a function u(t,x) depending on a time variable t and some other variables x which can often be thought of as spatial variables. There is also the case where there are no variables x, which is the one relevant for ordinary differential equations. We can reinterpret the function u as being something of the form u(t)(x), a function of x which depends on t. I am thinking of the case where u(t,x) takes its values in a Euclidean space. Then u(t) should be thought of as taking values in a function space. Different regularity assumptions on the solutions naturally lead to different choices of function space. Suppose, for instance, I consider the one-dimensional heat equation u_t=u_{xx}. Then I could choose the function space to be C^0, the space of continuous functions, C^2, the space of twice continuously differentiable functions or L^p. For some choices of function spaces we are forced to consider weak solutions. It is tempting to consider the evolution equation as an ODE in a function space. This can have advantages but also disadvantages. It gives us intuition which can suggest ideas but the analogues of statements about ODE often do not hold for more general evolution equations, in particular due to loss of compactness. (In what follows the function space considered will always be a Banach space.) We can formally write the equation of the example as u_t=Au for a linear operator A. If we choose the function space to be X=L^p then this operator cannot be globally defined, since A does not map from X to itself. This leads to the consideration of unbounded operators from X to X. This is a topic which requires care with the use of language and the ideas which we associate to certain formulations. An unbounded operator from X to X is a linear mapping from a linear subspace D(A), the domain of A, to X. Just as there may be more than one space X which is of interest for a given evolution equation there may be more than one choice of domain which is of interest even after the space has been chosen. To take account of this an unbounded operator is often written as a pair (A, D(A)). In the example we could for instance choose D(A) to be the space of C^\infty functions or the Sobolev space W^{2,p}.

In the finite-dimensional case we know the solution of the equation u_t=Au with initial datum u_0. It is u(t)=e^{tA}u_0. It is tempting to keep this formula even when A is unbounded, but it must then be supplied with a suitable interpretation. There are general ways of defining nonlinear functions of unbounded linear operators using spectral theory but here I want to pursue another direction, which uses a kind of axiomatic approach to the exponential function S(t)=e^{tA}. It should have the property that S(0)=I and it should satisfy the semigroup property S(s+t)=S(s)S(t) for all non-negative s and t. It remains to require some regularity property. One obvious possibility would be to require that S is a continuous function from [0,\infty) into the space of bounded operators L(X) with the topology defined by the operator norm. Unfortunately this is too much. Let us define an operator Ax=\lim_{t\to 0}\frac{S(t)x-x}{t} whenever this limit exists in X and D(A) to be the linear subspace for which it exists. In this way we get an unbounded operator on a definite domain. The problem with the continuity assumption made above is that it implies that D(A)=X. In other words, if the operator A is genuinely unbounded then this definition cannot apply. In particular it cannot apply to our example. It turns out that the right assumption is that \|S(t)x-x\|\to 0 for t\to 0 and any x\in D(X). This leads to what is called a strongly continuous one-parameter semigroup. A is called the infinitesimal generator of S(t). Its domain is dense and it is a closed operator, which means that its graph is a closed subset (in fact linear subspace) of the product X\times X with the topology defined by the product norm. In a case like the example above the problem with continuity is only at t=0. The solution of the heat equation is continuous in t in any reasonable topology on any reasonable Banach space for t>0 but not for t=0. In fact it is even analytic for t>0, something which is typical for linear parabolic equations.

In this discussion we have said how to start with a semigroup S(t) and get its generator (A,D(A)) but what about the converse? What is a criterion which tells us for a given operator (A,D(A)) that it is the generator of a semigroup? A fundamental result of this type is the Hille-Yosida theorem. I do not want to go into detail about this and related results here. I will just mention that it has to do with spectral theory. It is possible to define the spectrum of an unbounded operator as a generalization of the eigenvalues of a matrix. The complement of the spectrum is called the resolvent set and the resolvent is (A-\lambda I)^{-1}, which is a bounded operator. The hypotheses made on the generator of a semigroup concern the position of the resolvent set in the complex plane and estimates on the norm of the resolvent at infinity. In this context the concept of a sectorial operator arises.

My interest in these topics comes from an interest in systems of reaction-diffusion equations of the form u_t+D\Delta u=f(u). Here u is vector-valued, D is a diagonal matrix with non-negative elements and the Laplace operator is to be applied to each component of u. I have not found it easy to extract the results I would like to have from the literature. Part of the reason for this is that I am interested in examples where not all the diagonal elements of D are positive. That situation might be described as a degenerate system of reaction diffusion equations or as a system of reaction-diffusion equations coupled to a system of ODE. In that case a lot of results are not available ‘off the shelf’. Therefore to obtain an understanding it is necessary to penetrate into the underlying theory. One of the best sources I have found is the book ‘Global Solutions of Reaction-Diffusion Systems’ by Franz Rothe.

Dynamics of the activation of Lck

January 26, 2021

The enzyme Lck (lymphocyte-associated tyrosine kinase) is of central importance in the function of immune cells. I hope that mathematics can contribute to the understanding and improvement of immune checkpoint therapies for cancer. For this reason I have looked at the relevant mathematical models in the literature. In doing this I have realized the importance in this context of obtaining a better understanding of the activation of Lck. I was already familiar with the role of Lck in the activation of T cells. There are two tyrosines in Lck, Y394 and Y505, whose phosphorylation state influences its activity. Roughly speaking, phosphorylation of Y394 increases the activity of Lck while phosphorylation of Y505 decreases it. The fact that these are influences going in opposite directions already indicates complications. In fact the kinase Lck catalyses its own phosphorylation, especially on Y394. This is an example of autophosphorylation in trans, i.e. one molecule of Lck catalyses the phosphorylation of another molcule of Lck. It turns out that autophosphorylation tends to favour complicated dynamics. It is already the case that in a protein with a single phosphorylation site the occurrence of autophosphorylation can lead to bistability. Normally bistability in a chemical reaction network means the existence of more than one stable positive steady state and this is the definition I usually adopt. The definition may be weakened to the existence of more than one non-negative stable steady state. That autophosphorylation can produce bistability in this weaker sense was already observed by Lisman in 1985 (PNAS 82, 3055). He was interested in this as a mechanism of information storage in a biochemical system. In 2006 Fuss et al. (Bioinformatics 22, 158) found bistability in the strict sense in a model for the dynamics of Src kinases. Since Lck is a typical member of the family of Src kinases these results are also of relevance for Lck. In that work the phosphorylation processes are embedded in feedback loops. In fact the bistability is present without the feedback, as observed by Kaimachnikov and Kholodenko (FEBS J. 276, 4102). Finally, it was shown by Doherty et al. (J. Theor. Biol. 370, 27) that bistability (in the strict sense) can occur for a protein with only one phosphorylation site. This is in contrast to more commonly considered phosphorylation systems. These authors have also seen more complicated dynamical behaviour such as periodic solutions.

All these results on the properties of solutions of reaction networks are on the level of simulations. Recently Lisa Kreusser and I set out to investigate these phenomena on the level of rigorous mathematics and we have just put a paper on this subject on the archive. The model developed by Doherty et al. is one-dimensional and therefore relatively easy to analyse. The first thing we do is to give a rigorous proof of bistability for this system together with some information on the region of parameter space where this phenomenon occurs. We also show that it can be lifted to the system from which the one-dimensional system is obtained by timescale separation. The latter system has no periodic solutions. To obtain bistability the effect of the phosphorylation must be to activate the enzyme. It does not occur in the case of inhibition. We show that when an external kinase is included (in the case of Lck there is an external kinase Csk which may be relevant) and we do not restrict to the Michaelis-Menten regime bistability is restored.

We then go on to study the dynamics of the model of Kaimachnikov and Kholodenko, which is three-dimensional. These authors mention that it can be reduced to a two-dimensional model by timescale separation. Unfortunately we did not succeed in finding a rigorous framework for their reduction. Instead we used another related reduction which gives a setting which is well-behaved in the sense of geometric singular perturbation theory (normally hyperbolic) and can therefore be used to lift dynamical features from two to three dimensions in a rather straightforward way. It then remains to analyse the two-dimensonal system. It is easy to deduce bistability from the results already mentioned. We go further and show that there exist periodic and homoclinic solutions. This is done by showing the existence of a generic Bogdanov-Takens bifurcation, a procedure described more generally here and here. This system contains an abundance of parameters and the idea is to fix these so as to get the desired result. First we choose the coordinates of the steady state to be fixed simple rational numbers. Then we fix all but four of the parameters in the system. The four conditions for a BT bifurcation are then used to determine the values of the other four parameters. To get the desired positivity for the computed values the choices must be made carefully. This was done by trial and error. Establishing genericity required a calculation which was complicated but doable by hand. When a generic BT bifurcation has been obtained it follows that there are generic Hopf bifurcations nearby in parameter space and the nature of these (sub- or supercritical) can be determined. It turns out that in our case they are subcritical so that the associated periodic solutions are unstable. Having proved that periodic solutions exist we wanted to see what a solution of this type looks like by simulation. We had difficulties in finding parameter values for which this could be done. (We know the parameter values for the BT point and that those for the periodic solution are nearby but they must be chosen in a rather narrow region which we do not know explicitly.) Eventually we succeeded in doing this. In this context I used the program XPPAUT for the first time in my life and I learned to appreciate it. I see this paper as the beginning rather than the end of a path and I am very curious as to where it will lead.

The Bogdanov-Takens bifurcation, part 2

December 30, 2020

This post continues the discussion in the previous post on this subject. It turns out that the (exact) normal form is a system whose dynamics are not easy to analyse. The bifurcation diagram can be found as Figure 8.8 in the book of Kuznetsov. The case presented there is the supercritical one. The diagram can easily be converted into that describing the subcritical case, the most important thing being that the time direction is reversed so that the types of stability are interchanged. The bifurcation point is the origin in the plane of the parameters (\beta_1,\beta_2). The plane can be written as the union of four open regions R_i separated by bifurcation curves. These regions are distinguished by the types of dynamical behaviour which occur for the given values of the parameters. R_1 is simplest, having a trivial dynamics, without steady states or periodic solutions. The boundary of R_1 is a smooth curve T passing through the origin. If the origin is deleted the remainder consists of two connected components T_- and T_+. The curve T_- is the boundary between R_1 and R_2. There a fold bifurcation takes place, producing a saddle point and a sink. The region R_2 is that between T_- and the negative part of \beta_2 axis, which we call H. On that axis a Hopf bifurcation takes place. The sink turns into a source and a stable periodic solution is born. This is a supercritical Hopf bifurcation. For the other sign of the parameter s there is an unstable periodic solution and the Hopf bifurcation is subcritical. This is the reason that we have also adopted the terms sub- and supercritical in the case of a BT bifurcation. (Note that the sign of s can be computed from the parameters a_{ij} and b_{ij}.) On the other side of H we come to R_3. In that region there is a saddle point and the periodic solution already mentioned. It is the unique periodic solution for given values of the parameters. The other boundary of R_3 is a curve C where a non-local bifurcation takes place. On the curve there is an orbit which is homoclinic to the saddle. On the other side of C is R_4 where the homoclinic loop has broken and the qualitative behaviour is similar to that in R_2 except that there is a source instead of a sink. From R_4 we can pass through T_+ and return to R_1 by means of a fold bifurcation. It is perhaps worth mentioning what happens on the upper half of the \beta_2 axis. There is no bifurcation there but the two eigenvalues of the linearization are equal and opposite (a neutral saddle). This type of situation may be dangerous when looking for Hopf bifurcations because it shares certain algebraic properties with a Hopf point without being one. For this reason it could make sense to call it a ‘false Hopf point’, which is my internal name for it.

The proofs of the statements about the homoclinic orbit and the periodic solution are hard and I will just make a few comments on that. There is a subtle rescaling which, in particular, ensures that the two steady states are a fixed distance apart, independent of the parameters. Doing this transformation brings the system into the form of a perturbation of a Hamiltonian system. In the latter case the homoclinic orbit is obtained as a level surface of the Hamiltonian. This perturbation can be analysed by a method due to Pontryagin. Use is made of elliptic functions. These things are sketched in some detail by Kuznetsov in an appendix. He also mentions an alternative method for proving the uniqueness of the periodic orbit due to Dumortier and Rousseau. Concerning the fact that the approximate normal form can be transformed to the exact normal form Kuznetsov only gives a very brief sketch of the proof. Combining what is known about the qualitative behaviour of the solutions in normal form and the fact that any system satisfying the Bogdanov-Takens conditions can be reduced to this normal form provides a method for proving the existence and stability of periodic solutions and the existence of a homoclinic loop in given dynamical systems. The genericity conditions can be easier to check than for a Hopf bifurcation since they are conditions on the second derivatives rather than on the third derivatives. How can it be that it is easier to analyse a more complicated bifurcation than a simpler one? The point is that the hardest work is hidden in the proofs of the theorems about the normal form and does not have to be repeated when analysing a concrete system. The use of a BT bifurcation to help prove the existence of a Hopf bifurcation is connected to the idea of an organizing centre. The idea is to obtain insight into a dynamical system by looking for points with extremely special properties. In a given system a point like this of a given type may not exist but when it exists it may be easier to find than a point with a lower degree of speciality, even if the latter occurs more commonly. For instance we can look for a BT point instead of looking for a Hopf point. This type of strategy may be especially useful in systems where there are many parameters over which there is a lot of control. It gives a way of focussing the search in the parameter space. It is the opposite of the situation where you feel that you are looking for something you expect to be plentiful but do not know where to start.

The Bogdanov-Takens bifurcation

December 29, 2020

Consider a system of ODE of the form \dot x=f(x,\alpha) with parameters \alpha. A steady state is a solution of f(x_0,\alpha_0)=0 and it is a bifurcation point if J=D_x f(x_0,\alpha_0) has at least one eigenvalue on the imaginary axis. A common procedure in bifurcation theory is to start with those cases which are least degenerate. Thus we look at those cases with the fewest eigenvalues on the imaginary axis. If there is only one such eigenvalue, which is then necessarily zero, then I call this a spectral fold point. I use the word ‘spectral’ to indicate that this is a condition which only involves the structure of eigenvalues or eigenvectors. The full definition of a fold point also includes conditions which do not only involve the linearized equation \dot y=Jy. If the only imaginary eigenvalues are a non-zero complex conjugate pair then this is a spectral Hopf point. If the only imaginary eigenvalues are two zero eigenvalues and the kernel is only one-dimensional (so that J has a non-trivial Jordan block for the eigenvalue zero) then we have a spectral Bogdanov-Takens point.

When we have picked a class of bifurcations on the basis of the eigenvalues of J the aim is to show that it can be reduced to a standard form by transformations of the unknowns and the parameters. This is often done in two steps. The first is to reduce it to an approximate normal form, which still contains error terms which are higher order in the Taylor expansion about the bifurcation point than the terms retained. The second is to transform further to eliminate the error terms and reduce the system to normal form. Normal forms for the Bogdanov-Takens bifurcation were derived independently by Bogdanov and Takens in the early 1970’s. In this post I follow the presentation in the book of Kuznetsov, which in turn is based on Bogdanov’s approach. In the notation of Kuznetsov the BT bifurcation is defined by four conditions, denoted by BT.0, BT.1, BT.2 and BT.3. The condition BT.0 is the spectral condition we have already discussed. To be able to have a BT bifurcation the number of unknowns must be at least two. The following discussion concerns the two-dimensional case. This is the essential case since higher dimensions can then be treated by using centre manifold theory to reduce them to two dimensions. The BT bifurcation is codimension two which means that in a suitable sense the set of dynamical systems exhibiting this bifurcation is a subset of codimension two in the set of all dynamical system. Another way of saying this is that in order to find BT bifurcations which persist under small perturbations it is necessary to have at least two parameters. For these reasons we consider the case where there are two variables x and two variables \alpha. The conditions BT.1, BT.2 and BT.3 are formulated in terms of the derivatives of f of order up to two at the bifurcation point. We choose coordinates so that the bifurcation point is at the origin.

By a linear transformation from x to new variables y the system can be put into the form \dot y=J_0y+R(y,\alpha) where J_0 is a Jordan block and the remainder term R is of order two in y and of order one in \alpha. After this a sequence of transformations are carried out leading to new unknowns \eta, new parameters \beta and a new time coordinate. This eventually leads to the equations \dot\eta_1=\eta_2 and \dot\eta_2=\beta_1+\beta_2\eta_1+\eta_1^2+s\eta_1\eta_2+O(|\eta|^3), which is the (approximate) normal form of Bogdanov. Takens introduced a somewhat different normal form. The parameter s is plus or minus one. Because of relations to the Hopf bifurcation I call the case s=-1 supercritical and the case s=1 subcritical. Let us denote the coefficients in the quadratic contribution to R by a_{ij}(\alpha) and b_{ij}(\alpha). The condition BT.1 is that a_{20}(0)+b_{11}(0)\ne 0 and it is required to allow an application of the implicit function theorem. The condition BT.2 is that b_{20}\ne 0 and it is required to allow a change of time coordinate. The final transformation involves a rescaling of both the unknowns and the parameters. The existence of the new parameters \beta as a function of the old parameters \alpha is guaranteed by the implicit function theorem and it turns out that the non-degeneracy condition is equivalent to the condition that the derivative of a certain mapping is invertible at the bifurcation point. If the equation is \dot y=g(y,\alpha) then the mapping is (y,\alpha)\mapsto (g,{\rm tr}D_y g,\det D_y g). This condition is BT.3. This equivalence is the subject of a lemma in the book which is not proved there. As far as I can see proving this requires some heavy calculations and I do not have a conceptual explanation as to why this equivalence holds. Carrying out all these steps leads to the approximate normal form. At this point there is still a lot more to be understood about the BT bifurcation. It remains to understand how to convert the approximate normal form to an exact one and how to analyse the qualitative behaviour of the solutions of the system defined by the normal form. I will leave discussion of these things to a later post.

Paper on hepatitis C

November 13, 2020

Together with Alexis Nangue and other colleagues from Cameroon we have just completed a paper on mathematical modelling of hepatitis C. What is being modelled here is the dynamics of the amount of the virus and infected cells within a host. The model we study is a modification of one proposed by Guedj and Neumann, J. Theor. Biol. 267, 330. One part of it is a three-dimensional model, which is related to the basic model of virus dynamics. It is coupled with a two-dimensional model which gives a simple description of the way in which the virus replicates inside a host cell. This intracellular process is related to the mode of action of the modern drugs used to treat hepatitis C. I gave some information about the disease itself and its treatment in a previous post. In the end the object of study is a system of five ODE with many parameters.

For this system we first proved global existence and boundedness of the solutions, as well as positivity (positive initial data lead to a positive solution). One twist here is a certain lack of regularity of the coefficients of the system. When some of the unknowns become zero the right hand side of the equations is discontinuous. This means that it is necessary to prove that this singular set is not approached during the evolution of a positive solution. The source of the irregularity is the use of something called the standard incidence function instead of simple mass action kinetics. The former type of kinetics has a long history in epidemiology and I do not want to try to explain the background here. In any case, there are arguments which say that mass action kinetics leads to unrealistic results in within-host models of hepatitis and that the standard incidence function is better.

We show that the model has up to two virus-free steady states and determine their stability. The study of positive steady states is more difficult and, at the moment, incomplete. We have proved that there cannot be more than three steady states but we do not know if there is ever more than one. Under increasingly restrictive assumptions on the parameters (restrictions which are unfortunately not all biologically motivated) we show that there is at least one positive steady state, that there is exactly one and that that one is asymptotically stable. Under certain other assumptions we can show that every solution converges to a steady state. This last proof uses the method of Li and Muldowney discussed in a previous post. Learning about this method was one of the (positive) side effects of working on this project. Another was an improvement of my understanding of the concept of the basic reproductive number as discussed here and here. During this project I have learned a lot of new things, mathematical and biological, and I feel that I am now in a stronger position to tackle other projects modelling hepatitis C and other infectious diseases.

This

Lisa Eckhart and her novel Omama

October 2, 2020

Lisa Eckhart is a young Austrian known for a type of comedy which is called Kabarett in German-speaking countries. Characteristic features of this form of entertainment are black humour and political content. In her stage performances she likes to break taboos and flout political correctness. I first happened to see her on TV and I was immediately interested by her performance. Since then I saw her on stage in Mainz, shortly before COVID-19 made going to the theatre impossible for some time. I had also previously seen another Austrian proponent of the same type of performance who I appreciate a lot, Josef Hader, on stage in Mainz. Recently Eckhart published her first novel, ‘Omama’, and I now read it. In her stage performances there are some obstacles to understanding. The first is that her humour relies essentially on ambiguity. The second is that the whole thing goes very fast. The third is that she likes to mix in quite a lot of Austrian dialect. I thought that if there is some similarity between the contents of the novel and that of the performances it might help me to understand more. After all, a novel can be read at the speed the reader desires and it is possible to take time to investigate anything which is unclear. Recently there has been some public controversy in Germany around Eckhart. She has been accused of antisemitism, which I do not believe is justified. In August she was supposed to give a reading from her novel at a literature festival in Hamburg. Her invitation was withdrawn because the organisers were afraid that her appearance there might lead to violent protests by left-wing groups which the police would not be able to control. I find the fact that such a thing can happen a disgrace. What happened to free speech? Later the organisers renewed the invitation in a modified form but this time Eckhart refused, which I can understand. On Wednesday evening I went to read the last few pages of the book but before I got properly started I was called by my wife, who was watching TV. When I came into the living room I understood why she had called me since Lisa Eckhart was on the screen. She was participating in a program on the channel ARTE on the subject of decadence. The word decadence was one which had not gone through my mind for many years but it interested me a long time ago. At that time I read ‘A Rebours’ by Huysmans, ‘Le Soleil des Morts’ by Mauclair and, of course, ‘Les Fleurs du Mal’. Looking back it is hard to imagine how different my world was then, a fact that has less to do with a change in the world than with a change in myself. In any case, when I think about it, it is clear that we are now living in a period of decadence.

Let me finally come to the novel itself. It struck me as a curate’s egg. Parts of it are very good. There are passages where I appreciate the humour and I find the author’s use of language impressive. On a more global level I do not find the text attractive. It is the story of the narrator’s grandmother. (Here is a marginal note for the mathematical reader. Walter Rudin, known for his analysis textbooks, was born in Austria. In¬† a biographical text about him I read that one of his grandmothers was referred to as ‘Omama’.) The expressions are often very crude, with a large dose of excrement and other unpleasant aspects of the human body, and many elements of the story seem to me pointless. There is no single character in the novel who I find attractive. This is in contrast to the novel of Banine which I previously wrote about, where I find the narrator attractive. That novel also contains plenty of crude expressions but there are more than enough positive things to make up for it. I would like to emphasize that just because I find a novel unpleasant to read it does not mean I judge it negatively. A book which I found very unpleasant was ‘Alexis ou le traite du vain combat’ by Marguerite¬† Yourcenar but in that case my conclusion was that it could only be so unpleasant because it was so well written. I do not have the same feeling about Omama. As to the insight which I hoped I might get for Eckhart’s stage performances I have not seen it yet, but maybe I will notice a benefit the next time I experience a stage performance by her.

Determinant of a block triangular matrix

August 26, 2020

I am sure that the fact I am going to discuss here is well known but I do not know a good source and so I decided to prove it for myself and record the answer here. Suppose we have an n\times n matrix M with complex entries m_{ij} and that it is partitioned into four blocks by considering the index ranges 1\le i\le k and k+1\le i\le n for some positive integer k<n. Suppose that the lower left block is zero so that the matrix is block upper triangular. Denote by A, B and C the upper left, upper right and lower right blocks. Then \det M=\det A\det C. This is already interesting in the block diagonal case B=0. To begin the proof of this I use the fact that over the complex numbers diagonalizable matrices with distinct diagonal elements are dense in all matrices. Approximate A and C by diagonalizable matrices with distinct diagonal elements A_n and C_n, respectively. Replacing A and C by A_n and C_n gives an approximating sequence M_n for M. If we can show that \det M_n=\det A_n\det C_n for each n then the desired result follows by continuity. The reason I like this approach is that the density statement may be complicated to prove but it is very easy to remember and can be applied over and over again. The conclusion of this is that it suffices to treat the case where A and C are diagonalizable with distinct eigenvalues. Since the determinant of a matrix of this kind is the product of its eigenvalues it is enough to show that every eigenvalue of A or C is an eigenvalue of M. In general the determinant of a matrix is equal to the determinant of its transpose. Thus the matrix and its transpose have the same eigenvalues. Putting it another way, left eigenvectors define the same set of eigenvalues as right eigenvectors. Let \lambda be an eigenvalue of A and x a corresponding eigenvector. Then [x\ 0]^T is an eigenvector of M corresponding to that same eigenvalue. Hence any eigenvalue of A is an eigenvalue of M. Next let \lambda be an eigenvalue of C and y a corresponding left eigenvector. Then [0\ y] is a left eigenvector of M with eigenvalue \lambda. We see that all eigenvalues of A and C are eigenvalues of M. To see that we get all eigenvalues of M in this way it suffices to do the approximation of A and C in such a way that A_n and C_n have no common eigenvalues for any n. Then we just need to compare the numbers of eigenvalues of the matrices A,C and M. A consequence of this result is that the characteristic polynomial of M is the product of those of A and C. An application which interests me is the following. Suppose we have a partially decoupled system of ODE \dot x=f(x,y), \dot y=g(y), where x and y are vectors. Then the derivative of the right hand side of this system at any point has the block triangular form. This is in particular true for the linearization of the system about a steady state and so the above result can be helpful in stability analyses.