The Mixmaster model

June 24, 2009 by hydrobates

The Mixmaster model arises in the study of solutions of the vacuum Einstein equations which are spatially homogeneous. To say that a solution is spatially homogeneous means that it is invariant under a group of symmetries whose orbits are three-dimensional and spacelike. This implies that it can be described by functions depending only on time and that the Einstein equations reduce to a system of ordinary differential equations. The details of these equations depend on the group defining the symmetry. In all but one case this group can be taken to be three-dimensional. Then classifying the symmetry type essentially comes down to classifying all three-dimensional Lie algebras.This was done by Bianchi in the 1890’s. He introduced types I to IX and this terminology is used in the study of the Einstein equations to this day. Types VI and VII are actually one-parameter families of non-isomorphic Lie algebras and are therefore denoted by VI{}_h and VII{}_h, where h is the parameter. The most general Bianchi types (in the sense that the solutions depend on the largest number of parameters) are types VIII, IX and VI{}_{-\frac19}. All Bianchi types contain solutions with initial singularities and the three types just listed exhibit complicated oscillatory behaviour in the approach to the singularity. The case which has been studied most is Bianchi IX and this was christened the Mixmaster model by Charles Misner in 1969. He named it after a kind of food mixer which was popular at the time. There has been a great deal of heuristic and numerical work on the dynamics of this system of ODE. Rigorous mathematical results have been relatively rare. The most notable results in this direction are those arising from the PhD thesis of Hans Ringström, some of which will now be described.

It is useful in the study of Bianchi models to distinguish between the Lie algebras which are unimodular (trace of the structure constants zero) and the rest. These are called Class A and Class B respectively. All models of Class A can be incorporated into a single dynamical system. The dynamical systems for types VIII and IX occupy open subsets of the state space and the other types are represented on lower dimensional submanifolds on the boundaries of these open sets. Class A consists of the types I, II, VI{}_0, VII{}_0, VIII and IX. It was shown by Ringström that the limit points of solutions of Bianchi type IX in the approach to the singularity (technically the \alpha-limit points) are of type I and II. For almost all initial data it was proved that the approach to the singularity is oscillatory in the sense that there are at least three distinct \alpha-limit points. More details can be found in this paper. The wider context of these results and aspects of the history of the problem are discussed in the paper ‘Mixmaster: Fact and Belief‘ by Mark Heinzle and Claes Uggla.

Recently there has been renewed interest in these problems. A project which I have started in cooperation with the research group of Bernold Fiedler has generated a lot of activity. The central idea is to bring together knowledge about the concrete application with sophisticated techniques from the theory of dynamical systems. What are the problems to be solved? A first question is whether the direct analogues of the Bianchi IX results hold for type VIII. This has not been proved and there are serious reasons for doubting if it is even true. The key issue, at least at our present state of understanding is whether Bianchi VIII solutions may have \alpha-limit points of type VI{}_0. For type VI{}_{-\frac19} almost nothing is known about the corresponding questions on a rigorous level.

Misner’s original motivation for introducing the Mixmaster model was to find an explanation for the observed isotropy of the universe. Whether this explanation can work depends on the nature of the causal structure of the geometry near the singularity. These days it is usual to explain the isotropy of the universe on the basis of inflation and so Misner’s proposal no longer has the same direct physical significance. The mathematical problem remains and has lost none of its intrinsic fascination. In this context it is desirable to understand the detailed structure of the \alpha-limit sets of generic and non-generic solutions. In particular there are three special points in the phase space, the flat Kasner solutions. Whether Misner’s mechanism is effective depends on how much time solutions spend close to these special points. Quantifying this requires a very detailed understanding of the asymptotics of solutions near the singularity. This issue is under intensive investigation. I hope to write about the results here when they are ready to come into the public domain.

These questions are of interest for the understanding of solutions far beyond the homogeneous case. This has to do with the picture of general singularities in solutions of the Einstein equations developed by Belinkii, Khalatnikov and Lifshitz, which is one of the most outstanding challenges in mathematical relativity.

Cosmological perturbation theory, part 2

June 16, 2009 by hydrobates

In a previous post which I wrote several months ago I promised some more information about the work which Paul Allen and I have been doing on cosmological perturbation theory. Next Thursday I will give a talk on the subject at a conference in Lisbon. This work has been delayed by other commitments but now we posted a paper as arXiv:0906.2517. Here I will explain some of the results. In this paper we study the equation which describes so-called scalar perturbations. I will not attempt to explain the name but instead just say that these are the type of perturbations which cosmologists use to describe processes like the formation of the distribution of galaxies. In the simplest case these are described by the equation \Phi''+\frac{6(1+w)}{1+3w}\frac1{\eta}\Phi'=w\Delta\Phi. Here \Phi is a real valued function on {\rm R}\times T^3 where T^3 is the three-dimensional torus. The prime denotes a derivative with respect to a time coordinate \eta and \Delta is the Laplacian. The constant parameter w comes from the equation of state of the fluid, assumed to be of the form p=w\rho.

The aim of the paper is to obtain information about the asymptotics of general solutions of this equation in the limits t\to 0 and t\to\infty. It may be noted that this equation bears a certain resemblance to the polarized Gowdy equation. In fact we are able to import a number of techniques which have been used in the study of the Gowdy equations to understand the solutions of the equation above. The solutions can be parametrized by certain asymptotic data in each asymptotic regime. For the limit t\to 0 this data consists of two free functions which are coefficients in an asymptotic expansion of the form \sum_i\Phi_i(\eta) \zeta_i(t). For the limit t\to\infty it turns out to be useful to distinguish between solutions which are constant on the hypersurfaces of constant \eta and those whose integral over the torus is zero for each fixed \eta. Here I restrict consideration to the latter. It then turns out that after rescaling by a suitable power of \eta the solution looks like a solution of the flat space wave equation plus a remainder term. Thus in fact the asymptotics in each direction bears a strong qualitative resemblance to the asymptotics in the corresponding direction for solutions of the polarized Gowdy equations.

The paper also proves results about more general equations of state. Of course in that case the equation above is replaced by a more complicated one. It is assumed that in the appropriate limit the equation of state is the sum of a linear term and an expression which has an asymptotic expansion in powers of the energy density \epsilon which are negligible with respect to linear terms in the given regime. In many cases the modification does not make much of a difference and the leading order asymptotics is the same as in the corresponding linear case. An exception is the late time behaviour when the coefficient of the linear term vanishes so that the leading term in f(\epsilon) in the limit \epsilon\to 0 is proportional to \epsilon^{1+\sigma} for some positive \sigma. There is a bifurcation at \sigma=\frac13. When \sigma is smaller than this value the asymptotics is similar to that in the linear case. It does however happen that in the leading term the time coordinate in the solution of the flat space wave equation has to be distorted. For \sigma>\frac13 there is a more radical change in the asymptotics. In that case the behaviour looks more like what happens in the limit t\to 0. Waves which continue to propagate for ever are replaced by waves which freeze.In a sense the natural time coordinate for describing the dynamics is one which brings infinity to a finite value. This is reminiscent of the behaviour of the gravitational field in perturbed de Sitter spacetimes.

The Vlasov-Poisson system

June 4, 2009 by hydrobates

The Vlasov-Poisson system is a system of partial differential equations which comes up in mathematical physics. I have been involved quite a bit with these equations and related systems for many years now. In this post I want to reflect a little on what is and is not known about solutions of this system. One of the things which has stimulated me to think more about these questions just now is a lecture course on kinetic equations which I am giving at the Free University in Berlin. Because of the physics motivation the Vlasov-Poisson system is usually studied in three space dimensions. Here I will allow the space dimension n to be general. For convenience I also introduce a parameter \gamma which can take the values +1 and -1. The equations are \partial_t f+v\cdot \nabla_x f+\gamma\nabla_x U\cdot\nabla_v f=0 and \Delta U=\rho where \rho=\int f dv. Here t is time and x and v denote position and velocity variables, each belonging to {\bf R}^n. \Delta is the Laplacian. Because of their most frequent applications the cases \gamma=1 and \gamma=-1 are often called the plasma physics case and the stellar dynamics case respectively. A natural problem here is the initial value problem (Cauchy problem) with data prescribed for f.

Local existence in the Cauchy problem is known. For n\le 3 it is furthermore known that the local solution extends to a global in time solution, independently of the sign of \gamma. (The first proofs by two different methods were given by Pfaffelmoser and Lions/Perthame around 1991.) When n\ge 4 and \gamma=-1 there are large classes of smooth initial data for which global existence fails. More specifically, these equations have a conserved energy and when this energy is negative the corresponding smooth solution breaks down after finite time. The easiest way to realize that n=4 might be important is to look at scaling properties of the equations. For a discussion of the significance of scaling properties in general see Terry Tao’s post on the Navier-Stokes equations. In the case n=3 the potential and kinetic energies satisfy an inequality of the form |{\cal E}_{\rm pot}|\le C{\cal E}_{\rm kin}^{\frac12} and this plays an important role in the global existence proof. The essential feature is that the power on the right hand side is less than one. If similar arguments are carried out in the case n=4 then the power one half is replaced by the power one. Thus in a sense n=4 is the critical case. For n\ge 4 the global existence problem for the Vlasov-Poisson system with \gamma=1 is open. For n=4 it is a critical problem and might be solvable in a not too distant future. Similar remarks might be made about the relativistic Vlasov-Poisson system with massless particles in three space dimensions which is given by \partial_t f+\hat v\cdot \nabla_x f+\nabla_x U\cdot\nabla_v f=0 where \hat v=\frac{v}{|v|}. The analogue of this last system plays an important role in the recent work of Lemou, Méhats and Raphaël on the nature of singularities in solutions of the relativistic Vlasov-Poisson system with \gamma=-1.

Other open questions concern the behaviour of solutions of the Vlasov-Poisson system at late times. There are various results on this but they seem to be far from an exhaustive understanding of the asymptotics. Interesting questions include whether the density \rho is bounded any solution with \gamma=-1 and whether \|\rho\|_{L^\infty}=O(t^{-3}) in the case \gamma=1, as is known to be the case for small initial data.

H1N1 and the influenza pandemic of 1918

May 15, 2009 by hydrobates

A few years ago I read the book ‘Flu. The story of the great influenza pandemic of 1918 and the search for the virus that caused it’ by Gina Kolata. (Actually it happened to be the German translation.) I found it very interesting and a pleasure to read. The recent public interest in the outbreak of influenza of type H1N1 in Mexico led me to take up the book again. I found it just as fascinating as the first time around and I ended up rereading the whole book. Things I had learned in the meantime allowed me to appreciate more aspects of the subject. In what follows I will describe some of the highlights.

The main subject of the book is the influenza pandemic of 1918. According to conservative estimates this killed 50 million people, more than died in combat in the First World War. In view of the magnitude of the catastrophe it is surprising that it is not more widely known. In 1951 Johan Hultin travelled to Brevig in Alaska and obtained samples from the remains of victims from 1918 which had been relatively well preserved in the permafrost.He had hoped that the virus might have survived but this was too optimistic. He could not do much with his samples at that time. Later it was clear to him that with more recent developments in molecular biology there were new possibilities of using this kind of material. Independently of this, in 1995 Jeffery Taubenberger and his colleagues at the Armed Forces Institute of Pathology started to try to obtain information about the 1918 influenza virus from remains of soldiers who had died at that time which were archived at the institute. They were able to determine part of the genetic sequence of the virus but the quantity of material they had was so small as to limit the possibilities. Hultin read their first publication on the subject in Science in 1997 and contacted Taubenberger, offering to return to Brevig to obtain more material. He was successful and this finally allowed Taubenberger’s team to determine the whole genome of the virus. This was completed in 2005. A few months later the virus was reconstructed at the Centers for Disease Control and Prevention (CDC).

There were two parts of the book which I liked less. This was not because the writing was any less good in those parts – what I did not like was aspects of the content. The first of these two parts concerned the fears of a pandemic in 1976 and the reactions to it. At that time a mass vaccination was carried out which was very expensive and probably not necessary. This just shows how difficult decision making can be in this kind of situation. The thing which I find disturbing is that the costs of this were dwarfed by a sum of over three billion dollars which the government had to pay to people who claimed to have suffered adverse consequences from being vaccinated. This is of course likely to affect decisions of this kind in the future. My conclusion
from this is that the worst virus is probably less dangerous than the mechanisms which take place within human society and prevent mankind from reacting to the threat of a virus in the most effective way. The second part concerns another expedition to obtain material from the influenza victims of 1918, this time in Spitzbergen. It was the idea of Kirsty Duncan. Here I want to describe my personal reactions. From the beginning I felt admiration for the efforts of Hultin and of Taubenberger and his team. In contrast, my attitude to Kirsty Duncan was immediately negative. The sequel only strengthened this impression. The expedition set up by Duncan was a failure – that could have happened to anyone. What I dislike most is the combination of a facade of integrity with the refusal to admit the truth. I find it impressive when people achieve something remarkable by working discretely and quietly rather than thriving on and exploiting publicity. I am happy that in this case luck was on the side of those who deserved it.

One interesting effect of the affair concerning the pandemic scare and the vaccination campaign of 1976 was that the idea came up that a rare side effect of the vaccination might be the disease called Guillain-Barre syndrome. This is a demyelinating disease of the peripheral nerves. It seems that at least some types of this disease are related to molecular mimicry, a subject mentioned in a previous post. In this case the disease can be triggered by the bacterium Campylobacter pylori. Perhaps this disease could be a valuable model for autoimmunity in general.

The progress of the present H1N1 epidemic can be followed in daily reports on the web page of the World Health Organization. Today (15th May) the official number of cases is 7520.

Dynamics of dendritic cells

May 11, 2009 by hydrobates

On 23.04 I heard a talk by Michel Nussenzweig from Rockefeller University. His subject was dendritic cells, a class of white blood cells. For the casual observer, such as myself, it might seem that the class is homogeneous. It was emphasized by Nussenzweig that this is far from being the case. According to him, dendritic cells come in various types and the problem of classification may be as complicated as it is in the case of their more prominent relatives, the T-cells. As mentioned in previous posts, for the latter distinctions are made between CD4+, CD8+, Th1, Th2, Th17 etc. In the meantime various classes of dendritic cells have been recognized and it seems that not distinguishing between them sufficiently carefully has been an obstacle to progress. There are, for instance, conventional dendritic cells (cDC) and plasmacytoid dendritic cells (pDC). A related issue is insufficient precision in the use of language. As an aside: a ‘follicular dendritic cell’ is something quite different from a dendritic cell in the strict sense, sharing no more with it than its name and its morphology when seen under the microscope.

Among the themes of the talk were the questions of which cells occur in the course of development of dendritic cells from hematopoietic stem cells, where the different steps of this development take place (in bone marrow, blood, lymphatic organs or other tissues) and the relations between these precursor cells and the corresponding stages in the development of other classes of cells such as macrophages. One tool which can be used to investigate these things is parabiosis, where the bloodstreams of two mice are joined together.

At one point in the talk the speaker said, ‘Here we use a bit of mathematics’. He did not say what kind of mathematics. I sometimes have the impression in talks by biologists for biologists that the feeling is that it would be impolite to the audience, if not indecent, to show any mathematics. In this case I decided to look into what the mathematical content really is. For this I looked at the paper ‘Origin of dendritic cells in peripheral lymphoid organs of mice’ (Nature Immunology 8, 578) where Nussenzweig is one of the authors. The mathematics is not in the main text – to see it is necessary to go to the “Supplementary Methods” available as a separate file in the online version of the paper. From the point of view of a mathematician the presentation of the mathematical formulae is not very convenient. For instance, an ODE like \frac{df}{dt}=g would be written as df=g^*dt. For me the first step in understanding the mathematical part was to convert the equations into TeX. This having been done, equations are:
\frac{dN}{dt}=\left(\frac{V}{C}-D+P\right)N
\frac{dM}{dt}=\left(\frac{2V}{C}+PQ\right)N-\left(\frac{V}{C}+D\right)M
\frac{dL}{dt}=\frac{3PN}{10}+\left(\frac{V}{C}-D\right)L
Here N, M and L are functions of time while the other quantities denoted by capital letters are constant parameters. The first equation is decoupled from the others and in fact plays a different role. N is the number of cDC in the spleen and is taken to be 10^6. Thus the left hand side of the first equation is assumed to vanish and this gives the relation D=\frac{V}{C}+P. Taking account of this, there remains a system of two linear ODE, which can easily be solved explicitly. The resulting solution can be used to determine the parameters in terms of experimental data. A question of interest is how many of the cells have divided in a certain time interval. This can be investigated experimentally by adding the nucleoside analogue bromodeoxyuridine (BrdU). When a cell divides this compound is incorporated into the DNA of the daughter cells. Thus an assay of BrdU can be used to measure the proportion of cells which have divided. At this point it is appropriate to explain the meaning of the variables M and N.The variable M is the proportion of cDC which are labelled with BrdU at a given time. L is the proportion of cDC in the mouse under consideration which come from the other mouse joined to it by parabiosis.

The experimental data is as follows. The proportion of cDC in the spleen which are dividing is 5 per cent. This is the quantity V in the equations. At the initial time there are no cells labelled by BrdU (M(0)=0) and no cells coming from the other mouse (L(0)=0).
After 2.2 days 50% of the cells have taken up BrdU. If ‘days’ is taken as the unit of time then this gives M(2.2)=(0.5)N. On long time scales this proportion approaches 91%, so that \lim_{t\to\infty}M(t)=(0.91)N. After 13 days the proportion of cDC from the other mouse is 22% (L(13)=(0.22)N) while on long time scales it is 30% (\lim_{t\to\infty}L(\infty)=(0.3) N). Putting this information into the explicit solution of the equations and doing some elementary algebra allows all parameters to be computed. In particular P=0.101673. This is the proportion of the cDC in the spleen which enter from the blood in unit time.This can be converted to the number of cDC which enter the spleen each hour by multiplying by 10^6\times (1/24). The result is 4236 cells per hour, a figure which, rounded up to ‘nearly 4300′, is quoted in the abstract of the paper.

The modest goal of this discussion has been to obtain an answer to the question, ‘What is the mathematical content involved in this work’.

Respecting the matter in general relativity

April 25, 2009 by hydrobates

General relativity is a theory which describes space, time and the gravitational field in terms of a Lorentzian metric g_{\alpha\beta}. A complete understanding of the gravitational field requires an understanding of the matter sources which generate it. In the Einstein equations, G_{\alpha\beta}=8\pi T_{\alpha\beta}, the left hand side depends only on g_{\alpha\beta} and is a feature of the geometry alone. On the other hand the right hand side, the energy-momentum tensor, depends not only on the metric but also on some matter fields. The right hand side of Einstein’s equations seems to have suffered from bad press coverage from an early stage. Einstein himself is often quoted as having said that the left hand side of his equations is made of marble while the right hand side is made of wood. I do not have a source for this quote – if anyone reading this does I would be grateful to hear about it. In this post I want to suggest treating that humble right hand side with more respect. If I lived in a palace made of marble with beautiful wooden furniture then I might be more impressed by the marble than by the wood. I would nevertheless do my best to prevent little boys from carving their initials into the furniture with penknives or the cat (much as I love cats) from using it as an accessory for the care of its claws.

The left hand side of the Einstein equations is universal within general relativity – it is always the same, no matter which type of physical situation is to be described. On the other hand the nature of the matter fields depends very much on what physical situation is to be described and what aspects of it are to be included in the description. It is necessary to make a choice of matter model. What is remarkable is that there is a large variety of choices which, in conjunction with the Einstein equations, lead to a consistent closed system of equations which bears no traces of the fact that other physical effects have been omitted. In fact there are three related choices which have to be made to set up the mathematical model in any given case. The first is the matter fields themselves – what kind of geometrical objects are they? The second is the expression which defines the energy-momentum tensor in terms of the matter fields and the metric. The third is the system of equations of motion which describe the dynamics of the matter. Note that in general the energy-momentum tensor depends explicitly on the metric. It is not possible to define an energy-momentum tensor unless the spacetime geometry is given. The same is true in the case of the equations of motion of the matter. They also contain the metric explicitly. Without the metric even the nature of the matter fields themselves can become ambiguous. Which positions should we choose for the indices of a tensor occurring in the description of the matter fields? From a physical point of view it is clear why the metric is necessary in so many ways. The mathematical model must be given a physical interpretation which involves the consideration of measurements. In the absence of a given geometry there is no way to talk about measurements.

When a matter model has been chosen the basic equations which are to be solved are the Einstein-matter equations, i.e the Einstein equations coupled to the equations of motion for the chosen type of matter. The unknowns are the metric and the basic matter fields. For any reasonable choice of matter fields the energy-momentum tensor has zero divergence as a consequence of the equations of motion. However the equations of motion in general contain more information than the divergence-free property of the energy-momentum tensor. For more discussion of these things together with examples see Chapter 3 of my book. I emphasize that solving the equations describing the physical situation within the given model means solving both the Einstein equations and the equations of motion of the matter. This is too often neglected in the literature. A particular danger occurs when the solutions under consideration are of low regularity. If the Einstein equations do not make sense pointwise then it should be checked that they hold in the sense of distributions. For solutions which lack regularity on a hypersurface this is expressed in the junction conditions and it is common in the literature to check that they hold. The equations of motion should also be satisfied in the sense of distributions and this is often ignored. When I use the phrase ‘in the sense of distributions’ here this is just a shorthand since the equations are nonlinear. The correct statement is that it is necessary to think carefully about the sense in which the equations are satisfied.

An example may help to make the importance of the issue clear. At the GR12 conference in Boulder in 1989 there was a heated discussion of the question, whether colliding plane waves can give rise to spontaneous creation of matter. (I emphasize that this discussion was in a purely classical context. Quantum theory was not being taken into account.) This kind of creation of matter sounds ridiculous from a physical point of view. Nevertheless people exhibited ’solutions’ which showed this type of effect. Their mistake was that they had only verified those things which I said above were usual in the literature. They had not considered whether the equations of motion of matter were satisfied. If the equations of motion are ignored, it is not surprising that arbitrary things can happen.

Models for saltatory nerve conduction

April 18, 2009 by hydrobates

As mentioned in a previous post signals in the nerves of human beings, in contrast to those in the squid giant axon, propagate in a very non-uniform way, jumping from one node of Ranvier to the next. The propagation of a signal in the squid axon can be modelled as a travelling wave solution of the Hodgkin-Huxley system. To model the saltatory conduction in human nerves a different type of model is necessary. This kind of conduction is often studied experimentally in amphibians (specifically, frogs). A model of such phenomena was introduced by FitzHugh (Biophys J. 2, 11). The model couples copies of the HH model, one for each node, to a diffusion equation with damping describing the propagation of the voltage along the membrane in each internode (i.e. the section of the axon between two adjacent nodes).The coupling is achieved by imposing transition conditions at the nodes. At one of these, where the axon is stimulated electrically, an inhomogeneous Neumann boundary condition is assumed. At the others it is assumed that the potential is continuous and that its spatial derivative satisfies a transition condition. The central themes of the paper are a numerical solution of the equations and a comparison of the results with experimental data.

In fact the conduction mechanisms at the node of Ranvier seem to somewhat different from those in the squid axon. This was studied in detail in the frog Xenopus by Frankenhaeuser and Huxley. They introduced a modified version of the HH model in order to incorporate these differences (J. Physiol. 171, 302). In the HH model the variables other than the voltage are three gating variables which can be interpreted with hindsight as corresponding to the processes by which voltage-gated sodium channels open and enter the inactive state and the process by which voltage-gated potassium channels open. In the Frankenhaeuser-Huxley model a fourth gating variable is added. I do not know if hindsight has provided it with a clear interpretation. It is referred to in the original paper as a ‘non-specific current’. Goldman and Albus (Biophys. J. 8, 596) presented an improved version of Fitzhugh’s model for saltatory conduction where the HH model was replaced by the Frankenhaeuser-Huxley model. Apart from numerical calculations they did a dimensional analysis, leading to a derivation of the relation between conduction speed and the diameter of the nerve fibre.

Koles and Rasminsky (J. Physiol. 227, 351) investigated the effect of (partial) demyelination on nerve conduction in the framework of a model of the Goldman-Albus type. In particular, they studied numerically the effects of temperature and sodium concentration on the effectiveness of nerve conduction and the occurrence of a conduction block, obtaining results broadly compatible with experimental data. They also compared results of demyelination close to the nodes or further away from them.

In a paper of Bostock and Grafe (J. Physiol. 365, 239) it was argued that the poor functioning of demyelinated nerves after repeated stimulation is due to hyperpolarization of the cell membrane and that this results from the activity of the sodium-potassium pump. This hyperpolarization occurs in normal nerve fibres but has no major effects on the function of the nerve. In general it brings the size of the depolarization required to fire a node closer to that actually produced by the previous node. The problem in the demyelinated case is that these two quanitities can be close to each other to start with so that a small change due to hyperpolarization can cause firing to fail. Note that this type of effect cannot be captured by the mathematical models discussed above since as they stand they do not include the sodium-potassium pump.

Now I have completed the homework I left myself at the end of the post ‘Migrating ion channels‘ and I have come back to my starting point having gained some height in terms of my level of understanding.

When is a dynamical system empty of information?

April 16, 2009 by hydrobates

This post is concerned with dynamical systems in the sense of systems of ordinary differential equations. It is well-known that as soon as the dimension of the system is at least three general dynamical systems can include very complicated behaviour (chaos, strange attractors etc.) which defy detailed understanding in the absence of extra restrictions. In many applications of dynamical systems to the real world (for instance to biology) we encounter the following situation. A dynamical system is postulated which involves a lot of unknowns (in particular more than three) and depends on many parameters. There is only limited information about the parameters which means that in reality we are confronted with a large class of systems. Moreover there is only a limited amount of information about which initial data lead to solutions relevant to the application of interest. This means that in effect a large class of solutions is involved when an attempt is made to study the problem. The door is wide open to the daunting complexity of general solutions of general systems. The standard reaction is to make some choice of parameters and initial data and solve for the evolution on the computer. Since only a finite number of cases can be computed there is no guarantee that what comes out is typical of what happens in general or relevant for the application.

It might be hoped that general properties of dynamical systems which arise in particular wide classes of applications might reduce some of these difficulties. An example of a class of this type is defined by the systems which describe the competition of species in ecology. In 1976 Stephen Smale (J. Math. Biol. 3, 5) showed in a three-page paper that the restriction to this type of system does not, on its own, lead to any improvement whatsoever. In more detail what he did was the following. He considered a system of ODE of the form \dot x_i=x_if_i(x) where the f_i are smooth and \frac{\partial f_i}{\partial x_j}<0 for all i,j. The inequality expresses the competitive property while the factor x_i on the right hand side ensures that if the population of a species is initially zero it stays zero. The region where all x_i are non-negative, which is the biologically interesting region, is invariant. Finally he assumes that the functions f_i are negative outside a large ball so that solutions cannot escape to infinity. Let \Delta_1 be the set where all x_i are non-negative and \sum_i x_i=1. Smale shows that there is a dynamical system in the class under discussion here where \Delta_1 is an invariant subset, all solutions converge to \Delta_1 as t\to\infty and the dynamical system on \Delta_1 is arbitrary. Thus any complication of a general dynamical system of dimension n can be built into a model of the dynamics of n+1 competing species. There is no fundamental simplification if n is at least three. Soon afterwards an analogous result was proved for a class of systems in chemical physics describing continuous flow stirred tank reactors by Perelson and Wallwork (J. Chem. Phys. 66, 4390). A paper of Kaplan and Yorke (American Naturalist 111, 1030) mentions the relevance of Smale’s result to competitive exclusion, a topic mensioned in a previous post.

It is not my intention here to spread pessimism. I hope and believe that dynamical systems can be of immense value in many scientific applications and in biology and medicine in particular. I just want to emphasize that the act of writing down a dynamical system satisfying a few general assumptions is no guarantee that some interesting information about a scientific problem has been implemented. I would add that being able to calculate a few solutions on the computer is not a sufficient reason to be confident that something has been achieved. It should be a matter of priority to develop ways of ensuring that systems proposed do contain information.

The FitzHugh-Nagumo model

March 30, 2009 by hydrobates

As mentioned in a previous post the FitzHugh-Nagumo model is a simplified version of the Hodgkin-Huxley model describing the propagation of signals in nerve cells. In this post only the spatially homogeneous case is discussed. It is possible to consider a corresponding system including a diffusion term for one of the unknowns (x in the system below), a subject touched on in a previous post. A different ODE reduction can be obtained by looking at traveling wave solutions. The equations in the homogeneous case form a system of two ODE. It is supposed to capture the essential qualitative behaviour of the system of four ODE defining the Hodgkin-Huxley model while otherwise being as simple as possible. Here the notation of the original paper of FitzHugh (Biophys. J. 1, 445) will be used.The system is

\dot x=c(y+x-x^3/3+z)

\dot y=-(x-a+by)/c

with constant parameters a, b and c. The quantity z is in general a prescribed function of t. The unknown x corresponds to the voltage in the Hodgkin-Huxley system while y corresponds to the other variables. The parameter z plays the role of the external current in the Hodgkin-Huxley system. The relation to the van der Pol oscillator, as written in the previous post, is easily seen. The parameters a and b and the function z are new. If they are set to zero we get the van der Pol equation in slightly different variables. FitzHugh gives an intuitive description of the dynamics with phase plane pictures. Depending on the values of the parameters there may be a stable stationary solution which is a global attractor, or an unstable stationary solution together with a stable periodic solution. There is more discussion of the qualitative behaviour in Murray’s book ‘Mathematical Biology’.

Theorems about the van der Pol oscillator can be found in many textbooks. Corresponding material on the FitzHugh-Nagumo model seems to be much rarer. There is, nevertheless, a big literature out there. There is a thesis of Matthias Ringqvist online which collects a lot of interesting material on the subject and can serve as a point of entry to the literature for the uninitiated, like myself. He considers the properties of a dynamical system which contains the FitzHugh-Nagumo model as a special case. He motivates this generalization by noting that the more general system includes a number of different systems of interest in different problems in applied mathematics. He discusses the presence of periodic solutions in various parameter regimes and the bifurcations they are involved in. Hopf bifurcations play an important role. Another type of bifurcation which occurs in this context is the Bautin bifurcation. This differs from Hopf case in that one of the non-degeneracy conditions (the one which does not only depend on the linearization at the point of interest) fails. There can be coexistence of more than one periodic solution.

To what extent does the FitzHugh-Nagumo model capture the dynamics present in the Hodgkin-Huxley model? Ringqvist mentions numerical work of Guckenheimer and Oliva from 2002 which suggests that the HH model exhibits chaotic dynamics.The authors do not claim to have proved rigorously that chaos is present but this is a warning that it would be foolish to think that the dynamics of the HH system might be ‘essentially understood’. Chaos is observed in the forced van der Pol oscillator, i.e. the system obtained from the van der Pol oscillator by adding a prescribed function of time. It is more impressive for me to see chaos coming up in a system which is autonomous and which occurs naturally in an application. Of course the Lorenz system, one of the icons of chaos, satisfies the first condition and, at least in a weak sense, the second.

Shilnikov’s theorems on bifurcation from a homoclinic orbit

March 26, 2009 by hydrobates

If p is a stationary solution of a dynamical system a homoclinic orbit based at p is the image of a solution which converges to p both as t\to\infty and t\to -\infty. Homoclinic orbits are typically unstable. In other words, if the given dynamical system is embedded into a family depending smoothly on a parameter \mu then generically the qualitative behaviour of solutions for \mu\ne 0 is different from that of the solutions of the original system. The homoclinic loop breaks. Consider first the case of a two-dimensional dynamical system and a hyperbolic stationary solution. Then it can be shown (Andronov-Leontovich theorem) that generically there exist periodic solutions near \mu=0. Information can also be obtained about the stability of these periodic solutions. One of the genericity assumptions is that the sum of the two eigenvalues of the linearization is non-zero. In higher dimensions generalizations of these phenomena were studied in detail by Leonid Shilnikov in the 1960’s. Here I present some of the ideas involved. My main source for this is chapter 6 of the book ‘Elements of applied bifurcation theory’ by Y. Kuznetzov. Consider a homoclinic orbit based at a hyperbolic stationary solution where all eigenvalues of the linearization are real and of multiplicity one. Suppose that exactly one of them is positive. Label them in decreasing order \lambda_1, \lambda_2, \lambda_3. The eigenvalue \lambda_2 is referred to as the leading eigenvalue. Let \sigma=\lambda_1+\lambda_2. It is called the saddle quantity. Under some genericity assumptions, which include \sigma\ne 0 it can be shown that a periodic orbit bifurcates from the homoclinic orbit. When \sigma<0 the periodic orbit is stable while for \sigma positive it is of saddle type. When all eigenvalues are real higher dimensional systems may be reduced to the three-dimensional case by a kind of centre manifold construction.

Kuznetsov’s book discusses applications of the above general constructions to the example of travelling wave solutions of the FitzHugh-Nagumo system mentioned in a previous post. Different types of bifurcations can be used to classify different patterns of firing of neurons. This was already done by Hodgkin and is discussed in the book ‘Dynamical Systems in Neuroscience’ by E. Ishikevich, of which the first chapter is available on the Internet. The bifurcations which play a central role are different from those discussed above. One of them, called SNIC bifurcation by Ishikevich, arises from a homoclinic orbit based at a non-hyperbolic stationary solution and is discussed (under a different name) in Chapter 7 of Kuznetzov. There is a distinction between Class 1 and Class 2 excitability which reminded me of type I and type II critical collapse (see below) due to the similarity of the corresponding diagrams. I doubt if there is a deeper connection – the quantities plotted in the two cases are of quite a different nature.

The type of bifurcations studied by Shilnikov have been implicated in phenomena observed in a field of study in general relativity known as critical collapse. This concerns the gravitational collapse of matter. A typical scenario is as follows. A small amount of matter (’case of small data’) disperses to infinity since its gravitational field is too weak to hold it together. A very large amount of matter collapses to a black hole since the effect of the gravitational field dominates. What happens for data of intermediate size? This question was investigated intensively using numerical methods by M. Choptuik in the case a self-gravitating scalar field. Since then many authors have investigated many related cases. The results are almost entirely numerical and it would be nice to have some theorems. Choptuik found that there is a transition between the small data and large data behaviour where there is a unique attractor solution giving rise to universal properties of the evolution. There is still no proof of the existence of ‘the Choptuik solution’ but there is strong numerical evidence pointing to an object of this kind. It is discretely self-similar (DSS). Equivalently, it is periodic when expressed in dimensionless variables. With other choices of matter model attractors are encountered which are continuously self-similar (CSS). In terms of dimensionless variables these become stationary solutions. These objects are called critical solutions. Both of these correspond to what is called type II critical collapse. The mass of the black hole formed approaches zero as the threshold between collapse and dispersion is approached. In type I collapse it tends to a non-zero value and the critical solution is stationary in the original variables. For more information see the review article of Gundlach and Martín-García.

There are examples where the existence of CSS critical solutions has been proved, although it has not been proved that they are attractors. In the case of DSS critical solutions, on the other hand, there is not a single existence theorem available. How might a theorem of this kind be proved? One possible idea would be to obtain a DSS solution from a CSS solution by bifurcation in a one-parameter family. The simplest scenario would be a Hopf bifurcation but no example of this kind has yet been found. A more complicated possibility is a bifurcation of the Shilnikov type described above. In work of Aichelburg, Bizon and Tabor (gr-qc/0512136) the authors gave numerical evidence for the occurrence of this type of bifurcation in a model of critical collapse where the matter model is a wave map with values in the three-sphere. Perhaps this could serve as a basis for an existence theorem. There are however major obstacles. It is, for instance, not clear how a rigorous formulation as a dynamical system should be set up. Moreover it is clear that whatever this dynamical system is it must be infinite-dimensional. In the most optimistic case some kind of centre manifold or Lyapunov-Schmidt reduction might be used to get back to a finite-dimensional (even low-dimensional) system. Another problem, and perhaps the most serious one, is that this is a non-local bifurcation which cannot be identified by local calculations at the stationary point.