Poisson brackets

August 9, 2009 by hydrobates

Poisson brackets are very popular in theoretical physics. In the case of a classical mechanical system described in terms of a Hamiltonian the underlying mathematical structure is a finite-dimensional manifold M and a symplectic structure \omega. This object has an inverse \omega^{-1}. The Poisson bracket of two functions f and g on M is defined in terms of their exterior derivatives as \omega^{-1}(df,dg). There is no problem here for the mathematician interested in understanding this. What is more difficult is the definition of the Poisson bracket for field theories. Formally this corresponds to allowing the manifold M to be infinite dimensional. Consider for instance a scalar field \phi, in one space dimension for maximum simplicity. Thus we are considering functions \phi (t,\theta) and I suppose (simplicity again) that they are periodic in \theta. The Lagrangian density is \frac12 (\phi_t^2-\phi_\theta^2). The momentum \pi conjugate to \phi is defined to be the time derivative \phi_t. In this case the phase space M is formally the space of functions (\phi,\pi).I will not try to specify what kind of functions – for example we can think of them as being smooth.Given two functionals F and G (i.e. two functions on the infinite dimensional space M) the Poisson bracket is defined by the formula \int_{S^1} \frac{\delta F}{\delta\phi}\frac{\delta G}{\delta\pi}-\frac{\delta F}{\delta\pi}\frac{\delta G}{\delta\phi}. The derivatives here are functional derivatives. Now, feeling certain that my physicist colleagues will react with amusement or incomprehension, I must admit that I do not understand what a functional derivative is. This formula is a hieroglyph for me.

Let us not be discouraged. The functional derivative looks something like a variational derivative, a concept which is more familiar to me. This basically just means that if I have a functional which is the integral over S^1 of a function of \phi and \pi then it is possible to consider variations (\phi(\lambda),\pi(\lambda)), intuitively curves in the manifold M, and differentiate them with respect to the parameter \lambda. If the space M can be defined as a decent infinite-dimensional manifold (e.g. a Banach manifold) then this can be related to constructions of differential geometry on that manifold. The variational derivative just corresponds to the exterior derivative and is a one-form. If we start with smooth functions then this object takes values in the dual space. In other words we encounter distributions. Now in the presence of a volume form many distributions can be identified with functions. The one-form is then integration against that function, which could be called the functional derivative and used in the definition of the Poisson bracket.If the functionals are integrals of expressions depending pointwise on \phi and \pi then this is nothing other than the partial derivative with respect to the same quantity. Notice that with this definition we can reproduce the fact that the momentum \pi is the functional derivative of the Lagrangian with respect to \phi_t If the functional is allowed to depend on the spatial derivative of \phi then a further complication is added since in that case it is necessary to integrate by parts in space to compute the function corresponding to the distribution.If we can interpret the functional derivatives as functions of \theta then the Poisson bracket can be written as \int_{S^1}\frac{\delta F}{\delta\phi}(\theta)\frac{\delta G}{\delta\pi}(\theta)-\frac{\delta F}{\delta\pi}(\theta)\frac{\delta G}{\delta\phi}(\theta)d\theta.

Metabolic networks

July 24, 2009 by hydrobates

The processes of life involve systems of coupled chemical reactions. Often these are assumed to be homogeneous in space so that the concentrations of the substances involved are functions of time alone. The dynamics of these quantities are described by a system of ODE satisfied by the concentrations. I will call a system of this kind a metabolic system. What I want to do here is to say what is special about metabolic systems compared to general ODE systems. I also want to say something about features of these systems which are typically studied in theoretical work. A feature of metabolic systems which should be mentioned immediately is that they are often very large and moreover depend on a large number of parameters. This makes rigorous analysis difficult and can lead to a strong temptation to move to numerical approaches. On the basis of my personal mathematical preferences I would like to see analytical work taken as far as possible. My main source of information on this topic is the book ‘Systems Biology in Practice’ by E. Klipp et. al.

Consider a system of n substances taking part in r reactions. The system is taken to be of the form \frac{dS}{dt}=N\nu where S is the vector of concentrations taking values in {\bf R}^n, \nu is a vector of reaction rates, taking values in {\bf R}^r and N is a constant matrix, the stoichiometric matrix. It is n by r. It contains information about how many molecules of each type are consumed or produced in each reaction. The whole system depends on m parameters which form a vector in {\bf R}^m. Notice that any system of ODE can be put into this form – simply take n=r and N to be equal to the identity. One obviously interesting question about the system is how many stationary solutions it admits. Actually what is of interest is steady state solutions where \nu is not identically zero. (Call these non-trivial.) The concentrations of all chemicals should be independent of time but there should be non-zero reaction rates so that individual reactions are converting certain chemicals into others. Non-trivial steady states are only possible if the rank of N is less than r. In general the possible reaction rates form a vector space of dimension r minus the rank of N. Notice that the definition of ‘non-trivial’ here does not only use information about the ODE system – it also uses information about a particular splitting of the right hand side into two factors.

Metabolic control analysis is an attempt to understand which changes in a metabolic system result in which changes in particular features of the solutions. The quantity \nu depends on the variables S and p. For certain purposes it may be helpful to consider the derivatives of \nu with respect to these variables. In terms of components this means considering the partial derivatives \frac{\partial\nu_i}{\partial S_j} or \frac{\partial\nu_i}{\partial p_j}. In fact it is common to consider normalized quantities such as \frac{S_j}{\nu_i}\frac{\partial\nu_i}{\partial S_j}, which is possible as long as the denominators do not vanish. The resulting quantities are called elasticities (\epsilon-elasticty for S and \pi-elasticity for p). These relative quantities seem to require giving up the picture of certain quantities as vectors. Maybe they should be thought of as bunches of scalars or as points of a manifold admitting certain preferred transformations. A different type of coefficients are known as control coefficients. They are defined in terms of steady state solutions of the system. Steady states can be changed by changing parameters in the system. It seems to me that the definition of the control coefficients requires being able to define the rates of change of some aspect of the steady state solution (e.g. one of the concentrations) with respect to another.This appears to include the implicit requirement that the steady states are locally isolated for given values of the parameters. This means that both quantities of interest are functions of a common quantity (the parameter being varied) so that we have a chance to define the derivative of one with respect to the other.

There are certain types of nonlinearities which are typically used in modelling metabolic processes. The simplest is the mass action form. Here if p molecules of a species with concentration S_1 and q molecules of a species with concentration S_2 are the inputs for a certain reaction then the reaction rate is taken to be proportional to S_1^pS_2^q. This has a simple intuitive interpretation in terms of the probability that the necessary molecules meet. The other common type of nonlinearity results from the mass action form by a Michaelis-Menten reduction, a procedure described in a previous post. This leads to a non-polynomial nonlinearity but has the important advantage of reducing the number of equations in the system.

Casimir invariants

July 13, 2009 by hydrobates

For some time I have wanted to learn about the concept of Casimir invariants and I was not very satisfied with the information I found. Now I have made new efforts to learn about this topic and I will record some of what I learned here. Let g be a finite-dimensional Lie algebra and let G be the corresponding connected and simply connected Lie group.The Lie algebra g can be identified with T_e G, the tangent space to G at the identity. There is a one-to-one correspondence between elements of this tangent space and left-invariant vector fields on G. Let S(g) be the algebra of symmetric tensors over g. Let U(g) denote the associative algebra which is the quotient of S(g) by elements of the form x\otimes y-y\otimes x-[x,y]. This is called the universal enveloping algebra of g. There is a natural embedding i of g into U(g) and it is a Lie algebra homomorphism into L(U(g)), the Lie algebra obtained from U(g) by using the commutator to define a Lie bracket. Given an associative algebra A and a Lie algebra homomorphism \phi from g to L(A) there exists an algebra homomorphism \psi from U(g) toA such that \phi=\psi\circ i. This is the universal property which appears in the name of the object. Here the fact has been used implicitly that A and L(A) can be identified as sets. An important example is given by a representation \rho of the Lie algebra g on a vector space V, which can be thought of a Lie algebra homomorphism from g to gl(V)=L(GL(V)).

A Casimir invariant, or Casimir element or Casimir operator of g is an element of the centre of U(g). What remains unclear to me is whether these three concepts are supposed to be equivalent, or just related. I am also not sure whether (in any of these cases) any element of the centre is allowed, or just a particular one or a particular type. One definition I have found is the following. Suppose that G semisimple. Then it has a Killing form K which is a non-degenerate bilinear form. Let X^i be a basis of g and let X_i be the basis of one-forms associated to it via K. (I.e. for any vector Y we have X_i(Y)=K(X^i,Y). Then the Casimir invariant is defined to be the element of U(g) given by C=\sum _i X^i X_i. This is independent of the basis chosen. Since K is an invariant bilinear form it follows that C commutes with all elements of g and in fact lies in the centre of U(g). As a consequence of the universal property it is possible to define an object \rho (C). This is an operator on V which commutes with all elements of the image of \rho. If the representation is irreducible then this implies that \rho(C) is a multiple of the identity. The factor of proportionality is a real number which is an invariant of the representation.

There is a theorem on the structure of the centre of the universal enveloping algebra of a semisimple Lie algebra which is associated with the term ‘Harish-Chandra homomorphism’. It can be used to list the number of elements required to generate the centre and their orders as polynomials in basis vectors of the Lie algebra. The number of these generators is the rank of the algebra. For instance for SL(2,R) there is one generator of order two. For SU(3) there are generators of order two and three and the rank is two. The same will be true for SL(3,R) or SU(2,1). My aim at the moment is not to learn the abstract theory in depth but rather to understand enough to do some calculations for a specific application. I plan to say more about the application in a later post.

The Mixmaster model

June 24, 2009 by hydrobates

The Mixmaster model arises in the study of solutions of the vacuum Einstein equations which are spatially homogeneous. To say that a solution is spatially homogeneous means that it is invariant under a group of symmetries whose orbits are three-dimensional and spacelike. This implies that it can be described by functions depending only on time and that the Einstein equations reduce to a system of ordinary differential equations. The details of these equations depend on the group defining the symmetry. In all but one case this group can be taken to be three-dimensional. Then classifying the symmetry type essentially comes down to classifying all three-dimensional Lie algebras.This was done by Bianchi in the 1890’s. He introduced types I to IX and this terminology is used in the study of the Einstein equations to this day. Types VI and VII are actually one-parameter families of non-isomorphic Lie algebras and are therefore denoted by VI{}_h and VII{}_h, where h is the parameter. The most general Bianchi types (in the sense that the solutions depend on the largest number of parameters) are types VIII, IX and VI{}_{-\frac19}. All Bianchi types contain solutions with initial singularities and the three types just listed exhibit complicated oscillatory behaviour in the approach to the singularity. The case which has been studied most is Bianchi IX and this was christened the Mixmaster model by Charles Misner in 1969. He named it after a kind of food mixer which was popular at the time. There has been a great deal of heuristic and numerical work on the dynamics of this system of ODE. Rigorous mathematical results have been relatively rare. The most notable results in this direction are those arising from the PhD thesis of Hans Ringström, some of which will now be described.

It is useful in the study of Bianchi models to distinguish between the Lie algebras which are unimodular (trace of the structure constants zero) and the rest. These are called Class A and Class B respectively. All models of Class A can be incorporated into a single dynamical system. The dynamical systems for types VIII and IX occupy open subsets of the state space and the other types are represented on lower dimensional submanifolds on the boundaries of these open sets. Class A consists of the types I, II, VI{}_0, VII{}_0, VIII and IX. It was shown by Ringström that the limit points of solutions of Bianchi type IX in the approach to the singularity (technically the \alpha-limit points) are of type I and II. For almost all initial data it was proved that the approach to the singularity is oscillatory in the sense that there are at least three distinct \alpha-limit points. More details can be found in this paper. The wider context of these results and aspects of the history of the problem are discussed in the paper ‘Mixmaster: Fact and Belief‘ by Mark Heinzle and Claes Uggla.

Recently there has been renewed interest in these problems. A project which I have started in cooperation with the research group of Bernold Fiedler has generated a lot of activity. The central idea is to bring together knowledge about the concrete application with sophisticated techniques from the theory of dynamical systems. What are the problems to be solved? A first question is whether the direct analogues of the Bianchi IX results hold for type VIII. This has not been proved and there are serious reasons for doubting if it is even true. The key issue, at least at our present state of understanding is whether Bianchi VIII solutions may have \alpha-limit points of type VI{}_0. For type VI{}_{-\frac19} almost nothing is known about the corresponding questions on a rigorous level.

Misner’s original motivation for introducing the Mixmaster model was to find an explanation for the observed isotropy of the universe. Whether this explanation can work depends on the nature of the causal structure of the geometry near the singularity. These days it is usual to explain the isotropy of the universe on the basis of inflation and so Misner’s proposal no longer has the same direct physical significance. The mathematical problem remains and has lost none of its intrinsic fascination. In this context it is desirable to understand the detailed structure of the \alpha-limit sets of generic and non-generic solutions. In particular there are three special points in the phase space, the flat Kasner solutions. Whether Misner’s mechanism is effective depends on how much time solutions spend close to these special points. Quantifying this requires a very detailed understanding of the asymptotics of solutions near the singularity. This issue is under intensive investigation. I hope to write about the results here when they are ready to come into the public domain.

These questions are of interest for the understanding of solutions far beyond the homogeneous case. This has to do with the picture of general singularities in solutions of the Einstein equations developed by Belinkii, Khalatnikov and Lifshitz, which is one of the most outstanding challenges in mathematical relativity.

Cosmological perturbation theory, part 2

June 16, 2009 by hydrobates

In a previous post which I wrote several months ago I promised some more information about the work which Paul Allen and I have been doing on cosmological perturbation theory. Next Thursday I will give a talk on the subject at a conference in Lisbon. This work has been delayed by other commitments but now we posted a paper as arXiv:0906.2517. Here I will explain some of the results. In this paper we study the equation which describes so-called scalar perturbations. I will not attempt to explain the name but instead just say that these are the type of perturbations which cosmologists use to describe processes like the formation of the distribution of galaxies. In the simplest case these are described by the equation \Phi''+\frac{6(1+w)}{1+3w}\frac1{\eta}\Phi'=w\Delta\Phi. Here \Phi is a real valued function on {\rm R}\times T^3 where T^3 is the three-dimensional torus. The prime denotes a derivative with respect to a time coordinate \eta and \Delta is the Laplacian. The constant parameter w comes from the equation of state of the fluid, assumed to be of the form p=w\rho.

The aim of the paper is to obtain information about the asymptotics of general solutions of this equation in the limits t\to 0 and t\to\infty. It may be noted that this equation bears a certain resemblance to the polarized Gowdy equation. In fact we are able to import a number of techniques which have been used in the study of the Gowdy equations to understand the solutions of the equation above. The solutions can be parametrized by certain asymptotic data in each asymptotic regime. For the limit t\to 0 this data consists of two free functions which are coefficients in an asymptotic expansion of the form \sum_i\Phi_i(\eta) \zeta_i(t). For the limit t\to\infty it turns out to be useful to distinguish between solutions which are constant on the hypersurfaces of constant \eta and those whose integral over the torus is zero for each fixed \eta. Here I restrict consideration to the latter. It then turns out that after rescaling by a suitable power of \eta the solution looks like a solution of the flat space wave equation plus a remainder term. Thus in fact the asymptotics in each direction bears a strong qualitative resemblance to the asymptotics in the corresponding direction for solutions of the polarized Gowdy equations.

The paper also proves results about more general equations of state. Of course in that case the equation above is replaced by a more complicated one. It is assumed that in the appropriate limit the equation of state is the sum of a linear term and an expression which has an asymptotic expansion in powers of the energy density \epsilon which are negligible with respect to linear terms in the given regime. In many cases the modification does not make much of a difference and the leading order asymptotics is the same as in the corresponding linear case. An exception is the late time behaviour when the coefficient of the linear term vanishes so that the leading term in f(\epsilon) in the limit \epsilon\to 0 is proportional to \epsilon^{1+\sigma} for some positive \sigma. There is a bifurcation at \sigma=\frac13. When \sigma is smaller than this value the asymptotics is similar to that in the linear case. It does however happen that in the leading term the time coordinate in the solution of the flat space wave equation has to be distorted. For \sigma>\frac13 there is a more radical change in the asymptotics. In that case the behaviour looks more like what happens in the limit t\to 0. Waves which continue to propagate for ever are replaced by waves which freeze.In a sense the natural time coordinate for describing the dynamics is one which brings infinity to a finite value. This is reminiscent of the behaviour of the gravitational field in perturbed de Sitter spacetimes.

The Vlasov-Poisson system

June 4, 2009 by hydrobates

The Vlasov-Poisson system is a system of partial differential equations which comes up in mathematical physics. I have been involved quite a bit with these equations and related systems for many years now. In this post I want to reflect a little on what is and is not known about solutions of this system. One of the things which has stimulated me to think more about these questions just now is a lecture course on kinetic equations which I am giving at the Free University in Berlin. Because of the physics motivation the Vlasov-Poisson system is usually studied in three space dimensions. Here I will allow the space dimension n to be general. For convenience I also introduce a parameter \gamma which can take the values +1 and -1. The equations are \partial_t f+v\cdot \nabla_x f+\gamma\nabla_x U\cdot\nabla_v f=0 and \Delta U=\rho where \rho=\int f dv. Here t is time and x and v denote position and velocity variables, each belonging to {\bf R}^n. \Delta is the Laplacian. Because of their most frequent applications the cases \gamma=1 and \gamma=-1 are often called the plasma physics case and the stellar dynamics case respectively. A natural problem here is the initial value problem (Cauchy problem) with data prescribed for f.

Local existence in the Cauchy problem is known. For n\le 3 it is furthermore known that the local solution extends to a global in time solution, independently of the sign of \gamma. (The first proofs by two different methods were given by Pfaffelmoser and Lions/Perthame around 1991.) When n\ge 4 and \gamma=-1 there are large classes of smooth initial data for which global existence fails. More specifically, these equations have a conserved energy and when this energy is negative the corresponding smooth solution breaks down after finite time. The easiest way to realize that n=4 might be important is to look at scaling properties of the equations. For a discussion of the significance of scaling properties in general see Terry Tao’s post on the Navier-Stokes equations. In the case n=3 the potential and kinetic energies satisfy an inequality of the form |{\cal E}_{\rm pot}|\le C{\cal E}_{\rm kin}^{\frac12} and this plays an important role in the global existence proof. The essential feature is that the power on the right hand side is less than one. If similar arguments are carried out in the case n=4 then the power one half is replaced by the power one. Thus in a sense n=4 is the critical case. For n\ge 4 the global existence problem for the Vlasov-Poisson system with \gamma=1 is open. For n=4 it is a critical problem and might be solvable in a not too distant future. Similar remarks might be made about the relativistic Vlasov-Poisson system with massless particles in three space dimensions which is given by \partial_t f+\hat v\cdot \nabla_x f+\nabla_x U\cdot\nabla_v f=0 where \hat v=\frac{v}{|v|}. The analogue of this last system plays an important role in the recent work of Lemou, Méhats and Raphaël on the nature of singularities in solutions of the relativistic Vlasov-Poisson system with \gamma=-1.

Other open questions concern the behaviour of solutions of the Vlasov-Poisson system at late times. There are various results on this but they seem to be far from an exhaustive understanding of the asymptotics. Interesting questions include whether the density \rho is bounded any solution with \gamma=-1 and whether \|\rho\|_{L^\infty}=O(t^{-3}) in the case \gamma=1, as is known to be the case for small initial data.

H1N1 and the influenza pandemic of 1918

May 15, 2009 by hydrobates

A few years ago I read the book ‘Flu. The story of the great influenza pandemic of 1918 and the search for the virus that caused it’ by Gina Kolata. (Actually it happened to be the German translation.) I found it very interesting and a pleasure to read. The recent public interest in the outbreak of influenza of type H1N1 in Mexico led me to take up the book again. I found it just as fascinating as the first time around and I ended up rereading the whole book. Things I had learned in the meantime allowed me to appreciate more aspects of the subject. In what follows I will describe some of the highlights.

The main subject of the book is the influenza pandemic of 1918. According to conservative estimates this killed 50 million people, more than died in combat in the First World War. In view of the magnitude of the catastrophe it is surprising that it is not more widely known. In 1951 Johan Hultin travelled to Brevig in Alaska and obtained samples from the remains of victims from 1918 which had been relatively well preserved in the permafrost.He had hoped that the virus might have survived but this was too optimistic. He could not do much with his samples at that time. Later it was clear to him that with more recent developments in molecular biology there were new possibilities of using this kind of material. Independently of this, in 1995 Jeffery Taubenberger and his colleagues at the Armed Forces Institute of Pathology started to try to obtain information about the 1918 influenza virus from remains of soldiers who had died at that time which were archived at the institute. They were able to determine part of the genetic sequence of the virus but the quantity of material they had was so small as to limit the possibilities. Hultin read their first publication on the subject in Science in 1997 and contacted Taubenberger, offering to return to Brevig to obtain more material. He was successful and this finally allowed Taubenberger’s team to determine the whole genome of the virus. This was completed in 2005. A few months later the virus was reconstructed at the Centers for Disease Control and Prevention (CDC).

There were two parts of the book which I liked less. This was not because the writing was any less good in those parts – what I did not like was aspects of the content. The first of these two parts concerned the fears of a pandemic in 1976 and the reactions to it. At that time a mass vaccination was carried out which was very expensive and probably not necessary. This just shows how difficult decision making can be in this kind of situation. The thing which I find disturbing is that the costs of this were dwarfed by a sum of over three billion dollars which the government had to pay to people who claimed to have suffered adverse consequences from being vaccinated. This is of course likely to affect decisions of this kind in the future. My conclusion
from this is that the worst virus is probably less dangerous than the mechanisms which take place within human society and prevent mankind from reacting to the threat of a virus in the most effective way. The second part concerns another expedition to obtain material from the influenza victims of 1918, this time in Spitzbergen. It was the idea of Kirsty Duncan. Here I want to describe my personal reactions. From the beginning I felt admiration for the efforts of Hultin and of Taubenberger and his team. In contrast, my attitude to Kirsty Duncan was immediately negative. The sequel only strengthened this impression. The expedition set up by Duncan was a failure – that could have happened to anyone. What I dislike most is the combination of a facade of integrity with the refusal to admit the truth. I find it impressive when people achieve something remarkable by working discretely and quietly rather than thriving on and exploiting publicity. I am happy that in this case luck was on the side of those who deserved it.

One interesting effect of the affair concerning the pandemic scare and the vaccination campaign of 1976 was that the idea came up that a rare side effect of the vaccination might be the disease called Guillain-Barre syndrome. This is a demyelinating disease of the peripheral nerves. It seems that at least some types of this disease are related to molecular mimicry, a subject mentioned in a previous post. In this case the disease can be triggered by the bacterium Campylobacter pylori. Perhaps this disease could be a valuable model for autoimmunity in general.

The progress of the present H1N1 epidemic can be followed in daily reports on the web page of the World Health Organization. Today (15th May) the official number of cases is 7520.

Dynamics of dendritic cells

May 11, 2009 by hydrobates

On 23.04 I heard a talk by Michel Nussenzweig from Rockefeller University. His subject was dendritic cells, a class of white blood cells. For the casual observer, such as myself, it might seem that the class is homogeneous. It was emphasized by Nussenzweig that this is far from being the case. According to him, dendritic cells come in various types and the problem of classification may be as complicated as it is in the case of their more prominent relatives, the T-cells. As mentioned in previous posts, for the latter distinctions are made between CD4+, CD8+, Th1, Th2, Th17 etc. In the meantime various classes of dendritic cells have been recognized and it seems that not distinguishing between them sufficiently carefully has been an obstacle to progress. There are, for instance, conventional dendritic cells (cDC) and plasmacytoid dendritic cells (pDC). A related issue is insufficient precision in the use of language. As an aside: a ‘follicular dendritic cell’ is something quite different from a dendritic cell in the strict sense, sharing no more with it than its name and its morphology when seen under the microscope.

Among the themes of the talk were the questions of which cells occur in the course of development of dendritic cells from hematopoietic stem cells, where the different steps of this development take place (in bone marrow, blood, lymphatic organs or other tissues) and the relations between these precursor cells and the corresponding stages in the development of other classes of cells such as macrophages. One tool which can be used to investigate these things is parabiosis, where the bloodstreams of two mice are joined together.

At one point in the talk the speaker said, ‘Here we use a bit of mathematics’. He did not say what kind of mathematics. I sometimes have the impression in talks by biologists for biologists that the feeling is that it would be impolite to the audience, if not indecent, to show any mathematics. In this case I decided to look into what the mathematical content really is. For this I looked at the paper ‘Origin of dendritic cells in peripheral lymphoid organs of mice’ (Nature Immunology 8, 578) where Nussenzweig is one of the authors. The mathematics is not in the main text – to see it is necessary to go to the “Supplementary Methods” available as a separate file in the online version of the paper. From the point of view of a mathematician the presentation of the mathematical formulae is not very convenient. For instance, an ODE like \frac{df}{dt}=g would be written as df=g^*dt. For me the first step in understanding the mathematical part was to convert the equations into TeX. This having been done, equations are:
\frac{dN}{dt}=\left(\frac{V}{C}-D+P\right)N
\frac{dM}{dt}=\left(\frac{2V}{C}+PQ\right)N-\left(\frac{V}{C}+D\right)M
\frac{dL}{dt}=\frac{3PN}{10}+\left(\frac{V}{C}-D\right)L
Here N, M and L are functions of time while the other quantities denoted by capital letters are constant parameters. The first equation is decoupled from the others and in fact plays a different role. N is the number of cDC in the spleen and is taken to be 10^6. Thus the left hand side of the first equation is assumed to vanish and this gives the relation D=\frac{V}{C}+P. Taking account of this, there remains a system of two linear ODE, which can easily be solved explicitly. The resulting solution can be used to determine the parameters in terms of experimental data. A question of interest is how many of the cells have divided in a certain time interval. This can be investigated experimentally by adding the nucleoside analogue bromodeoxyuridine (BrdU). When a cell divides this compound is incorporated into the DNA of the daughter cells. Thus an assay of BrdU can be used to measure the proportion of cells which have divided. At this point it is appropriate to explain the meaning of the variables M and N.The variable M is the proportion of cDC which are labelled with BrdU at a given time. L is the proportion of cDC in the mouse under consideration which come from the other mouse joined to it by parabiosis.

The experimental data is as follows. The proportion of cDC in the spleen which are dividing is 5 per cent. This is the quantity V in the equations. At the initial time there are no cells labelled by BrdU (M(0)=0) and no cells coming from the other mouse (L(0)=0).
After 2.2 days 50% of the cells have taken up BrdU. If ‘days’ is taken as the unit of time then this gives M(2.2)=(0.5)N. On long time scales this proportion approaches 91%, so that \lim_{t\to\infty}M(t)=(0.91)N. After 13 days the proportion of cDC from the other mouse is 22% (L(13)=(0.22)N) while on long time scales it is 30% (\lim_{t\to\infty}L(\infty)=(0.3) N). Putting this information into the explicit solution of the equations and doing some elementary algebra allows all parameters to be computed. In particular P=0.101673. This is the proportion of the cDC in the spleen which enter from the blood in unit time.This can be converted to the number of cDC which enter the spleen each hour by multiplying by 10^6\times (1/24). The result is 4236 cells per hour, a figure which, rounded up to ‘nearly 4300′, is quoted in the abstract of the paper.

The modest goal of this discussion has been to obtain an answer to the question, ‘What is the mathematical content involved in this work’.

Respecting the matter in general relativity

April 25, 2009 by hydrobates

General relativity is a theory which describes space, time and the gravitational field in terms of a Lorentzian metric g_{\alpha\beta}. A complete understanding of the gravitational field requires an understanding of the matter sources which generate it. In the Einstein equations, G_{\alpha\beta}=8\pi T_{\alpha\beta}, the left hand side depends only on g_{\alpha\beta} and is a feature of the geometry alone. On the other hand the right hand side, the energy-momentum tensor, depends not only on the metric but also on some matter fields. The right hand side of Einstein’s equations seems to have suffered from bad press coverage from an early stage. Einstein himself is often quoted as having said that the left hand side of his equations is made of marble while the right hand side is made of wood. I do not have a source for this quote – if anyone reading this does I would be grateful to hear about it. In this post I want to suggest treating that humble right hand side with more respect. If I lived in a palace made of marble with beautiful wooden furniture then I might be more impressed by the marble than by the wood. I would nevertheless do my best to prevent little boys from carving their initials into the furniture with penknives or the cat (much as I love cats) from using it as an accessory for the care of its claws.

The left hand side of the Einstein equations is universal within general relativity – it is always the same, no matter which type of physical situation is to be described. On the other hand the nature of the matter fields depends very much on what physical situation is to be described and what aspects of it are to be included in the description. It is necessary to make a choice of matter model. What is remarkable is that there is a large variety of choices which, in conjunction with the Einstein equations, lead to a consistent closed system of equations which bears no traces of the fact that other physical effects have been omitted. In fact there are three related choices which have to be made to set up the mathematical model in any given case. The first is the matter fields themselves – what kind of geometrical objects are they? The second is the expression which defines the energy-momentum tensor in terms of the matter fields and the metric. The third is the system of equations of motion which describe the dynamics of the matter. Note that in general the energy-momentum tensor depends explicitly on the metric. It is not possible to define an energy-momentum tensor unless the spacetime geometry is given. The same is true in the case of the equations of motion of the matter. They also contain the metric explicitly. Without the metric even the nature of the matter fields themselves can become ambiguous. Which positions should we choose for the indices of a tensor occurring in the description of the matter fields? From a physical point of view it is clear why the metric is necessary in so many ways. The mathematical model must be given a physical interpretation which involves the consideration of measurements. In the absence of a given geometry there is no way to talk about measurements.

When a matter model has been chosen the basic equations which are to be solved are the Einstein-matter equations, i.e the Einstein equations coupled to the equations of motion for the chosen type of matter. The unknowns are the metric and the basic matter fields. For any reasonable choice of matter fields the energy-momentum tensor has zero divergence as a consequence of the equations of motion. However the equations of motion in general contain more information than the divergence-free property of the energy-momentum tensor. For more discussion of these things together with examples see Chapter 3 of my book. I emphasize that solving the equations describing the physical situation within the given model means solving both the Einstein equations and the equations of motion of the matter. This is too often neglected in the literature. A particular danger occurs when the solutions under consideration are of low regularity. If the Einstein equations do not make sense pointwise then it should be checked that they hold in the sense of distributions. For solutions which lack regularity on a hypersurface this is expressed in the junction conditions and it is common in the literature to check that they hold. The equations of motion should also be satisfied in the sense of distributions and this is often ignored. When I use the phrase ‘in the sense of distributions’ here this is just a shorthand since the equations are nonlinear. The correct statement is that it is necessary to think carefully about the sense in which the equations are satisfied.

An example may help to make the importance of the issue clear. At the GR12 conference in Boulder in 1989 there was a heated discussion of the question, whether colliding plane waves can give rise to spontaneous creation of matter. (I emphasize that this discussion was in a purely classical context. Quantum theory was not being taken into account.) This kind of creation of matter sounds ridiculous from a physical point of view. Nevertheless people exhibited ’solutions’ which showed this type of effect. Their mistake was that they had only verified those things which I said above were usual in the literature. They had not considered whether the equations of motion of matter were satisfied. If the equations of motion are ignored, it is not surprising that arbitrary things can happen.

Models for saltatory nerve conduction

April 18, 2009 by hydrobates

As mentioned in a previous post signals in the nerves of human beings, in contrast to those in the squid giant axon, propagate in a very non-uniform way, jumping from one node of Ranvier to the next. The propagation of a signal in the squid axon can be modelled as a travelling wave solution of the Hodgkin-Huxley system. To model the saltatory conduction in human nerves a different type of model is necessary. This kind of conduction is often studied experimentally in amphibians (specifically, frogs). A model of such phenomena was introduced by FitzHugh (Biophys J. 2, 11). The model couples copies of the HH model, one for each node, to a diffusion equation with damping describing the propagation of the voltage along the membrane in each internode (i.e. the section of the axon between two adjacent nodes).The coupling is achieved by imposing transition conditions at the nodes. At one of these, where the axon is stimulated electrically, an inhomogeneous Neumann boundary condition is assumed. At the others it is assumed that the potential is continuous and that its spatial derivative satisfies a transition condition. The central themes of the paper are a numerical solution of the equations and a comparison of the results with experimental data.

In fact the conduction mechanisms at the node of Ranvier seem to somewhat different from those in the squid axon. This was studied in detail in the frog Xenopus by Frankenhaeuser and Huxley. They introduced a modified version of the HH model in order to incorporate these differences (J. Physiol. 171, 302). In the HH model the variables other than the voltage are three gating variables which can be interpreted with hindsight as corresponding to the processes by which voltage-gated sodium channels open and enter the inactive state and the process by which voltage-gated potassium channels open. In the Frankenhaeuser-Huxley model a fourth gating variable is added. I do not know if hindsight has provided it with a clear interpretation. It is referred to in the original paper as a ‘non-specific current’. Goldman and Albus (Biophys. J. 8, 596) presented an improved version of Fitzhugh’s model for saltatory conduction where the HH model was replaced by the Frankenhaeuser-Huxley model. Apart from numerical calculations they did a dimensional analysis, leading to a derivation of the relation between conduction speed and the diameter of the nerve fibre.

Koles and Rasminsky (J. Physiol. 227, 351) investigated the effect of (partial) demyelination on nerve conduction in the framework of a model of the Goldman-Albus type. In particular, they studied numerically the effects of temperature and sodium concentration on the effectiveness of nerve conduction and the occurrence of a conduction block, obtaining results broadly compatible with experimental data. They also compared results of demyelination close to the nodes or further away from them.

In a paper of Bostock and Grafe (J. Physiol. 365, 239) it was argued that the poor functioning of demyelinated nerves after repeated stimulation is due to hyperpolarization of the cell membrane and that this results from the activity of the sodium-potassium pump. This hyperpolarization occurs in normal nerve fibres but has no major effects on the function of the nerve. In general it brings the size of the depolarization required to fire a node closer to that actually produced by the previous node. The problem in the demyelinated case is that these two quanitities can be close to each other to start with so that a small change due to hyperpolarization can cause firing to fail. Note that this type of effect cannot be captured by the mathematical models discussed above since as they stand they do not include the sodium-potassium pump.

Now I have completed the homework I left myself at the end of the post ‘Migrating ion channels‘ and I have come back to my starting point having gained some height in terms of my level of understanding.