Archive for the ‘partial differential equations’ Category

A model for the Calvin cycle with diffusion

June 29, 2021

In the past I have spent a lot of time thinking about mathematical models for the Calvin cycle of photosynthesis. These were almost exclusively systems of ordinary differential equations. One thing which was in the back of my mind was a system involving diffusion written down in a paper by Grimbs et al. and which I will describe in more detail in what follows. Recently Burcu Gürbüz and I have been looking at the dynamics of solutions of this system in detail and we have just produced a preprint on this. The starting point is a system of five ODE with mass action kinetics written down by Grimbs et al. and which I call the MA system (for mass action). That system has only one positive steady state and on the basis of experimental data the authors of that paper were expecting more than one. To get around this discrepancy they suggested a modified system where ATP is introduced as an additional variable and the diffusion of ATP is included in the model. I call this the MAd system (for mass action with diffusion). Diffusion of the other five chemical species is not included in the model. The resulting system can be thought of as a degenerate system of six reaction reaction-diffusion equations or as a system where five ODE are coupled to one (non-degenerate) diffusion equation. The idea was that the MAd system might have more than one steady state, with an inhomogenous steady state in addition to the known homogeneous one. Experiments which measure only spatial averages would not detect the inhomogeneity. To be relevant for experiments the steady states should be stable.

For simplicity we consider the model in one space dimension. The spatial domain is then a finite closed interval and Neumann boundary conditions are imposed for the concentration of ATP. We prove that for suitable values of the parameters in the model there exist infinitely many smooth inhomogeneous steady states. It turns out that all of these are (nonlinearly) unstable. This is not a special feature of this system but in fact, as pointed out in a paper of Marciniak-Czochra et al. (J. Math. Biol. 74, 583), it is a frequent feature of systems where ODE are coupled to a diffusion equation. This can be proved using a method discussed in a previous post which allows nonlinear instability to be concluded from spectral instability. We prove the spectral instability in our example. There may also exist non-smooth inhomogeneous steady states but we did not enter into that theme in our paper. If stable inhomogeneous steady states cannot be used to explain the experimental observations what alternatives are available which still keep the same model? If experiments only measure time averages then an alternative would be limit sets other than steady states. In this context it would be interesting to know whether the system has spatially homogeneous solutions which are periodic or converge to a strange attractor. A preliminary investigation of this question in the paper did not yield definitive results. With the help of computer calculations we were able to show that it can happen that the linearization of the (spatially homogeneous) system about a steady state has non-real eigenvalues, suggesting the presence of at least damped oscillations. We proved global existence for the full system but we were not able to show whether general solutions are bounded in time, not even in L^1. It can be concluded that there are still many issues to be investigated concerning the long-time behaviour of solutions of this system.

From spectral to nonlinear instability, part 2

May 21, 2021

Here I want to extend the discussion of nonlinear instability in the previous post to the infinite-dimensional case. To formulate the problem fix a Banach space X. If the starting point is a system of PDE then finding the right X might be a non-trivial task and the best choice might depend on which application we have in mind. We consider an abstract equation of the form \frac{du}{dt}=Au+F(u). Here t\mapsto u(t) is a mapping from an interval to X. A is a linear operator which may be unbounded. In fact the case I am most interested in here is that where A is the generator of a strongly continuous semigroup of linear operators on X. Hence A may not be globally defined on X. It will be assumed that F is globally defined on X. This is not a restriction in the applications I have in mind although it might be in other interesting cases. In this setting it is necessary to think about how a solution should be defined. In fact we will define a solution of the above equation with initial value v to be a continuous solution of the integral equation u(t)=e^{tA}v+\int_0^t e^{(t-\tau)A}F(u(\tau))d\tau. Here e^{tA} is not to be thought of literally as an exponential but as an element of the semigroup generated by A. When A is bounded then e^{tA} is really an exponential and the solution concept used here is equivalent to usual concept of solution for an ODE in a Banach space. Nonlinear stability can be defined just as in the finite-dimensional case, using the norm topology. In general the fact that a solution remains in a fixed ball as long as it exists may not ensure global existence, in contrast to the case of finite dimension. We are attempting to prove instability, i.e. to prove that solutions leave a certain ball. Hence it is convenient to introduce the convention that a solution which ceases to exist after finite time is deemed to have left the ball. In other words, when proving nonlinear instability we may assume that the solution u being considered exists globally in the future. What we want to show is that there is an \epsilon>0 such that for any \delta>0 there are solutions which start in the ball of radius \delta about x_0 and leave the ball of radius \epsilon about that point.

The discussion which follows is based on a paper called ‘Spectral Condition for Instability’ by Jalal Shatah and Walter Strauss. At first sight their proof looks very different from the proof I presented in the last post and here I want to compare them, with particular attention to what happens in the finite-dimensional case. We want to show that the origin is nonlinearly unstable under the assumption that the spectrum of A intersects the half of the complex plane where the real part is positive. In the finite-dimensional case this means that A has an eigenvalue with positive real part. The spectral mapping theorem relates this to the situation where the spectrum of e^A intersects the exterior of the unit disk. In the finite-dimensional case it means that there is an eigenvalue with modulus \mu greater than one. We now consider the hypotheses of the Theorem of Shatah and Strauss. The first is that the linear operator A occurring in the equation generates a strongly continuous semigroup. The second is that the spectrum of e^A meets the exterior of the unit disk. The third is that in a neighbourhood of the origin the nonlinear term can be estimated by a power of the norm greater than one. The proof of nonlinear instability is based on three lemmas. We take e^\lambda to be a point of the spectrum of e^A whose modulus is equal to the spectral radius. In the finite-dimensional case this would be an eigenvalue and we would consider the corresponding eigenvector v. In general we need a suitable approximate analogue of this. Lemma 1 provides for this by showing that e^\lambda belongs to the approximate point spectrum of e^A. Lemma 2 then shows that there is a v whose norm grows at a rate no larger than e^{{\rm Re}\lambda t} and is such that the norm of the difference between e^Av and e^{{\rm Re}\lambda t}v can be controlled. Lemma 3 shows that the growth rate of the norm of e^A is close to e^{{\rm Re}\lambda t}. In the finite-dimensional case the proofs of Lemma 1 and Lemma 2 are trivial. In Lemma 3 the lower bound is trivial in the finite-dimensional case. To get the upper bound in that case a change of basis can be used to make the off-diagonal entries in the Jordan normal form as small as desired. This argument is similar to that used to treat A_c in the previous post. In the theorem we choose a ball B such that instability corresponds to leaving it. As long as a solution remains in that ball the nonlinearity is under good control. The idea is to show that as long as the norm of the initial condition is sufficiently small the contribution of the nonlinear term on the right hand side of the integral equation will remain small compared to that coming from the linearized equation, which is growing at a known exponential rate. The details are complicated and are of course the essence of the proof. I will not try to explain them here.

Strongly continuous semigroups

March 19, 2021

An evolution equation is an equation for a function u(t,x) depending on a time variable t and some other variables x which can often be thought of as spatial variables. There is also the case where there are no variables x, which is the one relevant for ordinary differential equations. We can reinterpret the function u as being something of the form u(t)(x), a function of x which depends on t. I am thinking of the case where u(t,x) takes its values in a Euclidean space. Then u(t) should be thought of as taking values in a function space. Different regularity assumptions on the solutions naturally lead to different choices of function space. Suppose, for instance, I consider the one-dimensional heat equation u_t=u_{xx}. Then I could choose the function space to be C^0, the space of continuous functions, C^2, the space of twice continuously differentiable functions or L^p. For some choices of function spaces we are forced to consider weak solutions. It is tempting to consider the evolution equation as an ODE in a function space. This can have advantages but also disadvantages. It gives us intuition which can suggest ideas but the analogues of statements about ODE often do not hold for more general evolution equations, in particular due to loss of compactness. (In what follows the function space considered will always be a Banach space.) We can formally write the equation of the example as u_t=Au for a linear operator A. If we choose the function space to be X=L^p then this operator cannot be globally defined, since A does not map from X to itself. This leads to the consideration of unbounded operators from X to X. This is a topic which requires care with the use of language and the ideas which we associate to certain formulations. An unbounded operator from X to X is a linear mapping from a linear subspace D(A), the domain of A, to X. Just as there may be more than one space X which is of interest for a given evolution equation there may be more than one choice of domain which is of interest even after the space has been chosen. To take account of this an unbounded operator is often written as a pair (A, D(A)). In the example we could for instance choose D(A) to be the space of C^\infty functions or the Sobolev space W^{2,p}.

In the finite-dimensional case we know the solution of the equation u_t=Au with initial datum u_0. It is u(t)=e^{tA}u_0. It is tempting to keep this formula even when A is unbounded, but it must then be supplied with a suitable interpretation. There are general ways of defining nonlinear functions of unbounded linear operators using spectral theory but here I want to pursue another direction, which uses a kind of axiomatic approach to the exponential function S(t)=e^{tA}. It should have the property that S(0)=I and it should satisfy the semigroup property S(s+t)=S(s)S(t) for all non-negative s and t. It remains to require some regularity property. One obvious possibility would be to require that S is a continuous function from [0,\infty) into the space of bounded operators L(X) with the topology defined by the operator norm. Unfortunately this is too much. Let us define an operator Ax=\lim_{t\to 0}\frac{S(t)x-x}{t} whenever this limit exists in X and D(A) to be the linear subspace for which it exists. In this way we get an unbounded operator on a definite domain. The problem with the continuity assumption made above is that it implies that D(A)=X. In other words, if the operator A is genuinely unbounded then this definition cannot apply. In particular it cannot apply to our example. It turns out that the right assumption is that \|S(t)x-x\|\to 0 for t\to 0 and any x\in D(X). This leads to what is called a strongly continuous one-parameter semigroup. A is called the infinitesimal generator of S(t). Its domain is dense and it is a closed operator, which means that its graph is a closed subset (in fact linear subspace) of the product X\times X with the topology defined by the product norm. In a case like the example above the problem with continuity is only at t=0. The solution of the heat equation is continuous in t in any reasonable topology on any reasonable Banach space for t>0 but not for t=0. In fact it is even analytic for t>0, something which is typical for linear parabolic equations.

In this discussion we have said how to start with a semigroup S(t) and get its generator (A,D(A)) but what about the converse? What is a criterion which tells us for a given operator (A,D(A)) that it is the generator of a semigroup? A fundamental result of this type is the Hille-Yosida theorem. I do not want to go into detail about this and related results here. I will just mention that it has to do with spectral theory. It is possible to define the spectrum of an unbounded operator as a generalization of the eigenvalues of a matrix. The complement of the spectrum is called the resolvent set and the resolvent is (A-\lambda I)^{-1}, which is a bounded operator. The hypotheses made on the generator of a semigroup concern the position of the resolvent set in the complex plane and estimates on the norm of the resolvent at infinity. In this context the concept of a sectorial operator arises.

My interest in these topics comes from an interest in systems of reaction-diffusion equations of the form u_t+D\Delta u=f(u). Here u is vector-valued, D is a diagonal matrix with non-negative elements and the Laplace operator is to be applied to each component of u. I have not found it easy to extract the results I would like to have from the literature. Part of the reason for this is that I am interested in examples where not all the diagonal elements of D are positive. That situation might be described as a degenerate system of reaction diffusion equations or as a system of reaction-diffusion equations coupled to a system of ODE. In that case a lot of results are not available ‘off the shelf’. Therefore to obtain an understanding it is necessary to penetrate into the underlying theory. One of the best sources I have found is the book ‘Global Solutions of Reaction-Diffusion Systems’ by Franz Rothe.

Kestrels and Dirichlet boundary conditions

February 28, 2017

The story I tell in this post is based on what I heard a long time ago in a talk by Jonathan Sherratt. References to the original work by Sherratt and his collaborators are Proc. R. Soc. Lond. B269, 327 (more biological) and SIAM J. Appl. Math. 63, 1520 (more mathematical). There are some things I say in the following which I did not find in these sources and so they are based on my memories of that talk and on things which I wrote down for my own reference at intermediate times. If this has introduced errors they will only concern details and not the basic story. The subject is a topic in population biology and how it relates to certain properties of reaction-diffusion equations.

In the north of England there is an area called the Kielder Forest with a lake in the middle and the region around the lake is inhabited by a population of the field vole Microtus\ agrestis. It is well known that populations of voles undergo large fluctuations in time. What is less known is what the spatial dependence is like. There are two alternative scenarios. In the first the population density of voles oscillates in a way which is uniform in space. In the second it is a travelling wave of the form U(x-ct). In that case the population at a fixed point of space oscillates in time but the phase of the oscillations is different at different spatial points. In general there is relatively little observational data on this type of thing. The voles in the Kielder forest are an exception to this since in that case a dedicated observer collected data which provides information on both the temporal and spatial variation of the population density. This data is the basis for the modelling which I will now describe.

The main predators of the voles are weasels Mustela\ nivalis. It is possible to set up a model where the unknowns are the populations of voles and weasels. Their interaction is modelled in a simple way common in predator-prey models. Their spatial motion is described by a diffusion term. In this way a system of reaction-diffusion equations is obtained. These are parabolic equations and to the time evolution is non-local in space. The unknowns are defined on a region with boundary which is the complement of a lake. Because of this we need not only initial values to determine a solution but also boundary conditions. How should they be chosen? In the area around the lake there live certain birds of prey, kestrels. They hunt voles from the air. In most of the area being considered there is very thick vegetation and the voles can easily hide from the kestrels. Thus the direct influence of the kestrels on the vole population is negligible and the kestrels to not need to be included in the reaction-diffusion system. They do, however, have a striking indirect effect. On the edge of the lake there is a narrow strip with little vegetation and any vole which ventures into that area is in great danger of being caught by a kestrel. This means that the kestrels essentially enforce the vanishing of the population density of voles at the edge of the lake. In other words they impose a homogeneous Dirichlet boundary condition on one of the unknowns at the boundary. Note that this is incompatible with spatially uniform oscillations. On the boundary oscillations are ruled out by the Dirichlet condition. When the PDE are solved numerically what is seen that the shore of the lake generates a train of travelling waves which propagate away from it. This can also be understood theoretically, as explained in the papers quoted above.

Fuchsian equations

July 21, 2011

Fuchsian equations are a class of differential equations which have played a big role in my research and which have now found a niche in the mathematical relativity community. In fact they have a very wide range of applications. The story begins with Lazarus Fuchs in the mid nineteenth century. He was interested in the case of a higher order linear scalar ODE for a complex-valued function whose coefficients are analytic except for isolated singularities. The cases which I have mainly been concerned with are first order nonlinear systems of PDE for real-valued functions whose coefficients may only be smooth. It sounds like almost everything is different in the two cases and so what is the similarity? The basic idea is that a Fuchsian problem concerns a system of PDE with a singularity in the coefficients where the desire is to describe the behaviour of solutions near the singularity. If certain structural conditions are satisfied then for any formal power series solution of the problem there is a corresponding actual solution. In the original situation of Fuchs the series converged and the solution was its limit. More generally there need be no convergence and the series is only asymptotic. A related task is to find singular solutions of a regular system. By suitable changes of variables this can sometimes be transformed to the problem of finding a regular solution of a singular system.

With hindsight my first contact with a problem of Fuchsian type was in some work I did with Bernd Schmidt (Class. Quantum Grav. 8, 985) where we proved the existence of certain models for spherically symmetry stars in general relativity. Writing the equations in polar coordinates leads to a system of ODE with a singularity corresponding to the centre of symmetry. We proved the existence near the centre of solutions satisfying the appropriate boundary conditions by doing a direct iteration. In later years I was interested in the problem of mathematical models of the big bang in general relativity. I spent the academic year 1994-95 at IHES (I have written about this in a previous post) and in that period I often went to PDE seminars in Orsay. On one occasion the speaker was Satyanad Kichenassamy and his subject was Fuchsian equations. This is a long term research theme of his and he has written about it extensively in his books ‘Nonlinear Wave Equations’ and ‘Fuchsian Reduction: Applications to Geometry, Cosmology and Mathematical Physics’. I found the talk very stimulating and after some time I realized that this technique might be useful for studying spacetime singularities. Kichenassamy and I cooperated on working out an application to Gowdy spacetimes and this resulted in a joint paper. A general theorem on Fuchsian systems proved there has since (together with some small later modifications) been a workhorse for investigating spacetime singularities in various classes of spacetimes.

One of the highlights of this development was the proof by Lars Andersson and myself (Commun. Math. Phys. 218, 479) that there are very general classes of solutions of the Einstein equations coupled to a massless scalar field whose initial singularities can be described in great detail. In later work with Thibault Damour, Marc Henneaux and Marsha Weaver (Ann. H. Poincare 3, 1049) we were able to generalize this considerably, in particular to the case of solutions of the vacuum Einstein equations in sufficiently high dimensions.  For these results no symmetry assumptions were necessary. More recently these results were generalized in another direction by Mark Heinzle and Patrik Sandin (arXiv:1105.1643). Fuchsian systems have also been applied to the study of the late-time behaviour of cosmological models with positive cosmological constant. In the meantime there are more satisfactory results on this question useing other methods (see this post) but this example does show that the Fuchsian method can be applied to problems in general relativity which have nothing to do with the big bang. In general this method is a kind of machine for turning heuristic calculations into theorems.

The Fuchsian method works as follows. Suppose that a system of PDE is given and write it schematically as F(u)=0. I consider the case that the equation itself is regular and the aim is to find singular solutions. Let u_0 be an explicit function which satisfies the equation up to a certain order in an expansion parameter and which is singular on a hypersurface defined by t=0. Look for a solution of the form u=u_0+t^\alpha v where t^\alpha is less singular than u_0. The original equation can be rewritten as G(v)=0, where G is singular. Now the aim is to show that there is a unique solution v of the transformed equation which vanishes as t\to 0. A theorem of this kind was proved in the analytic case in the paper mentioned above which I wrote with Kichenassamy. Results on the smooth case are harder to prove and there are less of them known. A variant of the procedure is to define u_0 as a solution (not necessarily explicit) of an equation F_0(u_0)=0 which is a simplified version of the equation F(u)=0. In the cases of spacetime singularities which have been successfully handled the latter system is the so-called velocity-dominated system.

Weighted Sobolev spaces

February 22, 2011

In the study of elliptic systems of partial differential equations on compact manifolds it is often helpful to use Sobolev spaces. The Sobolev space H^s is the space of functions which are square integrable together with their derivatives up to order s. In general it is necessary to use distributions and distributional derivatives. Sobolev spaces can be defined in a similar way for sections of vector bundles but I will restrict myself to real-valued functions here. The Sobolev spaces H^s are Hilbert spaces. It is possible to replace L^2 by L^p, 1<p<\infty in the definitions with the resulting spaces being Banach spaces. On a manifold it is necessary to specify what kind of derivatives are used, how the lengths of these objects are defined and what measure is used for integration. One way of doing this is to use a Riemannian metric defined on the given manifold. Then covariant derivatives can be used, the length of tensors is naturally defined and the metric defines a volume form. In general using different metrics defines norms which are different but equivalent. An alternative approach is based on local coordinates. Introduce a partition of unity each of whose members has a support which is contained in a coordinate chart to write any function f as a sum of functions f_i which can be transported to Euclidean space using the charts. Define the square of the Sobolev norm of f to be equal to the sum of the squares of the corresponding flat space norms of the f_i. Different choices of the partition of unity and the charts lead to norms which are equivalent to each other and to the norms defined by means of Riemannian metrics. In the end the uniqueness of the Sobolev spaces as topological vector spaces (although not as normed spaces) is guaranteed by the compactness of the manifold.

Next some properties of Sobolev spaces on compact manifolds will be listed. If s_2>s_1 then the natural embedding from H^{s_2} to H^{s_1} is compact. A linear differential operator of order k whose coefficients belong to a Sobolev space of sufficiently high order defines a bounded linear operator from H^s to H^{s-k}. If the operator is elliptic then the corresponding mapping between Sobolev spaces is Fredholm. This means by definition that its kernel is finite dimensional and that its image is closed and of finite codimension. Elliptic regularity holds: if P is an elliptic operator of order k and Pu belongs to H^s then u belongs to H^{s+k}.

On a non-compact manifold things are more complicated. We have to expect that if there is any reasonable analogue of the theory just sketched it will depend on the asymptotics of the metrics used or the type of coordinate systems chosen. I am familar with one theory of this type which plays a role in general relativity. This is the theory of weighted Sobolev spaces on manifolds with an asymptotically flat metric. Since I was interested in a generalization of this and a bit rusty on the known theory I decided to refresh my memory by rereading the basic references. These are a paper of Choquet-Bruhat and Christodoulou (Acta Math. 146, 129) and one of Bartnik (Commun. Pure Appl. Math. 39, 661). The weighted Sobolev spaces used in this theory differ from ordinary Sobolev spaces by the introduction of positive functions (weights) multiplying the volume form in the definition of the norms. The weights used depend on the choice of a real number \delta in addition to the differentiability index s. The notational conventions used in the two papers just quoted differ and to avoid any ambiguity let me state that in what follows I use Bartnik’s conventions. (They seem to be the ones most frequently encountered in the literature.)

A Riemannian manifold (M,g) is called asymptotically flat if the following conditions hold. There is a compact subset K of M such that the complement of K is diffeomorphic to the complement of a closed ball in a Euclidean space of the same dimension. If the restriction of the metric to the complement of K is transported to R^n using a suitable diffeomorphism of this type then its components in Cartesian coordinates tend to those of the flat Euclidean metric at infinity in a suitable sense. One of the ways of making this into a precise definition is to use weighted Sobolev spaces of functions on flat space. (For the experts I remark that I have restricted here to the case of one end.) The weights are powers of a function \sigma which is everywhere smooth and behaves at infinity like the distance r from a fixed point. These weights are chosen so that (with Bartnik’s conventions) the growth rate of a function in the space H^s_\delta at infinity is no greater than r^\delta and that the growth rate decreases by one for each derivative taken up to order s. These statements are to be understood in an L^2 sense and not pointwise. There do, however, exist Sobolev embedding theorems which relate them to pointwise statements. The definitions and a number of the basic properties of these spaces are valid for any complete Riemannian manifold. For instance, the embedding from H^{s_1}_{\delta_1} into H^{s_2}_{\delta_2} is compact if s_1>s_2 and \delta_1<\delta_2. The elliptic theory requires more special assumptions. It uses scaling properties of Euclidean space and the fundamental solution for the flat space Laplacian. It can be shown, for instance, that suitable second order elliptic operators with coefficients belonging to weighted Sobolev spaces are Fredholm as operators from H^{s+2}_\delta to H^s_{\delta-2} and all non-integer values of \delta. For integer values of \delta the Fredholm property is lost. This is connected to the following fact. Each Fredholm operator has an index, which is the difference between the dimensions of the kernel and cokernel. The index is invariant under continuous variations of the operator. Changing the coefficient \delta for a fixed elliptic differential operator can be thought of as a continuous variation of the operator between Hilbert spaces. However the index in general changes when passing through an integer value of \delta and so the Fredholm property must be lost there.

Critical wave maps

January 20, 2010

At the moment I am attending a research programme called ‘Quantitative studies of nonlinear wave phenomena’ at the Erwin Schrödinger Institute in Vienna. This has stimulated me to think again about a topic in the study of nonlinear wave equations which has seen remarkable developments just recently. This concerns critical wave maps. Wave maps are nonlinear generalizations of the wave equation associated to a Riemannian manifold(N,h) called the target manifold. These days wave maps are one of the most important model problems in the study of nonlinear wave equations. If (M,g) is a pseudo-Riemannian manifold and \Phi a mapping from M to N define a Lagrange function by the norm squared of the derivative of \Phi with the norm being determined by g and h. In local coordinates L=g^{ij}\partial_i\Phi^I\partial_j\Phi^J h_{IJ}. Solutions of the corresponding Euler-Lagrange equations are called harmonic maps when g is Riemannian and wave maps when g is Lorentzian. Harmonic maps are elliptic while wave maps are hyperbolic. It is common to classify nonlinear evolution equations by the scaling of the energy into subcritical, critical and supercritical. The standard point of view is then that subcritical problems are relatively straightforward, supercritical problems are impossible at present and critical problems are on the borderline. It is therefore natural that critical problems are a hive of activity at the moment. For more information about this classification and its significance see this post of Terence Tao. Actually there are a couple of
results around on problems which could be called marginally supercritical. One, due to Tao, concerns the nonlinear wave equation \nabla^\alpha\nabla_\alpha u=u^5 (\log (2+u^2)) in three space dimensions (arXiv:math/0606145). Note that the equation with nonlinearity u^5 is critical in three space dimensions. Another, due to Michael Struwe, concerns the equation \nabla^\alpha\nabla_\alpha u=ue^{u^2} in two space dimensions, where every power law nonlinearity is subcritical. Here criticality is decided by the value of the energy and Struwe’s result applies to the supercritical case, under the assumption of rotational symmetry. He gave a talk on this here on Monday.

Getting back to wave maps the criticality is decided by the dimension of the manifold M. If its dimension is two (i.e. space dimension n=1) then the problem is subcritical. The case n=2 is critical while the problem is supercritical for n>2. From now on I concentrate on the critical case. In this context there is a general guide for distinguishing between global smoothness of solutions and the development of singularities which is the curvature of the target manifold. Negative curvature tends to be good for regularity while positive curvature tends to encourage singularities. This is related to an analogous feature of harmonic maps which has been known for a long time. A good model problem for the negatively curved case is that where (N,h) is the hyperbolic plane. The thesis of Shadi Tahvildar-Zadeh played an important role in making this problem known in the mathematics community. In this work the manifold (M,g) is three-dimensional Minkowski space. It was done at the Courant Institute under the supervision of Jalal Shatah and Demetrios Christodoulou. To make the problem more tractable some symmetry assumptions were made. Consider the action of SO(2) on R^2 by rotations about a point. There is also a natural action of the rotation group on the hyperbolic plane. One symmetry assumption is invariance – applying a rotation to a point x\in R^2 leaves \Phi(x) invariant. Another symmetry assumption is equivariance – applying a rotation by an angle \theta to x leads to a rotation of \Phi(x) by an angle k\theta where k is a positive integer. It can be seen that there are different types of equivariance depending on the parameter k. The invariant case is related to the Einstein equations of general relativity. Cylindrically symmetric solutions of the vacuum Einstein equations lead to exactly this problem. On the other hand one of the key techniques used in the proofs originates in a paper of Christodoulou where he showed a stability result for an exterior region in a black hole spacetime in the presence of a scalar field. The results on the invariant case were published by Christodoulou and Tahvildar-Zadeh in 1993. They made no mention of the connection to the vacuum Einstein equations. The results on the equivariant case were published by Shatah and Tahvildar-Zadeh at about the same time. Interestingly the two papers use rather different terminology. While the work on the equivarant case uses the language of multipliers, traditional in PDE theory, the work on the invariant case uses the language of vector fields coming from the direction of general relativity.

Without the symmetry the problem becomes much harder and was the subject of a major project of Tao (project heatwave). For more information see this post of Tao and the comments on it. By the time the project was finished there were already alternative approaches to solving the same problem by Sterbenz and Tataru and Krieger and Schlag. Now this problem can be considered solved. The different papers use different methods and the detailed conclusions are different. Nevertheless they all include global regularity for wave maps from three-dimensional Minkowski space to the hyperbolic plane without any symmetry assumptions. The proof of Sterbenz and Tataru goes through an intermediate statement that if there were a singularity there would be a finite energy harmonic map into the given target. They actually concentrate on the case of a compact target manifold but the case of the hyperbolic plane can be handled by passing to the universal cover. There also give a more general statement for the case that there is a non-trivial harmonic map around. Then regularity is obtained as long as the energy is less than that of the harmonic map. These ideas are also related to results on the positive curvature case which show what some blow-up solutions look like when they occur. Consider the case that the target manifold is the two-sphere. Here it is possible, as in the case of the hyperbolic plane, to define invariant and equivariant wave maps. In recent work Raphaël and Rodnianski are able to describe the precise nature of the blow-up for an open set of initial data and for all values of the integer k describing the type of equivariance. The case k=1 is exceptional. They have also proved similar results for an analogous critical problem, the Yang-Mills equations in 4+1 dimensions. These results provide a remarkably satisfactory confirmation of conjectures made by Piotr Bizoń and collaborators on the basis of numerical and heuristic considerations which were an important stimulus for the field.

Bootstrap arguments

November 22, 2009

A bootstrap argument is an analogue of mathematical induction where the natural numbers are replaced by the non-negative real numbers. This type of argument is a powerful tool for proving long-time existence theorems for evolution equations. For instance, it plays a central role in the proof of the stability of Minkowski space by Christodoulou and Klainerman and the theorem on formation of trapped surfaces by Christodoulou discussed in previous posts. The name comes from a story where someone pulls himself up by his bootstraps, leather attachments to the back of certain boots. This story is often linked to the name of Baron Münchhausen. In another variant he pulls himself out of a bog by his pigtail. This was a person who really lived and was known for telling tall tales. In later years people wrote various books about him and incorporated many other tall tales from various sources. The word ‘booting’ applied to computers is derived from ‘bootstrapping’ in the sense of this story. There are also bootstrap methods used in statistics. They involve analysing new samples drawn from a fixed sample. In some sense this means obtaining more knowledge about a system without any further input of information. It is this aspect of ‘apparently getting something for nothing’ which is typical of the bootstrap. In French the procedure in statistics has been referred to as ‘méthode Cyrano’. Unfortunately is seems that in PDE theory the French have just adopted the English term. I say ‘unfortunately’ because of a fondness for Cyrano de Bergerac. As in the case of Münchhausen there was a real person of this name, this time a writer. However the name is much better known as that of a fictional character, the hero of a play by Edmond Rostand. The non-fictional Cyrano wrote among other things about a trip to the moon. There is also a Münchhausen story where he uses a kind of inverse bootstrap (could there be a PDE analogue here?) to return from the moon. He constructs a rope which he attaches to one of the horns of the moon but it is much too short to reach down to the ground. He climbs to the bottom of the rope, reaches up and cuts off and detaches the part ‘which he does not need any more’ and ties it onto the bottom. He then repeats this process. Returning to Cyrano, he describes seven methods for getting to the moon of which the sixth is the one relevant to the bootstrap. He stands on an iron plate and throws a magnet into the air. The iron plate is attracted by the magnet and starts to rise. Then he rapidly catches the magnet and throws it into the air again. I should point out that Cyrano does not believe in the nonsensical stories he is telling – his aim is a practical one, holding the attention of the Duc de Guiche so as to delay him for a very specific reason.

Now I return to the topic of bootstrap arguments for evolution equations. I have given a discussion of the nature of these arguments in Section 10.3 of my book. Another description can be found in section 1.3 of Terry Tao’s book ‘Nonlinear dispersive equations: local and global analysis‘. A related and more familiar concept is that of the method of continuity. Consider a statement P(t) depending on a parameter t belong to the interval [0,\infty). Let S be the subset consisting of those t for which the statement P is true on the interval [0,t). If it can be shown that S is non-empty, open and closed then it can be concluded that the statement holds for all t, by the connectedness of the interval. The special feature of a bootstrap argument is the way in which openness is obtained. Suppose that, starting from P(t), we can prove a string of implications which ends again with P(t). This is nothing other than a circular argument and proves nothing. Suppose, however, that in addition this can be improved so that the statement at the end of the string is slightly stronger than that at the beginning. This improvement is something to work with and is a typical way of proving the openness needed to apply the continuity argument. It is more convenient here to work with the open interval (0,\infty) since we want to look at properties of solutions of an evolution equation defined on the interval (0,t). Let P(t) be the statement that a certain inequality (1) holds on the interval (0,t) and suppose that P(t) implies the statment Q(t) that a stronger inequality (2) holds on the same interval. Things are usually set up so that Q(t) implies by continuity that (2) holds at t and that the the property of being ‘stronger’ then shows that P(t') holds for t' slightly greater than t. This shows the openness property. I think the best way to really understand what a bootstrap argument means is to write out a known example explicitly or, even better, to invent a new one to solve a problem which interests you. The key thing is to find the right choice of P and Q. What I have described here is only the simplest variant. In the work of Christodoulou mentioned above he uses a continuity argument on two-dimensional sets.

The Vlasov-Poisson system

June 4, 2009

The Vlasov-Poisson system is a system of partial differential equations which comes up in mathematical physics. I have been involved quite a bit with these equations and related systems for many years now. In this post I want to reflect a little on what is and is not known about solutions of this system. One of the things which has stimulated me to think more about these questions just now is a lecture course on kinetic equations which I am giving at the Free University in Berlin. Because of the physics motivation the Vlasov-Poisson system is usually studied in three space dimensions. Here I will allow the space dimension n to be general. For convenience I also introduce a parameter \gamma which can take the values +1 and -1. The equations are \partial_t f+v\cdot \nabla_x f+\gamma\nabla_x U\cdot\nabla_v f=0 and \Delta U=\rho where \rho=\int f dv. Here t is time and x and v denote position and velocity variables, each belonging to {\bf R}^n. \Delta is the Laplacian. Because of their most frequent applications the cases \gamma=1 and \gamma=-1 are often called the plasma physics case and the stellar dynamics case respectively. A natural problem here is the initial value problem (Cauchy problem) with data prescribed for f.

Local existence in the Cauchy problem is known. For n\le 3 it is furthermore known that the local solution extends to a global in time solution, independently of the sign of \gamma. (The first proofs by two different methods were given by Pfaffelmoser and Lions/Perthame around 1991.) When n\ge 4 and \gamma=-1 there are large classes of smooth initial data for which global existence fails. More specifically, these equations have a conserved energy and when this energy is negative the corresponding smooth solution breaks down after finite time. The easiest way to realize that n=4 might be important is to look at scaling properties of the equations. For a discussion of the significance of scaling properties in general see Terry Tao’s post on the Navier-Stokes equations. In the case n=3 the potential and kinetic energies satisfy an inequality of the form |{\cal E}_{\rm pot}|\le C{\cal E}_{\rm kin}^{\frac12} and this plays an important role in the global existence proof. The essential feature is that the power on the right hand side is less than one. If similar arguments are carried out in the case n=4 then the power one half is replaced by the power one. Thus in a sense n=4 is the critical case. For n\ge 4 the global existence problem for the Vlasov-Poisson system with \gamma=1 is open. For n=4 it is a critical problem and might be solvable in a not too distant future. Similar remarks might be made about the relativistic Vlasov-Poisson system with massless particles in three space dimensions which is given by \partial_t f+\hat v\cdot \nabla_x f+\nabla_x U\cdot\nabla_v f=0 where \hat v=\frac{v}{|v|}. The analogue of this last system plays an important role in the recent work of Lemou, Méhats and Raphaël on the nature of singularities in solutions of the relativistic Vlasov-Poisson system with \gamma=-1.

Other open questions concern the behaviour of solutions of the Vlasov-Poisson system at late times. There are various results on this but they seem to be far from an exhaustive understanding of the asymptotics. Interesting questions include whether the density \rho is bounded any solution with \gamma=-1 and whether \|\rho\|_{L^\infty}=O(t^{-3}) in the case \gamma=1, as is known to be the case for small initial data.

Reaction-diffusion equations, interfaces and geometric flows

November 24, 2008

The subject of this post is a reaction-diffusion equation of the form \partial_t u=\Delta u-\frac{1}{\epsilon^2} f'(u) for a real-valued function $u$ with t\in {\bf R}^+ and x\in {\bf R}^n. Here \epsilon is a small parameter. This equation is often known as the Allen-Cahn equation and the phenomena discussed in the following were described in a 1979 paper of Allen and Cahn in Acta Metallurgica (which I have not seen). My primary source is a paper by Xinfu Chen (J. Diff. Eq. 96, 116) which includes both a nice introduction to the subject and rigorous proofs of some key theorems. The function f in the equation is a smooth function with certainly qualitative properties: two minima u_1^- and u_2^- separated by a maximum u^+.

In physical terms there are two opposing effects at work in this system. There is reaction which drives the evolution to resemble that of solutions of the corresponding ODE (limit \epsilon\to\infty) and diffusion which causes the solution to spread out in space. Generic solutions of the ODE tend to one of the two minima of the potential at late times. If the diffusion is set to zero this leads to two spatial regions where the solution has almost constant values u_1^- and u_2^- respectively. They are separated by an narrow interface where the spatial derivatives of u are large and the the values of u are close to u^+. These ideas are captured mathematically by proving asympotic expansions in the parameter \epsilon for the solutions on time intervals whose lengths depend on \epsilon in a suitable way.

The following phenomena arise. Here I assume that the space dimension n is at least two. The case n=1 is special. The first statement is that on a suitable time interval, which is short when \epsilon is small, an interface forms. It can be described by a hypersurface. Without going into details it can be thought of as the level hypersurface u=u^+. The second describes how the interface moves in space on a longer timescale. If the function f has unequal values at the minima then the interface moves in the direction of its normal towards the region corresponding to the smaller minimum with a velocity which is proportional to the difference of the two values. On the timescale on which this motion takes place the interface stands still in the case that the two values are equal. It can, however, be shown that in the case of equal minima the interface moves in a definite way on an even longer timescale. The interface moves in the direction of its normal with a velocity which is proportional to its mean curvature at the given point. In other words, the hypersurface is a solution of mean curvature flow, one of the best-known geometric flows.

To finish I will state a question which I have no answer to, not even vaguely. Are there any useful analogues of these results for the hyperbolic equation \partial_t^2 u+\partial_t u=\Delta u-\frac{1}{\epsilon^2} f'(u)? Here there is a potential interaction between reaction, damping and dispersion.