## Archive for the ‘dynamical systems’ Category

### Feinberg’s proof of the deficiency zero theorem

November 23, 2015

I discussed the deficiency zero theorem of chemical reaction network theory (CRNT) in a previous post. (Some further comments on this can be found here and here.) This semester I am giving a lecture course on chemical reaction network theory. Lecture notes are growing with the course and can be found in German and English versions on the web page of the course. The English version can also be found here. Apart from introductory material the first main part of the course was a proof of the Deficiency Zero Theorem. There are different related proofs in the literature and I have followed the approach in the classic lecture notes of Feinberg on the subject closely. The proof in those notes is essentially self-contained apart from one major input from a paper of Feinberg and Horn (Arch. Rat. Mech. Anal. 66, 83). In this post I want to give a high-level overview of the proof.

The starting point of CRNT is a reaction network. It can be represented by a directed graph where the nodes are the complexes (left or right hand sides of reactions) and the directed edges correspond to the reactions themselves. The connected components of this graph are called the linkage classes of the network and their number is usually denoted by $l$. If two nodes can be connected by oriented paths in both directions they are said to be strongly equivalent. The corresponding equivalence classes are called strong linkage classes. A strong linkage class is called terminal if there is no directed edge leaving it. The number of terminal strong linkage classes is usually denoted by $t$. From the starting point of the network making the assumption of mass action kinetics allows a system of ODE $\dot c=f(c)$ to be obtained in an algorithmic way. The quantity $c(t)$ is a vector of concentrations as a function of time. Basic mathematical objects involved in the definition of the network are the set $\cal S$ of chemical species, the set $\cal C$ of complexes and the set $\cal R$ of reactions. An important role is also played by the vector spaces of real-valued functions on these finite sets which I will denote by $F({\cal S})$, $F({\cal C})$ and $F({\cal R})$, respectively. Using natural bases they can be identified with $R^m$, $R^n$ and $R^r$. The vector $c(t)$ is an element of $F({\cal S})$. The mapping $f$ from $F({\cal S})$ to itself can be written as a composition of three mappings, two of them linear, $f=YA_k\Psi$. Here $Y$, the complex matrix, is a linear mapping from $F({\cal C})$ to $F({\cal S})$. $A_k$ is a linear mapping from $F({\cal C})$ to itself. The subscript $k$ is there because this matrix is dependent on the reaction constants, which are typically denoted by $k$. It is also possible to write $f$ in the form $Nv$ where $v$ describes the reaction rates and $N$ is the stoichiometric matrix. The image of $N$ is called the stoichiometric subspace and its dimension, the rank of the network, is usually denoted by $s$. The additive cosets of the stoichiometric subspace are called stoichiometric compatibility classes and are clearly invariant under the time evolution. Finally, $\Psi$ is a nonlinear mapping from $F({\cal S})$ to $F({\cal C})$. The mapping $\Psi$ is a generalized polynomial mapping in the sense that its components are products of powers of the components of $c$. This means that $\log\Psi$ depends linearly on the logarithms of the components of $c$. The condition for a stationary solution can be written as $\Psi(c)\in {\rm ker} (YA_k)$. The image of $\Psi$ is got by exponentiating the image of a linear mapping. The matrix of this linear mapping in natural bases is $Y^T$. Thus in looking for stationary solutions we are interested in finding the intersection of the manifold which is the image of $\Psi$ with the kernel of $YA_k$. The simplest way to define the deficiency of the network is to declare it to be $\delta=n-l-s$. A fact which is not evident from this definition is that $\delta$ is always non-negative. In fact $\delta$ is the dimension of the vector space ${\rm ker} Y\cap {\rm span}(\Delta)$ where $\Delta$ is the set of complexes of the network. An alternative concept of deficiency, which can be found in lecture notes of Gunawardena, is the dimension $\delta'$ of the space ${\rm ker} Y\cap{\rm im} A_k$. Since this vector space is a subspace of the other we have the inequality $\delta'\le\delta$. The two spaces are equal precisely when each linkage class contains exactly one terminal strong linkage class. This is, in particular, true for weakly reversible networks. The distinction between the two definitions is often not mentioned since they are equal for most networks usually considered.

If $c^*$ is a stationary solution then $A_k\Psi (c^*)$ belongs to ${\rm ker} Y\cap{\rm im} A_k$. If $\delta'=0$ (and in particular if $\delta=0$) then this means that $A_k\Psi (c^*)=0$. In other words $\Psi (c^*)$ belongs to the kernel of $A_k$. Stationary solutions of this type are called complex balanced. It turns out that if $c^*$ is a complex balanced stationary solution the stationary solutions are precisely those points $c$ for which $\log c-\log c_*$ lies in the orthogonal complement of the stoichiometric subspace. It follows that whenever we have one solution we get a whole manifold of them of dimension $n-s$. It can be shown that each manifold of this type meets each stoichiometric class in precisely one point. This is proved using a variational argument and a little convex analysis.

It is clear from what has been said up to now that it is important to understand the positive elements of the kernel of $A_k$. This kernel has dimension $t$ and a basis each of whose elements is positive on a terminal strong linkage class and zero otherwise. Weak reversibility is equivalent to the condition that the union of the terminal strong linkage classes is the set of all complexes. It can be concluded that when the network is not weakly reversible there exists no positive element of the kernel of $A_k$. Thus for a network which is not weakly reversible and has deficiency zero there exist no positive stationary solutions. This is part of the Deficiency Zero Theorem. Now consider the weakly reversible case. There a key statement of the Deficiency Zero Theorem is that there exists a complex balanced stationary solution $c^*$. Where does this $c^*$ come from? We sum the vectors in the basis of ${\rm ker} A_k$ and due to weak reversibility this gives something which is positive. Then we take the logarithm of the result. When $\delta=0$ this can be represented as a sum of two contributions where one is of the form $Y^T z$. Then $c^*=e^z$. A further part of the deficiency zero theorem is that the stationary solution $c^*$ in the weakly reversible case is asymptotically stable. This is proved using the fact that for a complex balanced stationary solution the function $h(c)=\sum_{s\in\cal S}[c(s)(\log c(s)-\log c^*(s)-1)+c(s^*)]$ is a Lyapunov function which vanishes for $c=c^*$

### Siphons in reaction networks

October 8, 2015

The concept of a siphon is one which I have been more or less aware of for quite a long time. Unfortunately I never had the impression that I had understood it completely. Given the fact that it came up a lot in discussions I was involved in and talks I heard last week I thought that the time had come to make the effort to do so. It is of relevance for demonstrating the property of persistence in reaction networks. This is the property that the $\omega$-limit points of a positive solution are themselves positive. For a bounded solution this is the same as saying that the infima of all concentrations at late times are positive. The most helpful reference I have found for these topics is a paper of Angeli, de Leenheer and Sontag in a proceedings volume edited by Queinnec et. al.

There are two ways of formulating the definition of a siphon. The first is more algebraic, the second more geometric. In the first the siphon is defined to be a set $Z$ of species with the property that whenever one of the species in $Z$ occurs on the right hand side of a reaction one of the species in $Z$ occurs on the left hand side. Geometrically we replace $Z$ by the set $L_Z$ of points of the non-negative orthant which are common zeroes of the elements of $Z$, thought of as linear functions on the species space. The defining property of a siphon is that $L_Z$ is invariant under the (forward in time) flow of the dynamical system describing the evolution of the concentrations. Another way of looking at the situation is as follows. Consider a point of $L_Z$. The right hand side of the evolution equations of one of the concentrations belonging to $Z$ is a sum of positive and negative terms. The negative terms automatically vanish on $L_Z$ and the siphon condition is what is needed to ensure that the positive terms also vanish there. Sometimes minimal siphons are considered. It is important to realize that in this case $Z$ is minimal. Correspondingly $L_Z$ is maximal. The convention is that the empty set is excluded as a choice for $Z$ and correspondingly the whole non-negative orthant as a choice for $L_Z$. What is allowed is to choose $Z$ to be the whole of the species space which means that $L_Z$ is the origin. Of course whether this choice actually defines a siphon depends on the particular dynamical system being considered.

If $x_*$ is an $\omega$-limit point of a positive solution but is not itself positive then the set of concentrations which are zero at that point is a siphon. In particular stationary solutions on the boundary are contained in siphons. It is remarked by Shiu and Sturmfels (Bull. Math. Biol. 72, 1448) that for a network with only one linkage class if a siphon contains one stationary solution it consists entirely of stationary solutions. To see this let $x_*$ be a stationary solution in the siphon $Z$. There must be some complex $y$ belonging to the network which contains an element of $Z$. If $y'$ is another complex then there is a directed path from $y'$ to $y$. We can follow this path backwards from $y$ and conclude successively that each complex encountered contains an element of $Z$. Thus $y'$ contains an element of $Z$ and since $y'$ was arbitrary all complexes have this property. This means that all complexes vanish at $x_*$ so that $x_*$ is a stationary solution.

Siphons can sometimes be used to prove persistence. Suppose that $Z$ is a siphon for a certain network so that the points of $Z$ are potential $\omega$-limit points of solutions of the ODE system corresponding to this network. Suppose further that $A$ is a conserved quantity for the system which is a linear combination of the coordinates with positive coefficents. For a positive solution the quantity $A$ has a positive constant value along the solution and hence also has the same value at any of its $\omega$-limit points. It follows that if $A$ vanishes on $Z$ then no $\omega$-limit point of that solution belongs to $Z$. If it is possible to find a conserved quantity $A$ of this type for each siphon of a given system (possibly different conserved quantities for different siphons) then persistence is proved. For example this strategy is used in the paper of Angeli et al. to prove persistence for the dual futile cycle. The concept of persistence is an important one when thinking about the general properties of reaction networks. The persistence conjecture says that any weakly reversible reaction network with mass action kinetics is persistent (possibly with the additional assumption that all solutions are bounded). In his talk last week Craciun mentioned that he is working on proving this conjecture. If true it implies the global attractor conjecture. It also implies a statement claimed in a preprint of Deng et. al. (arXiv:1111.2386) that a weakly reversible network has a positive stationary solution in any stoichiometric compatobility class. This result has never been published and there seems to be some doubt as to whether the proof is correct.

### Trip to the US

October 5, 2015

Last week I visited a few places in the US. My first stop was Morgantown, West Virginia where my host was Casian Pantea. There I had a lot of discussions with Casian and Carsten Conradi on chemical reaction network theory. This synergized well with the work I have recently been doing preparing a lecture course on that subject which I will be giving in the next semester. I gave a talk on MAPK and got some feedback on that. It rained a lot and there was not much opportunity to do anything except work. One day on the way to dinner while it was relatively dry I saw a Cardinal and I fortunately did have my binoculars with me. On Wednesday afternoon I travelled to New Brunswick and spent most of Thursday talking to Eduardo Sontag at Rutgers. It was a great pleasure to talk to an excellent mathematician who also knows a lot about immunology. He and I have a lot of common interests which is in part due to the fact that I was inspired by several of his papers during the time I was getting into mathematical biology. I also had the opportunity to meet Evgeni Nikolaev who told me a variety of interesting things. They concerned bifurcation theory in general, its applications to the kinds of biological models I am interested in and his successes in applying mathematical models to understanding concrete problems in biomedical research such as the processes taking place in tuberculosis. My personal dream is to see a real coming together of mathematics and immunology and that I have the chance to make a contribution to that process.

On Friday I flew to Chicago in order to attend an AMS sectional meeting. I had been in Chicago once before but that is many years ago now. I do remember being impressed by how much Lake Michigan looks like the sea, I suppose due to the structure of the waves. This impression was even stronger this time since there were strong winds whipping up the waves. Loyola University, the site of the meeting, is right beside the lake and it felt like home for me due to the combination of wind, waves and gulls. The majority of those were Ring-Billed Gulls which made it clear which side of the Atlantic I was on. There were also some Herring Gulls and although they might have been split from those on the other side of the Atlantic by the taxonomists I did not notice any difference. It was the first time I had been at an AMS sectional meeting and my impression was that the parallel sessions were very parallel, in other words in no danger of meeting. Most of the people in our session were people I knew from the conferences I attended in Charlotte and in Copenhagen although I did make a couple of new acquaintances, improving my coverage of the reaction network community.

In a previous post I mentioned Gheorghe Craciun’s ideas about giving the deficiency of a reaction network a geometric interpretation, following a talk of his in Copenhagen. Although I asked him questions about this on that occasion I did not completely understand the idea. Correspondingly my discussion of the point here in my blog was quite incomplete. Now I talked to him again and I believe I have finally got the point. Consider first a network with a single linkage class. The complexes of the network define points in the species space whose coordinates are the stoichiometric coefficients. The reactions define oriented segments joining the educt complex to the product complex of each reaction. The stoichiometric subspace is the vector space spanned by the differences of the complexes. It can also be considered as a translate of the affine subspace spanned by the complexes themselves. This makes it clear that its dimension $s$ is at most $n-1$, where $n$ is the number of complexes. The number $s$ is the rank of the stoichiometric matrix. The deficiency is $n-1-s$. At the same time $s\le m$. If there are several linkage classes then the whole space has dimension at most $n-l$, where $l$ is the number of linkage classes. The deficiency is $n-l-s$. If the spaces corresponding to the individual linkage classes have the maximal dimension allowed by the number of complexes in that class and these spaces are linearly independent then the deficiency is zero. Thus we see that the deficiency is the extent to which the complexes fail to be in general position. If the species and the number of complexes have been fixed then deficiency zero is seen to be a generic condition. On the other hand fixing the species and adding more complexes will destroy the deficiency zero condition since then we are in the case $n-l>m$ so that the possibility of general position is excluded. The advantage of having this geometric picture is that it can often be used to read off the deficiency directly from the network. It might also be used to aid in constructing networks with a desired deficiency.

### Oscillations in the MAP kinase cascade

September 10, 2015

In a recent post I mentioned my work with Juliette Hell on the existence of oscillations in the Huang-Ferrell model for the MAP kinase cascade. We recently put our paper on the subject on ArXiv. The starting point of this project was the numerical and neuristic work of Qiao et. al., PLoS Comp. Biol. 3, 1819. Within their framework these authors did an extensive search of parameter space and found Hopf bifurcations and periodic solutions for many parameters. The size of the system is sufficiently large that it represents a significant obstacle to analytical investigations. One way of improving this situation is to pass to a limiting system (MM system) by a Michaelis-Menten reduction. In addition it turns out that the periodic solutions already occur in a truncated system consisting of the first two layers of the cascade. This leaves one layer with a single phosphorylation and one with a double phosphorylation. In a previous paper we had shown how to do Michaelis-Menten reduction for the truncated system. Now we have generalized this to the full cascade. In the truncated system the MM system is of dimension three, which is still quite convenient for doing bifurcation theory. Without truncation the MM system is of dimension five, which is already much more difficult. It is however possible to represent the system for the truncated cascade as a (singular) limit of that for the full cascade and thus transport information from the truncated to the full cascade.

Consider the MM system for the truncated cascade. The aim is then to find a Hopf bifurcation in a three-dimensional dynamical system with a lot of free parameters. Because of the many parameters is it not difficult to find a large class of stationary solutions. The strategy is then to linearize the right hand side of the equations about these stationary solutions and try show that there are parameter values where a suitable bifurcation takes place. To do this we would like to control the eigenvalues of the linearization, showing that it can happen that at some point one pair of complex conjugate eigenvalues passes through the imaginary axis with non-zero velocity as a parameter is varied, while the remaining eigenvalue has non-zero real part. The behaviour of the eigenvalues can largely be controlled by the trace, the determinant and an additional Hurwitz determinant. It suffices to arrange that there is a point where the trace is negative, the determinant is zero and the Hurwitz quantity passes through zero with non-zero velocity. This we did. A superficially similar situation is obtained by modelling an in vitro model for the MAPK cascade due to Prabakaran, Gunawardena and Sontag mentioned in a previous post in a way strictly analogous to that done in the Huang-Ferrell model. In that case the layers are in the opposite order and a crucial sign is changed. Up to now we have not been able to show the existence of a Hopf bifurcation in that system and our attempts up to now suggest that there may be a real obstruction to doing so. It should be mentioned that the known necessary condition for a stable hyperbolic periodic solution, the existence of a negative feedback loop, is satisfied by this system.

Now I will say some more about the model of Prabakaran et. al. Its purpose is to obtain insights on the issue of network reconstruction. Here is a summary of some things I understood. The in vitro biological system considered in the paper is a kind of simplification of the Raf-MEK-ERK MAPK cascade. By the use of certain mutations a situation is obtained where Raf is constitutively active and where ERK can only be phosphorylated once, instead of twice as in vivo. This comes down to a system containing only the second and third layers of the MAPK cascade with the length of the third layer reduced from three to two phosphorylation states. The second layer is modelled using simple mass action (MA) kinetics with the two phosphorylation steps being treated as one while in the third layer the enzyme concentrations are included explicitly in the dynamics in a standard Michaelis-Menten way (MM-MA). The resulting mathematical model is a system of seven ODE with three conservation laws. In the paper it is shown that for given values of the conserved quantities the system has a unique steady state. This is an application of a theorem of Angeli and Sontag. Note that this is not the same system of equations as the system analogous to that of Huang-Ferrell mentioned above.

The idea now is to vary one of the conserved quantities and monitor the behaviour of two functions $x$ and $y$ of the unknowns of the system at steady state. It is shown that for one choice of the conserved quantity $x$ and $y$ change in the same direction while for a different choice of the conserved quantity they change in opposite directions when the conserved quantity is varied. From a mathematical point of view this is not very surprising since there is no obvious reason forbidding behaviour of this kind. The significance of the result is that apparently biologists often use this type of variation in experiments to reach conclusions about causal relationships between the concentrations of different substances (activation and inhibition), which can be represented by certain signed oriented graphs. In this context ‘network reconstruction’ is the process of determining a graph of this type. The main conclusion of the paper, as I understand it, is that doing different experiments can lead to inconsistent results for this graph. Note that there is perfect agreement between the experimental results in the paper and the results obtained from the mathematical model. In a biological system if two experiments give conflicting results it is always possible to offer the explanation that some additional substance which was not included in the model is responsible for the difference. The advantage of the in vitro model is that there are no other substances which could play that role.

### Models for photosynthesis, part 3

September 8, 2015

Here I continue the discussion of models for photosynthesis in two previous posts. There I described the Pettersson and Poolman models and indicated the possibility of introducing variants of these which use exclusively mass action kinetics. I call these the Pettersson-MA and Poolman-MA models. I was interested in obtaining information about the qualitative behaviour of solutions of these ODE systems. This gave rise to the MSc project of Dorothea Möhring which she recently completed successfully. Now we have extended this work a little further and have written up the results in a paper which has just been uploaded to ArXiv. The central issue is that of overload breakdown which is related to the mathematical notion of persistence. We would like to know under what circumstances a positive solution can have $\omega$-limit points where some concentrations vanish and, if so, which concentrations vanish in that case. It seems that there was almost no information on the latter point in the literature so that the question of what exactly overload breakdown is remained a bit nebulous. The general idea is that the Pettersson model should have a stronger tendency to undergo overload breakdown while the Poolman model should have a stronger tendency to avoid it. The Pettersson-MA and Poolman-MA models represent a simpler context to work in to start with.

For the Pettersson-MA model we were able to identify a regime in which overload breakdown takes place. This is where the initial concentrations of all sugar phosphates and inorganic phosphate in the chloroplast are sufficiently small. In that case the concentrations of all sugar phosphates tend to zero at late times with two exceptions. The concentrations of xylulose-4-phosphate and sedoheptulose-7-phosphate do not tend to zero. These results are obtained by linearizing the system around a simple stationary solution on the boundary and applying the centre manifold theorem. Another result is that if the reaction constants satisfy a certain inequality a positive solution can have no positive $\omega$-limit points. In particular, there are no positive stationary solutions in that case. This is proved using a Lyapunov function related to the total number of carbon atoms. In the case of the Poolman-MA model it was shown that the stationary point which was stable in the Pettersson case becomes unstable. Moreover, a quantitative lower bound for concentration of sugar phosphates at late times in obtained.These results fit well with the intuitive picture of what should happen. Some of the results on the Poolman-MA model can be extended to analogous ones for the original Poolman model. On the other hand the task of giving a full rigorous definition of the Pettersson model was postponed for later work. The direction in which this could go has been sketched in a previous post.

There remains a lot to be done. It is possible to define a kind of hybrid model by setting $k_{32}=0$ in the Poolman model. It would be desirable to completely clarify the definition of the Pettersson model and then, perhaps, to show that it can be obtained as a well-behaved limiting system of the hybrid system in the sense of geometric singular perturbation theory. This might allow the dynamical properties of solutions of the different systems to be related to each other. The only result on stationary solutions obtained so far is a non-existence theorem. It would be of great interest to have positive results on the existence, multiplicity and stability of stationary solutions. A related question is that of classifying possible $\omega$-limit points of positive solutions where some of the concentrations are zero. This was done in part in the paper but what was not settled is whether potential $\omega$-limit points with positive concentrations of the hexose phosphates can actually occur. Finally, there are a lot of other models for the Calvin cycle on the market and it would be interesting to see to what extent they are susceptible to methods similar to those used in our paper.

### Phosphorylation systems

September 1, 2015

In order to react to their environment living cells use signalling networks to propagate information from receptors, which are often on the surface of the cell, to the nucleus where transcription factors can change the behaviour of the cell by changing the rate of production of different proteins. Signalling networks often make use of phosphorylation systems. These are networks of proteins whose enzymatic activity is switched on or off by phosphorylation or dephosphorylation. When switched on they catalyse the (de-)phophorylation of other proteins. The information passing through the network is encoded in the phosphate groups attached to specific amino acids in the proteins concerned. A frequently occurring example of this type of system is the MAPK cascade discussed in a previous post. There the phosphate groups are attached to the amino acids serine, threonine and tyrosine. Another type of system, which is common in bacteria, are the two-component systems where the phosphate groups are attached to histidine and aspartic acid.

There is a standard mathematical model for the MAPK cascade due to Huang and Ferrell. It consists of three layers, each of which is a simple or dual futile cycle. Numerical and heuristic investigations indicate that the Huang-Ferrell model admits periodic solutions for certain values of the parameters. Together with Juliette Hell we set out to find a rigorous proof of this fact. In the beginning we pursued the strategy of showing that there are relaxation oscillations. An important element of this is to prove that the dual futile cycle exhibits bistability, a fact which is interesting in its own right, and we were able to prove this, as has been discussed here. In the end we shifted to a different strategy in order to prove the existence of periodic solutions. The bistability proof used a quasistationary (Michaelis-Menten) reduction of the Huang-Ferrell system. It applied bifurcation theory to the Michaelis-Menten system and geometric singular perturbation theory to lift this result to the original system. To prove the existence of periodic solutions we used a similar strategy. This time we showed the presence of Hopf bifurcations in a Michaelis-Menten system and lifted those. The details are contained in a paper which is close to being finished. In the meantime we wrote a review article on phosphorylation systems. Here I want to mention some of the topics covered there.

In our paper there is also an introduction to two-component systems. A general conclusion of the paper is that phosphorylation systems give rise to a variety of interesting mathematical problems which are waiting to be investigated. It may also be hoped that a better mathematical understanding of this subject can lead to new insights concerning the biological systems being modelled. Biological questions of interest in this context include the following. Are dynamical features of the MAPK cascade such as oscillations desirable for the encoding of information or are they undesirable side effects? To what extent do feedback loops tend to encourage the occurrence of features of this type and to what extent do they tend to suppress them? What are their practical uses, if any? If the function of the system is damaged by mutations how can it be repaired? The last question is of special interest due to the fact that many cancer cells have mutations in the Raf-MEK-ERK cascade and there have already been many attempts to overcome their negative effects using kinase inhibitors, some of them successful. A prominent example is the Raf inhibitor Vemurafenib which has been used to treat metastatic melanoma.

### Reaction networks in Copenhagen

July 9, 2015

Last week I attended a workshop on reaction network theory organized by Elisenda Feliu and Carsten Wiuf. It took place in Copenhagen from 1st to 3rd July. I flew in late on the Tuesday evening and on arrival I had a pleasant feeling of being in the north just due to the amount and quality of the light. Looking at the weather information for Mainz I was glad I had got a reduction in temperature of several degrees by making this trip. A lot of comments and extra information on the talks at this conference can be found on the blog of John Baez and that of Matteo Polettini. Now, on my own slower time scale, I will write a bit about things I heard at the conference which I found particularly interesting. The topic of different time scales is very relevant to the theme of the meeting and the first talk, by Sebastian Walcher, was concerned with it. Often a dynamical system of interest can be thought of as containing a small parameter and letting this parameter tend to zero leads to a smaller system which may be easier to analyse. Information obtained in this way may be transported back to the original system. If the parameter is a ratio of time scales then the limit will be singular. The issue discussed in the talk is that of finding a suitable small parameter in a system when one is suspected. It is probably unreasonable to expect to find a completely general method but the talk presented algorithms which can contribute to solving this type of problem.

In the second talk Gheorghe Craciun presented his proof of the global attractor conjecture, which I have mentioned in a previous post. I was intrigued by one comment he made relating the concept of deficiency zero to systems in general position. Later he explained this to me and I will say something about the direction in which this goes. The concept of deficiency is central in chemical reaction network theory but I never found it very intuitive and I feel safe in claiming that I am in good company as far as that is concerned. Gheorghe’s idea is intended to improve this state of affairs by giving the deficiency a geometric interpretation. In this context it is worth mentioning that there are two definitions of deficiency on the market. I had heard this before but never looked at the details. I was reminded of it by the talk of Jeanne Marie Onana Eloundou-Mbebi in Copenhagen, where it played an important role. She was talking about absolute concentration robustness. The latter concept was also the subject of the talk of Dave Anderson, who was looking at the issue of whether the known results on ACR for deterministic reaction networks hold in some reasonable sense in the stochastic case. The answer seems to be that they do not. But now I return to the question of how the deficiency is defined. Here I use the notation $\delta$ for the deficiency as originally defined by Feinberg. The alternative, which can be found in Jeremy Gunawardena’s text with the title ‘Chemical reaction network theory for in silico biologists’ will be denoted by $\delta'$. Gunawardena, who seems to find the second definition more natural, proves that the two quantities are equal provided a certain condition holds (each linkage class contains precisely one terminal strong linkage class). This condition is, in particular, satisfied for weakly reversible networks and this is perhaps the reason that the difference in definitions is not often mentioned in the literature. In general $\delta\ge\delta'$, so that deficiency zero in the sense of the common definition implies deficiency zero in the sense of the other definition.

For a long time I knew very little about control theory. The desire to change this motivated me to give a course on the subject in the last winter semester, using the excellent textbook of Eduardo Sontag as my main source. Since that I had never taken the time to look back on what I learned in the course of doing this and this became clearer to me only now. In Copenhagen Nicolette Meshkat gave a talk on identifiability in reaction networks. I had heard her give a talk on a similar subject at the SIAM life science conference last summer and not understood much. I am sure that this was not her fault but mine. This time around things were suddenly clear. The reason is that this subject involves ideas coming from control theory and through giving the course I had learned to think in some new directions. The basic idea of identifiability is to extract information on the parameters of a dynamical system from the input-output relation.

There was another talk with a lot of control theory content by Mustafa Khammash. He had brought some machines with him to illustrate some of the ideas. These were made of Lego, driven by computers and communicating with each other via bluetooth devices. One of these was a physical realization of one of the favourite simple examples in control theory, stabilization of the inverted pendulum. Another was a robot programmed to come to rest 30 cm in front of a barrier facing it. Next he talked about an experiment coupling living cells to a computer to form a control system. The output from a population of cells was read by a combination of GFP labeling and a FACS machine. After processing the signal the resulting input was done by stimulating the cells using light. This got a lot of media attention unter the name ‘cyborg yeast’. After that he talked about a project in which programmes can be incorporated into the cells themselves using plasmids. In one of the last remarks in his talk he mentioned how cows use integral feedback to control the calcium concentration in their bodies. I think it would be nice to incorporate this into popular talks or calculus lectures in the form ‘cows can do integrals’ or ‘cows can solve differential equations’. The idea would be to have a striking example of what the abstract things done in calculus courses have to do with the real world.

My talk at the conference was on phosphorylation systems and interestingly there was another talk there, by Andreas Weber, which had a possibly very significant relation to this. I only became aware of the existence of the corresponding paper (Errami et. al., J. Comp. Phys. 291, 279) a few weeks ago and since it involves a lot of techniques I am not too familiar with and has a strong computer science component I have only had a limited opportunity to understand it. I hope to get deeper into it soon. It concerns a method of finding Hopf bifurcations.

This conference was a great chance to maintain and extend my contacts in the community working on reaction networks and get various types of inside information on the field

### Models for photosynthesis, part 2

May 22, 2015

In my previous post on this subject I discussed the question of the status of the variables in the Poolman model of photosynthesis and in the end I was convinced that I had understood which concentrations are to be considered as dynamical unknowns and which as constants. The Poolman model is a modified version of the Pettersson model and the corresponding questions about the nature of the variables have the same answers in both cases. What I am calling the Pettersson model was introduced in a paper of Pettersson and Ryde-Pettersson (Eur. J. Biochem 175, 661) and there the description of the variables and the equations is rather complete and comprehensible. Now I will go on to consider the second question raised in the previous post, namely what the evolution equations are. The evolution equations in the Poolman model are modifications of those in the Pettersson model and are described relative to those in the original paper on the former model. For this reason I will start by describing the equations for the Pettersson model. As a preparation for that I will treat a side issue. In a reaction network a reaction whose rate depends only on the concentrations of the substances consumed in the reaction is sometimes called NAC (for non-autocatalytic). For instance this terminology is used in the paper of Kaltenbach quoted in the previous post. The opposite of NAC is the case where the reaction rate is modulated by the concentrations of other substances, such as activators or inhibitors.

The unknowns in the Pettersson model are concentrations of substances in the stroma of the chloroplast. The substances involved are 15 carbohydrates bearing one or more phosphate groups, inorganic phosphate and ATP, thus 17 variables in total. In addition to ordinary reactions between these substances there are transport processes in which sugar phosphates are moved from the stroma to the cytosol in exchange for inorganic phosphate. For brevity I will also refer to these as reactions. The total amount of phosphate in the stroma is conserved and this leads to a conservation law for the system of equations, a fact explicitly mentioned in the paper. On the basis of experimental data some of the reactions are classified as fast and it is assumed that they are already at equilibrium. They are also assumed to be NAC and to have mass-action kinetics. This defines a set of algebraic equations. These are to be used to reduce the 17 evolution equations which are in principle there to five equations for certain linear combinations of the variables. The details of how this is done are described in the paper. I will now summarize how this works. The time derivatives of the 16 variables other than inorganic phosphate are given in terms of linear combinations of 17 reaction rates. Nine of these reaction rates, which are not NAC, are given explicitly. The others have to be treated using the 11 algebraic equations coming from the fast reactions. The right hand sides $F_i$ of the five evolution equations mentioned already are linear combinations of those reaction rates which are given explicitly. These must be expressed in terms of the quantities whose time derivatives are on the left hand side of these equations, using the algebraic equations coming from the fast reactions and the conservation equation for the total amount of phosphate. In fact all unknowns can be expressed in terms of the concentrations of RuBP, DHAP, F6P, Ru5P and ATP. Call these quantities $s_i$. Thus if the time derivatives of the $s_i$ can be expressed in terms of the $F_i$ we are done. It is shown in the appendix to the paper how a linear combination of the time derivatives of the $s_i$ with coefficients only depending on the $s_i$ is equal to $F_i$. Moreover it is stated that the time derivatives of the $s_i$ can be expressed in terms of these linear combinations.

Consider now the Poolman model. One way in which it differs from the Pettersson model is that starch degradation is included. The other is that while the kinetics for the ‘slow reactions’ (i.e. those which are not classified as fast in the Pettersson model) are left unchanged, the equilibrium assumption for the fast reactions is dropped. Instead the fast reactions are treated as reversible with mass action kinetics. In the thesis of Sergio Grimbs (Towards structure and dynamics of metabolic networks, Potsdam 2009) there is some discussion of the models of Poolman and Pettersson. It is investigated whether information about multistability in these models can be obtained using ideas coming from chemical reaction network theory. Since the results from CRNT considered require mass action kinetics it is implicit in the thesis that the systems are being considered which are obtained by applying mass action to all reactions in the networks for the Poolman and Pettersson models. These systems are therefore strictly speaking different from those of Pettersson and Poolman. In any case it turned out that these tools were not useful in this example since the simplest results did not apply and for the more complicated computer-assisted ones the systems were too large.

In the Pettersson paper the results of computations of steady states are presented and a comparison with published experimental results looks good in a graph presented there. So whay can we not conclude that the problem of modelling the dynamics of the Calvin cycle was pretty well solved in 1988? The paper contains no details on how the simulations were done and so it is problematic to repeat them. Jablonsky et. al. set up simulations of this model on their own and found results very different from those reported in the original paper. In this context the advantage of the Poolman model is that it has been put into the BioModels database so that the basic data is available to anyone with the necessary experience in doing simulations for biochemical models. Forgetting the issue of the reliability of their simulations, what did Petterson and Ryde-Pettersson find? They saw that depending on the external concentration of inorganic phosphate there is either no positive stationary solution (for high values of this parameter) or two (for low values) with a bifurcation in between. When there are two stationary solutions one is stable and one unstable. It looks like there is a fold bifurcation. There is a trivial stationary solution with all sugar concentrations zero for all values of the parameter. When the external phosphate concentration tends to zero the two positive stationary solutions coalesce with the trivial one. The absence of positive stationary solutions for high phosphate concentrations is suggested to be related to the concept of ‘overload breakdown’. This means that sugars are being exported so fast that the production from the Calvin cycle cannot keep up and the whole system breaks down. It would be nice to have an analytical proof of the existence of a fold bifurcation for the Pettersson model but that is probably very difficult to get.

### Models for photosynthesis

May 15, 2015

Photosynthesis is a process of central importance in biology. There is a large literature on modelling this process. One step is to identify networks of chemical reactions involved. Another is to derive mathematical models (usually systems of ODE) from these networks. Here when I say ‘model’ I mean ‘mathematical model’ and not the underlying network. In a paper by Jablonsky et. al. (BMC Systems Biology 5: 185) existing models are surveyed and a number or errors and inconsistencies in the literature are pointed out. This reminded me of the fact that a widespread problem in the biological literature is that the huge amount of data being generated these days contains very many errors. Here I want to discuss some issues related to this, concentrating on models for the Calvin cycle of photosynthesis and, in particular, on what I will call the Poolman model.

A point which might seem obvious and trivial to the mathematician is that a description of a mathematical model (I consider here only ODE models) should contain a clear answer to the following two questions. 1) What are the unknowns? 2) What are the evolution equations? One source of ambiguity involved in the first question is the impossibility of modelling everything. It is usually unreasonable to model a whole organism although this has been tried for some simple ones. Even if it were possible, the organism is in interaction with other organisms and its environment and these things cannot also be included. In practise it is necessary to fix a boundary of the system we want to consider and cut there. One way of handling the substances outside the cut in a mathematical model is to set their concentrations to constant values, thus implicitly assuming that to a good approximation these are not affected by the dynamics within the system. Let us call these external species and the substances whose dynamics is included in the model internal species. Thus part of answering question 1) is to decide on which species are to be treated as internal. In this post I will confine myself to discussing question 1), leaving question 2) for a later date.

Suppose we want to answer question 1) for a model in the literature. What are potential difficulties? In biological papers the equations (and even the full list of unknowns) are often banished to the supplementary material. In addition to being less easy to access and often less easy to read (due to typographical inferiority) than the main text I have the feeling that this supplementary material is often subjected to less scrutiny by the referees and by the authors, so that errors or incompleteness can occur more easily. Sometimes this information is only contained in some files intended to be read by a computer rather than a human being and it may be necessary to have, or be able to use, special software in order to read them in any reasonable way. Most of these difficulties are not absolute in nature. It is just that the mathematician embarking on such a process should ideally be aware of some of the challenges awaiting him in advance.

How does this look in the case of the Poolman model? It was first published in a journal in a paper of Poolman, Fell and Thomas (J. Exp. Botany, 51, 319). The reaction network is specified by Fig. 1 of the paper. This makes most of the unknowns clear but leaves the following cases where something more needs to be said. Firstly, it is natural to take the concentration of ADP to be defined implicitly through the concentration of ATP and the conservation of the total amount of adenosine phosphates. Secondly, it is explictly stated that the concentrations of NADP and NADPH are taken to be constant so that these are clearly external species. Presumably the concentration of inorganic phosphate in the stroma is also taken to be constant, so that this is also an external variable, although I did not find an explicit statement to this effect in the paper. The one remaining possible ambiguity involves starch – is it an internal or an external species in this model? I was not able to find anything directly addressing this point in the paper. On the other hand the paper does refer to the thesis of Poolman and some internet resources for further information. In the main body of the thesis I found no explicit resolution of the question of external phosphate but there it does seem that this quantity is treated as an external parameter. The question of starch is particularly important since this is a major change in the Poolman model compared to the earlier Pettersson model on which it is based and since Jablonsky et. al. claim that there is an error in the equation describing this step. It is stated in the thesis that ‘a meaningful concentration cannot be assigned to’ … ‘the starch substrate’ which seems to support my impression that the concentration of starch is an external species. Finally a clear answer confirming my suppositions above can be found in Appendix A of the thesis which describes the computer implementation. There we find a list of variables and constants and the latter are distinguished by being preceded by a dollar sign. So is there an error in the equation for starch degradation used in the Poolman model? My impression is that there is not, in the sense that the desired assumptions have been implemented successfully. The fact that Jablonsky et. al. get the absurd result of negative starch concentrations is because they compute an evolution for starch which is an external variable in the Poolman model. What could be criticised in the Poolman model is that the amount of starch in the chloroplast varies a lot over the course of the day. Thus a model with starch as an external variable could only be expected to give a good approximation to reality on timescales much shorter than one day.

### The species-reaction graph

May 14, 2015

In the study of chemical reaction networks important features of the networks are often summarised in certain graphs. Probably that most frequently used is the species graph (or interaction graph), which I discussed in a previous post. The vertices are the species taking part in the network and the edges are related to the non-zero entries of the Jacobian matrix of the vector field defining the dynamics. Since the Jacobian matrix depends in general on the concentrations at which it is evaluated there is a graph (the local graph) for each set of values of the concentrations. Sometimes a global graph is defined as the union of the local graphs. A sign can be attached to each edge of the local graph according to the sign of the corresponding partial derivative. In the case, which does occur quite frequently in practise, that the signs of the partial derivatives are independent of the concentrations the distinction between local and global graphs is not necessary. In the general case a variant of the species graph has been defined by Kaltenbach (arXiv:1210.0320). In that case there is a directed edge from vertex $i$ to vertex $j$ if there is any set of concentrations for which the corresponding partial derivative is non-zero and instead of being labelled with a sign the edge is labelled with a function, namely the partial derivative itself.

Another more complicated graph is the species-reaction graph or directed species-reaction graph (DSR graph). As explained in detail by Kaltenbach the definition (and the name of the object) are not consistent in the literature. The species-reaction graph was introduced in a paper of Craciun and Feinberg (SIAM J. Appl. Math. 66, 1321). In a parallel development which started earlier Mincheva and Roussel (J. Math. Biol. 55, 61) developed results using this type of graph based on ideas of Ivanova which were little known in the West and for which published proofs were not available. In the sense used by Kaltenbach the DSR graph is an object closely related to his version of the interaction graph. It is a bipartite graph (i.e. there are two different types of vertices and each edge connects a vertex of one type with a vertex of the other). In the DSR graph the species define vertices of one type and the reactions the vertices of the other type. There is a directed edge from species $i$ to reaction $j$ if species $i$ is present on the LHS of reaction $j$. There is a directed edge from reaction $i$ to species $j$ if the net production of species $j$ in reaction $i$ is non-zero. The first type of edge is labelled by the partial derivative of flux $j$ with respect to species $i$. The second type is labelled by the corresponding stoichiometric coefficient. The DSR graph determines the interaction graph. The paper of Soliman I mentioned in a recent post uses the DSR graph in the sense of Kaltenbach.

A type of species-reaction graph has been used in quite a different way by Angeli, de Leenheer and Sontag to obtain conditions for the montonicity of the equations for a network written in reaction coordinates.