## The deficiency one theorem

December 2, 2015

Here I continue with the discussion of chemical reaction network theory begun in the previous post. After having presented a proof of the Deficiency Zero Theorem in my course I proceeded to the Deficiency One Theorem following the paper of Feinberg in Arch. Rat. Mech. Anal. 132, 311. If we have a reaction network we can consider each linkage class as a network in its own right and thus we can define its deficiency. If we denote the deficiency of the full network by $\delta$ and the deficiency of the $i$th linkage class by $\delta_i$ then in general $\delta\ge\sum_i \delta_i$. The first hypothesis of the Deficiency One Theorem is that the deficiencies of the linkage classes are no greater than one. The second is that equality holds in the inequality relating the deficiency of the network to those of its linkage classes. The stoichiometric subspace $S$ of the full network is the sum of the corresponding spaces $S_i$ for the linkage classes. The sum is direct precisely when equality holds in the inequality for the deficiencies. The third condition is that there is precisely one terminal strong linkage class in each linkage class ($t=l$). The first conclusion is that if there exists a positive stationary solution there is precisely one stationary solution in each stoichiometric compatibility class. The second is that if the network is weakly reversible there exists a positive stationary solution. Thus two of the conclusions of the Deficiency Zero Theorem have direct analogues in this case. Others do not. There is no statement about the stability of the stationary solutions. Networks which are not weakly reversible but satisfy the three conditions of the Deficiency One Theorem may have positive stationary solutions. In the paper of Feinberg the proof of the Deficiency One Theorem is intertwined with that of another result. It says that the linearization about a stationary solution of the restriction of the system to a stoichiometric compatibility class has trivial kernel. A related result proved in the same paper says that for a weakly reversible network of deficiency zero each stationary solution is a hyperbolic sink within its stoichiometric class. Statements of this type ensure that these stationary solutions possess a certain structural stability.

In the proof of the Deficiency One Theorem the first step (Step 1) is to show that when the assumptions of the theorem are satisfied and there is positive stationary solution $c^*$ then the set of stationary solutions is equal to the set of points for which $\log c-\log c^*$ lies in the orthogonal complement of the stoichiometric subspace. From this the conclusion that there is exactly one stationary solution in each stoichiometric compatibility class follows just as in the proof of the Deficiency Zero Theorem (Step 2). To complete the proof it then suffices to prove the existence of a positive stationary solution in the weakly reversible case (Step 3). In Step 1 the proof is reduced to the case where the network has only one linkage class by regarding the linkage classes of the original network as networks in their own right. In this context the concept of a partition of a network $(\cal S,\cal C,\cal R)$ is introduced. This is a set of subnetworks $({\cal S},{\cal C}^i,{\cal R}^i)$. The set of species is unchanged. The set of reactions $\cal R$ is a disjoint union of the ${\cal R}^i)$ and ${\cal C}^i$ is the set of complexes occurring in the reactions contained in ${\cal R}^i)$. The partition is called direct if the stoichiometric subspace of the full network is the direct sum of those of the subnetworks. The most important example of a partition of a network in the present context is that given by the linkage classes of any network. That it is direct is the second condition of the Deficiency One Theorem. The other part of Step 1 of the proof is to show that the statement holds in the case that there is only one linkage class. The deficiency is then either zero or one. Since the case of deficiency zero is already taken care of by the Deficiency Zero Theorem we can concentrate on the case where $\delta=1$. Then the dimension of ${\rm ker} A_k$ is one and that of ${\rm ker} (YA_k)$ is two. The rest of Step 1 consists of a somewhat intricate algebraic calculation in this two-dimensional space. It remains to discuss Step 3. In this step the partition given by the linkage classes is again used to reduce the problem to the case where there is only one linkage class. The weak reversibility is preserved by this reduction. Again we can assume without loss of generality that $\delta=1$. The subspace $U={\rm im} Y^T+{\rm span}\omega_{\cal C}$ is a hyperplane in $F({\cal C})$. We define $\Gamma$ to be the set of functions of the form $\log a$ with $a$ a positive element of ${\rm ker} (YA_k)$. The desired stationary solution is obtained as a point of the intersection of $U$ with $\Gamma$. To show that this intersection is non-empty it is proved that there are points of $\Gamma$ on both sides of $U$. This is done by a careful examination of the cone of positive elements of ${\rm ker} (YA_k)$.

To my knowledge the number of uses of the Deficiency Zero Theorem and the Deficiency One Theorem in problems coming from applications is small. If anyone reading this has other information I would like to hear it. I will now list the relevant examples I know. The Deficiency Zero Theorem was applied by Sontag to prove asymptotic stability for McKeithan’s kinetic proofreading model of T cell activation. I applied it to the model of Salazar and Höfer for the dynamics of the transcription factor NFAT. Another potential application would be the multiple futile cycle. This network is not weakly reversible. The simple futile cycle has deficiency one and the dual futile cycle deficiency two. The linkage classes have deficiency zero in both cases. Thus while the first and third conditions of the Deficiency One Theorem are satisfied the second is not. Replacing the distributive phosphorylation in the multiple futile cycle by processive phosphorylation may simplify things. This has been discussed by Conradi et. al., IEE Proc. Systems Biol. 152, 243. In the case of two phosphorylation steps the system obtained is of deficiency one but the linkage classes have deficiency zero so that condition (ii) is still not satisfied. It seems that the Deficiency One Algorithm may be more helpful and that will be the next subject I cover in my course.

## Feinberg’s proof of the deficiency zero theorem

November 23, 2015

I discussed the deficiency zero theorem of chemical reaction network theory (CRNT) in a previous post. (Some further comments on this can be found here and here.) This semester I am giving a lecture course on chemical reaction network theory. Lecture notes are growing with the course and can be found in German and English versions on the web page of the course. The English version can also be found here. Apart from introductory material the first main part of the course was a proof of the Deficiency Zero Theorem. There are different related proofs in the literature and I have followed the approach in the classic lecture notes of Feinberg on the subject closely. The proof in those notes is essentially self-contained apart from one major input from a paper of Feinberg and Horn (Arch. Rat. Mech. Anal. 66, 83). In this post I want to give a high-level overview of the proof.

The starting point of CRNT is a reaction network. It can be represented by a directed graph where the nodes are the complexes (left or right hand sides of reactions) and the directed edges correspond to the reactions themselves. The connected components of this graph are called the linkage classes of the network and their number is usually denoted by $l$. If two nodes can be connected by oriented paths in both directions they are said to be strongly equivalent. The corresponding equivalence classes are called strong linkage classes. A strong linkage class is called terminal if there is no directed edge leaving it. The number of terminal strong linkage classes is usually denoted by $t$. From the starting point of the network making the assumption of mass action kinetics allows a system of ODE $\dot c=f(c)$ to be obtained in an algorithmic way. The quantity $c(t)$ is a vector of concentrations as a function of time. Basic mathematical objects involved in the definition of the network are the set $\cal S$ of chemical species, the set $\cal C$ of complexes and the set $\cal R$ of reactions. An important role is also played by the vector spaces of real-valued functions on these finite sets which I will denote by $F({\cal S})$, $F({\cal C})$ and $F({\cal R})$, respectively. Using natural bases they can be identified with $R^m$, $R^n$ and $R^r$. The vector $c(t)$ is an element of $F({\cal S})$. The mapping $f$ from $F({\cal S})$ to itself can be written as a composition of three mappings, two of them linear, $f=YA_k\Psi$. Here $Y$, the complex matrix, is a linear mapping from $F({\cal C})$ to $F({\cal S})$. $A_k$ is a linear mapping from $F({\cal C})$ to itself. The subscript $k$ is there because this matrix is dependent on the reaction constants, which are typically denoted by $k$. It is also possible to write $f$ in the form $Nv$ where $v$ describes the reaction rates and $N$ is the stoichiometric matrix. The image of $N$ is called the stoichiometric subspace and its dimension, the rank of the network, is usually denoted by $s$. The additive cosets of the stoichiometric subspace are called stoichiometric compatibility classes and are clearly invariant under the time evolution. Finally, $\Psi$ is a nonlinear mapping from $F({\cal S})$ to $F({\cal C})$. The mapping $\Psi$ is a generalized polynomial mapping in the sense that its components are products of powers of the components of $c$. This means that $\log\Psi$ depends linearly on the logarithms of the components of $c$. The condition for a stationary solution can be written as $\Psi(c)\in {\rm ker} (YA_k)$. The image of $\Psi$ is got by exponentiating the image of a linear mapping. The matrix of this linear mapping in natural bases is $Y^T$. Thus in looking for stationary solutions we are interested in finding the intersection of the manifold which is the image of $\Psi$ with the kernel of $YA_k$. The simplest way to define the deficiency of the network is to declare it to be $\delta=n-l-s$. A fact which is not evident from this definition is that $\delta$ is always non-negative. In fact $\delta$ is the dimension of the vector space ${\rm ker} Y\cap {\rm span}(\Delta)$ where $\Delta$ is the set of complexes of the network. An alternative concept of deficiency, which can be found in lecture notes of Gunawardena, is the dimension $\delta'$ of the space ${\rm ker} Y\cap{\rm im} A_k$. Since this vector space is a subspace of the other we have the inequality $\delta'\le\delta$. The two spaces are equal precisely when each linkage class contains exactly one terminal strong linkage class. This is, in particular, true for weakly reversible networks. The distinction between the two definitions is often not mentioned since they are equal for most networks usually considered.

If $c^*$ is a stationary solution then $A_k\Psi (c^*)$ belongs to ${\rm ker} Y\cap{\rm im} A_k$. If $\delta'=0$ (and in particular if $\delta=0$) then this means that $A_k\Psi (c^*)=0$. In other words $\Psi (c^*)$ belongs to the kernel of $A_k$. Stationary solutions of this type are called complex balanced. It turns out that if $c^*$ is a complex balanced stationary solution the stationary solutions are precisely those points $c$ for which $\log c-\log c_*$ lies in the orthogonal complement of the stoichiometric subspace. It follows that whenever we have one solution we get a whole manifold of them of dimension $n-s$. It can be shown that each manifold of this type meets each stoichiometric class in precisely one point. This is proved using a variational argument and a little convex analysis.

It is clear from what has been said up to now that it is important to understand the positive elements of the kernel of $A_k$. This kernel has dimension $t$ and a basis each of whose elements is positive on a terminal strong linkage class and zero otherwise. Weak reversibility is equivalent to the condition that the union of the terminal strong linkage classes is the set of all complexes. It can be concluded that when the network is not weakly reversible there exists no positive element of the kernel of $A_k$. Thus for a network which is not weakly reversible and has deficiency zero there exist no positive stationary solutions. This is part of the Deficiency Zero Theorem. Now consider the weakly reversible case. There a key statement of the Deficiency Zero Theorem is that there exists a complex balanced stationary solution $c^*$. Where does this $c^*$ come from? We sum the vectors in the basis of ${\rm ker} A_k$ and due to weak reversibility this gives something which is positive. Then we take the logarithm of the result. When $\delta=0$ this can be represented as a sum of two contributions where one is of the form $Y^T z$. Then $c^*=e^z$. A further part of the deficiency zero theorem is that the stationary solution $c^*$ in the weakly reversible case is asymptotically stable. This is proved using the fact that for a complex balanced stationary solution the function $h(c)=\sum_{s\in\cal S}[c(s)(\log c(s)-\log c^*(s)-1)+c(s^*)]$ is a Lyapunov function which vanishes for $c=c^*$

## Conference on biological oscillators at EMBL in Heidelberg

November 17, 2015

EMBL, the European Molecular Biology Laboratory, is an international institution consisting of laboratories at five sites, two in Germany, one in the UK, one in France and one in Italy. I recently attended a meeting on the theme ‘Biological Oscillators’ at the site in Heidelberg. The impressive building is in the form of a double helix. There are two spiral ramps over several stories which are linked by bridges (‘hydrogen bonds’, in German Wasserstoffbrücken). This helix provides an original setting for the poster sessions. The building is reached by ascending steep hills in the area behind the castle. I took the comfortable option of using the bus provided by the institute. This meeting had about 130 participants but I think that the capacity is much greater.

One of the most interesting talks on the first day from my point of view was by Petra Schwille from the Max Planck Institute for Biochemistry. She talked about the Min system which is used by bacteria to determine their plane of division. The idea is that certain proteins (whose identity is explicitly known) oscillate between the ends of the cell and that the plane of division is the nodal surface of the concentration of one of these. The speaker and her collaborators have been able to reconstitute this system in a cell-free context. A key role is played by the binding of the proteins to the cell membrane. Diffusion of bound proteins is much slower than that of proteins in solution and this situation of having two different diffusion constants in a coupled system is similar to the classical scenario known from the Turing instability. It sounds like modelling this system mathematically can be a lot of fun and that there is no lack of people interested in doing so.

There was also a ‘Keynote Lecture’ by Jordi Garcia-Ojalvo which lived up to the promise of its special title. The topic was the growth of a colony of Bacillus subtilis. (The published reference is Nature 523, 550.) In fact, to allow better control, the colony is constrained to be very thin and is contained in a microfluidic system which allows its environment to be manipulated precisely. A key observation is that the colony does not grow at a constant rate. Instead its growth rate is oscillatory. The speaker explained that this can be understood in terms of the competition between the cells near the edge of the colony and those in the centre. The colony is only provided with limited resources (glycerol, glutamate and salts). It may be asked which resource limits the growth rate. It is not the glycerol, which is the primary carbon source. Instead it is the glutamate, which is the primary source of nitrogen. An important intermediate compound in the use of glutamate is ammonium. If cells near the boundary of the colony produced ammonium it would be lost to the surroundings. Instead they use ammonium produced by the interior cells. It is the exterior cells which grow and they can deprive the inner cells of glutamate. This prevents the inner cells producing ammonium which is then lacking for the growth of the outer cells. This establishes a negative feedback loop which can be seen as the source of the oscillations in growth rate. The feasibility of this mechanism was checked using a mathematical model. The advantage of the set-up for the bacteria is that if the colony is exposed to damage from outside it can happen that only the exterior cells die and the interior cells generate a new colony. The talk also included a report on further work (Nature 527, 59) concerning the role of ion channels in biofilms. There are close analogies to the propagation of nerve signals and the processes taking place can be modelled by equations closely related to the Hodgkin-Huxley system.

I will now mention a collection of other topics at the conference which I found particularly interesting. One recurring theme was NF$\kappa$B. This transcription factor is known to exhibit oscillations. A key question is what their function is, if any. One of the pioneers in this area, Mike White, gave a talk at the conference. There were also a number of other people attending working on related topics. I do not want to go any deeper here since I think that this is a theme to which I might later devote a post of its own, if not more than one. I just note two points from White’s talk. One is that this substance is a kind of hub or bow-tie with a huge number of inputs and outputs. Another is that the textbook picture of the basic interactions of NF$\kappa$B is a serious oversimplification. Another transcription factor which came up to a comparable extent during the conference is Hes1, which I had never previously heard of. Jim Ferrell gave a talk about the coordination of mitosis in Xenopus eggs. These are huge cells where communication by means of diffusion would simply not be fast enough. The alternative proposed by Ferrell are trigger waves, which can travel much faster. Carl Johnson talked about mechanisms ensuring the stability of the KaiABC oscillator. He presented videos showing the binding of individual KaiA molecules to KaiC. I was was amazed that these things can be seen directly and are not limited to the cartoons to be found in biology textbooks. Other videos I found similarly impressive were those of Alexander Aulehla showing the development of early mouse embryos (segmentation clock) where it could be seen how waves of known chemical events propagating throught the tissues orchestrate the production of structures in the embryo. These pictures brought the usual type of explanations used in molecular biology to a new level of concreteness in my perception.

## David Vetter, the bubble boy

October 17, 2015

T cells are a class of white blood cells without which a human being usually cannot survive. An exception to this was David Vetter, a boy who lived 12 years without T cells. This was only possible because he lived all this time in a sterile environment, a plastic bubble. For this reason he became known as the bubble boy. The disease which he suffered from is called SCID, severe combined immunodeficiency, and it corresponds to having no T cells. The most common form of this is due to a mutation on the X chromosome and as a result it usually affects males. The effects set in a few months after birth. The mutation leads to a lack of the $\gamma$ chain of the IL-2 receptor. In fact this chain occurs in several cytokine receptors and is therefore called the ‘common chain’. Probably the key to the negative effects caused by its lack in SCID patients is the resulting lack of the receptor for IL-7, which is important for T cell development. SCID patients have a normal number of B cells but very few antibodies due to the lack of support by helper T cells. Thus in the end they lack both the immunity usually provided by T cells and that usually provided by B cells. This is the reason for the description ‘combined immunodeficiency’. I got the information on this theme which follows mainly from two sources. The first is a documentary film ‘Bodyshock – The Boy in the Bubble’ about David Vetter produced by Channel 4 and available on Youtube. (There are also less serious films on this subject, including one featuring John Travolta.) The second is the chapter on X-linked SCID in the book ‘Case Studies in Immunology’ by Raif Geha and Luigi Notarangelo. I find this book a wonderful resource for learning about immunology. It links general theory to the case history of specific patients.

David Vetter had an older brother who also suffered from SCID and died of infection very young. Thus his parents and their doctors were warned. The brother was given a bone marrow transplant from his sister, who had the necessary tissue compatibility. Unfortunately this did not save him, presumably because he had already been exposed to too many infections by the time it was carried out. The parents decided to have another child, knowing that if it was a boy the chances of another case of SCID were 50%. Their doctors had a hope of being able to save the life of such a child by isolating him and then giving him a bone marrow transplant before he had been exposed to infections. The parents very soon had another child, it was a boy, he had SCID. The child was put into a sterile plastic bubble immediately after birth. Unfortunately it turned out that the planned bone marrow donor, David’s sister, was not a good match for him. It was necessary to wait and hope for an alternative donor. This hope was not fulfilled and David had to stay in the bubble. This had not been planned and it must be asked whether the doctors involved had really thought through what would happen if the optimal variant they had thought of did not work out.

At one point David started making punctures in his bubble as a way of attracting attention. Then it was explained to him what his situation was and why he must not damage the bubble. Later there was a kind of space suit produced for him by NASA which allowed him to move around outside his home. He only used it six times since he was too afraid there could be an accident. His physical health was good but understandably his psychological situation was difficult. New ideas in the practise of bone marrow transplantation indicated that it might be possible to use donors with a lesser degree of compatibility. On this basis David was given a transplant with his sister as the donor. It was not noticed that her bone marrow was infected with Epstein-Barr virus. As a result David got Burkitt’s lymphoma, a type of cancer which can be caused by that virus. (Compare what I wrote about this role of EBV here.) He died a few months after the operation, at the age of 12. Since that time treatment techniques have improved. The patient whose case is described in the book of Geha and Notarangelo had a successful bone marrow transplant (with his mother as donor). Unfortunately his lack of antibodies was not cured but this can be controlled with injections of immunoglobulin once every three weeks.

## Siphons in reaction networks

October 8, 2015

The concept of a siphon is one which I have been more or less aware of for quite a long time. Unfortunately I never had the impression that I had understood it completely. Given the fact that it came up a lot in discussions I was involved in and talks I heard last week I thought that the time had come to make the effort to do so. It is of relevance for demonstrating the property of persistence in reaction networks. This is the property that the $\omega$-limit points of a positive solution are themselves positive. For a bounded solution this is the same as saying that the infima of all concentrations at late times are positive. The most helpful reference I have found for these topics is a paper of Angeli, de Leenheer and Sontag in a proceedings volume edited by Queinnec et. al.

There are two ways of formulating the definition of a siphon. The first is more algebraic, the second more geometric. In the first the siphon is defined to be a set $Z$ of species with the property that whenever one of the species in $Z$ occurs on the right hand side of a reaction one of the species in $Z$ occurs on the left hand side. Geometrically we replace $Z$ by the set $L_Z$ of points of the non-negative orthant which are common zeroes of the elements of $Z$, thought of as linear functions on the species space. The defining property of a siphon is that $L_Z$ is invariant under the (forward in time) flow of the dynamical system describing the evolution of the concentrations. Another way of looking at the situation is as follows. Consider a point of $L_Z$. The right hand side of the evolution equations of one of the concentrations belonging to $Z$ is a sum of positive and negative terms. The negative terms automatically vanish on $L_Z$ and the siphon condition is what is needed to ensure that the positive terms also vanish there. Sometimes minimal siphons are considered. It is important to realize that in this case $Z$ is minimal. Correspondingly $L_Z$ is maximal. The convention is that the empty set is excluded as a choice for $Z$ and correspondingly the whole non-negative orthant as a choice for $L_Z$. What is allowed is to choose $Z$ to be the whole of the species space which means that $L_Z$ is the origin. Of course whether this choice actually defines a siphon depends on the particular dynamical system being considered.

If $x_*$ is an $\omega$-limit point of a positive solution but is not itself positive then the set of concentrations which are zero at that point is a siphon. In particular stationary solutions on the boundary are contained in siphons. It is remarked by Shiu and Sturmfels (Bull. Math. Biol. 72, 1448) that for a network with only one linkage class if a siphon contains one stationary solution it consists entirely of stationary solutions. To see this let $x_*$ be a stationary solution in the siphon $Z$. There must be some complex $y$ belonging to the network which contains an element of $Z$. If $y'$ is another complex then there is a directed path from $y'$ to $y$. We can follow this path backwards from $y$ and conclude successively that each complex encountered contains an element of $Z$. Thus $y'$ contains an element of $Z$ and since $y'$ was arbitrary all complexes have this property. This means that all complexes vanish at $x_*$ so that $x_*$ is a stationary solution.

Siphons can sometimes be used to prove persistence. Suppose that $Z$ is a siphon for a certain network so that the points of $Z$ are potential $\omega$-limit points of solutions of the ODE system corresponding to this network. Suppose further that $A$ is a conserved quantity for the system which is a linear combination of the coordinates with positive coefficents. For a positive solution the quantity $A$ has a positive constant value along the solution and hence also has the same value at any of its $\omega$-limit points. It follows that if $A$ vanishes on $Z$ then no $\omega$-limit point of that solution belongs to $Z$. If it is possible to find a conserved quantity $A$ of this type for each siphon of a given system (possibly different conserved quantities for different siphons) then persistence is proved. For example this strategy is used in the paper of Angeli et al. to prove persistence for the dual futile cycle. The concept of persistence is an important one when thinking about the general properties of reaction networks. The persistence conjecture says that any weakly reversible reaction network with mass action kinetics is persistent (possibly with the additional assumption that all solutions are bounded). In his talk last week Craciun mentioned that he is working on proving this conjecture. If true it implies the global attractor conjecture. It also implies a statement claimed in a preprint of Deng et. al. (arXiv:1111.2386) that a weakly reversible network has a positive stationary solution in any stoichiometric compatobility class. This result has never been published and there seems to be some doubt as to whether the proof is correct.

## Trip to the US

October 5, 2015

Last week I visited a few places in the US. My first stop was Morgantown, West Virginia where my host was Casian Pantea. There I had a lot of discussions with Casian and Carsten Conradi on chemical reaction network theory. This synergized well with the work I have recently been doing preparing a lecture course on that subject which I will be giving in the next semester. I gave a talk on MAPK and got some feedback on that. It rained a lot and there was not much opportunity to do anything except work. One day on the way to dinner while it was relatively dry I saw a Cardinal and I fortunately did have my binoculars with me. On Wednesday afternoon I travelled to New Brunswick and spent most of Thursday talking to Eduardo Sontag at Rutgers. It was a great pleasure to talk to an excellent mathematician who also knows a lot about immunology. He and I have a lot of common interests which is in part due to the fact that I was inspired by several of his papers during the time I was getting into mathematical biology. I also had the opportunity to meet Evgeni Nikolaev who told me a variety of interesting things. They concerned bifurcation theory in general, its applications to the kinds of biological models I am interested in and his successes in applying mathematical models to understanding concrete problems in biomedical research such as the processes taking place in tuberculosis. My personal dream is to see a real coming together of mathematics and immunology and that I have the chance to make a contribution to that process.

On Friday I flew to Chicago in order to attend an AMS sectional meeting. I had been in Chicago once before but that is many years ago now. I do remember being impressed by how much Lake Michigan looks like the sea, I suppose due to the structure of the waves. This impression was even stronger this time since there were strong winds whipping up the waves. Loyola University, the site of the meeting, is right beside the lake and it felt like home for me due to the combination of wind, waves and gulls. The majority of those were Ring-Billed Gulls which made it clear which side of the Atlantic I was on. There were also some Herring Gulls and although they might have been split from those on the other side of the Atlantic by the taxonomists I did not notice any difference. It was the first time I had been at an AMS sectional meeting and my impression was that the parallel sessions were very parallel, in other words in no danger of meeting. Most of the people in our session were people I knew from the conferences I attended in Charlotte and in Copenhagen although I did make a couple of new acquaintances, improving my coverage of the reaction network community.

In a previous post I mentioned Gheorghe Craciun’s ideas about giving the deficiency of a reaction network a geometric interpretation, following a talk of his in Copenhagen. Although I asked him questions about this on that occasion I did not completely understand the idea. Correspondingly my discussion of the point here in my blog was quite incomplete. Now I talked to him again and I believe I have finally got the point. Consider first a network with a single linkage class. The complexes of the network define points in the species space whose coordinates are the stoichiometric coefficients. The reactions define oriented segments joining the educt complex to the product complex of each reaction. The stoichiometric subspace is the vector space spanned by the differences of the complexes. It can also be considered as a translate of the affine subspace spanned by the complexes themselves. This makes it clear that its dimension $s$ is at most $n-1$, where $n$ is the number of complexes. The number $s$ is the rank of the stoichiometric matrix. The deficiency is $n-1-s$. At the same time $s\le m$. If there are several linkage classes then the whole space has dimension at most $n-l$, where $l$ is the number of linkage classes. The deficiency is $n-l-s$. If the spaces corresponding to the individual linkage classes have the maximal dimension allowed by the number of complexes in that class and these spaces are linearly independent then the deficiency is zero. Thus we see that the deficiency is the extent to which the complexes fail to be in general position. If the species and the number of complexes have been fixed then deficiency zero is seen to be a generic condition. On the other hand fixing the species and adding more complexes will destroy the deficiency zero condition since then we are in the case $n-l>m$ so that the possibility of general position is excluded. The advantage of having this geometric picture is that it can often be used to read off the deficiency directly from the network. It might also be used to aid in constructing networks with a desired deficiency.

## Immunotherapy for cancer

September 20, 2015

A promising innovative approach to cancer therapy is to try to persuade the immune system to attack cancer cells effectively. The immune system does kill cancer cells and presumably removes many tumours which we never suspect we had. At the same time established tumours are able to successfully resist this type of attack in many cases. The idea of taking advantage of the immune system in this way is an old one but it took a long time before it became successful enough to reach the stage of an approved drug. This goal was achieved with the approval of ipilimumab for the treatment of melanoma by the FDA in 2011. This drug is a monoclonal antibody which binds the molecule CTLA4 occurring on the surface of T cells.

To explain the background to this treatment I first recall some facts about T cells. T cells are white blood cells which recognize foreign substances (antigens) in the body. The antigen binds to a molecule called the T cell receptor on the surface of the cell and this gives the T cell an activation signal. Since an inappropriate activation of the immune system could be very harmful there are built-in safety mechanisms. In order to be effective the primary activation signal has to be delivered together with a kind of certificate that action is really necessary. This is a second signal which is given via another surface molecule on the T cell, CD28. The T cell receptor only binds to an antigen when the latter is presented on the surface of another cell (an antigen-presenting cell, APC) in a groove within another molecule, an MHC molecule (major histocompatibility complex). On the surface of the APC there are under appropriate circumstances other molecules called B7.1 and B7.2 which can bind to CD28 and give the second signal. Once this has happened the activated T cell takes appropriate action. What this is depends on the type of T cell involved but for a cytotoxic T cell (one which carries the surface molecule CD8) it means that the T cell kills cells presenting the antigen. If the cell was a virus-infected cell and the antigen is derived from the virus then this is exactly what is desired. Coming back to the safety mechanisms, it is not only important that the T cell is not erroneously switched on. It is also important that when it is switched on in a justified case it should also be switched off after a certain time. Having it switched on for an unlimited time would never be justified. This is where CTLA4 comes in. This protein can bind to B7.1 and B7.2 and in fact does so more strongly than CD28. Thus it can crowd out CD28 and switch off the second signal. By binding to CTLA4 the antibody in ipilimumab stops it from binding to B7.1 and B7.2, thus leaving the activated T cell switched on. In some cases cancer cells present unusual antigens and become a target for T cells. The killing of these cells can be increased by CTLA4 via the mechanism just explained. At this point I should say that it may not be quite clear whether this is really the mechanism of action of CTLA4 in causing tumours to shrink. Alternative possibilities are mentioned in the Wikipedia article on CTLA4.

There are various things which have contributed to my interest in this subject. One is lectures I heard in the series ‘Universität im Rathaus’ [University in the Town Hall] in Mainz last February. The speakers were Matthias Theobald and Ugur Sahin and the theme was personalized cancer medicine. The central theme of what they were talking about is one step beyond what I have just sketched. A weakness of the therapy using antibodies to CTLA4 or the related approach using antibodies to another molecule PD-1 is that they are unspecific. In other words they lead to an increase not only in the activity of the T cells specific to cancer cells but of all T cells which have been activated by some antigen. This means that serious side effects are very likely. An approach which is theoretically better but as yet in a relatively early stage of development is to produce T cells which are specific for antigens belonging to the tumour of a specific patient and for an MHC molecule of that patient capable of presenting that antigen. From the talk I had the impression that doing this requires a lot of input from bioinformatics but I was not able to understand what kind of input it is. I would like to know more about that. Coming back to CTLA4, I have been interested for some time in modelling the activation of T cells and in that context it would be natural to think about also modelling the deactivating effects of CTLA4 or PD-1. I do not know whether this has been tried.

## Oscillations in the MAP kinase cascade

September 10, 2015

In a recent post I mentioned my work with Juliette Hell on the existence of oscillations in the Huang-Ferrell model for the MAP kinase cascade. We recently put our paper on the subject on ArXiv. The starting point of this project was the numerical and neuristic work of Qiao et. al., PLoS Comp. Biol. 3, 1819. Within their framework these authors did an extensive search of parameter space and found Hopf bifurcations and periodic solutions for many parameters. The size of the system is sufficiently large that it represents a significant obstacle to analytical investigations. One way of improving this situation is to pass to a limiting system (MM system) by a Michaelis-Menten reduction. In addition it turns out that the periodic solutions already occur in a truncated system consisting of the first two layers of the cascade. This leaves one layer with a single phosphorylation and one with a double phosphorylation. In a previous paper we had shown how to do Michaelis-Menten reduction for the truncated system. Now we have generalized this to the full cascade. In the truncated system the MM system is of dimension three, which is still quite convenient for doing bifurcation theory. Without truncation the MM system is of dimension five, which is already much more difficult. It is however possible to represent the system for the truncated cascade as a (singular) limit of that for the full cascade and thus transport information from the truncated to the full cascade.

Consider the MM system for the truncated cascade. The aim is then to find a Hopf bifurcation in a three-dimensional dynamical system with a lot of free parameters. Because of the many parameters is it not difficult to find a large class of stationary solutions. The strategy is then to linearize the right hand side of the equations about these stationary solutions and try show that there are parameter values where a suitable bifurcation takes place. To do this we would like to control the eigenvalues of the linearization, showing that it can happen that at some point one pair of complex conjugate eigenvalues passes through the imaginary axis with non-zero velocity as a parameter is varied, while the remaining eigenvalue has non-zero real part. The behaviour of the eigenvalues can largely be controlled by the trace, the determinant and an additional Hurwitz determinant. It suffices to arrange that there is a point where the trace is negative, the determinant is zero and the Hurwitz quantity passes through zero with non-zero velocity. This we did. A superficially similar situation is obtained by modelling an in vitro model for the MAPK cascade due to Prabakaran, Gunawardena and Sontag mentioned in a previous post in a way strictly analogous to that done in the Huang-Ferrell model. In that case the layers are in the opposite order and a crucial sign is changed. Up to now we have not been able to show the existence of a Hopf bifurcation in that system and our attempts up to now suggest that there may be a real obstruction to doing so. It should be mentioned that the known necessary condition for a stable hyperbolic periodic solution, the existence of a negative feedback loop, is satisfied by this system.

Now I will say some more about the model of Prabakaran et. al. Its purpose is to obtain insights on the issue of network reconstruction. Here is a summary of some things I understood. The in vitro biological system considered in the paper is a kind of simplification of the Raf-MEK-ERK MAPK cascade. By the use of certain mutations a situation is obtained where Raf is constitutively active and where ERK can only be phosphorylated once, instead of twice as in vivo. This comes down to a system containing only the second and third layers of the MAPK cascade with the length of the third layer reduced from three to two phosphorylation states. The second layer is modelled using simple mass action (MA) kinetics with the two phosphorylation steps being treated as one while in the third layer the enzyme concentrations are included explicitly in the dynamics in a standard Michaelis-Menten way (MM-MA). The resulting mathematical model is a system of seven ODE with three conservation laws. In the paper it is shown that for given values of the conserved quantities the system has a unique steady state. This is an application of a theorem of Angeli and Sontag. Note that this is not the same system of equations as the system analogous to that of Huang-Ferrell mentioned above.

The idea now is to vary one of the conserved quantities and monitor the behaviour of two functions $x$ and $y$ of the unknowns of the system at steady state. It is shown that for one choice of the conserved quantity $x$ and $y$ change in the same direction while for a different choice of the conserved quantity they change in opposite directions when the conserved quantity is varied. From a mathematical point of view this is not very surprising since there is no obvious reason forbidding behaviour of this kind. The significance of the result is that apparently biologists often use this type of variation in experiments to reach conclusions about causal relationships between the concentrations of different substances (activation and inhibition), which can be represented by certain signed oriented graphs. In this context ‘network reconstruction’ is the process of determining a graph of this type. The main conclusion of the paper, as I understand it, is that doing different experiments can lead to inconsistent results for this graph. Note that there is perfect agreement between the experimental results in the paper and the results obtained from the mathematical model. In a biological system if two experiments give conflicting results it is always possible to offer the explanation that some additional substance which was not included in the model is responsible for the difference. The advantage of the in vitro model is that there are no other substances which could play that role.

## Models for photosynthesis, part 3

September 8, 2015

Here I continue the discussion of models for photosynthesis in two previous posts. There I described the Pettersson and Poolman models and indicated the possibility of introducing variants of these which use exclusively mass action kinetics. I call these the Pettersson-MA and Poolman-MA models. I was interested in obtaining information about the qualitative behaviour of solutions of these ODE systems. This gave rise to the MSc project of Dorothea Möhring which she recently completed successfully. Now we have extended this work a little further and have written up the results in a paper which has just been uploaded to ArXiv. The central issue is that of overload breakdown which is related to the mathematical notion of persistence. We would like to know under what circumstances a positive solution can have $\omega$-limit points where some concentrations vanish and, if so, which concentrations vanish in that case. It seems that there was almost no information on the latter point in the literature so that the question of what exactly overload breakdown is remained a bit nebulous. The general idea is that the Pettersson model should have a stronger tendency to undergo overload breakdown while the Poolman model should have a stronger tendency to avoid it. The Pettersson-MA and Poolman-MA models represent a simpler context to work in to start with.

For the Pettersson-MA model we were able to identify a regime in which overload breakdown takes place. This is where the initial concentrations of all sugar phosphates and inorganic phosphate in the chloroplast are sufficiently small. In that case the concentrations of all sugar phosphates tend to zero at late times with two exceptions. The concentrations of xylulose-4-phosphate and sedoheptulose-7-phosphate do not tend to zero. These results are obtained by linearizing the system around a simple stationary solution on the boundary and applying the centre manifold theorem. Another result is that if the reaction constants satisfy a certain inequality a positive solution can have no positive $\omega$-limit points. In particular, there are no positive stationary solutions in that case. This is proved using a Lyapunov function related to the total number of carbon atoms. In the case of the Poolman-MA model it was shown that the stationary point which was stable in the Pettersson case becomes unstable. Moreover, a quantitative lower bound for concentration of sugar phosphates at late times in obtained.These results fit well with the intuitive picture of what should happen. Some of the results on the Poolman-MA model can be extended to analogous ones for the original Poolman model. On the other hand the task of giving a full rigorous definition of the Pettersson model was postponed for later work. The direction in which this could go has been sketched in a previous post.

There remains a lot to be done. It is possible to define a kind of hybrid model by setting $k_{32}=0$ in the Poolman model. It would be desirable to completely clarify the definition of the Pettersson model and then, perhaps, to show that it can be obtained as a well-behaved limiting system of the hybrid system in the sense of geometric singular perturbation theory. This might allow the dynamical properties of solutions of the different systems to be related to each other. The only result on stationary solutions obtained so far is a non-existence theorem. It would be of great interest to have positive results on the existence, multiplicity and stability of stationary solutions. A related question is that of classifying possible $\omega$-limit points of positive solutions where some of the concentrations are zero. This was done in part in the paper but what was not settled is whether potential $\omega$-limit points with positive concentrations of the hexose phosphates can actually occur. Finally, there are a lot of other models for the Calvin cycle on the market and it would be interesting to see to what extent they are susceptible to methods similar to those used in our paper.

## Phosphorylation systems

September 1, 2015

In order to react to their environment living cells use signalling networks to propagate information from receptors, which are often on the surface of the cell, to the nucleus where transcription factors can change the behaviour of the cell by changing the rate of production of different proteins. Signalling networks often make use of phosphorylation systems. These are networks of proteins whose enzymatic activity is switched on or off by phosphorylation or dephosphorylation. When switched on they catalyse the (de-)phophorylation of other proteins. The information passing through the network is encoded in the phosphate groups attached to specific amino acids in the proteins concerned. A frequently occurring example of this type of system is the MAPK cascade discussed in a previous post. There the phosphate groups are attached to the amino acids serine, threonine and tyrosine. Another type of system, which is common in bacteria, are the two-component systems where the phosphate groups are attached to histidine and aspartic acid.

There is a standard mathematical model for the MAPK cascade due to Huang and Ferrell. It consists of three layers, each of which is a simple or dual futile cycle. Numerical and heuristic investigations indicate that the Huang-Ferrell model admits periodic solutions for certain values of the parameters. Together with Juliette Hell we set out to find a rigorous proof of this fact. In the beginning we pursued the strategy of showing that there are relaxation oscillations. An important element of this is to prove that the dual futile cycle exhibits bistability, a fact which is interesting in its own right, and we were able to prove this, as has been discussed here. In the end we shifted to a different strategy in order to prove the existence of periodic solutions. The bistability proof used a quasistationary (Michaelis-Menten) reduction of the Huang-Ferrell system. It applied bifurcation theory to the Michaelis-Menten system and geometric singular perturbation theory to lift this result to the original system. To prove the existence of periodic solutions we used a similar strategy. This time we showed the presence of Hopf bifurcations in a Michaelis-Menten system and lifted those. The details are contained in a paper which is close to being finished. In the meantime we wrote a review article on phosphorylation systems. Here I want to mention some of the topics covered there.

The MAPK cascade, which is the central subject of the paper is not isolated in its natural biological context. It is connected with other biochemical reactions which can be thought of as feedback loops, positive and negative. As already mentioned, the cascade itself consists of layers which are futile cycles. The paper first reviews what is known about the dynamics of futile cycles and the stand-alone MAPK cascasde. The focus is on phenomena such as multistability, sustained oscillations and (marginally) chaos and what can be proved about these things rigorously. The techniques which can be used in proofs of this kind are also reviewed. Given the theoretical results on oscillations it is interesting to ask whether these can be observed experimentally. This has been done for the Raf-MEK-ERK cascade by Shankaran et. al. (Mol. Syst. Biol. 5, 332). In that paper it is found that the experimental results do not fit well to the oscillations in the isolated cascade but they can be modelled better when the cascade is embedded in a negative feedback loop. Two other aspects are also built into the models used – the translocation of ERK from the cytosol to the nucleus (which is what is actually measured) and the fact that when ERK and MEK are not fully phosphorylated they can bind to each other. It is also briefly mentioned in our paper that a negative feedback can arise through the interaction of ERK with its substrates, as explained in Liu et. al. (Biophys. J. 101, 2572). For the cascade as treated in the Huang-Ferrell model with feedback added no rigorous results are known yet. (For a somewhat different system there is result on oscillations due to Gedeon and Sontag, J. Diff. Eq. 239, 273, which uses the strategy based on relaxation oscillations.)

In our paper there is also an introduction to two-component systems. A general conclusion of the paper is that phosphorylation systems give rise to a variety of interesting mathematical problems which are waiting to be investigated. It may also be hoped that a better mathematical understanding of this subject can lead to new insights concerning the biological systems being modelled. Biological questions of interest in this context include the following. Are dynamical features of the MAPK cascade such as oscillations desirable for the encoding of information or are they undesirable side effects? To what extent do feedback loops tend to encourage the occurrence of features of this type and to what extent do they tend to suppress them? What are their practical uses, if any? If the function of the system is damaged by mutations how can it be repaired? The last question is of special interest due to the fact that many cancer cells have mutations in the Raf-MEK-ERK cascade and there have already been many attempts to overcome their negative effects using kinase inhibitors, some of them successful. A prominent example is the Raf inhibitor Vemurafenib which has been used to treat metastatic melanoma.