## Archive for the ‘Uncategorized’ Category

### Organizing posts by categories

August 25, 2012

I have a tendency to use the minimal amount of technology I have to in order to achieve a particular goal. So for instance, having been posting things on this blog for several years, I have made use of hardly any of the technical possibilities available.  Among other things I did not assign my posts to categories, just putting them in one long list. I can well understand that not everyone who wants to read about immunology wants to read about general relativity and vice versa. Hence it is useful to have a sorting mechanism which can help to direct people to what they are interested in. Now I have invested the effort to add information on categories to most of the posts. It was easy (though time-consuming) to do and I find that the results are useful. It is helpful for me myself to navigate through the material and it is interesting for me to see at a glance how many posts on which subjects there are. For now on I will systematically assign (most) new posts to a category and the effort to do so should be negligible. This post is an exception since it does not really fit into any category I have.

### Do you know these matrices?

March 9, 2012

I have come across a class of matrices with some interesting properties. I feel that they must be known but I have not been able to find anything written about them. This is probably just because I do not know the right place to look. I will describe these matrices here and I hope that somebody will be able to point out a source where I can find more information about them. Consider an $n\times n$ matrix $A$ with elements $a_{ij}$ having the following properties. The elements with $i=j$ (call them $b_i$) are negative. The elements with $j=i+1\ {\rm mod}\ n$ (call them $c_i$) are positive. All other elements are zero. The determinant of a matrix of this type is $\prod_i b_i+(-1)^{n+1}\prod_i c_i$. Notice that the two terms in this sum always have opposite signs. A property of these matrices which I found surprising is that $B=(-1)^{n+1}(\det A)A^{-1}$ is a positive matrix, i.e. all its entries $b_{ij}$ are positive. In proving this it is useful to note that the definition of the class is invariant under cyclic permutation of the indices. Therefore it is enough to show that the entries in the first row of $B$ are non-zero. Removing the first row and the first column from $A$ leaves a matrix belonging to the class originally considered. Removing the first row and a column other than the first from $A$ leaves a matrix where $a_{n1}$ is alone in its column. Thus the determinant can be expanded about that element. The result is that we are left to compute the determinant of an $(n-2)\times (n-2)$matrix which is block diagonal with the first diagonal block belonging to the class originally considered and the second diagonal block being the transpose of a matrix of that class. With these remarks it is then easy to compute the determinant of the $(n-1)\times (n-1)$ matrix resulting in each of these cases. In more detail $b_{11}=(-1)^{n+1}b_2b_3\ldots b_n$ and $b_{1j}=(-1)^{n-j}b_2b_3\ldots b_{j-1}c_j\ldots c_n$ for $j>1$.

Knowing the positivity of $(-1)^{n+1}(\det A)A^{-1}$ means that it is possible to apply the Perron-Frobenius theorem to this matrix. In the case that $\det A$ has the same sign as $(-1)^{n+1}$ it follows that $A^{-1}$ has an eigenvector all of whose entries are positive. The corresponding eigenvalue is positive and larger in magnitude than any other eigenvalue of $A^{-1}$. This vector is also an eigenvalue of $A$ with a positive eigenvalue. Looking at the characteristic polynomial it is easy to see that if $(-1)^n(b_1b_2\ldots b_n+(-1)^{n+1}c_1c_2\ldots c_n)<0$ the matrix $A$ has exactly one positive eigenvalue and that none of its eigenvalues is zero.

### The Perron-Frobenius theorem

October 20, 2011

The Perron-Frobenius theorem is a result in linear algebra which I have known about for a long time. On the other hand I never took the time to study a proof carefully and think about why the result holds. I was now motivated to change this by my interest in chemical reaction network theory and the realization that the Perron-Frobenius theorem plays a central role in CRNT. In particular, it lies at the heart of the original proof of the existence part of the deficiency zero theorem. Here I will review some facts related to the Perron-Frobenius theorem and its proof.

Let $A$ be a square matrix all of whose entries are positive. Note how this condition makes no sense for an endomorphism of a vector space in the absence of a preferred basis. Then $A$ has a positive eigenvalue $\lambda_+$ and it is bigger than the magnitude of any other eigenvalue. The dimension of the generalized eigenspace corresponding to this eigenvalue is one. There is a vector in the eigenspace all of whose components are positive. Let $C_i$ be the sum of the entries in the $i$th column of $A$. Then $\lambda_+$ lies between the minimum and the maximum of the $C_i$.

If the assumption on $A$ is weakened to its having non-negative entries then most of the properties listed above are lost. However analogues can be obtained if the matrix is irreducible. This means by definition that the matrix has no invariant coordinate subspace. In that case $A$ has a positive eigenvalue which is at least as big as the magnitude of any other eigenvalue. As in the positive case it has multiplicity one. There is a vector in the eigenspace all of whose elements are positive. In general there are other eigenvalues of the same magnitude as the maximal positive eigenvalue and they are related to it by multiplication with powers of a root of unity. The estimate for the maximal real eigenvalue in terms of column sums remains true. The last statement follows from the continuous dependence of the eigenvalues on the matrix.

Suppose now that a matrix $B$ has the properties that its off-diagonal elements are non-negative and that the sum of the elements in each of its columns is zero. Then the sum of the elements in each column of a matrix of the form $B+\lambda I$ is $\lambda$. On the other hand for $\lambda$ sufficiently large the entries of the matrix $B+\lambda I$ are non-negative. If $B$ is irreducible then it can be concluded that the Perron eigenvalue of $B+\lambda I$ is $\lambda$, that the kernel of $B$ is one-dimensional and that it is spanned by a vector all of whose components are positive. In the proof of the deficiency zero theorem this is applied to certain restrictions of the kinetic matrix. The irreducibility property of $B$ follows from the fact that the network is weakly reversible.

The Perron-Frobenius theorem is proved in Gantmacher’s book on matrices. He proves the non-negative case first and uses that as a basis for the positive case. I would have preferred to see a proof for the positive case in isolation. I was not able to extract a simple conceptual picture which I found useful. I have seen some mention of the possibility of applying the Brouwer fixed point theorem but I did not find a complete treatment of this kind of approach written anywhere. There is an infinite-dimensional version of the theorem (the Krein-Rutman theorem). It applies to compact operators on a Banach space which satisfy a suitable positivity condition. In fact this throws some light on the point raised above concerning a preferred basis. Some extra structure is necessary but it does not need to be as much as a basis. What is needed is a positive cone. Let $K$ be the set of vectors in $n$-dimensional Euclidean space, all of whose components are non-negative. A matrix is non-negative if and only if it leaves $K$ invariant and this is something which can reasonably be generalized to infinite dimensions. Thus the set $K$ is the only extra structure which is required.

### Me on TV

November 26, 2010

Recently I was interviewed by TV journalists for a documentary of the channel 3Sat called “Rätsel Dunkle Materie” [The riddle of dark matter]. It was broadcast yesterday. Before I say more about my experience with this let me do a flashback to the only other time in my life I appeared on TV. On that occasion the BBC visited our school. I guess I was perhaps twelve at the time although I do not know for sure. I was filmed reading a poem which I had written myself. I was seen sitting in a window of the Bishops’ Palace in Kirkwall, looking out. I suppose only my silhouette was visible. I no longer have the text of the poem. All I know is that the first line was ‘Björn, adventuring at last’ and that later on there was some stuff about ravens. At that time I was keen on Vikings. The poem was no doubt very heroic, so that the pose looking out the window was appropriate.

Coming back to yesterday, the documentary consisted of three main elements. There was a studio discussion with three guests – the only one I know personally is Simon White. There were some clips illustrating certain ideas. Thirdly there were short sequences from interviews with some other people. I was one of these people. They showed a few short extracts of the interview with me and I was quite happy with the selection they made. This means conversely that they nicely cut out things which I might not have liked so much. I was answering questions posed by one of the journalists and which were not heard on TV. They told me in advance that this would be the case. They told me that for this reason I should not refer to the question during my answers. I found this difficult to do and I think I would need some practice to do it effectively. Fortunately it seems that they efficiently cut out these imperfections. I did not know the questions in advance of the filming and this led to some hesitant starts in my answers. This also did not come through too much in what was shown. Summing up, it was an interesting experience and I would do it again if I had the chance. Of course being a studio guest would be even more interesting …

I found the documentary itself not so bad. I could have done without the part about religion at the end. Perhaps the inclusion of this is connected with the fact that the presenter of the series, Gert Scobel, studied theology and also has a doctorate in hermeneutics. (I had to look up that word to have an idea what it meant.) An aspect of the presentation which was a bit off track was that it gave the impression that the idea of a theory unifying general relativity and quantum theory was solely due to Stephen Hawking. Before ending this post I should perhaps say something about my own point of view on dark matter and dark energy. Of course they are symptoms of serious blemishes in our understanding of reality. I believe that dark matter and dark energy are better approaches to explaining the existing observational anomalies than any other alternative which is presently available. In the past I have done some work related to dark energy myself. The one thing that I do not like about a lot of the research in this area is that while people are very keen on proposing new ‘theories’ (which are often just more or less vague ideas for models) there is much less enthusiasm for working out these ideas to obtain a logically sound proposal. Of course that would be more difficult. A case study in this direction was carried out in the diploma thesis of Nikolaus Berndt which was done under my supervision. The theme was to what extent the so-called Cardassian models (do not) deserve to be called a theory. We later produced a joint publication on this. It has not received much attention in the research community and as far as I know has only been cited once.

### The principle of symmetric criticality

May 12, 2010

There are many interesting partial differential equations which can be expressed as the Euler-Lagrange equations corresponding to some Lagrangian. Thus they are equivalent to the condition that the action defined by the Lagrangian is stationary under all variations. Sometimes we want to study solutions of the equations which are invariant under some symmetry group. Starting from the original equations, it is possible to calculate the symmetry-reduced equations. This is what I and many others usually do, without worrying about a Lagrangian formulation. Suppose that in some particular case the task of doing a symmetry reduction of the Lagrangian is significantly easier than the corresponding task for the differential equations. Then it is tempting to take the Euler-Lagrange equations corresponding to the symmetry-reduced action and hope that for symmetric solutions they are equivalent to the Euler-Lagrange equations without symmetry. But is this always true? The Euler-Lagrange equations without symmetry are equivalent to stationarity under all variations while the Euler-Lagrange equations for the symmetry-reduced action are equivalent to stationarity under symmetric perturbations. The second property is a priori weaker than the first. This procedure is often implicit in physics papers, where the variational formulation is more at the centre of interest than the equations of motion.

The potential problem just discussed is rarely if ever mentioned in the physics literature. Fortunately this question has been examined a long time ago by Richard Palais in a paper entitled ‘The principle of symmetric criticality’ (Commun. Math. Phys. 69, 19). I have known of the existence of this paper for many years but I never took the trouble to look at it seriously. Now I have finally done so. Palais shows that the principle is true if the group is compact or if the action is by isometries on a Riemannian manifold. Here the manifold is allowed to be an infinite-dimensional Hilbert manifold, so that examples of relevance to field theories in physics are included. The proof in the Riemannian case is conceptually simple and so I will give it here. Suppose that $(M,g)$ is a Riemannian manifold and $f$ a function on $M$. Let a group $G$ act smoothly on $M$ leaving $g$ and $f$ invariant. Let $p$ be a critical point of the restriction of $f$ to the set $F$ of fixed points of the group action. It can be shown that $F$ is a smooth totally geodesic submanifold. (In fact in more generality a key question is whether the fixed point set is a submanifold. If this is not the case even the definition of the principle may be problematic.) The gradient of $f$ at $p$ is orthogonal to $F$. Now consider the geodesic starting at $p$ with initial tangent vector equal to the gradient of $f$. It is evidently invariant under the group action since all the objects entering into its definition are. It follows that this geodesic consists of fixed points of the action of $G$ and so must be tangent to $F$. Hence the gradient of $f$ vanishes.

When does the principle fail? Perhaps the simplest example is given by the action of the real numbers on the plane generated by the vector field $x\frac{\partial}{\partial y}$ and the function $x$. This has no critical points but its restriction to the fixed point set, which is the $y$-axis, has critical points everywhere.

### Induced pluripotent stem cells

January 18, 2010

The usual career of a living cell proceeds from its beginning as a stem cell in the embryo through a process of differentiation where it becomes more and more specialized until (in most cases) it finally takes its place in some tissue as a terminally differentiated cell. This process involves various genes being switched on or off. Usually in the past this process has been thought of as being more or less irreversible. This leads to the great interest in embryonic stem cells as a potential basis of the treatment of various illnesses by regeneration of certain types of cells. Unfortunately embryonic stem (ES) cells have two big problems associated with them. The first is that their use raises ethical concerns in many people which act as a powerful inhibitor of the development of the technology. The other is that they may involve medical dangers. If the cells develop in the wrong direction they may lead to tumours, especially the type called a teratoma where cells are found which are of the wrong type of tissue (and often of many types) for the place they are in.

It was discovered in 2006 by Shinya Yamanaka and his associates that the usual development can be run backwards, producing stem cells from terminally differentiated cells, for instance skin cells. They named these cells induced pluripotent stem cells (iPS cells). On the web page of the National Institutes of Health where they have videos of lectures (http://videocast.nih.gov/) there is a talk given by Yamanaka on January 14th, 2010 which is inspiring and at the same time presented in an entertaining style. The introduction by Francis Collins, director of the NIH suggests that Yamanaka will not have to wait long for his Nobel Prize. iPS cells are an ethically safe alternative to ES cells. Their medical safety does not look so good at the moment. Under some circumstances the safety profile of iPS cells is similar to that of EC cells. Under other circumstances a subset of the cells seem to be refractory to differentiation and can then produce teratomas at a later time. It is necessary to learn to control their development better before they can be used in regenerative medicine. Of course it would be important to know what characterizes this subset. Yamanaka suggests that this may have to do with epigenetic factors and this ideas is being tested in his laboratory now. An application of iPS cells less risky than tissue regeneration is to use cells produced from iPS cells to test drugs which are toxic, or even lethal, for certain patients but not for the majority. The idea is to take skin cells from the patient, turn them into stem cells and test the drug on those cells. Unfortunately this process requires a lot of time and money.

The normal cells are turned into iPS cells by the application of certain transcription factors. This may be done by tranferring genetic material or by using the proteins themselves directly. Originally four different transcription factors had to be combined. Recent work by Hans Schöler and collaborators indicates that one of these, Oct4, is enough in humans. The article is in Nature, 461 (2009) 649.

### Four-dimensional Lie algebras

January 1, 2010

In mathematical general relativity it is common to study solutions of the Einstein equations with symmetry. In other words, solutions are considered which are invariant under the action of a Lie group $G$. (In what follows I will restrict consideration to the vacuum case to avoid having to talk about matter. So a solution means a Lorentzian metric $g$ satisfying ${\rm Ric}(g)=0$.) It is usual to concentrate on the four-dimensional case, corresponding to the fact that in everyday life we encounter one time and three space dimensions. One type of solutions with symmetry are the spatially homogeneous ones where the orbits of the group action are three-dimensional and spacelike. Then the Einstein equations reduce from partial differential equations to ordinary differential equations. This is a huge simplification although the solutions of the ODEs obtained are pretty complicated. Here I will make the further assumptions that the Lie group is of dimension three and that it is simply connected. The first of these assumptions is a real restriction but the second is not from my point of view since it does not change the dynamics of the solutions, which is what I am mainly interested in. With these assumptions the unknown can naturally be considered as a one-parameter family of left-invariant Riemannian metrics on a three-dimensional Lie group. These Riemannian metrics are obtained as the metrics induced by the spacetime metric on the orbits of the group action. Any connected three-dimensional Lie group can occur. Connected and simply connected Lie groups are in one to one correspondence with their Lie algebras.Thus it is important to understand what three-dimensional Lie algebras there are. Fortunately there exists a classification which was found by Bianchi in 1898. People working in general relativity call the spatially homogenous solutions of the Einstein equations with symmetry property defined by Lie groups in this way Bianchi models. They use the terminology of Bianchi, who distinguished types I-IX. A lot of work has been done on the dynamics of these solutions. Some more information on this can be found in a previous post on the Mixmaster model.

For reasons of pure mathematical curiosity, or otherwise, it is interesting to ask what happens to all this in space dimensions greater than three. Recently Arne Gödeke has written a diploma thesis on some aspects of this question under my supervision and this has led me to go into the issue in some depth. One thing which naturally comes up is the question of classifying Lie algebras in $n$ dimensions. As far as I can see there is not a useful complete classification in general dimensions but there is quite a bit of information available in low dimensions. Here I will concentrate on the case of four dimensions. In that case there is a classification which was found by Fubini in 1904 and since then other people have produced other versions. Having worked with Bianchi models for many years I feel very much at home with the three-dimensional Lie algebras. In contrast the four-dimensional classification appeared to me quite inhospitable and so I have invested some time in trying to fit the four-dimensional Lie algebras into a framework which I find more appealing. I record some of what I found here. The best guide I found was the work of Sigbjørn Hervik, in particular his paper in Class. Quantum Grav. 19, 5409 (cf. arXiv:gr-qc/0207079).

From the point of view of the dynamics of the Einstein equations one Bianchi type which is notably different from all others is type IX.The reason for this is that the Lie group (which is $SU(2)$) admits left-invariant metrics of positive scalar curvature. Is there a natural analogue for four-dimensional Lie algebras? A useful tool here is the Levi-Malcev theorem which provides a way of splitting a general Lie algebra into two simpler pieces. More precisely it says that each Lie algebra is the semidirect sum of a semisimple and a solvable Lie algebra. The semisimple part is called a Levi subalgebra and is unique up to isomorphism. It turns out that the information about whether there exists a metric of positive scalar curvature is contained in the Levi subalgebra. There are not many semisimple Lie algebras in low dimensions. In fact in dimension no greater than four there are only two, $su(2)$ and $sl(2,R)$. These correspond to Bianchi types IX and VIII respectively. The only possible non-trivial Levi decompositions are the semidirect sum of one of the two Lie algebras just mentioned and the real numbers. In fact it turns out that the semidirect sum of a semisimple Lie algebra and the real numbers is automatically a direct sum because any derivation of a semisimple Lie algebra is an inner derivation. The corresponding simply connected Lie group is a direct product. It can be concluded from this that the only simply connected and connected four-dimensional Lie group which admits a metric of positive scalar curvature is $SU(2)\times R$. This is the analogue of Bianchi type IX for $n=4$.

It is common in general relativity to divide the three-dimensional Lie algebras into two disjoint classes, Class A and Class B. The first of these consist of the unimodular Lie algebras, i.e. those whose structure constants have vanishing trace. They are closely associated with the class of Lie groups whose left-invariant metrics can be compactified by taking the quotient by a discrete group of isometries. They also have the pleasant property that their dynamics can be reduced to the case where the matrix of components of the metric in a suitable basis of left-invariant one-forms is diagonal. This is important for the Wainwright-Hsu system, a dynamical system formulation of the Einstein equations for Class A Bianchi models which is the basis for most of the rigorous results on the dynamics of these solutions obtained up to now. If type IX is omitted there are five different Lie algebras in Class A. One way of getting unimodular Lie algebras of dimension four is to take the direct sum of the three dimensional Lie algebras with the real numbers. Call the others indecomposable. The indecomposable unimodular four-dimensional Lie algebras can be classified into six types. Four of these are individual Lie algebras while the other types are one-parameter families of non-isomorphic algebras. One way of putting these into a larger framework is to note that each of them has a three-dimensional Abelian subalgebra. They can therefore be considered as special cases of solutions with three commuting spacelike Killing vector fields. This generalizes the fact that all the Class A Bianchi types except VIII and IX can be considered as solutions with two commuting Killing vector fields. I do not have an overview of the questions of compactification and diagonalization for these metrics. It seems that calculations done by Isenberg, Jackson and Lu in their study of the Ricci flow on homogeneous four-dimensional manifolds (Commun. Anal. Geom. 14, 345) might be helpful in this context.

More details on some of the things mentioned in this post will be given in a forthcoming preprint by Gödeke and myself.

### Manipulating cells using light

October 27, 2009

In what follows I describe another subject which was a theme in the talk of Orion Weiner mentioned in the previous post. In the meantime I am familiar with the fact that there are techniques which allow us to see details of what is going on in cells. Here the most prominent protagonist is the green fluorescent protein (GFP) which was honoured by Nobel prizes in 2008. It allows information to be exported from the cell. This is a passive process in the sense that once the system has been prepared we just watch what happens. A more active process which is sometimes shown on video is that where a neutrophil follows the moving tip of a micropipette which is releasing a substance to which the cell is chemotactic. The subject of the present post is how it is possible to actively manipulate cells by sending in light of certain wavelengths. This may mean bathing the cell in light, illuminating certain precisely defined areas with a laser or a combination of the two.

The first type of experiment involves proteins which can be located either at the cell membrane or in the cytosol and which are fluorescently labelled so that their position can be monitored. It is possible to cause these molecules to move rapidly from the one localization to the other. This can be done on a time scale of a couple of seconds and it looks likes switching on and off a light. This can be done many times in a row. Here the effect on the cell is global. The second type of experiment has to do with localizing this type of effect. It allows patterns chosen by the experimenter to be projected onto the cell. Here coloured patches are visible. Their interpretation is that concentrations of a certain substance have been fixed according to the pattern. The third type of experiment is the most striking. Here a spot of light is moved over the cell and away from it in a certain direction. There results a long projection of the cell in that direction. On the video it looks as if the the cell is being pulled by a sticky object. All these things are done by switching on certain proteins which have been made light-sensitive.The sensitivity to light is achieved by incorporating elements which are responsible for allowing certain plants to react to light. One of the plants which acts as a source here is the favourite model organism among plants, Arabidopsis thaliana. The reference to the paper describing these results is ‘Spatiotemporal control of cell signalling using a light-switchable protein interaction’, Nature 461, 997-1001 (15 October, 2009).

### Poisson brackets

August 9, 2009

Poisson brackets are very popular in theoretical physics. In the case of a classical mechanical system described in terms of a Hamiltonian the underlying mathematical structure is a finite-dimensional manifold $M$ and a symplectic structure $\omega$. This object has an inverse $\omega^{-1}$. The Poisson bracket of two functions $f$ and $g$ on $M$ is defined in terms of their exterior derivatives as $\omega^{-1}(df,dg)$. There is no problem here for the mathematician interested in understanding this. What is more difficult is the definition of the Poisson bracket for field theories. Formally this corresponds to allowing the manifold $M$ to be infinite dimensional. Consider for instance a scalar field $\phi$, in one space dimension for maximum simplicity. Thus we are considering functions $\phi (t,\theta)$ and I suppose (simplicity again) that they are periodic in $\theta$. The Lagrangian density is $\frac12 (\phi_t^2-\phi_\theta^2)$. The momentum $\pi$ conjugate to $\phi$ is defined to be the time derivative $\phi_t$. In this case the phase space $M$ is formally the space of functions $(\phi,\pi)$.I will not try to specify what kind of functions – for example we can think of them as being smooth.Given two functionals $F$ and $G$ (i.e. two functions on the infinite dimensional space $M$) the Poisson bracket is defined by the formula $\int_{S^1} \frac{\delta F}{\delta\phi}\frac{\delta G}{\delta\pi}-\frac{\delta F}{\delta\pi}\frac{\delta G}{\delta\phi}$. The derivatives here are functional derivatives. Now, feeling certain that my physicist colleagues will react with amusement or incomprehension, I must admit that I do not understand what a functional derivative is. This formula is a hieroglyph for me.

Let us not be discouraged. The functional derivative looks something like a variational derivative, a concept which is more familiar to me. This basically just means that if I have a functional which is the integral over $S^1$ of a function of $\phi$ and $\pi$ then it is possible to consider variations $(\phi(\lambda),\pi(\lambda))$, intuitively curves in the manifold $M$, and differentiate them with respect to the parameter $\lambda$. If the space $M$ can be defined as a decent infinite-dimensional manifold (e.g. a Banach manifold) then this can be related to constructions of differential geometry on that manifold. The variational derivative just corresponds to the exterior derivative and is a one-form. If we start with smooth functions then this object takes values in the dual space. In other words we encounter distributions. Now in the presence of a volume form many distributions can be identified with functions. The one-form is then integration against that function, which could be called the functional derivative and used in the definition of the Poisson bracket.If the functionals are integrals of expressions depending pointwise on $\phi$ and $\pi$ then this is nothing other than the partial derivative with respect to the same quantity. Notice that with this definition we can reproduce the fact that the momentum $\pi$ is the functional derivative of the Lagrangian with respect to $\phi_t$ If the functional is allowed to depend on the spatial derivative of $\phi$ then a further complication is added since in that case it is necessary to integrate by parts in space to compute the function corresponding to the distribution.If we can interpret the functional derivatives as functions of $\theta$ then the Poisson bracket can be written as $\int_{S^1}\frac{\delta F}{\delta\phi}(\theta)\frac{\delta G}{\delta\pi}(\theta)-\frac{\delta F}{\delta\pi}(\theta)\frac{\delta G}{\delta\phi}(\theta)d\theta$.

### Casimir invariants

July 13, 2009

For some time I have wanted to learn about the concept of Casimir invariants and I was not very satisfied with the information I found. Now I have made new efforts to learn about this topic and I will record some of what I learned here. Let $g$ be a finite-dimensional Lie algebra and let $G$ be the corresponding connected and simply connected Lie group.The Lie algebra $g$ can be identified with $T_e G$, the tangent space to $G$ at the identity. There is a one-to-one correspondence between elements of this tangent space and left-invariant vector fields on $G$. Let $S(g)$ be the algebra of symmetric tensors over $g$. Let $U(g)$ denote the associative algebra which is the quotient of $S(g)$ by elements of the form $x\otimes y-y\otimes x-[x,y]$. This is called the universal enveloping algebra of $g$. There is a natural embedding $i$ of $g$ into $U(g)$ and it is a Lie algebra homomorphism into $L(U(g))$, the Lie algebra obtained from $U(g)$ by using the commutator to define a Lie bracket. Given an associative algebra $A$ and a Lie algebra homomorphism $\phi$ from $g$ to $L(A)$ there exists an algebra homomorphism $\psi$ from $U(g)$ to$A$ such that $\phi=\psi\circ i$. This is the universal property which appears in the name of the object. Here the fact has been used implicitly that $A$ and $L(A)$ can be identified as sets. An important example is given by a representation $\rho$ of the Lie algebra $g$ on a vector space $V$, which can be thought of a Lie algebra homomorphism from $g$ to $gl(V)=L(GL(V))$.

A Casimir invariant, or Casimir element or Casimir operator of $g$ is an element of the centre of $U(g)$. What remains unclear to me is whether these three concepts are supposed to be equivalent, or just related. I am also not sure whether (in any of these cases) any element of the centre is allowed, or just a particular one or a particular type. One definition I have found is the following. Suppose that $G$ semisimple. Then it has a Killing form $K$ which is a non-degenerate bilinear form. Let $X^i$ be a basis of $g$ and let $X_i$ be the basis of one-forms associated to it via $K$. (I.e. for any vector $Y$ we have $X_i(Y)=K(X^i,Y)$. Then the Casimir invariant is defined to be the element of $U(g)$ given by $C=\sum _i X^i X_i$. This is independent of the basis chosen. Since $K$ is an invariant bilinear form it follows that $C$ commutes with all elements of $g$ and in fact lies in the centre of $U(g)$. As a consequence of the universal property it is possible to define an object $\rho (C)$. This is an operator on $V$ which commutes with all elements of the image of $\rho$. If the representation is irreducible then this implies that $\rho(C)$ is a multiple of the identity. The factor of proportionality is a real number which is an invariant of the representation.

There is a theorem on the structure of the centre of the universal enveloping algebra of a semisimple Lie algebra which is associated with the term ‘Harish-Chandra homomorphism’. It can be used to list the number of elements required to generate the centre and their orders as polynomials in basis vectors of the Lie algebra. The number of these generators is the rank of the algebra. For instance for $SL(2,R)$ there is one generator of order two. For $SU(3)$ there are generators of order two and three and the rank is two. The same will be true for $SL(3,R)$ or $SU(2,1)$. My aim at the moment is not to learn the abstract theory in depth but rather to understand enough to do some calculations for a specific application. I plan to say more about the application in a later post.