PHAS0067: Advanced Physical Cosmology

3 The Geodesic Equation

()

3.1 How the metric affects particles

Let’s now attempt to apply the WEP in practice, given a spacetime with some known metric. First recall that the metric allows us to define an infinitessimal squared proper distance ds2 according to the equation

ds2=gμνdxμdxν (25)

where gμν is the metric tensor, and xμ are the coordinates (so dxμ are the infinitessimal changes in the coordinates). As an aside, you should be aware that there are three possible kinds of resulting intervals (see Fig. 4):

ds2<0: spacelike (26)
ds2=0: null / lightlike (27)
ds2>0: timelike. (28)

The distinction is a physical, not just a mathematical one. First, let’s consider why ds2=0 can be called “lightlike”. In our units where c=1, light travelling along the x axis in Minkowski space would obey dx/dt=c=1. So, it has ds2=dt2-dx2=0. The WEP immediately allows us to generalize this result and say that it applies in any spacetime, for light travelling in any direction. That justifies the use of the term “lightlike”.

Back in Minkowski space, consider two spacetime points separated only by time t. It follows immediately from equation (20) that the overall interval ds2>0. Similarly, two points separated only by space (x, y or z) have ds2<0. Once again, the WEP states that these results must also be true in curved space.

With this in mind let’s take a look at how particles travel through spacetime in practice.

Figure 4: A lightcone on a spacetime diagram (with y and z directions suppressed). Points that are spacelike, lightlike, and timelike separated from the origin are indicated.

3.2 Transforming from a Cartesian basis

In Minkowski space, particles travel in straight lines unless they are acted upon by an external force. In more general spacetimes, the concept of a “straight line” gets replaced by a “geodesic” – that is to say, particles follow the shortest path between two points in spacetime. This fact follows immediately from the WEP; after all, in a flat space a straight line is the shortest distance between two points.

Abstract arguments to one side, let’s see what the maths looks like in practice. We can start by generalizing Newton’s law with no forces, d2𝒙dt2=0, under the assumption that there is an underlying Cartesian description of what’s going on. For a concrete example, you could imagine particle motion in a Euclidean 2D plane. Equations of motion in Cartesian coordinates xCi=(x,y) for a free particle are

d2xCidt2=0. (29)

Here, I’m using xCi to specifically indicate cartesian coordinates so that we can ask what are the equations of motion in a general coordinate system xi. For example, we could think about polar coordinates, xi=(r,θ). Crucially, the basis vectors for polar coordinates, 𝒓^, 𝜽^ vary in the plane so

d2xCidt2=0does 𝐧𝐨𝐭 implyd2xidt2=0. (30)

To derive the correct equation, let us start by expanding the Cartesian equation using the chain rule (just as previously applied in Eq. (11)):

dxCidt=(xCixj)dxjdt, (31)

where the term in the brackets on the RHS is a transformation matrix going from one basis to another. For example, in the particular case that x represents polar coordinates, the particular transformation from Cartesian to polar coordinates reads xC1=x1cosx2,xC2=x1sinx2, giving a transformation matrix

xCixj=(cosx2-x1sinx2sinx2x1cosx2). (32)

Whatever the new basis, the geodesic equation now reads

0=ddt[dxCidt]=ddt[xCixjdxjdt]=ddt[xCixj]dxjdt+xCixjd2xjdt2. (33)

where we’ve applied the product rule to produce two terms. If the transformation were linear, the first of these would vanish and the geodesic equation in the new basis would be d2xidt2=0, just as in the original xCi basis. But in polar coordinates (or, in general, other non-cartesian coordinate systems), the transformation is not linear – we need to calculate the first term explicitly using the chain rule:

ddt[xCixj]=dxkdt2xCixkxj.

The geodesic equation therefore becomes

xCixjd2xjdt2+2xCixjxkdxkdtdxjdt=0. (34)

We wanted an equation for d2xi/dt2, and we’re almost there – but first we need to remove the xCi/xj out the front of the first term. This is accomplished by multiplying by its inverse which is xj/xCk:

xCixjxkxCi=δjk, (35)

where δjk is the Kronecker delta, equation (18). So, multiplying (34) by x/xCi, we get

d2xdt2+[xxCi2xCixjxk]dxkdtdxjdt=0. (36)

The term in the brackets doesn’t care about the details of the path being followed, only about the relationship between the cartesian and general coordinate system. It’s therefore often useful to tabulate its values for each possible j, k and . The result is known as the Christoffel symbol,

ΓjkxxCi2xCixjxk, (37)

and is symmetric in j,k (so there are not quite so many independent components as it would first appear). In Cartesian coordinates, Γjk=0, and the geodesic equation boils back down to where we started, d2xCi/dt2=0. In general, Γjk0 describes geodesics in non-trivial coordinate systems.

Exercise 3A Use expression (36), together with the polar coordinates transformation matrix given by (32), to work out the equations of motion for a particle in circular polar coordinates when no forces are applied. Interpret your equations in terms of what you know about circular motion.

3.3 Beyond flat space: time and curvature

The geodesic equation is powerful when combined with the equivalence principle. So far, we have shown how to express straight lines in any coordinate system assuming the underlying space is flat (i.e. can be described by a global set of cartesian coordinates). But look at the form of equation (36). It depends only on the local relationship between the two sets of coordinates (i.e. on derivatives, not the actual values of the coordinates). And the equivalence principle says that in a local frame, particles in curved space behave just like they do in flat space… therefore, particles in curved space must follow an equation of this form, too!

We just need to get one last point ironed out before we can move into curved space. So far, we have treated time t as an absolute quantity. In relativity, this is no longer the case; t is a coordinate on the same footing as x or r or θ and so on. That necessitates two changes:

  • we must allow indices to range from 03 to include time and space;

  • we introduce a parameter, traditionally λ, to describe the path through spacetime (Figure 5). We normally insist that λ monotonically increases with time t, but otherwise the relationship between them is flexible.

Figure 5: A particle’s path parametrized by λ, which monotonically increases from its initial value λ1 to its final value λ2.

With these changes the geodesic equation immediately generalizes:

d2xμdλ2+Γαβμdxαdλdxβdλ=0. (38)

A particle’s path through spacetime, such as the one described by xμ(λ) above, is sometimes called a worldline.

3.4 From Christoffel symbols to the connection

It’s all very well having something like equation (38) in hand, but we’re going to need to calculate the Christoffel symbols for every situation. From the machinary we have looked at so far, that would involve (1) constructing a set of freely falling coordinates xα (playing the role of the xCα from Section 2); (2) specifying the transformation between the original set of coordinates xα and the new xα; (3) calculating the Christoffel symbols using expression (37).

How do we know what freely falling coordinates look like? They’re the coordinates in which particles explicitly obey the laws of special relativity. Mathematically, the metric in these coordinates looks like ημν. So, if we’re given some spacetime (like the expanding universe) with a specified metric, we need to find the transformation to the coordinates in which the metric has this form. For any curved spacetime, such a transformation doesn’t exist globally; it has to be constructed locally around each point.

But luckily (and perhaps surprisingly) it’s possible to do this generally – for any set of coordinates – once and for all. The result is:

Γαβμ=gμν2[gανxβ+gβνxα-gαβxν]. (39)

Note there is no explicit reference to the freely-falling coordinate system x here – equation (39) is a recipe for jumping straight from the spacetime metric gμν to the geodesic equation. The next exercise sheds some light on how that’s possible.

Exercise 3B Verify that Eq. (37) is consistent with (39). Hint: (1) assume that some coordinates xα exist with metric gαβ=ηαβ at a given point, and with gαβ/xγ=0; (2) write down the relationship between the original metric gαβ and the local free-fall metric gμν; (3) insert your expression for gαβ into equation (39) and verify that it is consistent with the original definition (37).

When derived from the metric as in (39), the Christoffel symbols are known as the connection (because they help us to ‘connect’ particle behaviour between different patches of spacetime).

Non-examinable aside: The requirements put on the x frame – that gαβ=ηαβ and gαβ/xγ=0 at a given point – is something like saying “these coordinates are like Minkowski space to sufficient accuracy in the locality of our point that we can apply the equivalence principle”. But you might wonder whether we really have enough freedom in the construction of a coordinate system to be always able to satisfy these conditions starting from any point of any spacetime. The answer is yes – in four dimensions, these two conditions generate 50 constraints, while there are actually 56 degrees of freedom in the transformation. The additional 6 degrees of freedom are taken up by the Lorentz group – rotations and boosts that preserve the local inertial frame.

3.5 Particles in an expanding universe

Most measurements we make in cosmology have to do with intercepting photons which arrive at Earth. They have been emitted at various epochs during the evolution of the universe. As discussed earlier, we can think of these photons as particles that travel along null geodesics. So, let’s apply the geodesic equation to a particle travelling through an expanding universe. First, we’ll need the components of the connection.

📝 Exercise 3C Use the flat FRW metric (23) to show that the components of the connection are: Γ000 =0, Γ0i0=Γi00 =0, Γij0 =δija˙a Γ0ji=Γj0i =a˙aδji, Γjki =0, Γ00i =0. (40) where overdots denote d/dt.

With these in hand, we can turn equation (38) into an explicit second-order differential equation for the spacetime coordinates of the photon, xα. What happens next depends on the exact physical question we’re posing; here let’s consider how the particle energy changes as the universe expands. Just as in special relativity, a massless particle has energy-momentum 4–vector pα, where

pα=dxαdλ. (41)

In special relativity, the energy is given by E=p0: the equivalence principle means this is still true in the expanding universe11 1 Take care, though: technically one has to construct a local inertial frame before making a statement like this. The coordinate frame of FRW is certainly not inertial. There are a number of equivalent ways to see that the statement E=p0 still holds true nonetheless in this case; for instance one can construct a frame-independent expression like E=pμuμ for an observer 4-vector uμ. Take a look at Section 3.5 of Carroll for detail. .

Eliminate λ by noting that ddλ=dx0dλddx0=Eddt, and the 0–component of the geodesic equation,

EdEdt = -Γij0pipj, (42)
= -δija˙apipj.

For a massless particle, the energy-momentum vector has zero magnitude:

gμνpμpν=E2-δija2pipj=0. (43)

We can use this to eliminate δijpipj in favour of E from (42):

dEdt+a˙aE=0. (44)

This expression implies that the energy of a massless particle decreases as the universe expands:

E1a. (45)

The result we have derived accords with the intuition from a handwaving argument: Eλ-1 (where λ is the wavelength) and λa is stretched along with the expansion. The frequency of a photon emitted with frequency νem will therefore be observed with a lower frequency νobs as the universe expands:

νobsνem=λemλobs=aemaobs. (46)

A common question at this point is: where does the energy of this photon go? As ever in physics, there are a few different ways to approach this question but I think the simplest answer is to recognise that energy conservation is not an absolute law. Noether’s theorem (see Section 7) shows us that conservation laws reflect symmetries; in particular, energy conservation relates to time-translation invariance. In an expanding universe the time-translation symmetry is lost and consequently so too is energy conservation.

3.6 Redshift

Cosmologists like to speak of the above effect in terms of the redshift z between two events, defined by the fractional change in wavelength:

zemλobs-λemλem. (47)

If the observation takes place today (aobs=a0=1), this implies

aem=11+zem. (48)

So the redshift of an object tells us immediately the scale factor when the photon was emitted. For that reason, measuring the redshifts of distant objects is one of the most fundamentally important tools in the observer’s kit.

How can this be achieved in practice? Ideally, one splits the light into a spectrum and then examines features in that spectrum, such as emission or absorption lines. Because these are made by quantum transitions in atoms, their rest-frame wavelengths (λem) are well-known. Especially if multiple different lines at different λems are present, this allows zem to be reliably inferred.

Sometimes taking a spectrum isn’t practical (especially for dim objects, or with large surveys) and one instead has to estimate zem just from photometry in different broad bands. This is known as “photometric redshift” estimation, and people spend entire careers working on it.

Non-examinable aside: This redshift is not quite the same as the conventional Doppler effect. In particular, it doesn’t really make sense to talk about very distant objects in the universe travelling at some fraction of the speed of light, since comparing speeds of two objects can only be done within a single inertial frame. Since a global inertial frame doesn’t exist for any curved spacetime, any statement about recession velocities runs the risk of generating severe confusion. Some people instead like to talk about the photon being ‘stretched by the expansion of space’ -- and certainly this leads to the right expectation, as we discussed above -- but again it’s not really physics, since it’s unclear quite how or why expanding space would have this effect. People like to have long debates about the best view of what it all means. But really, the only safe thing to say is that ‘‘the redshift in an expanding universe follows from the equivalence principle’’; at least according to the shut-up-and-calculate view22 2 https://en.wikipedia.org/wiki/Interpretations_of_quantum_mechanics#Instrumentalist_description, no other explanation is necessary.

3.7 Geodesics from the principle of least action

The most famous property of a straight line in flat space is that it is the shortest route between two points. Therefore, in curved spacetime, we might argue that particles should continue to follow the shortest route between two points33 3 One can formalise this slightly imprecise motivation using the equivalence principle.. The argument is valid and leads back to the same spacetime path that we derived directly from the equivalence principle above. But mathematically we now have a minimisation problem, which provides an alternative perspective and a convenient calculational tool as we’ll see below.

Recap of the Euler-Lagrange equations

Such problems are already familiar in physics from the principle of least action, which we will require in its own right later on. Setting aside GR for one moment let’s begin with the familiar example of the classical mechanics of a single particle in 1D with coordinate q(t). The equations of motion for such a particle can be derived by searching for the trajectory q(t) that makes the action S a minimum:

q(t) minimises S, where S=t0t1dtL(q,q˙). (49)

Here the function L(q,q˙) is known as the Lagrangian, and we imagine that we know where the particle is at times t0 and t1. This formulation seems terribly abstract if you haven’t seen it before but it’s just an efficient (and, in the professional literature, ubiquitous) way of stating a variety of known laws of physics44 4 In particular, you might be wondering why it’s a function of q and q˙ but not, say, q¨. The answer is a bit disappointing: simply, to describe physics as we know it, we’ll need things of this form. When we crunch through the maths, you’ll see “normal” equations from physics start to emerge. The next question one thinks of is something like: “but why are equations of physics of this form that can be derived in such a particular way?” It’s an interesting enough question, but not one that we can afford to get hung up on in this course.. By stating just the form of L (which is often surprisingly compact), we will be able to derive laws and many properties of those laws with far greater ease than other techniques allow.

Without specifying anything further about L or S for the moment, let’s see what it means to find the minimum of S with respect to the function q(t). Just like finding a minimum of the curve A(x) where you’d set A/x=0 – meaning that A does not change for a slight change in x – we need to state that δS=0 for any small change in the q(t) function. This isn’t actually too hard: by the chain rule (or, equivalently, by Taylor expanding L in q and q˙),

0=δS=t0t1dt(Lqδq(t)+Lq˙δq˙(t)) (50)

where δq(t) is a function expressing a small change in the original function q(t), and δq˙(t) is the corresponding small change in the original function’s time derivative q˙(t). These are related by:

δq˙(t)=d(δq(t))/dt. (51)
😇 Exercise 3D Derive equation (51), i.e. convince yourself that δ and d/dt commute.

The clever part of the principle of least action comes from using this relation to turn (50) into an equation of motion. The second term is integrated by parts, using relation (51):

0=δS=t0t1dt(Lqδq(t)-ddt[Lq˙]δq(t))+[Lq˙δq(t)]t0t1 (52)

Remember that we said we know where the particle is at t0 and t1 – we’re searching for its correct trajectory at intermediate times. Therefore, δq(t0)=δq(t1)=0 and the boundary term vanishes. With a minor refactorisation inside the integral, we get:

0=δS=t0t1dtδq(t)(Lq-ddt[Lq˙]) (53)

Now this must be true for any slight change in q obeying the boundary conditions, i.e. for any δq(t) between t0<t<t1. Accordingly, the term in the circular brackets must be zero at every t and we have derived the Euler-Lagrange equations:

Lq-ddt(Lq˙)=0. (54)

To actually get some physics, we need to specify L(q,q˙). The Lagrangian for Newtonian point-particle problems is typically of the form L=K-V, where K is the kinetic energy and V is the potential energy. For example, L=12mq˙2-mV(q) leads to the equation of motion

mq¨=-mdVdq (55)

which you should recognise as Newton’s second law for a particle in a 1D potential V(q).

Exercise 3E Verify equation (55) can be obtained by applying the Euler-Lagrange equation (54) to L=12mq˙2-mV(q).

Why would we choose to go to all these lengths just to rederive Newton’s law from such an abstract framework? There are a number of reasons. For one, it makes it remarkably easy to pull out conservation laws (energy, momentum etc) and see deeper reasons for their existence (the aforementioned Noether’s theorem). But for another, it makes dealing with different coordinate systems vastly easier. For example, what if you need to derive Newton’s laws in a rotating frame? Doing the coordinate transformation manually in full generality is a real nightmare. But with the Euler-Lagrange approach, there is no problem – simply write the potential energy and kinetic energy in terms of the variables you want, and everything will fall out. (We don’t have time to cover this in the present course, but find a good textbook and do it yourself if you’re curious and haven’t seen it before.) Of course, seamlessly handling weird-looking coordinate systems is a very desirable property in GR, so it pays to revisit the problem of geodesics with our new machinary.

Geodesics from the Euler-Lagrange equations

If, as claimed at the start of this section, we can derive the path of a particle in GR by searching for geodesics through spacetime the action becomes just the length of the spacetime curve which we can write as ds. If the particle trajectory is parametrised as xα(λ) as before, and the metric is gμν, the spacetime length of the curve is

S=ds=λ1λ2gμνxμxνdλ, (56)

where55 5 This notation where xμ means the derivative of xμ with respect to λ should not be confused with our earlier use of xμ to mean xμ in a different coordinate system. Unfortunately in cosmology we often run out of symbols and meanings have to be redefined in different contexts. xμdxμ/dλ. We could apply the Euler-Lagrange equations directly to this expression, with λ playing the role of the time coordinate t from the Newtonian case in Section 7. Note that t may now be one of the coordinates in xα, so it’s critical to distinguish t from λ. Writing out the correct formula explicitly for completeness:

Lxα-ddλ(Lxα)=0. (57)

Unfortunately the square-root in equation (56) creates ugly algebra when taking derivatives. There is a simple trick that gets rid of that problem: we insist that our final geodesic will be parametrised such that gμνxμxν is constant. As a result, the solution to the E-L equations will satisfy

0 =λ1λ2dλδ(gμνxμxν)
=λ1λ2dλ12gμνxμxνδ(gμνxμxν)
=constant12λ1λ2dλδ(gμνxμxν). (58)

This particular choice of normalisation for xμ is known as an affine parametrisation; see Carroll or your other GR courses for more information. The main point for our purposes is that, as a consequence of (58), we can equally well use the alternative action

S=12λ1λ2gμνdxμdλdxνdλdλ, (59)

leading to far simpler but equivalent equations of motion for the geodesic from the Euler-Lagrange equations.

For example, returning to the flat FRW metric (23), we have the explicit action

S=12λ1λ2dλ(t(λ)2-a2(t)δijxi(λ)xj(λ)). (60)

This allows us to neatly bypass calculation of the Christoffel symbols and rederive an earlier result:

📝 Exercise 3F

Apply the Euler-Lagrange equation (57) for t(λ) to the action (60), using the null vector constraint (43) to show that

dEdt=-a˙aE, (61)

thus rederiving the redshift result (42). (Hint: remember that E=dt/dλ.)

😇 Exercise 3G Starting from the action for a general metric, eq. (59), rederive the geodesic equation (38). This confirms that the two approaches are exactly equivalent for all metrics.

3.8 Summary

  • The equivalence principle is powerful (even in its weak form) as it allows us to map as follows:

    Theory: Special relativity General relativity
    Minkowski metric valid:    Globally Locally
    Free particles follow: Straight line Spacetime geodesic
  • According to the equivalence principle, particles that are only under the influence of gravity move in a local inertial frame as though no forces are acting. The non-trivial path they follow arises because of the interrelationships between these local inertial frames which become highly non-trivial in a curved spacetime.

  • We can calculate paths in a more convenient coordinate system by relating it back to a series of local inertial frames. This involves a connection Γαβμ which can be calculated from the metric for our chosen coordinate system; equation (39). (To be completely clear, there is no requirement to memorise this equation.)

  • One can also take a completely different approach to the trajectory calculation through the insight that particles must follow a path that makes the distance through spacetime a local maximum (with our sign convention; a minimum with the opposite signature). This, too, follows from the equivalence principle because in a flat spacetime particles have straight worldlines – such straight worldlines maximise the spacetime pathlength.

  • Because of this “extreme distance” property, the path taken by particles is known as a geodesic (in analogy to the shortest path between two places on the Earth’s surface).

  • Solving the extreme distance problem boils down to applying the Euler-Lagrange equations, (57) to the action (59). To check the logic is sound, one can prove this does give equivalent results to using the connection-based approach (see Exercises 7 and 7).

  • Using the geodesic equation coupled to the simple expanding spacetime, equation (23), one can derive that photons redshift as the universe expands. This is an illustration that energy need not be conserved in an expanding universe.

  • Energy conservation corresponds to time-translation invariance by Noether’s theorem, but the background metric has a time-varying scalefactor, so energy conservation is violated. Later we will see what conservation laws GR does permit.

  • Being able to infer the redshift of an object is crucial for observers. The most precise way to accomplish this is by producing a spectrum of the observed object, and comparing its atomic emission or absorption lines with known rest-frame wavelengths.