Let’s now attempt to apply the WEP in practice, given a spacetime with some known metric. First recall that the metric allows us to define an infinitessimal squared proper distance according to the equation
(25) |
where is the metric tensor, and are the coordinates (so are the infinitessimal changes in the coordinates). As an aside, you should be aware that there are three possible kinds of resulting intervals (see Fig. 4):
spacelike | (26) | ||||
null / lightlike | (27) | ||||
(28) |
The distinction is a physical, not just a mathematical one. First, let’s consider why can be called “lightlike”. In our units where , light travelling along the axis in Minkowski space would obey . So, it has . The WEP immediately allows us to generalize this result and say that it applies in any spacetime, for light travelling in any direction. That justifies the use of the term “lightlike”.
Back in Minkowski space, consider two spacetime points separated only by time . It follows immediately from equation (20) that the overall interval . Similarly, two points separated only by space (, or ) have . Once again, the WEP states that these results must also be true in curved space.
With this in mind let’s take a look at how particles travel through spacetime in practice.
In Minkowski space, particles travel in straight lines unless they are acted upon by an external force. In more general spacetimes, the concept of a “straight line” gets replaced by a “geodesic” – that is to say, particles follow the shortest path between two points in spacetime. This fact follows immediately from the WEP; after all, in a flat space a straight line is the shortest distance between two points.
Abstract arguments to one side, let’s see what the maths looks like in practice. We can start by generalizing Newton’s law with no forces, , under the assumption that there is an underlying Cartesian description of what’s going on. For a concrete example, you could imagine particle motion in a Euclidean 2D plane. Equations of motion in Cartesian coordinates for a free particle are
(29) |
Here, I’m using to specifically indicate cartesian coordinates so that we can ask what are the equations of motion in a general coordinate system . For example, we could think about polar coordinates, . Crucially, the basis vectors for polar coordinates, , vary in the plane so
(30) |
To derive the correct equation, let us start by expanding the Cartesian equation using the chain rule (just as previously applied in Eq. (11)):
(31) |
where the term in the brackets on the RHS is a transformation matrix going from one basis to another. For example, in the particular case that represents polar coordinates, the particular transformation from Cartesian to polar coordinates reads , giving a transformation matrix
(32) |
Whatever the new basis, the geodesic equation now reads
(33) |
where we’ve applied the product rule to produce two terms. If the transformation were linear, the first of these would vanish and the geodesic equation in the new basis would be , just as in the original basis. But in polar coordinates (or, in general, other non-cartesian coordinate systems), the transformation is not linear – we need to calculate the first term explicitly using the chain rule:
The geodesic equation therefore becomes
(34) |
We wanted an equation for , and we’re almost there – but first we need to remove the out the front of the first term. This is accomplished by multiplying by its inverse which is :
(35) |
where is the Kronecker delta, equation (18). So, multiplying (34) by , we get
(36) |
The term in the brackets doesn’t care about the details of the path being followed, only about the relationship between the cartesian and general coordinate system. It’s therefore often useful to tabulate its values for each possible , and . The result is known as the Christoffel symbol,
(37) |
and is symmetric in (so there are not quite so many independent components as it would first appear). In Cartesian coordinates, , and the geodesic equation boils back down to where we started, . In general, describes geodesics in non-trivial coordinate systems.
The geodesic equation is powerful when combined with the equivalence principle. So far, we have shown how to express straight lines in any coordinate system assuming the underlying space is flat (i.e. can be described by a global set of cartesian coordinates). But look at the form of equation (36). It depends only on the local relationship between the two sets of coordinates (i.e. on derivatives, not the actual values of the coordinates). And the equivalence principle says that in a local frame, particles in curved space behave just like they do in flat space… therefore, particles in curved space must follow an equation of this form, too!
We just need to get one last point ironed out before we can move into curved space. So far, we have treated time as an absolute quantity. In relativity, this is no longer the case; is a coordinate on the same footing as or or and so on. That necessitates two changes:
we must allow indices to range from to include time and space;
we introduce a parameter, traditionally , to describe the path through spacetime (Figure 5). We normally insist that monotonically increases with time , but otherwise the relationship between them is flexible.
With these changes the geodesic equation immediately generalizes:
(38) |
A particle’s path through spacetime, such as the one described by above, is sometimes called a worldline.
It’s all very well having something like equation (38) in hand, but we’re going to need to calculate the Christoffel symbols for every situation. From the machinary we have looked at so far, that would involve (1) constructing a set of freely falling coordinates (playing the role of the from Section 2); (2) specifying the transformation between the original set of coordinates and the new ; (3) calculating the Christoffel symbols using expression (37).
How do we know what freely falling coordinates look like? They’re the coordinates in which particles explicitly obey the laws of special relativity. Mathematically, the metric in these coordinates looks like . So, if we’re given some spacetime (like the expanding universe) with a specified metric, we need to find the transformation to the coordinates in which the metric has this form. For any curved spacetime, such a transformation doesn’t exist globally; it has to be constructed locally around each point.
But luckily (and perhaps surprisingly) it’s possible to do this generally – for any set of coordinates – once and for all. The result is:
(39) |
Note there is no explicit reference to the freely-falling coordinate system here – equation (39) is a recipe for jumping straight from the spacetime metric to the geodesic equation. The next exercise sheds some light on how that’s possible.
When derived from the metric as in (39), the Christoffel symbols are known as the connection (because they help us to ‘connect’ particle behaviour between different patches of spacetime).
Non-examinable aside: The requirements put on the frame – that and at a given point – is something like saying “these coordinates are like Minkowski space to sufficient accuracy in the locality of our point that we can apply the equivalence principle”. But you might wonder whether we really have enough freedom in the construction of a coordinate system to be always able to satisfy these conditions starting from any point of any spacetime. The answer is yes – in four dimensions, these two conditions generate constraints, while there are actually degrees of freedom in the transformation. The additional degrees of freedom are taken up by the Lorentz group – rotations and boosts that preserve the local inertial frame.
Most measurements we make in cosmology have to do with intercepting photons which arrive at Earth. They have been emitted at various epochs during the evolution of the universe. As discussed earlier, we can think of these photons as particles that travel along null geodesics. So, let’s apply the geodesic equation to a particle travelling through an expanding universe. First, we’ll need the components of the connection.
With these in hand, we can turn equation (38) into an explicit second-order differential equation for the spacetime coordinates of the photon, . What happens next depends on the exact physical question we’re posing; here let’s consider how the particle energy changes as the universe expands. Just as in special relativity, a massless particle has energy-momentum –vector , where
(41) |
In special relativity, the energy is given by : the equivalence principle means this is still true in the expanding universe11 1 Take care, though: technically one has to construct a local inertial frame before making a statement like this. The coordinate frame of FRW is certainly not inertial. There are a number of equivalent ways to see that the statement still holds true nonetheless in this case; for instance one can construct a frame-independent expression like for an observer 4-vector . Take a look at Section 3.5 of Carroll for detail. .
Eliminate by noting that , and the –component of the geodesic equation,
(42) | |||||
For a massless particle, the energy-momentum vector has zero magnitude:
(43) |
We can use this to eliminate in favour of E from (42):
(44) |
This expression implies that the energy of a massless particle decreases as the universe expands:
(45) |
The result we have derived accords with the intuition from a handwaving argument: (where is the wavelength) and is stretched along with the expansion. The frequency of a photon emitted with frequency will therefore be observed with a lower frequency as the universe expands:
(46) |
A common question at this point is: where does the energy of this photon go? As ever in physics, there are a few different ways to approach this question but I think the simplest answer is to recognise that energy conservation is not an absolute law. Noether’s theorem (see Section 7) shows us that conservation laws reflect symmetries; in particular, energy conservation relates to time-translation invariance. In an expanding universe the time-translation symmetry is lost and consequently so too is energy conservation.
Cosmologists like to speak of the above effect in terms of the redshift between two events, defined by the fractional change in wavelength:
(47) |
If the observation takes place today (), this implies
(48) |
So the redshift of an object tells us immediately the scale factor when the photon was emitted. For that reason, measuring the redshifts of distant objects is one of the most fundamentally important tools in the observer’s kit.
How can this be achieved in practice? Ideally, one splits the light into a spectrum and then examines features in that spectrum, such as emission or absorption lines. Because these are made by quantum transitions in atoms, their rest-frame wavelengths () are well-known. Especially if multiple different lines at different s are present, this allows to be reliably inferred.
Sometimes taking a spectrum isn’t practical (especially for dim objects, or with large surveys) and one instead has to estimate just from photometry in different broad bands. This is known as “photometric redshift” estimation, and people spend entire careers working on it.
Non-examinable aside: This redshift is not quite the same as the conventional Doppler effect. In particular, it doesn’t really make sense to talk about very distant objects in the universe travelling at some fraction of the speed of light, since comparing speeds of two objects can only be done within a single inertial frame. Since a global inertial frame doesn’t exist for any curved spacetime, any statement about recession velocities runs the risk of generating severe confusion. Some people instead like to talk about the photon being ‘stretched by the expansion of space’ -- and certainly this leads to the right expectation, as we discussed above -- but again it’s not really physics, since it’s unclear quite how or why expanding space would have this effect. People like to have long debates about the best view of what it all means. But really, the only safe thing to say is that ‘‘the redshift in an expanding universe follows from the equivalence principle’’; at least according to the shut-up-and-calculate view22 2 https://en.wikipedia.org/wiki/Interpretations_of_quantum_mechanics#Instrumentalist_description, no other explanation is necessary.
The most famous property of a straight line in flat space is that it is the shortest route between two points. Therefore, in curved spacetime, we might argue that particles should continue to follow the shortest route between two points33 3 One can formalise this slightly imprecise motivation using the equivalence principle.. The argument is valid and leads back to the same spacetime path that we derived directly from the equivalence principle above. But mathematically we now have a minimisation problem, which provides an alternative perspective and a convenient calculational tool as we’ll see below.
Such problems are already familiar in physics from the principle of least action, which we will require in its own right later on. Setting aside GR for one moment let’s begin with the familiar example of the classical mechanics of a single particle in 1D with coordinate . The equations of motion for such a particle can be derived by searching for the trajectory that makes the action a minimum:
(49) |
Here the function is known as the Lagrangian, and we imagine that we know where the particle is at times and . This formulation seems terribly abstract if you haven’t seen it before but it’s just an efficient (and, in the professional literature, ubiquitous) way of stating a variety of known laws of physics44 4 In particular, you might be wondering why it’s a function of and but not, say, . The answer is a bit disappointing: simply, to describe physics as we know it, we’ll need things of this form. When we crunch through the maths, you’ll see “normal” equations from physics start to emerge. The next question one thinks of is something like: “but why are equations of physics of this form that can be derived in such a particular way?” It’s an interesting enough question, but not one that we can afford to get hung up on in this course.. By stating just the form of (which is often surprisingly compact), we will be able to derive laws and many properties of those laws with far greater ease than other techniques allow.
Without specifying anything further about or for the moment, let’s see what it means to find the minimum of with respect to the function . Just like finding a minimum of the curve where you’d set – meaning that does not change for a slight change in – we need to state that for any small change in the function. This isn’t actually too hard: by the chain rule (or, equivalently, by Taylor expanding in and ),
(50) |
where is a function expressing a small change in the original function , and is the corresponding small change in the original function’s time derivative . These are related by:
(51) |
The clever part of the principle of least action comes from using this relation to turn (50) into an equation of motion. The second term is integrated by parts, using relation (51):
(52) |
Remember that we said we know where the particle is at and – we’re searching for its correct trajectory at intermediate times. Therefore, and the boundary term vanishes. With a minor refactorisation inside the integral, we get:
(53) |
Now this must be true for any slight change in obeying the boundary conditions, i.e. for any between . Accordingly, the term in the circular brackets must be zero at every and we have derived the Euler-Lagrange equations:
(54) |
To actually get some physics, we need to specify . The Lagrangian for Newtonian point-particle problems is typically of the form , where is the kinetic energy and is the potential energy. For example, leads to the equation of motion
(55) |
which you should recognise as Newton’s second law for a particle in a 1D potential .
Why would we choose to go to all these lengths just to rederive Newton’s law from such an abstract framework? There are a number of reasons. For one, it makes it remarkably easy to pull out conservation laws (energy, momentum etc) and see deeper reasons for their existence (the aforementioned Noether’s theorem). But for another, it makes dealing with different coordinate systems vastly easier. For example, what if you need to derive Newton’s laws in a rotating frame? Doing the coordinate transformation manually in full generality is a real nightmare. But with the Euler-Lagrange approach, there is no problem – simply write the potential energy and kinetic energy in terms of the variables you want, and everything will fall out. (We don’t have time to cover this in the present course, but find a good textbook and do it yourself if you’re curious and haven’t seen it before.) Of course, seamlessly handling weird-looking coordinate systems is a very desirable property in GR, so it pays to revisit the problem of geodesics with our new machinary.
If, as claimed at the start of this section, we can derive the path of a particle in GR by searching for geodesics through spacetime the action becomes just the length of the spacetime curve which we can write as . If the particle trajectory is parametrised as as before, and the metric is , the spacetime length of the curve is
(56) |
where55 5 This notation where means the derivative of with respect to should not be confused with our earlier use of to mean in a different coordinate system. Unfortunately in cosmology we often run out of symbols and meanings have to be redefined in different contexts. . We could apply the Euler-Lagrange equations directly to this expression, with playing the role of the time coordinate from the Newtonian case in Section 7. Note that may now be one of the coordinates in , so it’s critical to distinguish from . Writing out the correct formula explicitly for completeness:
(57) |
Unfortunately the square-root in equation (56) creates ugly algebra when taking derivatives. There is a simple trick that gets rid of that problem: we insist that our final geodesic will be parametrised such that is constant. As a result, the solution to the E-L equations will satisfy
(58) |
This particular choice of normalisation for is known as an affine parametrisation; see Carroll or your other GR courses for more information. The main point for our purposes is that, as a consequence of (58), we can equally well use the alternative action
(59) |
leading to far simpler but equivalent equations of motion for the geodesic from the Euler-Lagrange equations.
For example, returning to the flat FRW metric (23), we have the explicit action
(60) |
This allows us to neatly bypass calculation of the Christoffel symbols and rederive an earlier result:
The equivalence principle is powerful (even in its weak form) as it allows us to map as follows:
Theory: | Special relativity | General relativity | |
Minkowski metric valid: | Globally | Locally | |
Free particles follow: | Straight line | Spacetime geodesic |
According to the equivalence principle, particles that are only under the influence of gravity move in a local inertial frame as though no forces are acting. The non-trivial path they follow arises because of the interrelationships between these local inertial frames which become highly non-trivial in a curved spacetime.
We can calculate paths in a more convenient coordinate system by relating it back to a series of local inertial frames. This involves a connection which can be calculated from the metric for our chosen coordinate system; equation (39). (To be completely clear, there is no requirement to memorise this equation.)
One can also take a completely different approach to the trajectory calculation through the insight that particles must follow a path that makes the distance through spacetime a local maximum (with our sign convention; a minimum with the opposite signature). This, too, follows from the equivalence principle because in a flat spacetime particles have straight worldlines – such straight worldlines maximise the spacetime pathlength.
Because of this “extreme distance” property, the path taken by particles is known as a geodesic (in analogy to the shortest path between two places on the Earth’s surface).
Using the geodesic equation coupled to the simple expanding spacetime, equation (23), one can derive that photons redshift as the universe expands. This is an illustration that energy need not be conserved in an expanding universe.
Energy conservation corresponds to time-translation invariance by Noether’s theorem, but the background metric has a time-varying scalefactor, so energy conservation is violated. Later we will see what conservation laws GR does permit.
Being able to infer the redshift of an object is crucial for observers. The most precise way to accomplish this is by producing a spectrum of the observed object, and comparing its atomic emission or absorption lines with known rest-frame wavelengths.