In the Annus Mirabilis, 1905, one of Einstein's accomplishments was to establish the theory of special relativity. What was special was that all observers must travel at a constant speed, neither accelerating or decelerating. For such an observer, the speed of light is a constant. Different observers will see different wavelengths and frequencies, but the product of wavelength with the frequency is identical. The wavelength and frequency are said to be Lorentz covariant, meaning we know how they change for different observers. The speed of light is Lorentz invariant. It is one of my pet peeves that invariants should always be paired with their corresponding covariant quantities or else an incomplete story is being told.

Newton's law of gravity does a remarkable job in describing the motion of the planets. It is all that is needed by today's rocket ships unless those devices also carry atomic clocks or other tools of exceptional accuracy. Here is Newton's law in potential form:

From the perspective of special relativity, the equation suffers a fatal flaw: if there is a change in the mass density rho, then that must propagate everywhere instantaneously. Oops.

Einstein set out to fix this flaw. The struggle took him ten years ("Subtle is the Lord..." by Abraham Pais http://www.amazon.com/Subtle-Is-Lord-Einstein-Paperbacks/dp/0192851381 is the was to get the real details on the subject). The math was hard then and remains hard today. At a far away level, it sounds easy - describe all physics the same way whether one is accelerating or not. It is the details of Riemann geometry that are daunting. Einstein got a private tutor and collaborator for the subject, his school buddy Marcel Grossmann. He also traded letters on his math struggles with the leading math minds of his day, including David Hilbert. Einstein came to the field equations not from an action, but from thinking all about the physics. Hilbert figured out the action that generates the Einstein field equations. That is where the derivation begins:

1. Start with the Hilbert action:

Note the square root of the determinant of the metric as part of the volume element. That is required so the volume element can be in curved spacetime. It plays a critical role in the derivation, so I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor.

2. Vary with respect to the metric tensor :

3. Pull back the factor of the square root of the metric and use the product rule on the term with the Ricci scalar R:

4. Focus on the first term, using the definition of a Ricci scalar as a contraction of the Ricci tensor:

A total derivative does not make a contribution to the variation of the functional, so can be ignored in our quest to find an extremum. This is Stokes theorem in action.

<SIDEBAR>

Show that the variation in the Ricci tensor is a total derivative.

Since I don't understand this all in detail, I will try to get you in the neighborhood of getting it.

SB1. Start with the Riemann curvature tensor:

Lots of stuff there, but here is a simplifying viewpoint. One is comparing two paths, that is why there is a subtraction here. The two paths are found by switching the order of the mu and the nu. This is a really complicated structure, but that should be obvious :-)

SB2: Vary the Riemann curvature tensor with respect to the metric tensor:

Lots of terms, but remember the mu <-> nu exchange is responsible for half of them.

One cannot take a covariant derivative of a connection since it does not transform like a tensor. Apparently the difference of two connections does transform like a tensor. I say "apparently" because this is an example where I have to rely on authority, I don't appreciate the details.

SB3: Calculate the covariant derivative of the variation of the connection:

Notice that the third terms of these two expressions are identical because the mu and nu are neighbors in the connection.

Again, this is a step whose details I don't understand enough to clarify should others have questions.

SB4: Rewrite the variation of the Riemann curvature tensor as the difference of two covariant derivatives of the variation of the connection written in step SB3.

SB5: Contract the result of SB4

SB6: Contract the result of SB5:

This now looks to my eye like a total derivative, so will not contribute to the action.

<END SIDEBAR>

Since that was such a long sidebar, what has been done is the first of three terms in the variation is the Ricci tensor.

5. Focus on evaluating the variation of the second term in the action. Transform the coordinate system to one where the metric is diagonal and use the product rule:

Notice there was a flip of the metric in the variation which required one more sign change. That is the kind of detail I always trip on.

6. Define the stress energy tensor as the third term:

That factor of a minus a half? I don't get it. Bet it comes out of some classical limit. Hopefully I can research that later in the week.

7. The variation of the Hilbert action will be at an extremum when the integrand is equal to zero:

or

Fini.

But not fini. This was a math exercise. Note how little physics was involved. There are a huge number of physics issues one could go into. As an example, these equations bind to particles with integral spin which is good for bosons, but there are quite a few fermions that also participate in gravity. To include those, one can consider the metric and the connection to be independent of each other. That is the Palatini approach.

Doug

Next Monday/Tuesday: Dot and cross products, differences and overlaps with quaternions

## Comments

Note the square root of the determinant of the metric as part of the volume element. That is required so the volume element can be in curved spacetime. ... I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor.I've seen you bring this up before. This is not an issue of curved spacetime. I'm hesitant to just give you the answer, since you seem to have become dependent on people doing that. So let me translate it to something that may look familiar or at least simpler.

In some inertial cartesian coordinates you have the spacetime volume element:

*dx dy dz dt*Now let's do a Lorentz transformation (say a "boost" along the x direction), to get some new coordinate system. How do you know that the volume element is invariant? How do you know that it is still just

*dx' dy' dz' dt'*and without any other constant factors like gamma or something? This is worth working out later, but let's simplify it further to help bridge the gap to physical intuition.

Okay, now for some calculus. In simple Euclidean 3-space, the volume element is:

*dx dy dz*If we instead use spherical coordinates, it is

*r*^{2 }sinθ dr dθ dφYou've probably seen this many times, but how do you actually calculate that? When I was first taught this, we just built up the volume element using simple arguments of arclengths, etc. This is probably how you saw it before as well, but is essentially cheating for this discussion because you are using your knowledge of the "physical" lengths to derive something about the lengths. Given only the original elements, and the coordinate transformation, how do you get the new volume element? Maybe try something even simpler such as the transformation

**x' = 2x**

After playing with it awhile (I encourage you to do so), and possibly rereading some calculus textbook discussion, you will eventually come up with the Jacobian. I may have ruined it just by uttering the word.

Okay, now return to the spacetime. For the action to be invariant it is not sufficient to have a Lagrangian to be a scalar density. We'd need the volume element to be invariant as well. Sticking with cartesian coordinates hides some factors. So let's consider a general coordinate transformation (this has nothing to do with curved spacetime, we can still be doing SR), we see

except for the special case

So before with

**we clearly are not working with an invariant. It just looked like it because it didn't change between two inertial coordinate systems. That fact is not enough to show invariance (sound familiar to any other discussions lately? ::smile:: ). Anyway, we need to add a scaling term that we are leaving out. If you want, you can use spherical coordinates or simple scaled coordinates to help you work this one out.**

*dx dy dz dt*So what is this scaling factor "a"?

Hint, if you look at the transformation law for the metric tensor, the answer becomes pretty clear. This isn't worth working out in great detail. If you understand the steps above, the answer should should be reachable without wading in the swamp of detailed calculations (plugging it into Mathematica doesn't teach you anything either; take the time to learn what the math means).

As a historical side note:

Einstein originally limited his field equations to coordinate systems which set this "scaling factor" to 1. When Schwarzschild worked out his solution to GR, he purposely used a "scaled" coordinate system (not the one denoted to his name now) to preserve this feature. (He also made a mistake with the event horizon that Hilbert (I think?) corrected later in a brief footnote when he used it in some paper.) With the Hilbert action it became clear one could easily lift this restriction.

I'll try to get to your other questions later.

EDIT: Wow, that was a record number of typos. Probably still haven't caught them all. I hope no one was reading that in the mean time.

One cannot take a covariant derivative of a connection since it does not transform like a tensor. ...You already wrote out the Riemann tensor. The Christoffel symbols can be written in terms of the metric. So you can write the whole thing in terms of the metric. It is messier, but you could proceed that way without making any 'leaps' of faith.

In fact, this leap of faith to the covariant derivative needs to be undone later, because when you say:

A total derivative does not make a contribution to the variation of the functional, so can be ignored in our quest to find an extremum. This is Stokes theorem in action.Note that Stokes theorem doesn't work for covariant derivatives. It is just for the usual derivatives. This is part of the issue of defining mass in a local volume in GR, as I think David or someone else described earlier.

I forgot the rest of the steps, but wikipedia comments

unfortunately that is not obvious to me at this time of night. Can anyone explain why this should be clear?

After some sleep I'll work it out to convince myself, but I don't think I'll be able to give a simple answer.

According to this:

http://arxiv.org/pdf/gr-qc/0406088v4.pdf

The vacuum lagrangian for GR can also be written as (eq. 32):

L = -1/(k Λ) Sqrt[ det(R) ]

where Λ is the cosmological constant.

What in the world!?

And with Λ right next to k, how can we separate the gravitational constant from Λ ?? This also seems to disallow the possibility of Λ=0.

David, Henry, Doug, anyone? They call this the "aﬃne formulation in General Relativity". It is really equivalent? How? I really don't understand this paper.

So what happenned to all the unfinished discussion about coordinate transformations of quaternions from the last thread? I hope that isn't just being dropped, and everyone is deciding to leave that in the past. Next week's topic on cross products seems unrelated. I think everyone was getting close to the real issues at hand in for Doug's quest in this blog.

Doug, using David's suggested notation, can you explain what you mean by a coordinate transformation of quaternions?

but I have to think about it in terms of managing the changes between two different basis vectors. I am certain in all the years I held a paying job, a Jacobian was never used.If you are talking about transforming from a 4-vector basis to a quaternion basis, are you sure the question is well defined? The bases satisfy different algebras. Biquaternions represent the Lorentz Group, but can you show me a transformation matrix from biquaternions to 4-vectors?

You can see that rotating the boosted V is the same as boosting by a rotated v, since quaternionic rotation both respects products and quaternionic conjugation. This is a direct proof that the boosts and rotations in the stand-up representation properly commute, so that all 6 generators properly work together.I think this is a nifty way to write down the Lorentz group, and I am surprised that it is new. It is a spinorial trick specific to the four-dimensional Lorentz transformations, but it is a nice one, and it might be useful for something.

Most frigging impressive, the work done by Ron Marion and Qmechanic at http://physics.stackexchange.com/! This feels like the first time when others more skilled than I figured out an issue without much of a bumbling comment by yours truly. They even reached out to the gamma matrices, the correct ones as it were. While I am aware of a connection between the two, I could not discuss the issue in a correct technical way due to my informal training.

**Anyone who has found this discussion of the way to do boosts with quaternions suspect should read that thread.**

To get to the actual Lorentz group, it looks like he considers quaternions with 2x2 matrix valued components.

Isn't this just biquaternions? "complexified" quaternions are already known to be able to represent the Lorentz group as Barry already pointed out previously (and is what started this side discussion).

Of course, I could always be wrong about this subject because it isn't quite my thing.

It still isn't obvious to me that two boosts make a rotation, but then that isn't obvious to me in the biquaternion case either.

I just read the stack-exchange page.

I feel you and Doug are misinterpreting. Doug has always used "real" quaternions (H) instead of "complex" quaternions which are biquaternions (C x H). Doug loves the ability to divide, and the biquaternions are equivalent to 2x2 complex matrices and as such do not form a division algebra.

Reread the stack-exchange answer and note:

In the "Verifying that it works" section, Maimon introduces a notation of writing a quaternion as two complex numbers (instead of a four-tuple of real numbers like Doug usually does). This is fully equivalent to the usual quaternions Doug uses, and is just a different notation which can be convenient at times.

In the "embedding the whole Lorentz group" section, the sigma's in those equations are 2x2 matrices, and even the 1 stands for the 2x2 identity matrix. Maimon is considering elements which are 2x2 matrices. As he makes clear:

"This is not a division algebra anymore, but it's the full space of 2 by 2 complex matrices."

This is equivalent to biquaternions. He seems to be claiming (if I understand correctly) that it is a different representation of the Lorentz group. Regardless, there should be no confusion that these are not equivalent to Doug's real quaternions anymore. This is not just a notational difference. It is not a division algebra.

If you are getting confused by the notation, please remember that the 2x2 complex matrices are isomorphic to the biquaternions.

http://en.wikipedia.org/wiki/Biquaternion

"Given any 2 × 2 complex matrix, there are complex values u, v, w, and x to put it in this form [of a quaternion] so that the matrix ring is isomorphic to the biquaternion ring."

Based on your comments in Doug's articles, it sounds like what you really wanted to know is whether the "usual" or "real valued" quaternions (not the biquaternions) can give a representation of the full Lorentz group, and what this representation looks like.

I can imagine that one says: this is the definition of a group, it uses just one multiplication product. Should you require anything more, it is not a group. If so compelled, I hope we can come up with a name for the structures of real-valued quaternions that do end up at a similar place.

I don't know how to describe the extra algebraic structure required for the real-valued quaternion system to work.I don't know either. My approach was to ignore it and focus on the result which is a lorentz transformaton. I'm sure you could think about it a few different ways. I originally thought of it as a measure of how poorly quaternions are suited to represent a group. The more correction factors the further it is away from a pure quaternion rotation transformation. But you should really find some experts. If I was in your position I would chat with them on stackexchange about it (there is a chat room there as well as a question board).

My approach was to ignore it and focus on the result which is a lorentz transformaton.In some ways, that has characterized my efforts with quaternions. Yet it does cause me valid problems. Technical words usually mean but one thing. I may get to a result, but not along the same technical path.

Based on your comments in Doug's articles, it sounds like what you really wanted to know is whether the "usual" or "real valued" quaternions (not the biquaternions) can give a representation of the full Lorentz group, and what this representation looks like.I originally didn't believe that it preserved the Boost Group, let alone the full Lorentz Group. Rob seems to think it represents the full group with the same algebra. I really don't see that at all, but it does rotations and boosts. I'm not sure if it mixes rotations and boosts the right way. Can you show that it doesn't?

I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensorThis is covered very well in Dirac's General Theory of Relativity, which is a great resource because it covers most of the essentials of GR in about 70 pages, without sacrificing clarity (see page 36).

One of the things that tossed me for a loop is I have only seen the square root of minus the determinant of a metric tensor in equations for GR, never for anything else. Yet it does belong for anything that is a Lagrange

*density,*as indicated by a

*d*. If a calculation is in Cartesian coordinates, there are no worries, the implied factor of unity. If the coordinates are different, it just gets put in by hand without reference to .

^{4}xDid you pay the $5/day, $9/mo, $59/year Scribd.com fee? Interesting model if it works. The paperback is $22 on amazon.

I found that Dirac reference by google search. I don't pay for scribd. I believe only a portion of the book is there. Almost all of the book is also available on google books (but not that section).

Doug!

I think I did it! I think I figured out how your quaternion idea of relating a quaternion to a four-vector, what the multiplication corresponds to in the usual tensor notation.

Can you translate this to nice math so everyone can discuss it easier?

Okay, using David's notation suggestion and Henry's tweak on it, we can always write a four-vector in terms of components with a spacetime basis choice as such:

**A** = a0 **e0** + a1 **e1** + a2 **e2** + a3 **e3**

and we can write a quaternion in components with a quaternion basis choice as such

**B** = b0 **q0** + b1 **q1** + b2 **q2** + b3 **q3**

Also, in the same spacetime basis, we can write out any tensor in components as such (an example of a rank2):

**T** = T_ij **ei** **ej**

Now, your idea starts with a Cartesian inertial coordinate system, and equates the quaternion basis to the spacetime basis. And then defines in this basis, that the quaternion basis multiply like the standard 1, i, j, and k quaternion basis.

(There is also the issue that this forces you to chose some particular inertial frame where this definition applies. Which one? Who cares for now. Just pick one randomly. Let's denote this special frame as the Q frame, since it is the frame where the **q** basis is the usual quaternion basis.)

Okay, now with that all specified, we come to the issue that you are struggling to interpret. What is the result of multiplying two four-vectors? Is the first component somehow a Lorentz scalar?

Well, what I finally realized is that equating of a quaterion and four-vector in this manner means any higher tensor is now mapped back onto a rank 1 tensor. Because for example in the Q frame **q1** **q2** = **q3**.

So for example, we can always simplify a rank 2 tensor to a rank 1 tensor

**T** = T_ij **ei** **ej** = T_u **eu**

since a rank 1 tensor is a quaternion in your idea, this means any tensor object (except scalars?) can now be represented by a single quaternion.

This is exactly what your quaternion multiplication is doing, and helps explain what object you are getting out. Your multiplication isn't telling us about the inner-product of two four-vectors (which is what you were hoping with the 0 component), it is actually talking about the outer-product of two four-vectors.

This points out something that I forgot about, we already have a defined means to multiply two four-vectors directly. It is the outer-product. However this usually gives a rank 2 object. Due to your idea though, all higher rank objects can map down to a rank 1 object. This is why what you call the "vector" pieces of your quaternion, when the result of a multiplication, appeared to have two distinct pieces when adding up the terms from the multiplication: a "3 vector" like piece which is really just the T_0i or T_i0 components of a rank 2 tensor, and a "pseudo-vector" part which is really just the T_ij components of a rank 2 tensor.

Unfortunately, this also makes it clear why your idea seemed to be so coordinate system dependent whenever someone tried to be explicit about the basis choices. First, the definition of the multiplication of the basis can only work nicely in the special Q frame (not even other inertial frames as Henry showed). And Second, we can see the inevitable "reduction" rule maps all higher tensors to a rank 1 tensor (a quaternion here). This clarifies the meaning of the object that is the result of a quaternion multiplication, and unfortunately the components don't mean what you intended.

The idea also strongly restricts the possible geometric objects (to basically just rank 0 numbers and rank 1 quaternions). I haven't thought yet about what this restriction means for the idea.

Henry also suggested the alternate interpretation of your idea that maybe the basis weren't literally equated, but only required to transform equivalently. Dropping the strict equivalency gives more freedom and no longer results in all higher rank tensors mapping down to a rank 1 tensor (quaternion). Besides the mapping reduction, the math leads to the same insights as before. Except now, the multiplication would be equivalent to defining that the quaternion multiplication

C = A B

this is in tensor notation

C^u = Q^u_vw A^v B^w

where in one special frame you define the components of the object Q to match the quaternion multiplication definition.

This results in all the same realization as above, regarding summing over the outer-product of four-vectors (rank 2 pieces). Here, it is the Q^u_vw "field" which represents quaternion multiplication, which contains the very coordinate dependent components in one nice little package. Unlike say the metric components, this object's components change even when sticking with inertial coordinate systems.

So in conclusion: trying to treat a quaternion with a four-vector results in the quaternion multiplication being very coordinate system dependent and appears to have no useful goemetric meaning beyond some coincidences in the special frame Q where everything is just the result of definition (you get out what you put in).

**crickets**

Anyone?

I thought this would finally help explain why the object that is obtained by multiplying two quaternions treated as four-vectors yields a result containing a lot of tensor pieces added together.

It also, in my opinion, clearly answers the issues regarding claims that the quaternion product gives a Lorentz scalar in one component.

David was right, being clear about what the basis are, and what is meant by coordinate transformation, immediately gives mathematically unavoidable answers. Does this settle the issue?

Doug? Anyone?

"There are a huge number of physics issues one could go into. As an example, these equations bind to particles with integral spin which is good for bosons, but there are quite a few fermions that also participate in gravity."

Where did this come from?

Are you claiming that fermions such as electrons have no gravitational mass according to GR?

More specifically, are you claiming fermions can't contribute to the stress-energy tensor?

In general relativity, the action is usually assumed to be a functional of the metric (and matter fields), and the connection is given by the Levi-Civita connection. The Palatini formulation of general relativity assumes the metric and connection to be independent, and varies with respect to both independently, which makes it possible to include fermionic matter fields with non-integral spin.If someone else can expand on why this is so, that would be great.

To assuage any doubts, the mass of electrons doesn't somehow violate GR. The issue is regarding coupling to intrinsic spin. No experiment has given insight on this yet. For mathematical reasons, it is expected that allowing torsion will be necessary to include instrinsic spin cleanly. For example Loop quantum gravity is actually trying to quantize the variation of GR that allows torsion.

I am not an expert on the subject, so from my cursory understanding, it is not clear to me why the wiki editor says integral instrinsic spin is okay but non-integral intrinsic spin is not. Especially when considering it at a classical level, its not clear why the value of intrinsic spin would already be treated differently between some quantized values. So I'd treat that comment cautiously.

Regardless, since there is no experimental guidance at the moment, just ignore those issues unless you want to dive into quantum gravity. I would suggest understanding classical gravity before even considering that.

On but not fini, indeed I don't think modern science has proved much with fermion spin under gravity, (most of our particle accelerators are horiozontal) are have only seen spin 1/2, half integral spins, are there any other kind? Are there real spin 3/2 gravitinos, and could we ever see them.