Einstein Field Equation Derivation in about a Dozen Steps
By Doug Sweetser | May 21st 2012 11:46 PM | 30 comments | Print | E-mail | Track Comments

Trying to be a semi-pro amateur physicist (yes I accept special relativity is right!). I _had_ my own effort to unify gravity with other forces in...

View Doug's Profile
In this blog I will derive the Einstein field equations starting from the Hilbert action.  Since there are only two terms in the Hilbert action, one of which is left alone, there is not that much to do.  Well, there is always way more to do - how well is this step really understood?  Where does that factor come from?  What kinds of variations could one do?  The core of this blog is an extensive translation of the wikipedia page on the Einstein-Hilbert action written in my own personal style, doing minor variations so the steps made more sense to me.  Go there if my style confuses you for a step or two.

In the Annus Mirabilis, 1905,  one of Einstein's accomplishments was to establish the theory of special relativity.  What was special was that all observers must travel at a constant speed, neither accelerating or decelerating.  For such an observer, the speed of light is a constant.  Different observers will see different wavelengths and frequencies, but the product of wavelength with the frequency is identical.  The wavelength and frequency are said to be Lorentz covariant, meaning we know how they change for different observers.  The speed of light is Lorentz invariant.  It is one of my pet peeves that invariants should always be paired with their corresponding covariant quantities or else an incomplete story is being told.

Newton's law of gravity does a remarkable job in describing the motion of the planets.  It is all that is needed by today's rocket ships unless those devices also carry atomic clocks or other tools of exceptional accuracy.  Here is Newton's law in potential form:

$4 \pi G \rho = \nabla^2 \phi$

From the perspective of special relativity, the equation suffers a fatal flaw: if there is a change in the mass density rho, then that must propagate everywhere instantaneously.  Oops.

Einstein set out to fix this flaw.  The struggle took him ten years ("Subtle is the Lord..." by Abraham Pais http://www.amazon.com/Subtle-Is-Lord-Einstein-Paperbacks/dp/0192851381 is the was to get the real details on the subject).  The math was hard then and remains hard today.  At a far away level, it sounds easy - describe all physics the same way whether one is accelerating or not.  It is the details of Riemann geometry that are daunting.  Einstein got a private tutor and collaborator for the subject, his school buddy Marcel Grossmann.  He also traded letters on his math struggles with the leading math minds of his day, including David Hilbert.  Einstein came to the field equations not from an action, but from thinking all about the physics.  Hilbert figured out the action that generates the Einstein field equations.  That is where the derivation begins:

$S = \int{\sqrt{-g} d^4 x\left( \frac{c^4}{16 \pi G}R + \mathcal{L_M} \right)}$

Note the square root of the determinant of the metric as part of the volume element.  That is required so the volume element can be in curved spacetime.  It plays a critical role in the derivation, so I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor.

2. Vary with respect to the metric tensor $g_{\mu\nu}$:

$\delta S = \int{ d^4 x\left( \frac{c^4}{16 \pi G} \frac{\delta (\sqrt{-g}R)}{\delta g^{\mu\nu}} + \frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} \right)\delta g^{\mu\nu}}$

3. Pull back the factor of the square root of the metric and use the product rule on the term with the Ricci scalar R:

$\delta S = \int{ \sqrt{-g} d^4 x\left( \frac{c^4}{16 \pi G} \left( \frac{\delta R}{\delta g^{\mu\nu}}+ \frac{R}{\sqrt{-g}} \frac{\delta \sqrt{-g}}{\delta g^{\mu\nu}} \right) + \frac{1}{\sqrt{-g}}\frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} \right)\delta g^{\mu\nu}}$

4. Focus on the first term, using the definition of a Ricci scalar as a contraction of the Ricci tensor:

\begin{align*} \frac{\delta R}{\delta g^{\mu\nu}} &= \frac{\delta(g^{\mu\nu} R_{\mu\nu})}{\delta g^{\mu\nu}}\\ &= R_{\mu\nu} \frac{\delta g^{\mu\nu}}{\delta g^{\mu\nu}} + g^{\mu\nu} \frac{\delta R_{\mu\nu}}{\delta g^{\mu\nu}}\\ &=R_{\mu\nu} + \rm{a \;total \; derivative} \end{align*}

A total derivative does not make a contribution to the variation of the functional, so can be ignored in our quest to find an extremum.  This is Stokes theorem in action.

<SIDEBAR>
Show that the variation in the Ricci tensor is a total derivative.

Since I don't understand this all in detail, I will try to get you in the neighborhood of getting it.

$R^{\rho}_{\;\,\sigma\mu\nu}= \partial_{\mu}\Gamma^{\rho}_{\;\,\sigma\nu}-\partial_{\nu}\Gamma^{\rho}_{\;\,\sigma\mu} + \Gamma^{\rho}_{\;\,\lambda \mu}\Gamma^{\lambda}_{\,\;\sigma \nu} - \Gamma^{\rho}_{\;\,\lambda \nu}\Gamma^{\lambda}_{\,\;\sigma \mu}$

Lots of stuff there, but here is a simplifying viewpoint.  One is comparing two paths, that is why there is a subtraction here.  The two paths are found by switching the order of the mu and the nu.  This is a really complicated structure, but that should be obvious :-)

SB2: Vary the Riemann curvature tensor with respect to the metric tensor:

\begin{align*} \delta R^{\rho}_{\;\,\sigma\mu\nu}=& \partial_{\mu}\delta \Gamma^{\rho}_{\;\,\sigma\nu}-\partial_{\nu}\delta \Gamma^{\rho}_{\;\,\sigma\mu}\\& + \delta \Gamma^{\rho}_{\;\,\lambda \mu}\Gamma^{\lambda}_{\,\;\sigma \nu} - \delta \Gamma^{\rho}_{\;\,\lambda \nu}\Gamma^{\lambda}_{\,\;\sigma \mu}\\& + \Gamma^{\rho}_{\;\,\lambda \mu}\delta\Gamma^{\lambda}_{\,\;\sigma \nu} - \Gamma^{\rho}_{\;\,\lambda \nu}\delta \Gamma^{\lambda}_{\,\;\sigma \mu} \end{align*}

Lots of terms, but remember the mu <-> nu exchange is responsible for half of them.

One cannot take a covariant derivative of a connection since it does not transform like a tensor.  Apparently the difference of two connections does transform like a tensor.  I say "apparently" because this is an example where I have to rely on authority, I don't appreciate the details.

SB3: Calculate the covariant derivative of the variation of the connection:

\begin{align*} \nabla_{\mu} (\delta \Gamma^{\rho}_{\;\,\sigma\nu})&= \partial_{\mu}(\delta \Gamma^{\rho}_{\;\,\sigma\nu})\\&+ \Gamma^{\rho}_{\;\,\lambda \mu}\delta \Gamma^{\lambda}_{\,\;\sigma \nu} \\&-\delta \Gamma^{\rho}_{\,\;\lambda \sigma} \Gamma^{\lambda}_{\;\,\mu \nu}\\& - \delta\Gamma^{\rho}_{\,\;\lambda \nu}\Gamma^{\lambda}_{\;\,\sigma \mu} \\ \\ \nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\sigma\mu})&= \partial_{\nu}(\delta \Gamma^{\rho}_{\;\,\sigma\mu})\\&+ \Gamma^{\rho}_{\;\,\lambda \nu}\delta \Gamma^{\lambda}_{\,\;\sigma \mu} \\&-\delta \Gamma^{\rho}_{\,\;\lambda \sigma} \Gamma^{\lambda}_{\;\,\mu \nu}\\& - \delta\Gamma^{\rho}_{\,\;\lambda \mu}\Gamma^{\lambda}_{\;\,\sigma \nu} \end{align*}

Notice that the third terms of these two expressions are identical because the mu and nu are neighbors in the connection.

Again, this is a step whose details I don't understand enough to clarify should others have questions.

SB4: Rewrite the variation of the Riemann curvature tensor as the difference of two covariant derivatives of the variation of the connection written in step SB3.

$\delta R^{\rho}_{\;\,\sigma\mu\nu} = \nabla_{\mu} (\delta \Gamma^{\rho}_{\;\,\sigma\nu}) -\nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\sigma\mu})$

SB5: Contract the result of SB4

$\delta R^{\rho}_{\;\,\mu\rho\nu} = \delta R_{\mu\nu} = \nabla_{\rho} (\delta \Gamma^{\rho}_{\;\,\mu\nu}) -\nabla_{\nu} (\delta \Gamma^{\rho}_{\;\,\rho\mu})$

SB6: Contract the result of SB5:

\begin{align*}g^{\mu\nu} \delta R_{\mu\nu} &= \nabla_{\rho} \;g^{\mu\nu}(\delta \Gamma^{\rho}_{\;\,\mu\nu}) -\nabla_{\nu} \;g^{\mu\nu} (\delta \Gamma^{\rho}_{\;\,\rho\mu})\\&= \nabla_{\sigma} \;g^{\mu\nu}(\delta \Gamma^{\sigma}_{\;\,\mu\nu}) -\nabla_{\sigma} \;g^{\mu\sigma} (\delta \Gamma^{\rho}_{\;\,\rho\mu}) \\&= \nabla_{\sigma} \left(\;g^{\mu\nu}(\delta \Gamma^{\sigma}_{\;\,\mu\nu}) -\;g^{\mu\sigma} (\delta \Gamma^{\rho}_{\;\,\rho\mu}) \right) \end{align*}

This now looks to my eye like a total derivative, so will not contribute to the action.
<END SIDEBAR>

Since that was such a long sidebar, what has been done is the first of three terms in the variation is the Ricci tensor.

5. Focus on evaluating the variation of the second term in the action.  Transform the coordinate system to one where the metric is diagonal and use the product rule:

\begin{align*}\frac{R}{\sqrt{-g}} \frac{\delta \sqrt{-g}}{\delta g^{\mu\nu}} &= \frac{R}{\sqrt{-g}} \frac{-1}{2 \sqrt{-g}}(-1)g g^{\mu\nu} \frac{\delta g_{\mu\nu}}{\delta g^{\mu\nu}}\\&= -\frac{1}{2} g_{\mu\nu} R\end{align*}

Notice there was a flip of the metric in the variation which required one more sign change.  That is the kind of detail I always trip on.

6. Define the stress energy tensor as the third term:

$\frac{1}{\sqrt{-g}}\frac{\delta (\sqrt{-g}\mathcal{L_M})}{\delta g^{\mu\nu}} = -\frac{1}{2}T_{\mu\nu}$

That factor of a minus a half?  I don't get it.  Bet it comes out of some classical limit.  Hopefully I can research that later in the week.

7. The variation of the Hilbert action will be at an extremum when the integrand is equal to zero:

$\frac{c^4}{16 \pi G} \left( R_{\mu\nu} -\frac{1}{2} g_{\mu\nu} R \right) - \frac{1}{2} T_{\mu\nu} = 0$

or

$R_{\mu\nu} -\frac{1}{2} g_{\mu\nu} R = \frac{8 \pi G}{c^4}T_{\mu\nu}$

Fini.

But not fini.  This was a math exercise.  Note how little physics was involved.  There are a huge number of physics issues one could go into.  As an example, these equations bind to particles with integral spin which is good for bosons, but there are quite a few fermions that also participate in gravity.  To include those, one can consider the metric and the connection to be independent of each other.  That is the Palatini approach.

Doug

Next Monday/Tuesday: Dot and cross products, differences and overlaps with quaternions

Nice work. General Relavity isn't easy, It was the class I fell bellow 80% on in Degree Exam.

On but not fini, indeed I don't think modern science has proved much with fermion spin under gravity, (most of our particle accelerators are horiozontal) are have only seen spin 1/2,  half integral spins, are there any other kind? Are there real spin 3/2 gravitinos, and could we ever see them.
Note the square root of the determinant of the metric as part of the volume element. That is required so the volume element can be in curved spacetime.  ... I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor.
I've seen you bring this up before.  This is not an issue of curved spacetime.  I'm hesitant to just give you the answer, since you seem to have become dependent on people doing that.  So let me translate it to something that may look familiar or at least simpler.

In some inertial cartesian coordinates you have the spacetime volume element:
dx dy dz dt
Now let's do a Lorentz transformation (say a "boost" along the x direction), to get some new coordinate system. How do you know that the volume element is invariant? How do you know that it is still just
dx' dy' dz' dt'
and without any other constant factors like gamma or something? This is worth working out later, but let's simplify it further to help bridge the gap to physical intuition.

Okay, now for some calculus.  In simple Euclidean 3-space, the volume element is:
dx dy dz
If we instead use spherical coordinates, it is
rsinθ dr dθ dφ
You've probably seen this many times, but how do you actually calculate that? When I was first taught this, we just built up the volume element using simple arguments of arclengths, etc. This is probably how you saw it before as well, but is essentially cheating for this discussion because you are using your knowledge of the "physical" lengths to derive something about the lengths.  Given only the original elements, and the coordinate transformation, how do you get the new volume element?  Maybe try something even simpler such as the transformation
x' = 2x
After playing with it awhile (I encourage you to do so), and possibly rereading some calculus textbook discussion, you will eventually come up with the Jacobian.  I may have ruined it just by uttering the word.

Okay, now return to the spacetime. For the action to be invariant it is not sufficient to have a Lagrangian to be a scalar density. We'd need the volume element to be invariant as well. Sticking with cartesian coordinates hides some factors. So let's consider a general coordinate transformation (this has nothing to do with curved spacetime, we can still be doing SR), we see

except for the special case

So before with  dx dy dz dt  we clearly are not working with an invariant. It just looked like it because it didn't change between two inertial coordinate systems. That fact is not enough to show invariance (sound familiar to any other discussions lately? ::smile:: ). Anyway, we need to add a scaling term that we are leaving out. If you want, you can use spherical coordinates or simple scaled coordinates to help you work this one out.

So what is this scaling factor "a"?
Hint, if you look at the transformation law for the metric tensor, the answer becomes pretty clear. This isn't worth working out in great detail. If you understand the steps above, the answer should should be reachable without wading in the swamp of detailed calculations (plugging it into Mathematica doesn't teach you anything either; take the time to learn what the math means).

As a historical side note:
Einstein originally limited his field equations to coordinate systems which set this "scaling factor" to 1.  When Schwarzschild worked out his solution to GR, he purposely used a "scaled" coordinate system (not the one denoted to his name now) to preserve this feature. (He also made a mistake with the event  horizon that Hilbert (I think?) corrected later in a brief footnote when he used it in some paper.)  With the Hilbert action it became clear one could easily lift this restriction.

I'll try to get to your other questions later.
EDIT: Wow, that was a record number of typos.  Probably still haven't caught them all.  I hope no one was reading that in the mean time.
Hmm... it has been awhile since I played with some of this, and I can't remember the answer to one part.  Hopefully it will be obvious with some sleep, but if someone can point the way, I'd be much obliged.
One cannot take a covariant derivative of a connection since it does not transform like a tensor.  ...
You already wrote out the Riemann tensor.  The Christoffel symbols can be written in terms of the metric.  So you can write the whole thing in terms of the metric.  It is messier, but you could proceed that way without making any 'leaps' of faith.

In fact, this leap of faith to the covariant derivative needs to be undone later, because when you say:
A total derivative does not make a contribution to the variation of the functional, so can be ignored in our quest to find an extremum.  This is Stokes theorem in action.
Note that Stokes theorem doesn't work for covariant derivatives.  It is just for the usual derivatives.  This is part of the issue of defining mass in a local volume in GR, as I think David or someone else described earlier.

I forgot the rest of the steps, but wikipedia comments

unfortunately that is not obvious to me at this time of night. Can anyone explain why this should be clear?

After some sleep I'll work it out to convince myself, but I don't think I'll be able to give a simple answer.
According to this:
http://arxiv.org/pdf/gr-qc/0406088v4.pdf

The vacuum lagrangian for GR can also be written as (eq. 32):
L = -1/(k Λ) Sqrt[ det(R) ]
where Λ is the cosmological constant.

What in the world!?
And with Λ right next to k, how can we separate the gravitational constant from Λ ?? This also seems to disallow the possibility of Λ=0.

David, Henry, Doug, anyone? They call this the "aﬃne formulation in General Relativity". It is really equivalent? How? I really don't understand this paper.

So what happenned to all the unfinished discussion about coordinate transformations of quaternions from the last thread? I hope that isn't just being dropped, and everyone is deciding to leave that in the past. Next week's topic on cross products seems unrelated. I think everyone was getting close to the real issues at hand in for Doug's quest in this blog.

Doug, using David's suggested notation, can you explain what you mean by a coordinate transformation of quaternions?

I haven't forgotten about it, but I have to think about it in terms of managing the changes between two different basis vectors.  I am certain in all the years I held a paying job, a Jacobian was never used.  Even back in the day when I first learned about switching coordinates, I probably was not as diligent on the subject as I now wish I was.  Sorry for my non-answer, but I am pondering...
but I have to think about it in terms of managing the changes between two different basis vectors. I am certain in all the years I held a paying job, a Jacobian was never used.
If you are talking about transforming from a 4-vector basis to a quaternion basis, are you sure the question is well defined? The bases satisfy different algebras. Biquaternions represent the Lorentz Group, but can you show me a transformation matrix from biquaternions to 4-vectors?
Will there be a quaternion blog this week? Have you read  this recently? It has been updated since I first posted it and now and includes this
You can see that rotating the boosted V is the same as boosting by a rotated v, since quaternionic rotation both respects products and quaternionic conjugation. This is a direct proof that the boosts and rotations in the stand-up representation properly commute, so that all 6 generators properly work together.I think this is a nifty way to write down the Lorentz group, and I am surprised that it is new. It is a spinorial trick specific to the four-dimensional Lorentz transformations, but it is a nice one, and it might be useful for something.

I took a vacation day on Memorial day :-)

Most frigging impressive, the work done by Ron Marion and Qmechanic at http://physics.stackexchange.com/!  This feels like the first time when others more skilled than I figured out an issue without much of a bumbling comment by yours truly.  They even reached out to the gamma matrices, the correct ones as it were.  While I am aware of a connection between the two, I could not discuss the issue in a correct technical way due to my informal training.

Anyone who has found this discussion of the way to do boosts with quaternions suspect should read that thread.
It is very clear and fascinating. Particularly,
The boosts are all the h's which satisfy h2+hˉ2=1 which are all the rotations of the stand-up mechanic's boost.
which is a nice unification of boosts and rotations.
To get to the actual Lorentz group, it looks like he considers quaternions with 2x2 matrix valued components.

Isn't this just biquaternions? "complexified" quaternions are already known to be able to represent the Lorentz group as Barry already pointed out previously (and is what started this side discussion).

Complexified quaternions are just biquaternions.  There seems to be technical back and forth on that issue (one needs to expand the comments to see them all).  My reading is that both work for representing the Lorentz group.
It is very closely related to Biquaternions, but in the "verfiying why this works section." Although the quaternions are constructed from complex numbers, the elements being acted on satisfy a quaternion algebra, and are, therefore, quaternions. Notice that the Lorentz Transformation isn't the same as the biquaternion case (which has a much neater Lorentz Transformation).

It still isn't obvious to me that two boosts make a rotation, but then that isn't obvious to me in the biquaternion case either.
I just read the stack-exchange page.

I feel you and Doug are misinterpreting. Doug has always used "real" quaternions (H) instead of "complex" quaternions which are biquaternions (C x H). Doug loves the ability to divide, and the biquaternions are equivalent to 2x2 complex matrices and as such do not form a division algebra.

In the "Verifying that it works" section, Maimon introduces a notation of writing a quaternion as two complex numbers (instead of a four-tuple of real numbers like Doug usually does). This is fully equivalent to the usual quaternions Doug uses, and is just a different notation which can be convenient at times.

In the "embedding the whole Lorentz group" section, the sigma's in those equations are 2x2 matrices, and even the 1 stands for the 2x2 identity matrix. Maimon is considering elements which are 2x2 matrices. As he makes clear:
"This is not a division algebra anymore, but it's the full space of 2 by 2 complex matrices."
This is equivalent to biquaternions. He seems to be claiming (if I understand correctly) that it is a different representation of the Lorentz group. Regardless, there should be no confusion that these are not equivalent to Doug's real quaternions anymore. This is not just a notational difference. It is not a division algebra.

If you are getting confused by the notation, please remember that the 2x2 complex matrices are isomorphic to the biquaternions.
http://en.wikipedia.org/wiki/Biquaternion
"Given any 2 × 2 complex matrix, there are complex values u, v, w, and x to put it in this form [of a quaternion] so that the matrix ring is isomorphic to the biquaternion ring."

Based on your comments in Doug's articles, it sounds like what you really wanted to know is whether the "usual" or "real valued" quaternions (not the biquaternions) can give a representation of the full Lorentz group, and what this representation looks like.

I thought that section was sketchy myself. QMechanic said it doesn't represent the full group, but there was originally a small error in my question. Rob seems to think that it does in places. Anyway, I thought the alternation representation of the Lorentz Boost was nice. What you would like to do is study the generators, but these transformations are so strange I'm not sure where they are. If the algebra of the generators is the same, the two algebras form the same group. It looks like some aspects of Lorentz Transformations might be easier to to prove with quaternions.
I do have a basic group theory question based on this.  There is a binary operation multiplication that takes two elements to form a third element in the group.  All my proposed real-valued quaternion work requires quaternion triple products.  Rotations in 3D require a product on the left with a unitary quaternion as well as its conjugate on the right.  For the boosts, the triple products have one left/right, but also two where the hyperbolic trig part is on the left.  It is like the real quaternion approach has a bit more internal algebraic structure than the Lorentz group which is both simple and direct (this is an element of the Lorentz group, so the product is also a member of the Lorentz group).  I don't know how to describe the extra algebraic structure required for the real-valued quaternion system to work.

I can imagine that one says: this is the definition of a group, it uses just one multiplication product.  Should you require anything more, it is not a group.  If so compelled, I hope we can come up with a name for the structures of real-valued quaternions that do end up at a similar place.
I don't know how to describe the extra algebraic structure required for the real-valued quaternion system to work.
I don't know either. My approach was to ignore it and focus on the result which is a lorentz transformaton. I'm sure you could think about it a few different ways. I originally thought of it as a measure of how poorly quaternions are suited to represent a group. The more correction factors the further it is away from a pure quaternion rotation transformation. But you should really find some experts. If I was in your position I would chat with them on stackexchange about it (there is a chat room there as well as a question board).
I enjoyed this line in particular:
My approach was to ignore it and focus on the result which is a lorentz transformaton.
In some ways, that has characterized my efforts with quaternions. Yet it does cause me valid problems.  Technical words usually mean but one thing.  I may get to a result, but not along the same technical path.
You might want to check out  The Skeleton Key of Mathematics: A Simple Account of Complex Algebraic Theories By Dudley Ernest Littlewood. I find it difficult because it is extremely concise, but it covers Groups, Algebras (including Quaternions), and Group Algebras consecutively. The first half of the book is pure number theory and you can skip that and go straight to the group stuff. Maybe those chapters could give you the broad idea of how algebras and groups work together and the terminology. Its the only book I've come across that does this. Its very inexpensive too.
An order has been placed for the book.
Based on your comments in Doug's articles, it sounds like what you really wanted to know is whether the "usual" or "real valued" quaternions (not the biquaternions) can give a representation of the full Lorentz group, and what this representation looks like.
I originally didn't believe that it preserved the Boost Group, let alone the full Lorentz Group. Rob seems to think it represents the full group with the same algebra. I really don't see that at all, but it does rotations and boosts. I'm not sure if it mixes rotations and boosts the right way. Can you show that it doesn't?
I wish I had a better handle on why that factor in that form is required so that the differential volume element transforms like a tensor
This is covered very well in Dirac's General Theory of Relativity, which is a great resource because it covers most of the essentials of GR in about 70 pages, without sacrificing clarity (see page 36).
I was a big fan of Dirac's book on quantum mechanics.  Sounds like what others have been saying here.  It is an idea one needs to walk with, by that I mean think about it as I walk around.

One of the things that tossed me for a loop is I have only seen the square root of minus the determinant of a metric tensor in equations for GR, never for anything else.  Yet it does belong for anything that is a Lagrange density, as indicated by a d4x.  If a calculation is in Cartesian coordinates, there are no worries, the implied factor of unity.  If the coordinates are different, it just gets put in by hand without reference to $\sqrt{-g}$.
Did you pay the $5/day,$9/mo, $59/year Scribd.com fee? Interesting model if it works. The paperback is$22 on amazon.

I've never actually done an integral in a curved space in GR.  The metric determinant is a scalar density of weight +2. Some metrics have negative signatures and the negative sign is simply there to make the metric determinant postive. The square root is there to make it transform like a scalar density of weight 1 and, because the four dimensional volume element is weight -1, there is a cancellation of weights and this produces a scalar.  (This isn't information I have memorized. This is from pages93-96 of D'Invernio's Introducing Einstein's Relativity.) It is all there to make the thing your integrating into an ordinary scalar function.

I found that Dirac reference by google search. I don't pay for scribd. I believe only a portion of the book is there. Almost all of the book is also available on google books (but not that section).
Doug!
I think I did it! I think I figured out how your quaternion idea of relating a quaternion to a four-vector, what the multiplication corresponds to in the usual tensor notation.

Can you translate this to nice math so everyone can discuss it easier?

Okay, using David's notation suggestion and Henry's tweak on it, we can always write a four-vector in terms of components with a spacetime basis choice as such:
A = a0 e0 + a1 e1 + a2 e2 + a3 e3
and we can write a quaternion in components with a quaternion basis choice as such
B = b0 q0 + b1 q1 + b2 q2 + b3 q3
Also, in the same spacetime basis, we can write out any tensor in components as such (an example of a rank2):
T = T_ij ei ej

Now, your idea starts with a Cartesian inertial coordinate system, and equates the quaternion basis to the spacetime basis. And then defines in this basis, that the quaternion basis multiply like the standard 1, i, j, and k quaternion basis.

(There is also the issue that this forces you to chose some particular inertial frame where this definition applies. Which one? Who cares for now. Just pick one randomly. Let's denote this special frame as the Q frame, since it is the frame where the q basis is the usual quaternion basis.)

Okay, now with that all specified, we come to the issue that you are struggling to interpret. What is the result of multiplying two four-vectors? Is the first component somehow a Lorentz scalar?

Well, what I finally realized is that equating of a quaterion and four-vector in this manner means any higher tensor is now mapped back onto a rank 1 tensor. Because for example in the Q frame q1 q2 = q3.
So for example, we can always simplify a rank 2 tensor to a rank 1 tensor
T = T_ij ei ej = T_u eu
since a rank 1 tensor is a quaternion in your idea, this means any tensor object (except scalars?) can now be represented by a single quaternion.

This is exactly what your quaternion multiplication is doing, and helps explain what object you are getting out. Your multiplication isn't telling us about the inner-product of two four-vectors (which is what you were hoping with the 0 component), it is actually talking about the outer-product of two four-vectors.

This points out something that I forgot about, we already have a defined means to multiply two four-vectors directly. It is the outer-product. However this usually gives a rank 2 object. Due to your idea though, all higher rank objects can map down to a rank 1 object. This is why what you call the "vector" pieces of your quaternion, when the result of a multiplication, appeared to have two distinct pieces when adding up the terms from the multiplication: a "3 vector" like piece which is really just the T_0i or T_i0 components of a rank 2 tensor, and a "pseudo-vector" part which is really just the T_ij components of a rank 2 tensor.

Unfortunately, this also makes it clear why your idea seemed to be so coordinate system dependent whenever someone tried to be explicit about the basis choices. First, the definition of the multiplication of the basis can only work nicely in the special Q frame (not even other inertial frames as Henry showed). And Second, we can see the inevitable "reduction" rule maps all higher tensors to a rank 1 tensor (a quaternion here). This clarifies the meaning of the object that is the result of a quaternion multiplication, and unfortunately the components don't mean what you intended.

The idea also strongly restricts the possible geometric objects (to basically just rank 0 numbers and rank 1 quaternions). I haven't thought yet about what this restriction means for the idea.

Henry also suggested the alternate interpretation of your idea that maybe the basis weren't literally equated, but only required to transform equivalently. Dropping the strict equivalency gives more freedom and no longer results in all higher rank tensors mapping down to a rank 1 tensor (quaternion). Besides the mapping reduction, the math leads to the same insights as before. Except now, the multiplication would be equivalent to defining that the quaternion multiplication
C = A B
this is in tensor notation
C^u = Q^u_vw A^v B^w
where in one special frame you define the components of the object Q to match the quaternion multiplication definition.

This results in all the same realization as above, regarding summing over the outer-product of four-vectors (rank 2 pieces). Here, it is the Q^u_vw "field" which represents quaternion multiplication, which contains the very coordinate dependent components in one nice little package. Unlike say the metric components, this object's components change even when sticking with inertial coordinate systems.

So in conclusion: trying to treat a quaternion with a four-vector results in the quaternion multiplication being very coordinate system dependent and appears to have no useful goemetric meaning beyond some coincidences in the special frame Q where everything is just the result of definition (you get out what you put in).

**crickets**

Anyone?
I thought this would finally help explain why the object that is obtained by multiplying two quaternions treated as four-vectors yields a result containing a lot of tensor pieces added together.

It also, in my opinion, clearly answers the issues regarding claims that the quaternion product gives a Lorentz scalar in one component.

David was right, being clear about what the basis are, and what is meant by coordinate transformation, immediately gives mathematically unavoidable answers. Does this settle the issue?

Doug? Anyone?

I will try and supply part of the answer in my next blog.  I have been thinking about how to do transformations, life without orthonormal coordinates, and how to do the work of a Jacobian without actually using a Jacobian because the matrix representation of a Jacobian is not a quaternion.
"There are a huge number of physics issues one could go into. As an example, these equations bind to particles with integral spin which is good for bosons, but there are quite a few fermions that also participate in gravity."

Where did this come from?
Are you claiming that fermions such as electrons have no gravitational mass according to GR?
More specifically, are you claiming fermions can't contribute to the stress-energy tensor?

That bit of my blog was entirely based on this bit from the wiki page which I recall reading in other places:
In general relativity, the action is usually assumed to be a functional of the metric (and matter fields), and the connection is given by the Levi-Civita connection. The Palatini formulation of general relativity assumes the metric and connection to be independent, and varies with respect to both independently, which makes it possible to include fermionic matter fields with non-integral spin.
If someone else can expand on why this is so, that would be great.
To assuage any doubts, the mass of electrons doesn't somehow violate GR. The issue is regarding coupling to intrinsic spin. No experiment has given insight on this yet. For mathematical reasons, it is expected that allowing torsion will be necessary to include instrinsic spin cleanly. For example Loop quantum gravity is actually trying to quantize the variation of GR that allows torsion.

I am not an expert on the subject, so from my cursory understanding, it is not clear to me why the wiki editor says integral instrinsic spin is okay but non-integral intrinsic spin is not. Especially when considering it at a classical level, its not clear why the value of intrinsic spin would already be treated differently between some quantized values. So I'd treat that comment cautiously.

Regardless, since there is no experimental guidance at the moment, just ignore those issues unless you want to dive into quantum gravity. I would suggest understanding classical gravity before even considering that.