This is the second part of a ten-part post on the foundation of our understanding of high energy physics, which is Richard Feynman's functional integral. The first part is Action, and the following parts, which will appear at intervals of about a month, are Electromagnetism, Action for Fields, Radiation in an Oven, Matrix Multiplication, The Functional Integral, Gauge Invariance, Photons, and Interactions.

I'm hoping this blog will be fun and useful for everyone with an interest in science, so although I'll pop up a few formulae again, I'll still try to keep them friendly by explaining all the pieces, as in the first part of the post. Please feel free to ask a question in the Comments, if you think anything in the post is unclear.

The clue that led to the discovery of quantum mechanics, whose principles are summarized in Feynman's functional integral, came from the attempted application to electromagnetic radiation of discoveries about heat and temperature. Today I would like to tell you about some of those discoveries.

Around 60 BC, Titus Lucretius Carus suggested in an epic poem, "On the Nature of Things," that matter consists of indivisible atoms moving incessantly in an otherwise empty void. In 1738 Daniel Bernoulli proposed that the pressure and temperature of gases are consequences of the random motions of large numbers of molecules. The theory was not immediately accepted, but around the middle of the nineteenth century, John James Waterston, August Krönig, Rudolf Clausius, and James Clerk Maxwell discovered that the everyday observations that things tend to have a temperature that can increase or decrease, and that the temperatures of adjacent objects tend to change towards a common intermediate value, follow from the random behaviour of large numbers of microscopic objects subject to Newton's laws, in particular the conservation of total energy.

To understand how this happens, it is helpful to know about the rate of change with of an expression such as , which means to the power , where is a fixed number greater than 0, and could for example be time or a position coordinate, measured in units of a fixed amount of time or a fixed distance. does not have to be a whole number, since any number can be approximated as accurately as desired by a ratio of the form , where and are whole numbers, and is the 'th power of the 'th root of . The rate of change with of at , which we can write as , is called the natural logarithm of , and usually written as . It was studied in the sixteenth century by John Napier. I explained the meaning of an expression like in the first part of the post, here.

For example, this diagram shows in blue, for in the range to 1. The red line is the straight line with the same rate of change with as at , and from its value at , we can read from the graph that . The symbol means, "approximately equal to."

We have:

since for all greater than 0. Thus , since for all . Leibniz's indicates that the formula is to be taken in the limit where the size of the tiny quantity tends to 0.

We now find:

The number such that is sometimes called Napier's number. From the above formula, we have:

For any fixed number , we have:

where the second from last equality follows because the limit of as the tiny quantity tends to 0 is the same as the limit of as the tiny quantity tends to 0, because for fixed , is also a tiny quantity that tends to 0 as tends to 0.

From the above equation with chosen to be , we find that

for all , because both these expressions are equal to 1 for , and they both satisfy the equation . This equation fixes for all once is given, because can be calculated from as accurately as desired, by dividing up the interval from 0 to into a great number of sufficiently tiny intervals, and calculating the approximate value of at the end of each tiny interval from its approximate value at the start of that interval, by using .

From the above equation at , we find that

for every number greater than 0. Thus for any numbers and , both greater than 0, we have . Thus:

If , so that the symbol , like the expression , represents the collection of data that gives the value of the -dependent quantity at each value of , then from this formula above, we have , and from this formula above, we have , for all values of greater than 0. Thus , for all greater than 0, so:

for all values of greater than 0. We can treat as an ordinary ratio when taking its reciprocal, as I did here, because the rate of change of with is the reciprocal of the rate of change of with .

To calculate the value of Napier's number , we observe first that for all positive whole numbers :

For , so the above formula is true for , and if the formula is established for , then from Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post here, we have:

Thus if the formula is established for then it is established for , so it is established for all the positive whole numbers in succession.

Therefore:

where is defined to be 1, and for each positive whole number , is defined to be the product of all the whole numbers from 1 to . The exclamation mark is usually read as "factorial". The mean that the sum continues in accordance with the pattern shown by the terms before the . The symbol is the Greek letter Sigma, and is called the summation sign. I explained its meaning in the first part of the post, here. The symbol above the , which is read as "infinity", means that the sum is unending.

The reason the above formula for is true is that the sum in the right-hand side of the formula is equal to 1 for , and it satisfies the same equation as does. So for the same reason as we discussed above, the sum in the right-hand side of the formula is equal to , for all . The reason the expression satisfies the equation is that on the first term in gives 0, and on each term in after the first gives the preceding term in , since from our observation above, .

This argument that the expression satisfies the equation assumes that the endless sum tends to a finite limiting value as more and more terms are added, no matter how large the magnitude of is. In fact, the expression increases in magnitude with increasing for , and then starts decreasing in magnitude more and more rapidly with increasing , so that the endless sum always does tend to a finite limiting value, no matter how large is. If is larger than , then the endless sum of all the terms from onwards does not exceed in magnitude. The endless sum approaches 2 when it is continued without end, because each successive term halves the difference between 2 and the sum of the terms up to that point, as shown in this diagram.

The above formula expressing as a sum of powers of is an example of a "Taylor series," named after Brook Taylor. For , it gives:

The sum of the first 7 terms is sufficient to obtain the result to 3 decimal places.

Let's now consider again, as in the first part of the post, here, the example of a collection of objects, such that each object behaves approximately as though its mass is concentrated at a single point, the objects are moving slowly compared to the speed of light, and the forces between the objects arise from their potential energy , which depends on their positions but not on their motions. We'll continue to assume that their motions are governed by Newton's laws, and thus by de Maupertuis's principle of stationary action, which I explained in the first part of the post, here, and we'll now assume that the objects are microscopic and their number is very large, so they could be atoms in solids, liquids, gases, or living things. We'll use the same notation as in the first part of the post, here.

We'll use Cartesian coordinates for the positions of the objects, as we did when we derived Newton's second law of motion from de Maupertuis's principle, in the first part of the post, here, and we'll now assume that the potential energy depends only on the relative positions of the objects, as in the example of the gravitational potential energy, so that the value of is unaltered if the positions of the objects are shifted by a common displacement , so that , for all , . The symbol means "less than or equal to."

By adding up Newton's second law of motion for the objects, which we obtained from de Maupertuis's principle in the first part of the post, here, we find:

From the formula we derived in the first part of the post, here, for the change of when is replaced by with small, so that the positions of the objects are shifted by arbitrary small displacements , we find that if we choose all the to be the same small displacement , so that for all , , then:

in consequence of the assumption that depends only on the relative positions of the objects. This is true for an arbitrary small displacement , so it's true in particular if , where now represents an arbitrary small number rather than a whole collection of data as before, is any of the numbers 1, 2, or 3, and is the Kronecker delta symbol that I mentioned in the first part of the post, here, which is defined to be 1 if its two indices are equal, and 0 otherwise. The error of the above formula tends to 0 more rapidly than in proportion to as approaches 0, so we have:

for all the relevant values 1, 2, and 3 of . Combining this result with the formula we obtained above by adding up Newton's law of motion for the objects, we find:

The expression represents the velocity of the 'th object, and the product of an object's mass and its velocity is called its momentum. I shall let represent the collection of data that gives the momenta of all the objects at each moment in time, so that for all , , and times . The above result can then be written:

which is usually referred to as the conservation of total momentum.

If the positions and momenta of all the objects are specified at one particular time, then their values at every other time are determined by Newton's second law of motion, which we obtained in the first part of the post, here, from de Maupertuis's principle. We'll now divide the range of the possible positions and momenta of the objects into equal size "bins", and ask what the most likely number of objects in each bin will be, if the objects are randomly distributed among the bins, subject to the total energy of the objects having a fixed value .

We'll assume that each bin is sufficiently small that we can treat the positions and momenta of objects in the same bin as approximately equal to one another, but also sufficiently large that the number of objects in a typical bin will be large. For this to be possible, we'll assume that the total number of objects, , is very large. This is reasonable for things in the everyday world, since the number of atoms in a kilogram of matter is in the range from about to .

We'll allow for the possibility that there could be a number of different types of object, such that the masses and interactions of objects of the same type are either identical or very similar to one another, so that the kinetic energy and the potential energy are either exactly or approximately unaltered if the positions and momenta of two objects of the same type are swapped. Objects of different types could be different types of atom, or atoms of the same type in different situations. For example we'll treat two oxygen atoms as different types of object if they form parts of gas molecules contained in separate containers, or if one is part of a gas molecule and the other is part of the wall of a glass container. We'll assume that the number of objects of each different type is very large.

We'll assume that the total momentum of the objects is 0, so that the position of their centre of mass is independent of time, and we'll assume that if any of the objects are parts of liquid or gas molecules, then some of the other objects form solid containers that prevent the liquids or gases from spreading without limit. The number of relevant bins is therefore finite, because the position coordinates of all the objects are bounded, and the momenta of the objects are also bounded, because we assumed above that the objects are moving slowly compared to the speed of light. We'll denote the number of relevant bins by .

I shall let represent the collection of data that gives the total number of objects of each type, so that if represents one of the different types of object, then is the total number of objects of type , and I shall let represent the collection of data that gives the number of objects of each type in each of the bins into which the range of possible positions and momenta of the objects has been divided, so that if represents one of the bins, then is the number of objects of type in bin .

The objects can be distinguished from one another even if they are of the same type and identical to one another, because we can trace their motions back to a particular time, and "label" identical objects by the positions and momenta they had at that time. The number of different assignments of the objects of type to the bins is , because each of the objects can be assigned independently to any of the bins, and of these assignments, the number such that of the objects of type are in bin 1, of them are in bin 2, and so on, is:

where was defined above, for non-negative whole numbers , to be 1 times the product of all the whole numbers not less than 1 and not greater than .

To understand why the above formula gives the number of different assignments of the objects of type to the bins, such that for each whole number in the range 1 to , the number of objects of type in bin is , we note first that the number of different ways of putting distinguishable objects in distinguishable places, such that exactly one object goes to each place, is , because we can put the first object in any of the places, the second object in any of the remaining places, and so on. So if there were distinguishable places in bin , for each in the range 1 to , then the number of different ways of putting the objects in these distinct places would be . This overcounts the number of different assignments of the objects to the bins by a factor , because we can divide up the arrangements into classes, such that arrangements are in the same class if they only differ by permuting objects within bins. Each class then corresponds to a different assignment of the objects to the bins. The number of the arrangements in each class is , so the number of different classes is .

The number of different assignments of all objects to the bins is , and of these, the number such that of the objects of type are in bin , for all and , which I shall denote by , is the product of the above number over , which we can write as:

The symbol is the upper-case Greek letter Pi, and indicates a product of what follows it. It works in the same way as I described in the first part of the post, here, for , except that instead of forming the sum of the expression that follows it for all the specified values of the specified index, we form the product. If no specific range of the index is specified, then the product is over all relevant values of the index, and thus here over all the different types of object.

The total energy of the objects, when the numbers of objects of the different types in the different bins are given by the collection of data , is approximately:

where is the energy of an object of type in the centre of bin . If we randomly drop the objects into the bins, and discard the result unless the total energy differs from by at most a fixed small amount, then the probability that the numbers of objects of the different types in the different bins are given by is , divided by the sum of over all such that is close enough to . Thus if the objects are randomly distributed among the bins, subject to the total energy of the objects having a fixed value , then the most likely number of objects of each type in each bin will be given by the distribution for which reaches its maximum value, among all the distributions for which is approximately equal to .

To find the distribution for which reaches its maximum value, subject to , we'll use the observation that the slope of a smooth hill is zero at the top of the hill. Thus we'll look for the distribution such that the rate of change of with each of the numbers would be 0, if it were not for the requirement that . For convenience we'll do the calculation for rather than for , so that the product of factors in becomes the sum of the natural logarithms of those factors, due to the result we found above. This will give the same result for the most likely distribution , because increases with increasing for all greater than 0, due to the result we found above, so that the that gives the largest value of will also be the that gives the largest value of .

From the formula above for , we have:

From the above definition of , for non-negative whole numbers , we have:

For very large , we can therefore write:

where is the slope of a smooth curve that fits the values of at the whole numbers . The error of the above approximation changes in proportion to as increases, and from above, we have , for all , where the symbol means "greater than". Thus , so the error of the above approximation decreases in proportion to as increases.

And from together with Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post, here, we have:

Thus if we use the above result that even for down to 0, then from the result that the integral of the rate of change is equal to the net change, which we obtained in the first part of the post, here, we have:

since from above, and from above, . The above approximation for is in error by an amount that increases slowly for large , but its relative error tends to 0 for large , so it is accurate enough to use for finding the distribution for which reaches its maximum value, since the numbers will all increase in proportion to the total number of objects , which we have assumed to be very large. We can also use the simpler approximation:

whose relative error also tends to 0 for large .

Thus we have:

where the second line follows from the fact that , for each type of object , and the third line follows from the relation , which we obtained above, for any numbers and , both greater than 0. Thus in the above approximation, depends only on the ratios and .

It is convenient to think of the ratios as coordinates in a "space", which I shall call the space of bin fractions, since is the fraction of the total number of objects which are objects of type in bin . The numbers are restricted to be whole numbers, but for fixed values of the ratios , these numbers will be proportional to , which we have assumed is very large. Thus the ratios only change by tiny amounts when change by , where the symbol means "plus or minus," so since depends smoothly on for all numbers , we can think of the coordinates as effectively continuous. If the number of types of object is , then the space of bin fractions has dimensions, since a point in this space is specified by the numbers .

The equation imposes one relation among the coordinates of the space of bin fractions, so it defines a -dimensional "surface" in this space. We are looking for the point on this surface at which reaches the largest value it takes anywhere on this surface. If we think of as the height of a smooth "hill", then since the slope of a smooth hill is 0 in each direction at the top of the hill, the rate of change of is 0 in each direction along the surface, at the point on the surface where reaches its maximum value. However the rate of change of in directions that are not along the surface does not have to be 0 at that point. So we are looking for a point on the surface such that the rate of change in every direction, whether along the surface or not, is a multiple of the rate of change of in that direction, since the rate of change of along the surface is 0.

We can do that by looking for a point for which an expression

reaches its maximum value, where , which is the Greek letter beta, is a fixed number whose value will be chosen later. For requiring that the rate of change of the above expression is 0 in every direction gives the equations:

for all and , so that the rate of change of in every direction is a multiple of the rate of change of in that direction. As I explained in the first part of the post, here, the symbol is an alternative notation for Leibniz's , that is usually used to express the rate of change of a quantity that depends on a number of quantities that can vary continuously. The points that solve the above equation for different values of will form a "line" through the space of bin fractions, such that different values of correspond to different points on the line, and after we have found this line, we can find the point where it intersects the surface defined by the equation . The number is called a "Lagrange multiplier", after Joseph Louis Lagrange.

The equations , one for each type of object , similarly each define a -dimensional surface in the space of bin fractions, and we'll take these equations into account by using additional Lagrange multipliers , one for each of these equations, where is the Greek letter gamma. So we'll look for a point in the space of bin fractions for which an expression:

reaches its maximum value, where , and and the are fixed numbers whose values will be chosen later. The second line here follows from the formula for , as above, and the formula for we obtained above. From a calculation similar to the one above, we have:

so:

If this is 0 for all and all at a point in the space of bin fractions, then the rate of change of will be 0 in any direction where the rates of change of and all the quantities are 0. From the result we found above, this expression for is 0 when:

where Napier's number was defined above, and its value was approximately calculated above. From the result above, is for all , so increases as increases, for all , so from the formula above, decreases as increases, for all values of greater than 0. Thus for , is positive when is less than the above value and negative when is greater than the above value, so since the terms in that depend on any one are independent of all the other , the maximum value of in the region where all are is attained when the value of each is given by the above formula.

And when the value of each is given by the above formula, the rate of change of , in any direction in the space of bin fractions, is times the rate of change of in that direction, plus the sum, over the object types , of times the rate of change of in that direction.

Remembering that and the are fixed numbers whose values will be chosen later, we'll now define the value of , and the numbers , to be such that the equation , and the equations , are all satisfied at the point where the value of each is given by the above formula.

For fixed values of , and the numbers , each of these equations defines a -dimensional surface in the space of bin fractions, and the intersection of these surfaces defines a -dimensional "surface" in the space of bin fractions. This -dimensional surface is the surface on which the equations and are all satisfied, so in any direction tangential to this -dimensional surface, the rates of change of and are all 0. Thus at the point on this -dimensional surface where the value of each is given by the above formula, the rate of change of , in any direction tangential to this -dimensional surface, is 0.

I'll refer to this -dimensional surface in the space of bin fractions as , since the fixed values of and the numbers on it are determined by the fixed values of and the . Since the maximum value of in the region where all are is attained when the value of each is given by the above formula, and the values of and the are all fixed on , the maximum value of on , in the region where all are , is attained when the value of each is given by the above formula.

From the formula for as above, the fixed values of the quantities and the are determined in terms of the quantities and the by the formulae:

So for given values of and the that are not impossible, due for example to some of the being negative or greater than 1, or the not adding up to 1, there will be values of and the , typically uniquely determined, for which and the , as determined by the above formulae, have the given values. These are the values of and the which we choose, when the values of and the are given.

From the last formula above:

so from the formula above:

From above, Napier's number has the value , so if was negative, then for a given type of object , would be larger for bins for which the energy at their centres is larger, and if was 0, would be the same for all bins, no matter how large the energy at their centres. In either of these cases, there would be no justification for our assumption that the objects are all moving slowly compared to the speed of light, so I'll assume that .

From the above formula for , we have:

where:

is the average energy of an object of type .

We observe that is a weighted average of the energies of the objects of type at the bin centres, such that the relative weights of larger decrease as increases, so we expect that will decrease as increases. To check this, we note that by a rearrangement of the above formula:

So from Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post, here, we have:

So from the result above, with the fixed number taken as , and taken as , we have:

From a rearrangement of this formula, we have:

From the formula for above, we have:

From the sum of this formula and the previous one, we have:

which confirms our expectation that , and shows that for finite , unless has the same value for all bins .

From the formula above for in terms of the , we have:

From above, we are considering objects as being of different types if they are different types of atom, or atoms of the same type in different situations. For example, we are treating two oxygen atoms as different types of object if they form parts of gas molecules contained in separate containers, or if one is part of a gas molecule and the other is part of the wall of a glass container. And from above, is the number of different assignments of the objects to the bins, such that the number of objects of type in bin , for all and , is . Properties of the system such as the pressures of gases in separate containers depend only on the numbers , and not on the details of which objects are in which bins, other than through the numbers .

If each different assignment of the objects to the bins, consistent with the given total energy of the system, is equally likely, then as we observed above, the most likely values of the numbers will be those for which reaches its maximum value, consistent with the given total energy . If the numbers initially differ from these values, then over the course of time, we expect them to tend towards these values. The reason for this is that we have assumed that the total energy can be expressed as a sum of the energies of the individual objects. There will be small corrections to this assumption, due for example to interactions between gas molecules in the same container, or small mutual interactions between atoms vibrating near the surfaces of different containers that are touching one another. These interactions will occur randomly and can change the numbers by small amounts such as , so their net effect is that the numbers will drift towards their most likely values.

It's convenient, now, to change the meaning of , which I defined above to be the average energy of an object of type , to be the total energy of the objects of type , instead. The total energy of the objects of type depends on the numbers , so a drift of these numbers with time can result in a net transfer of energy from one type of object to another, while the total energy remains constant. If two systems, each of which might contain a number of different types of object, are initially separated from one another, with total energies and and initial values and of , and are brought into contact with one another, such that neither system exerts any mechanical, electromagnetic, or gravitational force on the other, but the numbers for each system can drift due to random microscopic interactions between parts of the two systems as above, for example where containers of gas that were initially separated are now touching one another, then the numbers for each system will drift towards values corresponding to a common final value of for both systems, which is the value for which for the combined system is maximized, when the total energy of the combined system is .

If the initial values and of are such that , then the final common value of cannot be such that , for by the result above, that would mean that the final energies and of the two systems satisfy and , in contradiction with the conservation of the total energy of the combined system, which we found above follows from Newton's laws or de Maupertuis's principle, and which implies that . And similarly, cannot be such that , for by the result above, that would imply and , which again contradicts the conservation of the total energy of the combined system. Thus we must have , so if , then .

If , and each of the two systems contains at least one type of object for which has different values for at least two different bins, then from the result above, and , so , and and , so the drift of the numbers to their final values results in a net transfer of energy from the second system to the first. The values of , , and are determined by the requirement that , so that .

These results show that has the basic observed properties of temperature, except that increases where temperature decreases, and conversely. To determine the relation between and temperature, we'll consider the example of an ideal gas, which is a collection of randomly moving non-interacting molecules of mass , enclosed in a container. In accordance with our assumptions above, we'll assume that each molecule behaves approximately as though its mass is concentrated at a single point.

We'll take the container of the gas to be a box whose edges are aligned with the Cartesian coordinate directions, such that the interior dimensions of the box are , , and . The total momentum of the molecules and the box is 0 in accordance with our assumption above, so the position of the centre of mass of the molecules and the box is independent of time. The molecules are moving randomly in the interior of the box, and we'll assume that the box is sufficiently rigid, and its mass is sufficiently large compared to the mass of each molecule, that we can treat the box to a good approximation as staying in a fixed position. The ranges of the position coordinates in the interior of the box are , , and . The potential energy is 0 when all the molecules are in the interior of the box, and when any of the molecules is outside the interior of the box.

We'll now divide the range of the possible positions and momenta of each molecule into equal size bins as I described above, and we'll choose each bin to be a box with its edges aligned with the Cartesian coordinate directions, such that the length of each position edge of a bin is and the difference between the values of a momentum coordinate at the ends of a momentum edge of a bin is . The sizes of the bin edges and are sufficiently small that we can treat all the molecules in a bin as being approximately at the same position and having approximately the same momentum, but sufficiently large that the number of molecules in a bin is large. This is not a problem for gas containers of everyday sizes, since the number of molecules in a cubic metre of air, to the nearest power of 10, is about .

From above, the momentum of a molecule at position moving with speed is , so the kinetic energy of the molecule is:

For convenience we'll label the bins by the position and the momentum at their centres. We are ignoring the vibrations of the atoms in the walls of the container about their mean positions, so the gas molecules are the only relevant type of object, so we can drop the index that represents the type of object in the formulae above. So from the formula above, the most likely number of molecules in the bin centred at position and momentum is:

where is the total number of gas molecules, is the Lagrange multiplier related to the temperature of the gas, whose value is to be calculated from the total energy of the gas, as above, and:

If is in the interior of the container, then the potential energy is 0, so from the formula above for the kinetic energy of a molecule:

And if is outside the interior of the container, then the potential energy is , so since we are assuming in accordance with the discussion above, the expression is 0, so:

Thus:

If we now write , and , where we have temporarily relaxed the rule that Leibniz's means that formulae are to be evaluated in the limit where expressions such as tend to 0, we thus have approximately:

If we did the calculations in the limit where the sizes and of the bin edges tend to 0, this formula would be exact. We assumed above that and are sufficiently small that we can treat all the molecules in a bin as being approximately at the same position and having approximately the same momentum, so I'll treat this formula as exact. We therefore have:

We assumed above that the number of bins is finite, and we could implement that by putting an upper limit on the magnitudes , , and of the momentum coordinates. However the expression tends to 0 very rapidly with increasing , once is larger than about , so the value of calculated with an upper limit much larger than on the magnitudes will be almost identical to the value of calculated with no upper limit on those magnitudes. So we'll calculate with the limits on each taken as and , where the symbol was defined above.

We'll complete the calculation of by first calculating the number , and we'll calculate this number by first calculating its square, which I'll represent by . We can write this as:

We can think of and as Cartesian coordinates for the 2-dimensional plane of Euclidean geometry. The distance from the point to the point is then . We'll now define , for , to be the value of the above double integral over the region , so that is the limit of as tends to . Then is times the area between the circles and , divided by , which is . Now:

The second step here is obtained because the terms neglected by dropping in the exponent are all proportional to or higher powers of , and thus give 0 in the limit where tends to 0. The fourth step is obtained by writing the small quantity as . Leibniz's means that the expressions are to be taken in the limit where tends to 0 from either positive or negative values, and for not equal to 0, this is equivalent to the limit where tends to 0 from either negative or positive values. The seventh step follows from the result which we obtained above, with taken as .

Thus:

From the result that the integral of the rate of change of a quantity is equal to the net change of that quantity, which we obtained in the first part of the post, here, we have:

since . Thus , since the limit of as tends to is 0. Thus:

We now observe that according to the explanation I gave in the first part of the post, here, of the meaning of the integral of a quantity, say , that depends smoothly on another quantity, say , over a range of values of , say from to , where , the range of from to is divided up into a great number of tiny intervals, and the integral is approximately the sum of a contribution from each of these tiny intervals, such that the contribution from each tiny interval is the value of at some point in that tiny interval, times the difference between the values of at the ends of that tiny interval. The exact value of the integral is the limit of sums of this form, as the tiny intervals become so small and their number so great, that the size of the largest tiny interval tends to 0. So if in turn depends on another quantity, say , such that , the rate of change of with respect to , is for all values of from to , then we have:

where and are the values of such that and . For to each way of dividing up the range of from to into tiny intervals, there is a corresponding division of the range of from to into tiny intervals, such that if is the end of one tiny interval and the start of another tiny interval in the range from to , then is the end of a tiny interval and the start of another tiny interval in the range from to . And if the ends of a tiny interval in the range from to are at and , where is very small, then the difference between the values of at the ends of the corresponding tiny interval in the range from to is approximately , and the error of this approximation tends to 0 more rapidly than in proportion to as tends to 0. So for every way of dividing up the range of from to into tiny intervals, the contributions of corresponding tiny intervals to the left-hand side and the right-hand side of the above equation are approximately equal, and in the limit where the tiny intervals become so small and their number so great that the size of the largest tiny interval tends to 0, the totals of all the contributions to the left-hand side and to the right-hand side become exactly equal, for if the size of a typical tiny interval is , where the symbol means "proportional to," then the number of tiny intervals is , and the total error is a sum of quantities, each of which tends to 0 more rapidly than in proportion to as tends to 0, so the total error tends to 0 as tends to 0.

From this observation, with taken as , taken as , and taken as , so that , , and , we find from the result above that:

So from the formula above:

So from the formula above, the most likely number of molecules in a bin centred at a position inside the container and momentum , with edge sizes , , , , , and , is:

The pressure of the gas is the force per unit area on the walls of the container, that results from the gas molecules bouncing off the walls. We'll calculate the force from the molecules bouncing off the wall of area at .

When an object of mass moving with velocity collides with an object of mass that is initially at rest, and no other objects are involved, and the potential energy is 0 except at the moment when the objects are in contact, then by the conservation of total energy, which we obtained in the first part of the post, here, the sum of the kinetic energies of the objects is the same before and after the collision, and by the result we found above, the sum of the momenta of the objects is the same before and after the collision. So if the final velocity of the object of mass is , and the final velocity of the object of mass is , we have:

If there is no force between the objects in the 2 or 3 coordinate directions during the collision, as will be the case when a point-like gas molecule hits a wall perpendicular to the 1 coordinate direction, then , , and . So the equations above become:

whose solution with is:

So when is extremely large in comparison to , as for example when the object of mass is the wall of a gas container of mass kilogram, and the object of mass is a gas molecule of mass kilograms, we have and to such a high precision that I shall treat these relations as exact.

So if the 1 component of the velocity of a molecule of the ideal gas is at a particular time, then the only values the 1 component of the velocity of that molecule ever takes are . If , then that molecule transfers momentum to the container wall at , at moments separated by time intervals , so the average rate at which that molecule transfers momentum to that container wall is per unit time, so since force is the rate of change of momentum, the average force exerted by that molecule on that container wall is in the outwards direction. Thus the pressure on that container wall is the sum of over all the gas molecules in the container.

From the formula above, integrated over the volume of the container, the most likely number of molecules in a momentum bin centred at momentum , with edge sizes , , and , is:

We'll assume now that this most likely number of molecules in each momentum bin is the actual number of molecules in each momentum bin. Then the sum of over all the gas molecules in the container is:

where I used the result above to calculate the integrals over and .

We'll calculate the above integral over by first calculating the integral . We found above that , so:

From Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post, here, we have:

Thus:

And from the result that the integral of the rate of change of a quantity is equal to the net change of that quantity, which we found in the first part of the post, here, we have:

The magnitude of increases much more rapidly than the magnitude of , as becomes large, because the magnitude of is multiplied by a factor every time increases by 1. Thus since , the magnitude of tends rapidly to 0 as becomes large, so both terms in the right-hand side of the above formula are 0. Thus:

where at the last step, I used the result we found above. So from the formula above:

So in a similar manner to the calculation above, with again taken as , we have:

So from the result above, the pressure of the gas in the container is:

Comparing with the ideal gas law:

which was deduced by Émile Clapeyron in 1834 from the experimental observations of Robert Boyle and Jacques Charles, where now represents the volume of the container, is the absolute temperature in degrees kelvin, which is the same as the centigrade scale except that the zero of temperature is at centigrade instead of at centigrade, and:

is known as Boltzmann's constant, after Ludwig Boltzmann, we therefore find that:

So if a system in thermal equilibrium at absolute temperature is composed of a very large number microscopic objects of various types subject to Newton's laws of motion, which we derived from de Maupertuis's principle of stationary action in the first part of the post, here, and if the range of possible positions and momenta of the objects is divided up into tiny bins of equal size, then from the result we found above, the most likely number of objects of type in position and momentum bin number is:

where is the total number of objects of type , and is the energy of an object of type at the centre of bin number . This is called the Boltzmann distribution, and the corresponding distribution of the momenta of the molecules in an ideal gas, which we found above, with identified as , is called the Maxwell-Boltzmann distribution.

The clue that led to the discovery of quantum mechanics, whose principles are summarized in Feynman's functional integral, and which made possible, among other things, the design and construction of the computer on which you are reading this blog post, came from the attempted application of the Boltzmann distribution to electromagnetic radiation. In the next part of the post, Electromagnetism, we'll look at the discoveries about electricity and magnetism that enabled James Clerk Maxwell, in the middle of the nineteenth century, to identify light as waves of oscillating electric and magnetic fields, and to calculate the speed of light from measurements of:

1. the force between parallel wires carrying electric currents;
2. the heat given off by a long thin wire carrying an electric current; and
3. the time integral of the temporary electric current that flows through a long thin wire, when a voltage is introduced between two parallel metal plates, close to each other but not touching, via that wire.