Indistinguishability But while (2) gave the correct behavior for ¯E¯¯s¯ in the high-frequency limit (Wien's law), it departed sharply from the Planc k distribution at low frequenc ies. Eq.(1) was empiric ally c orrec t, not (2). The implic ation was that if light was made of partic les labeled by frequenc y, they were partic les that c ould not be c onsidered as independent of eac h other at low frequenc ies.5 Eq.(1) is true of bosons; bosons are represented by totally symmetriz ed states in quantum mec hanic s and quantum field theory; totally symmetriz ed states are entangled states. There is no doubt that Einstein, and later Sc hrödinger, were puz z led by the lac k of independenc e of light-quanta at low frequenc ies. They were also puz z led by quantum nonloc ality and entanglement. It is tempting to view all these puz z les as related.6 Others c onc luded that light c ould not after all be made of partic les, or that it is made up of both partic les and waves, or it is made up of a spec ial c ategory of entities that are not really objec ts at all.7 We shall c ome bac k to these questions separately. For Planc k's own views on the matter, they were perhaps c losest to Gibbs's.8 Gibbs had arrived at the c onc ept of particle indistinguishability quite independent of quantum theory. To understand this development, however, c onsiderably more stage-setting is needed, in both c lassic al statistic al mec hanic s and thermodynamic s, the business of sec tions 1.2 to 1.4. (Those familiar with the Gibbs paradox may skip direc tly to sec tion 1.5.) 1.2 The Gibbs Paradox in Thermodynamics Consider the entropy of a volume V of gas c omposed of NA molec ules of kind A and NB molec ules of kind B.9 It differs from the entropy of a gas at the same temperature and pressure when A and B are identic al. The differenc e is: (3) where k is Boltz mann's c onstant, k = 1.38 × 10−16 erg K−1. The expression (3) is unc hanged no matter how similar A and B are, even when in prac tic e the two gases c annot be distinguished; but it must vanish when A and B are the same. This is the Gibbs paradox in thermodynamic s. It is not c lear that the puz z le as stated is really paradoxic al, but it c ertainly bears on the notion of identity—and on whether identity admits of degrees. Thus, Denbigh and Redhead argue: The entropy of mixing has the same value … however alike are the two substanc es, but suddenly c ollapses to z ero when they are the same. It is the absenc e of any “warning” of the impending c atastrophe, as the substanc es are made more and more similar, whic h is the truly paradoxic al feature. (Denbigh and Redhead 1989, 284) The diffic ulty is more severe for those who see thermodynamic s as founded on operational c onc epts. Identity, as distinc t from similarity under all prac tic al measurements, seems to outstrip any possible experimental determination. To see how experiment does bear on the matter, rec all that the c lassic al thermodynamic entropy is an extensive func tion of the mass (or partic le number) and volume. That is to say, for real numbers ƛ, the thermodynamic entropy S as a func tion of N and V sc ales linearly: By c ontrast the pressure and temperature are intensive variables that do not sc ale with mass and volume. The thermodynamic entropy func tion for an ideal gas is: (4) where c is an arbitrary c onstant. It is extensive by inspec tion. The extensivity of the entropy allows one to define the analogue of a density—entropy per unit mass or unit volume —important to nonequilibrium thermodynamic s, but the c onc ept c learly has its limits: for example, it is hardly expec ted to apply to gravitating systems, and more generally ignores surfac e effec ts and other sourc es of inhomogeneity It is to be sharply distinguished from additivity of the entropy, needed to define a total entropy for a c ollec tion of equilibrium systems eac h separately desc ribed—typic ally, as (at least initially) physic ally isolated Page 3 of 31
Indistinguishability systems. The assumption of additivity is that a total entropy c an be defined as their sum: It is doubtful that any general statement of the sec ond law would be possible without additivity. Thus, c ollec t together a doz en equilibrium systems, some samples of gas, homogeneous fluids or material bodies, initially isolated, and determine the entropy of eac h as a func tion of its temperature, volume, and mass. Energetic ally isolate them from external influenc es, but allow them to interac t with eac h other in any way you like (mec hanic al, thermal, c hemic al, nuc lear), so long as the result is a new c ollec tion of equilibrium systems. Then the sec ond law can be expressed as follows: the sum of the entropies of the latter systems is equal to or greater than the sum of the entropies of the former systems.10 Now for the c onnec tion with the Gibbs paradox. The thermodynamic entropy differenc e between states 1 and 2 is defined as the integral, over any reversible proc ess11 that links the two states, of dQ/T, that is as the quantity: where dQ is the heat transfer. If the insertion or removal of a partition between A and B is to c ount as a reversible process, then from additivity and given that negligible work is done on the partition it follows there will be no change in entropy, so no entropy of mixing. This implies the entropy must be extensive. Conversely extensivity, under the same presupposition, and again given additivity, implies there is no entropy of mixing. Whether the removal of a partition between A and B should c ount as a reversible proc ess is another matter: surely not if means are available to tell the two gases apart. Thus, if a membrane is opaque to A, transparent to B, under c ompression work PAdV must be done against the partial pressure PA in voiding one part of the c ylinder of gas A (and similarly for B), where: The work dW required to separate the two gases isothermally at temperature T is related to the entropy c hange and the heat transfer by: Using the equation of state for the ideal gas to determine dW = PdV where N = NA + NB, the result is the entropy of mixing, Eq.(3). However, there c an be no suc h semi-permeable membrane when the two gases are identic al,12 so in this c ase the entropy of mixing is z ero. Would it matter to the latter c onc lusion if the differenc es between the two gases were suffic iently small (were ignored or remained undisc overed)? But as van Kampen argues, it is hard to see how the c hemist will be led into any prac tic al error in ignoring an entropy of mixing, if he c annot take mec hanic al advantage of it. Most thermodynamic substanc es, in prac tic e, are c omposites of two or more substanc es (typic ally, different isotopes), but suc h mixtures are usually treated as homogeneous. In thermodynamic s, as a sc ienc e based on operational c onc epts, the meaning of the entropy func tion does not extend beyond the c ompetenc ies of the experimenter: Thus, whether suc h a proc ess is reversible or not depends on how disc riminating the observer is. The expression for the entropy depends on whether or not he is able and willing to distinguish between the molec ules A and B. This is a paradox only for those who attac h more physic al reality to the entropy than is implied by its definition. (Van Kampen 1984, 307) A similar resolution of the Gibbs paradox was given by Jaynes (1992). It appears, on this reading, that the entropy is not a real physic al property of a thermodynamic system, independent of our knowledge of it. Ac c ording to Van Kampen, it is attributed to a system on the basis of a system of conventions—on whether the removal of a partition is to be c ounted as a reversible proc ess, and on whether the entropy func tion for the two samples of gas is counted as extensive. That explains why the entropy of mixing is an all-or-nothing affair. 1.3 The Gibbs Paradox in Statistical Mechanics Page 4 of 31
Indistinguishability Thermodynamic s is the one fundamental theory of physic s that might lay c laim to being based on operational c onc epts and definitions. The situation is different in statistic al mec hanic s, where the c onc ept of entropy is not limited to equilibrium states, nor bound to the c onc ept of reversibility. There is an immediate diffic ulty, however; for the c lassic al derivations of the entropy in statistic al mec hanic s yield a func tion that is not extensive, even as an idealiz ation. That is, c lassic ally, there is always an entropy of mixing, even for samples of the same gas. If the original Gibbs paradox was that there was no entropy of mixing in the limit of identity, the new paradox is that there is.13 To see the nature of the problem, it will suffic e to c onsider the ideal gas, using the Boltz mann definition of entropy, so-c alled.14 The state of a system of N partic les is represented by a set of N points in the 6-dimensional one- partic le phase spac e (or μ-space), or equivalently, by a single point in the total 6N-dimensional phase spac e ΓN. A fine-graining of ΓN is a division of this spac e into c ells of equal volume τN (c orresponding to a division of μ-spac e into c ells of volume τ, where τ has dimensions of [momentum]3 [length]3 ). A coarse-graining is a division of ΓN into regions with a given range of energy. For weakly interac ting partic les these regions c an be parameteriz ed by the one-partic le energies εs, with Ns the number of partic les with energy in the range [εs, εs + Δεs], and the c oarse- graining extended to μ-spac e as well. These numbers must satisfy: (5) where E is the total energy. Thus, for any fine-grained desc ription (mic rostate) of the gas, whic h spec ifies how, for eac h s, Ns partic les are distributed over the fine-graining, there is a definite c oarse-grained desc ription (mac rostate) whic h only spec ifies the number in eac h energy range. Eac h mac rostate c orresponds to a definite volume of phase spac e. We c an now define the Boltz mann entropy of a gas of N partic les in a given mic rostate: it is proportional to the logarithm of the volume, in ΓN, of the corresponding macrostate. In this the c hoic e of τ only effec ts an additive c onstant, irrelevant to entropy differenc es. This entropy is c omputed as follows. For eac h s, let there be Cs c ells in μ-spac e of volume τ bounded by the energies εs, εs + Δε, c ontaining Ns partic les. Counting mic rostates as distinc t if they differ in whic h partic les are in which cells, we use (2) for the number of microstates, each with the same phase space volume τ Ns , yielding the volume: (6) The produc t of these quantities (over s) is the N-partic le phase-spac e volume of the mac rostate N1, N2 ,…,Ns,. for just one way of partitioning the N partic les among the various one-partic le energies. There are (7) partitionings in all. The total phase spac e volume WB (“B” is for Boltz mann) of the mac rostate N1, N2 , …, Ns,. is the product of terms (6) (over s) and (7): (8) and the entropy is: From the Stirling approximation for x large, logx! ≍ xlog x − x: (9) By inspec tion, this entropy func tion is not extensive. When the spatial volume and partic le number are doubled, the sec ond and third expressions on the RHS sc ale properly, but not the first. This pic ks up a term kN log 2, c orresponding to the 2N c hoic es as to whic h of the two sub-volumes c ontains whic h partic le. Page 5 of 31
Indistinguishability One way to obtain an extensive entropy func tion is to simply subtrac t the term kN log N. In the Stirling approximation (up to a c onstant sc aling with N and V) that is equivalent to dividing the volume (8) by N!. But with what justific ation? If, after all, permutations of partic les did not yield distinc t fine-grained distributions, the fac tor (7) would not be divided by N!; it would be set equal to unity. Call this the N! problem. This is itself sometimes c alled the Gibbs paradox, but is c learly only a fragment of it. It is the main topic of sec tions 1.5 and 2.1. 1.4 The Equilibrium Entropy Although not needed in the sequel, for c ompleteness we obtain the equilibrium entropy, thus making the c onnec tion with observable quantities.15 A system is in equilibrium when the entropy of its c oarse-grained distribution is a maximum; that is, when the entropy is stationary under variation of the numbers Ns → Ns + δNs, c onsistent with (5), that is from (9): (10) where (11) If the variations δNs were entirely independent, eac h term in the summand (10) would have to vanish. Instead introduc e Lagrange multipliers a, β for the respec tive c onstraint equations (11). Conc lude for eac h s: Rearranging: (12) Substituting in (9) and using (5) gives the equilibrium entropy S¯¯B: (13) The values of α and β are fixed by (5) and (12). Replac ing the sc hematic label s by c oordinates on phase spac e for a monatomic gas x⃗ , p, ⃗ with εs the kinetic energy 1 p⃗2 , the sum over Ns in the first equation of (5) bec omes: 2m The spatial integral gives the volume V; the momentum integral gives (2π m/β)3/2, so From the analogous normaliz ation c ondition on the total energy (the sec ond c onstraint (5)), substituting (12) and given that for an ideal monatomic gas E = 3 N kT , deduc e that β = 1 . Substituting in (13), the equilibrium entropy 2 kT is: It is c learly not extensive. Compare Eq.(4), whic h using the equation of state for the ideal gas takes the form (the Sackur-Tetrode equation): where c is an arbitrary c onstant. They differ by the term Nk logN, as already noted. 1.5 The N! Puzzle Page 6 of 31
Indistinguishability The N! puz z le is this: What justifies the subtrac tion of the term Nklog N from the entropy? Or equivalently, what justifies the division of the phase spac e volume Eq.(8) by N !? In fac t it has a fairly obvious answer (see sec tion 2.1): c lassic al partic les, if identic al, should be treated as permutable, just like identic al quantum partic les. But this suggestion has rarely been taken seriously. Muc h more widely favored is the view that quantum theory is needed. Classic al statistic al mec hanic s is not after all a c orrec t theory; quantum statistic al mec hanic s (Eq.(1)), in the dilute limit Cs ≫ Ns, gives: yielding the required c orrec tion to (6) (setting (7) to unity). Call this the orthodox solution to the N! puz z le. This reasoning, so far as it goes, is perfec tly sound, but it does not go very far. It says nothing about why partic les in quantum theory but not c lassic al theory are permutable. If rationale is offered, it is that c lassic al partic les are loc aliz ed in spac e and henc e are distinguishable (we shall c onsider this in more detail in the next sec tion); and along with that, that the quantum state for identic al partic les is unc hanged by permutations.16 But how the two are connected is rarely explained. Erwin Schrödinger, in his book Statistical Thermodynamics, did give an analysis: It was a famous paradox pointed out for the first time by W. Gibbs, that the same inc rease of entropy must not be taken into ac c ount, when the two molec ules are of the same gas, although (ac c ording to naive gas- theoretic al views) diffusion takes plac e then too, but unnotic eably to us, bec ause all the partic les are alike. The modern view [of quantum mec hanic s] solves this paradox by dec laring that in the sec ond c ase there is no real diffusion, bec ause exc hange between like partic les is not a real event—if it were, we should have to take ac c ount of it statistic ally. It has always been believed that Gibbs's paradox embodied profound thought. That it was intimately linked up with something so important and entirely new [as quantum mec hanic s] c ould hardly be foreseen. (Sc hrödinger 1946, 61) Evidently, by “exc hange between like partic les” Sc hrödinger meant the sort of thing that happens when gases of c lassic al molec ules diffuse—the trajec tories of individual molec ules are twisted around one another—in c ontrast to the behavior of quantum partic les, whic h do not have trajec tories, and so do not diffuse in this way. But why the exc hange of quantum partic les “is not a real event” (whereas it is c lassic ally) is lost in the even more obsc ure question of what quantum partic les really are. Sc hrödinger elsewhere said something more. He wrote of indistinguishable partic les as “losing their identity,” as “non-individuals,” in the way of units of money in the bank (they are “fungible”). That fitted with Planc k's original idea of indistinguishable quanta as elements of energy, rather than material things—so, again, quite unlike c lassic al particles. On this point there seems to have been wide agreement. Sc hrödinger's c laims about the Gibbs paradox c ame under plenty of c ritic ism, for example, by Otto Stern, but Stern remarked at the end: In c onc lusion, it should be emphasiz ed that in the foregoing remarks c lassic al statistic s is c onsidered in princ iple as a part of c lassic al mec hanic s whic h deals with individuals (Boltz mann). The c onc eption of atoms as partic les losing their identity c annot be introduc ed into the c lassic al theory without c ontradic tion. (Stern 1949, 534) This c omment or similar c an be found sc attered throughout the literature on the foundations of quantum statistic s. There is a sec ond solution to the N! puz z le that goes in the diametric ally-opposite direc tion: it appeals only to c lassic al theory, prec isely assuming partic le distinguishability. Call this the classical solution to the puz z le. Its origins lie in a treatment by Ehrenfest and Trkal (1920) of the equilibrium c onditions for molec ules subjec t to disassoc iation into a total of N* atoms. This number is c onserved, but the number of molec ules NA, NB, … formed of these atoms, of various types A, B,… may vary. The dependenc e of the entropy func tion on N* is not needed sinc e this number never c hanges: it is the dependenc e on NA, NB, … that is relevant to the extensivity of the entropy for molec ules of type A, B, …, whic h c an be measured. By similar c onsiderations as in sec tion 1.3, the number of ways ** Page 7 of 31
Indistinguishability the N* atoms c an be partitioned among NA molec ules of type A,NB molec ules of type B, … is the fac tor N*!/NA!NB! …. This multiplies the produc t of all the phase spac e volumes for eac h type of molec ule, delivering the required division by NA! for molec ules of type A, by NB! for molec ules of type B, and so on (with the dependenc e on N* absorbed into an overall c onstant). A similar argument was given by van Kampen (1984), but using Gibbs's methods. The c anonic al ensemble for a gas of N* partic les has the probability distribution: Here (q, p) are c oordinates on the 6N* -dimensional phase spac e for the N* partic les, whic h we suppose are c onfined to a volume V*, H is the Hamiltonian, and f is a normaliz ation c onstant. Let us determine the probability of finding N partic les with total energy E in the sub-volume V (so N′ = N* − N are in volume V′ = V* − V). If the interac tion energy between partic les in V′ and V is small, the Hamiltonian HN* of the total system c an be approximately written as the sum HN + HN′ of the Hamiltonians for the two subsystems. The probability density W(N, q, p) for N particles as a function of ⟨N, q, p⟩ = ⟨q ⃗1 , p ⃗1 ; q ⃗2 , p2⃗ ; . … ; q ⃗ N, p⃗ N⟩ where q ⃗i is then the marginal on integrating out the remaining N′ partic les in V′, multiplied by the number of ways of selec ting N partic les from N* partic les. The latter is given by the binomial func tion: The result is: In the limit N* ≫ N, the binomial is to a good approximation: The volume integral yields V′N*−N. For noninterac ting partic les, for c onstant density ρ = V′/N′ in the large volume limit V′ ≫ V we obtain: where z is a func tion of ρ and β. It has the required division by N! Evidently this solution to the N! puz z le is the same as in Ehrenfest and Trkal's derivation: extensivity of the entropy c an only be obtained for an open system, that is, for a proper subsystem of a c losed system, never for a c losed one—and it follows prec isely bec ause the partic les are nonpermutable. The tables are thus neatly turned.17 Whic h of the two, the orthodox or the c lassic al, is the “c orrec t” solution to the N! puz z le? It is tempting to say that both are c orrec t, but as answers to different questions: the orthodox solution is about the thermodynamic s of real gases, governed by quantum mec hanic s, and the c lassic al solution is about the c onsistenc y of a hypothetic al c lassic al system of thermodynamic s that in reality does not exist. But on either line of reasoning, identic al quantum partic les are treated as radic ally unlike identic al c lassic al partic les (only the former are permutable). This fits with the standard ac c ount of the departures of quantum from c lassic al statistic s: they are explained by permutability. These are the c laims c hallenged in Part 2. 2. Indistinguishability as a Unifo rm Symmetry 2.1 Gibbs' Solution There is another answer as to whic h of the two solutions to the N! puz z le is c orrec t: neither. The N! puz z le arises in both c lassic al and quantum theories and is solved in exac tly the same way: by passing to the quotient spac e (of phase spac e and Hilbert spac e, respec tively). This is not to deny that atoms really are quantum mec hanic al, or Page 8 of 31
Indistinguishability that measurements of the dependenc e of the entropy on partic le number are made in the way that Ehrenfest et al. envisaged; it is to deny that the c ombinatoric s fac tors thus introduc ed are, exc ept in spec ial c ases, either justified or needed. Gibbs, in his Elementary Principles in Statistical Mechanics, put the matter as follows: If two phases differ only in that c ertain entirely similar partic les have c hanged plac es with one another, are they to be regarded as identic al or different phases? If the partic les are regarded as indistinguishable, it seems in ac c ordanc e with the spirit of the statistic al method to regard the phases as identic al. (Gibbs 1902, 187) He proposed that the phase of an N-partic le system be unaltered “by the exc hange of plac es between similar partic les.” Phases (points in phase spac e) like this he c alled generic (and those that are altered, specific). The state spac e of generic phases is the reduced phase space ΓN/ΠN, the quotient spac e under the permutation group ΠN of N elements. In this spac e points of ΓN related by permutations are identified. The suggestion is that even c lassic ally, the expressions (6) and (7) are wrong. (7) is replac ed by unity (as already noted): there is just one way of partitioning N permutable partic les among the various states so as to give Ns partic les to eac h state. But (6) is wrong too: it should be replac ed by the volume of reduc ed phase spac e c orresponding to the mac rostate (for s), the volume For the mac rostate N1, N2 , …, Ns,. the total reduc ed volume, denote Wred is: (14) The derivation does not depend on the limiting behavior of Eq.(1), or on the assumption of equiprobability or equality of volume of eac h fine-grained distribution (and is in fac t in c ontradic tion with that assumption, as we shall see). Given (14), there is no entropy of mixing. Consider a system of partic les all with the same energy εs. The total entropy before mixing is, from additivity: (15) After mixing, if A and B are identic al: (16) If the pressure of the two samples is initially the same (so CA/NA = CB/NB), the quantities (15), (16) should be approximately equal18 —as c an easily be verified in the Stirling approximation. But if A and B are not identic al, and permutations of A particles with B particles is not a symmetry, we pass to the quotient spaces under ΠNA and ΠNB separately and take their produc t, and the denominator in (16) should be NA!NB!. With that SA + SB and SA+B are no longer even approximately the same. Gibbs c onc luded his disc ussion of whether to use generic or spec ific phases with the words, “The question is one to be dec ided in ac c ordanc e with the requirements of prac tic al c onvenienc e in the disc ussion of the problems with whic h we are engaged” (Gibbs 1902, 188). Prac tic ally speaking, if we are interested in defining an extensive c lassic al entropy func tion (even for c losed systems), use of the generic phase (permutability) is c learly desirable. On the other hand, integral and differential c alc ulus is simple on manifolds homeomorphic to ℝ6 N, like ΓN; the reduc ed phase spac e ΓN/ΠN has by c ontrast a muc h more c omplex topology (a point made by Gibbs). If the needed c orrec tion, division by N!, c an be simply made at the end of a c alc ulation, the sec ond c onsideration will surely trump the first. Page 9 of 31
Indistinguishability 2.2 Arguments against Classical Indistinguishability Are there princ ipled arguments against permutability thus treated uniformly, the same in the c lassic al as in the quantum c ase? The c onc ept of permutability c an c ertainly be misrepresented. Thus, c lassic ally, of c ourse, it makes sense to move atoms about so as to interc hange one with another, for partic les have definite trajec tories; in that sense an “exc hange of plac es” must make for a real physic al differenc e, and in that sense “indistinguishability” c annot apply to c lassic al partic les. But that is not what is meant by “interc hange”—Sc hrödinger was just misleading on this point. It is interc hange of points in phase spac e whose signific anc e is denied, not in c onfiguration spac e over time. Points in phase-spac e are in 1: 1 c orrespondenc e with the dynamic ally allowed trajec tories. A system of N partic les whose trajec tories in μ-spac e swirl about one another, leading to an exc hange of two or more of them in their plac es in spac e at two different times, is desc ribed by eac h of N\\ points in the 6N-dimensional phase spac e ΓN, eac h faithfully representing the same swirl of trajec tories in μ-spac e (but assigning different labels to eac h trajec tory). In passing to points of the quotient spac e ΓN/ΠN there is therefore no risk of desc riptive inadequac y in representing partic le interc hange in Sc hrödinger's sense. Another and more obsc ure muddle is to suppose that points of phase spac e c an only be identified insofar as they are all traversed by one and the same trajec tory. That appears to be the princ iple underlying van Kampen's argument: One c ould add, as an aside, that the energy surfac e c an be partitioned in N! equivalent parts, whic h differ from one another only by a permutation of the molec ules. The trajec tory, however, does not rec ogniz e this equivalenc e bec ause it c annot jump from one point to an equivalent one. There c an be no good reason for identifying the Z-star [the region of phase spac e pic ked out by given mac rosc opic c onditions] with only one of these equivalent parts. (Van Kampen 1984, 307) But if the whole reason to c onsider the phase-spac e volumes of mac rostates in deriving thermodynamic behavior is bec ause (say by ergodic ity) they are proportional to the amount of time the system spends in the assoc iated mac rostates, then, just bec ause the trajec tory c annot jump from one point to an equivalent one, it should be enough to c onsider only one of the equivalent parts of the Z-star. We should draw prec isely the opposite conclusion to van Kampen. However van Kampen put the matter somewhat differently—in terms, only, of probability: Gibbs argued that, sinc e the observer c annot distinguish between different molec ules, “it seems in ac c ordanc e with the spirit of the statistic al method” to c ount all mic rosc opic states that differ only by a permutation as a single one. Ac tually it is exac tly opposite to the basic idea of statistic al mec hanic s, namely that the probability of a mac rostate is given by the measure of the Z-star, i.e. the number of corresponding, macroscopically indistinguishable microstates. As mentioned … it is impossible to justify the N! as long as one restric ts oneself to a single c losed system. (van Kampen 1984, 309, emphasis added). Moreover, he speaks of probabilities of mac rosc opic ally indistinguishable mic rostates, whereas the c ontentious question c onc erns microscopically indistinguishable mic rostates. The c ontentious question is whether mic rostates that differ only by partic le permutations, with all physic al properties unc hanged—whic h are in this sense indistinguishable—should be identified. Alexander Bac h in his book Classical Particle Indistinguishability defended the c onc ept of permutability of states in c lassic al statistic al mec hanic s, understood as the requirement that probability distributions over mic rostates be invariant under permutations. But what he meant by this is the invarianc e of func tions on ΓN As suc h, as probability measures, they c ould never provide c omplete desc riptions of the partic les (unless all their c oordinates c oinc ide)— they c ould not be c onc entrated on individual trajec tories. He c alled this the “deterministic setting.” In his own words: Indistinguishable Classical Particles Have No Trajecto ries. The unc onventional role of indistinguishable c lassic al partic les is best expressed by the fac t that in a deterministic setting no Page 10 of 31
Indistinguishability indistinguishable partic les exist, or—equivalently—that indistinguishable c lassic al partic les have no trajec tories. Before I give a formal proof I argue as follows. Suppose they have trajec tories, then the partic les c an be identified by them and are, therefore, not indistinguishable. (Bac h 1997, 7) His formal argument was as follows. Consider the c oordinates of two partic les at a given time. in one dimension, as an extremal of the set of probability measures M+1 (R2 ) on ℝ2 (a 2-dimensional configuration space), from which, assuming the two partic les are impenetrable, the diagonal D = {〈 x, x 〉 ∊ ℝ2 , x ∊ ℝ} has been removed. Sinc e indistinguishable, the state of the two partic les must be unc hanged under permutations (permutability), so it must be in M+1,sym (R2 ), the space of symmetrized measures. It consists of sums of delta functions of the form: But no such state is an extremal of M+1 (R2 ). As already remarked, the argument presupposes that the c oordinates of the two partic les defines a point in M+1 (R2 ), the unreduced space, rather than in M+1 (R2 /Π2 ), the space of probability measures over the reduced space ℝ2/Π2. In the latter case, since M+1 (R2 /Π2 ) is isomorphic to M+1,sym (R2 ), there is no difficulty Bac h's informal argument above is more instruc tive. Why not use the trajec tory of a partic le to identify it, by the way it twists and turns in spac e? Why not indeed: if that is all there is to being a partic le, you have already passed to a trajec tory in the quotient spac e ΓN/ΠN, for those related by permutations twist and turn in exac tly the same way. The c onc ept of partic le distinguishability is not about the trajec tory or the one-partic le state: it is about the label of the trajec tory or the one-partic le state, or equivalently, the question of whic h partic le has that trajec tory, that state.19 2.3 Haecceitism Gibbs's suggestion was c alled “fundamentally idealistic ” by Rosenfeld, “mystic al” by van Kampen, “inc onsistent” by Bac h; they were none of them prepared to see in indistinguishability the rejec tion of what is on first sight a purely metaphysic al doc trine—that after every desc ribable c harac teristic of a thing has been ac c ounted for, there still remains the question of which thing has those c harac teristic s. The key word is “every”; desc ribe a thing only partly, and the question of whic h it is of several more prec isely desc ribed things is obviously physic ally meaningful. But mic rostates, we take it, are maximal, c omplete desc riptions. If there is a more c omplete level of desc ription it is the mic rostate as given by another theory, or at a deeper level of desc ription by the same theory, and to the latter our c onsiderations apply. The doc trine, now that we have understood it c orrec tly, has a suitably tec hnic al name in philosophy. It is c alled haecceitism. Its origins are medieval if not anc ient, and it was in play, one way or another, in a c onnec ted line of argument from Newton and Clarke to Leibniz and Kant. That c entered on the need, given symmetries, inc luding permutations, not just for symmetry-breaking in the c hoic e of initial c onditions,2 0 but for a c hoic e among haec c eistic differenc es—in the c ase of c ontinuous symmetries, among values of absolute positions, absolute direc tions, and absolute veloc ities. All parties to this debate agreed on haec c eitism. These c hoic es were ac ts of God, with their c onsequenc es visible only to God (Newton, Clarke); or they were humanly visible too, but in ways that c ould not be put into words—that c ould only be grasped by “intuition” (Kant); or they involved c hoic es not even available to God, who c an only c hoose on the basis of reason; so there c ould be no c reated things suc h as indistinguishable atoms or points of a featureless spac e (Leibniz ).2 1 So muc h philosophic al baggage raises a worry in its own right. If it is the truth or falsity of haec c eitism that is at issue, it seems unlikely that it c an be settled by any empiric al finding. If that is what the extensivity of the entropy is about, perhaps extensivity has no real physic al meaning after all. It is, perhaps, itself metaphysic al— or c onventional. This was the view advoc ated by Nic k Huggett when he first drew the c omparison between Boltz mann's c ombinatoric s and haec c eitism.2 2 Page 11 of 31
Indistinguishability But this point of view is only remotely tenable if haec c eitism is similarly irrelevant to empiric al questions in quantum statistic s. And on the fac e of it that c annot be c orrec t. Planc k was, after all, led by experiment to Eq. (1). Use of the unreduc ed state spac e in quantum mec hanic s rather than the reduc ed (symmetriz ed) spac e surely has direc t empiric al c onsequenc es. Against this two objec tions c an be made. The first, following Reic henbac h (1956), is that the important differenc e between quantum and c lassic al systems is the absenc e in quantum theory of a c riterion for the re-identific ation of identic al partic les over time. They are, for this reason, “non-individuals” (this links with Sc hrödinger's writings2 3 ). This, rather than any failure of haec c eitism, is what is responsible for the departures from c lassic al statistic s.2 4 The sec ond, following Post (1963) and Frenc h and Redhead (1988), is that haec c eitism must be c onsistent with quantum statistic s (inc luding Planc k's formula) bec ause partic les, even given the symmetriz ation of the state, may nevertheless possess “transc endental” individuality, and symmetriz ation of the state c an itself be understood as a dynamical constraint on the state, rather than in terms of permutability. Of these the sec ond need not detain us. Perhaps metaphysic al c laims c an be isolated from any possible impac t on physic s, but better, surely, is to link them with physic s where suc h links are possible. Or perhaps we were wrong to think that haec c eitism is a metaphysic al doc trine: it just means nonpermutability, it is to break the permutation symmetry. The c onverse view is to respec t this symmetry. As for the first, it is simply not true that indistinguishable quantum partic les c an never be reidentified over time. Suc h identific ations are only exac t in the kinematic limit, to be sure, and even then only for a c ertain c lass of states; but the ideal gas is c ommonly treated in just suc h a kinematic limit, and the restric tion in states applies just as muc h to the reidentific ation of identic al quantum partic les that are not indistinguishable—that are not permutable —but that are otherwise entangled. This point needs some elaboration. Consider first the c ase of nonpermutable identic al partic les. The N partic le state space is then H N = H ⊗ H ⊗ . ⊗ H , the N-fold tensor product of the one-particle state space H . Consider states of the form: (17) where the one-partic le states are members of some orthonormal basis (we allow for repetitions). The kth-partic le is then in the one-partic le state | πc ). The ordering of the tensor-produc t breaks the permutation symmetry. If the partic les are only weakly interac ting, and the state remains a produc t state, the kth-partic le c an also be assigned a one-partic le state at later times, namely the unitary evolute of | φ c 〉. Even if more than one partic le has the initial state | φ c ), still it will be the c ase that eac h partic le in that state has a definite orbit under the unitary evolution. It is true that in those c irc umstanc es it would seem impossible to tell the two orbits apart, but the same will be true of two c lassic al partic les with exac tly the same representative points in μ-spac e.2 5 Now notic e the limitation of this way of speaking of partic les as one-partic le states that are (at least c onc eptually) identifiable over time: it does not in general apply to superpositions of states of the form (17)—as will naturally arise if the partic les are interac ting, even starting from (17). In general, given superpositions of produc t states, there is no single c ollec tion of N one-partic le states, or orbits of one-partic le states, suffic ient for the desc ription of the N partic les over time. In these c irc umstanc es no definite histories, no orbits of one-partic le states, c an be attributed to identic al but distinguishable partic les. Now c onsider identic al permutable quantum partic les (indistinguishable quantum partic les). The state must now be invariant under permutations, so (for vec tor states): (18) for every π ∈ ΓN, where U :π → Uπ is a unitary representation of the permutation group ΓN. Given (18), |Φ〉 must be of the form: (19) and superpositions thereof. Here c is a normaliz ation c onstant, π ∈ ΓN is a permutation of the N symbols {a,b, Page 12 of 31
Indistinguishability …,c,.d} (whic h, again, may have repetitions), and as before, the one-partic le states are drawn from some orthonormal set in H . If noninteracting, and initially in the state (19), the particle in the state |φc〉 can still be reidentified over time—as the partic le in the state whic h is the unitary evolute of | φ c 〉.2 6 That is to say, for entanglements like this, one-partic le states c an still be trac ked over time. It is true that we c an no longer refer to the state as that of the kth partic le, in c ontrast to states of the form (17), but that labeling—unless shorthand for something else, say the lattic e position of an atom in a c rystal—never had any physic al meaning. As for more entangled states—for superpositions of states of the form (19)—there is of c ourse a diffic ulty; but it is the same diffic ulty as we enc ountered for identic al but distinguishable partic les. Reic henbac h was therefore right to say that quantum theory poses spec ial problems for the reidentific ation of identic al partic les over time, and that these problems derive from entanglement; but not from the “mild”2 7 form of entanglement required by symmetriz ation itself (as involved in states of the form (19)), of the sort that explains quantum statistic s. On the other hand, this muc h is also true: permutability does rule out appeal to the reduc ed density matrix to distinguish eac h partic le in time (defined, for the kth partic le, by taking the partial trac e of the state over the Hilbert spac e of all the partic les save the kth). Given (anti)symmetriz ation, the reduc ed density matric es will all be the same. But it is hard to see how the reduc ed density matrix c an provide an operational as opposed to a c onc eptual c riterion for the reidentific ation of one among N identic al partic les over time. What would an operational c riterion look like? Here is a simple example: a helium atom in the c anister of gas by the laboratory door is thereby distinguished from one in the high-vac uum c hamber in the c orner, a c riterion that is preserved over time. This means: the one-partic le state loc aliz ed in the c anister is distinguished from the one in the vacuum chamber. We shall enc ounter this idea of referenc e and reidentific ation by loc ation (or more generally by properties) again, so let us give it a name: c all it individuating reference, and the properties c onc erned individuating properties. In quantum mec hanic s the latter c an be represented in the usual way by projec tion operators. Thus if Pc an is the projec tor onto the region of spac e Δc an oc c upied by the c anister, and Pc ham onto the region Δc ham oc c upied by the vac uum c hamber, and if | χ 1〉, | χ2 〉 are loc aliz ed in Δcan (and similarly | ψ1〉, | ψ2 〉 in Δcham), then even in the superposition (where |c1|2+ | c2|2 = 1) one c an still say there is a state in whic h one partic le is in region Δc an and one in Δc ham (but we c annot say whic h); still we have: (20) If the c anister and vac uum c hamber are well-sealed, this c ondition will be preserved over time. Individuating properties c an be defined in this way just as well for permutable as for nonpermutable identic al partic les. It is time to take stoc k. We asked whether the notion of permutability c an be applied to c lassic al statistic al mec hanic s. We found that it c an, in a way that yields the desired properties of the statistic al mec hanic al entropy func tion, bringing it in line with the c lassic al thermodynamic entropy. We saw that arguments for the unintelligibility of c lassic al permutability in the literature are invalid or unsound, amounting, at best, to appeal to the philosophic al doc trine of haec c eitism. We knew from the beginning that state-desc riptions in the quantum c ase should be invariant under permutations, and that this has empiric al c onsequenc es, so on the most straightforward reading of haec c eitism the doc trine is false in that c ontext. Unless it is emasc ulated from all relevanc e to physic s, haec c eitism c annot be true a priori. We wondered if it was required or implied if partic les are to be reidentified over time, and found the answer was no to both, in the quantum as in the c lassic al c ase. We c onc lude: permutation symmetry holds of identic al c lassic al partic les just as it does of identic al quantum partic les, and may be treated in the same way, by passing to the quotient spac e. Yet an important lac una remains, for among the desirable c onsequenc es of permutation symmetry in the c ase of quantum partic les are the departures from c lassic al statistic s—statistic s that are unc hanged in the c ase of c lassic al partic les. Why is there this differenc e? Page 13 of 31
Indistinguishability 2.4 The Explanation of Quantum Statistics Consider again the c lassic al reduc ed phase-spac e volume for the mac rostate N1, N2 , …, Ns,., as given by Eq.(14): (21) In effec t, Planc k replac ed the one-partic le phase-spac e volume element τ, hitherto arbitrary, by h3 , and c hanged the fac tor Zs by whic h it was multiplied to obtain: (22) Continuing from this point, using the method of sec tions 1.3 and 1.4 one is led to the equilibrium entropy func tion and equation of state for the ideal Bose-Einstein gas. The entire differenc e between this and the c lassic al ideal gas is that for eac h s, the integer CsN is replac ed by (Ns + Cs − 1)!/(Cs − 1)! What is the rational for this? It does not c ome from partic le indistinguishability (permutability); that has already been taken into ac c ount in (21). Let us foc us on just one value of s, that is, on Ns partic les distributed over Cs c ells, all of the same energy (and hereinafter drop the subsc ript s). At the level of the fine-grained desc ription, in terms of how many (indistinguishable) partic les are in eac h (distinguishable) c ell, a mic rostate is spec ified by a sequenc e of fine- grained occupation numbers 〈 n1, n2, …, nC 〉, where ∑jC=1 nj = N; there are many such corresponding to the c oarse-grained desc ription (N, C) (for a single value of s). Their sum is (23) as before. But here is another mathematic al identity: 2 8 (24) In other words, the differenc e between the two expressions (21) and (22), apart from the replac ement of the unit τ by h3, is that in quantum mechanics every microstate 〈 n1, n2, …, nC 〉 has equal weight, whereas in classical mechanics each is weighted by the factor (n1!…nC!)−1. Bec ause of this weighting, a c lassic al fine-grained distribution where the N partic les are evenly distributed over the C c ells has a muc h greater weight than one where most of the partic les are c onc entrated in a small handful. In c ontrast, in quantum mec hanic s, the weights are always the same. Given that “weight,” one way or another, translates into statistic s, partic les weighted c lassic ally thus tend to repel, in c omparison to their quantum mec hanic al c ounterparts; or put the other way, quantum partic les tend to bunc h together, in c omparison to their classical counterparts. That is what the weighting does, but why is it there? Consider figure 10.1a, for N = 2, C = 4. Suppose, for c onc reteness, we are modeling two c lassic al, non-permutable identic al c oins, suc h that the first two c ells c orrespond to one of the c oins landing heads (H), and the remainder to that c oin landing tails (T) (and similarly for the other c oin).2 9 The c ells along the diagonal c orrespond not just to both c oins landing heads or both landing tails —they are c ells in whic h the two c oins have all their fine-grained properties the same. For any c ell away from the diagonal, there is a c orresponding c ell that differs only in whic h c oin has whic h fine-grained property (its reflec tion in the diagonal). Their c ombined volume in phase spac e is therefore twic e that of any c ell on the diagonal. Page 14 of 31
Indistinguishability Figure 10.1 Phase space and reduced phase space for two particles The same is true in the reduc ed phase spac e, figure 10.1b. For N = 3 there are three suc h diagonals; c ells along these have one half the volume of the others. And there is an additional boundary, where all three partic les have the same fine-grained properties, eac h with one sixth their volume. The weights in Eq.(24) follow from the struc - ture of reduced phase space, as faithfully preserving ratios of volumes of microstates in the unreduced space. As explained by Huggett (1999a), two c lassic al identic al c oins, if permutable, still yield a weight for {H, T} twic e that of the weight for {H,H} or {T,T}, just as for nonpermutable coins, that is with probabilities one-half, one-quarter, and one-quarter respectively. Contrast quantum mec hanic s, where subspac es of Hilbert spac e replac e regions of phase spac e, and subspac e dimension replac es volume measure. Phase spac e struc ture, insofar as it c an be defined in quantum theory, is derivative and emergent. Sinc e the only measure available is subspac e dimension, eac h of a set of orthogonal direc tions in eac h subspac e is weighted prec isely the same—yielding, for the symmetriz ed Hilbert spac e, Eq.(23) instead.30 Page 15 of 31
Indistinguishability Figure 10.2 Discrete measures for Hilbert spac e But there are two cases when subspace dimension and volume measure are proportional to one another—or rather, for we take quantum theory as fun3d1 amental, for when phase-spac e struc ture, c omplete with volume measure, emerges from quantum theory. One is in the limit C ≫ N, when the contribution from the states along the diagonals is negligible in c omparison to the total (figure 10.2b), and the other is when the full Hilbert spac e for nonpermutable particles is used. That is why permutability makes a difference to statistics in the quantum case but not the c lassic al: for N ≍ C, as in figure 10.2a, the dimensionality measure departs signific antly from the volume measure (in figure 10.2a, as five-eighths to one-half). For N = 2, C = 2 there are just three orthogonal microstates, eac h of equal weight. Take two two-state quantum partic les (qubits) as quantum c oins, and the probabilities {H,H}, {T, T}, {H, T} are all one-third. Is there a remaining puz z le about quantum statistic s—say, the nonindependenc e of permutable quantum partic les, as noted by Einstein? Statistic al inthdependenc e fails, in that the state c annot be spec ified for N − 1 partic les, independent of the state of the N , but that is true of classical states on reduced phase space too (or, indeed, for permutation-invariant states on the unreduced phase space; see Bach 1997). Find a way to impose a discrete measure on a classical permutable system, and one can hope to reproduce quantum statistics as well (Gottesmann 2005). Quantum holism has some role to play in the explanation of quantum statistics, but like entanglement and identity over time, less than meets the eye. 2.5 Fermions We have made almost no mention so far of fermions. In fac t most of our disc ussion applies to fermions too, but there are some differences. Whc y are there fermions at all? The reason is that microstates in quantum theory are actually rays, not vector states |ϕ 〉, that is, they are 1-dimensional subspaces of Hilbert space. As such they are invariant under multiplication by complex numbers of unit norm. If only the ray need be invariant under permutations, there is an alternative to Eq. (18), namely: (25) Page 16 of 31
Indistinguishability where θ ∊ [0, 2π]. Sinc e any permutation c an be dec omposed as a produc t of permutations πij (that interc hange i and j), even or odd in number, and since πij πij = I, it follows that (18) need not be obeyed after all: there is the new possibility that θ = 0 or π for even and odd permutations, respec tively. Suc h states are antisymmetrized, that is, of the form: (26) where sgn(π)= 1 (−1) for even (odd) permutations, and superpositions thereof. An immediate c onsequenc e is that, unlike in (19), every one-partic le state in (26) must now be orthogonal to every other: repetitions would automatic ally c anc el, leaving no c ontribution to | ΨFD). Sinc e superpositions of states (19) with (26) satisfy neither (18) nor (25), permutable partic les in quantum mec hanic s must be of one kind or the other.32 The c onnec tion between phase spac e struc ture and antisymmetriz ation of the state is made by the Pauli exc lusion princ iple—the princ iple that no two fermions c an share the same c omplete set of quantum numbers, or equivalently, have the same one-particle state. In view of the effective identification of elementary phase space c ells of volume h3 with rays in Hilbert spac e, fermions will be c onstrained so that no two oc c upy the same elementary volume. In other words, in terms of mic rostates in phase spac e, the nk's are all z eros or ones. In plac e of Eq.(23), we obtain for the number of microstates for the coarse-grained distribution 〈C,N〉 (as before, for a single energy level s): (27) Use of (27) in place of (1) yields the entropy and equation of state for the Fermi-Dirac ideal gas. It is, of course, extensive. A c lassic al phase spac e struc ture emerges from this theory in the same limit C ≫ N (for eac h s) as for the Bose-Einstein gas, when the classical weights for cells along the diagonals are small in comparison to the total. Away from this limit, whereas for bosons their weight is too small (as suppressed by the fac tor (n1!…nC,)−1),for fermions their weight is too large (as not suppressed enough; they should be set equal to z ero). Thus fermions tend to repel, in c omparison to non-permutable partic les.3 3 3. Onto lo gy The explanation of quantum statistics completes the main argument of this chapter: permutation symmetry falls in plac e as with any other exac t symmetry in physic s, and applies just as muc h to c lassic al systems of equations that display it as to quantum systems.3 4 In both c ases only quantities invariant under permutations are physically real. This is the sense in whic h “exc hange between like partic les is not a real event”; it has nothing to do with the swirling of partic les around eac h other, it has only to do with haec c eistic redundanc ies in the mathematic al desc ription of suc h partic les, swirling or otherwise. Similar c omments apply to other symmetries in physic s, where instead of haec c eistic differenc es one usually speaks of c oordinate-dependent distinc tions. In both c lassic al and quantum theory state-spac es c an be defined in terms only of invariant quantities. In quantum mec hanic s partic le labels need never be introduc ed at all (the so-c alled “oc c upation number formalism”)—a formulation rec ommended by Teller (1995). Why introduc e quantities (partic le labels) only to deprive them of physic al signific anc e? What is their point if they are permutable? We c ome bac k to Quine's question and to eliminativism. There are two sides to this question. One is whether, or how, permutable partic les c an be adequate as ontology (sec tion 3.1), and link in a reasonable way with philosophic al theories of ontology (sec tions 3.2 and 3.3). The other question is whether some other way of talking might not be preferable, in whic h permutability as a symmetry does not even arise (section 3.4). 3.1 The Gibbs Paradox, Again Page 17 of 31
Indistinguishability A first pass at the question of whether permutable entities are really objec ts is to ask how they may give rise to nonpermutable objects. That returns us to the Gibbs paradox in the sense of section 1.2: How similar do objects have to be to c ount as identic al? On this problem (as opposed to the N! problem) sec tion 2 may seem a disappointment. It foc used on indistinguishability as a symmetry, but the existenc e of a symmetry (or otherwise) seems just as muc h an all-or- nothing affair as identity. But sec tion 2 did more than that: it offered a mic rosc opic dynamic al analysis of the proc ess of mixing of two gasses. In fac t, not even the N! problem is entirely solved, for we would still like to have an extensive entropy func tion even where partic les are obviously non-identic al, say in the statistic al behavior of large objec ts (like stars), and of small but c omplex objec ts like fatty molec ules in c olloids.3 5 In these c ases we c an appeal to the Ehrenfest-Trkal- van Kampen approac h, but only given that we c an arrive at a desc ription of suc h objec ts as distinguishable: How do we do that, exactly? The two problems are related, and an answer to both lies in the idea of individuating properties, already introduc ed, and the idea of phase-space structure as emergent, already mentioned. For if particles (or bound states of partic les) ac quire some dynamic ally stable properties, there is no reason that they should not play muc h the same role, in the definition of effec tive phase-spac e struc ture, as do intrinsic ones. Thus two or more nonidentic al gases may arise, even though their elementary c onstituents are identic al and permutable, if all the molec ules of one gas have some c harac teristic arrangement, different from those of the other. The two gases will be nonidentic al only at an effec tive, emergent level of desc ription to be sure, and permutation symmetries will still apply at the level of the full phase-spac e. The effec tive theory will have only approximate validity, in regimes where those individuating properties are stable in time. Similar c omments apply to Hilbert-spac e struc tures.3 6 Figure 10.3 Individuating properties as particle labels In illustration, c onsider again figure 10.1b for two c lassic al permutable c oins. Suppose that the dynamic s is suc h that one of the c oins always rotates about its axis of symmetry in the opposite direc tion to the other. This fac t is rec orded in the mic rostate: eac h c oin not only lands either heads (H) or tails (T), but lands rotating one way (A) or the other (B). It follows that c ertain regions of the reduc ed phase spac e are no longer ac c essible, among them the cells on the diagonal for which all the properties of the two coins are the same (shaded, figure 10.3a). By in- spec tion, the available phase spac e has the effec tive struc ture of an unreduc ed phase spac e for distinguishable c oins, the A c oin and the B c oin (figure 10.3b). It is tempting to add “even if there is no fac t of the matter as to whic h of the c oins is the A c oin, and whic h is the B c oin,” but there is another way of putting it: the c oin that is the A c oin is the one rotating one way, the B c oin is the one rotating the other way.3 7 The elimination of the diagonals makes no differenc e to partic le statistic s (sinc e this is c lassic al theory), but analogous reasoning applies to the quantum case, where it does. Two quantum coins, thus dynamically distinguished, will land one head and one tail with probability one half, not one third. The argument c arries over unc hanged in the language of Feynman diagrams. Thus, the two sc attering proc esses depic ted in figure 10.4 c annot (normally) be dynamic ally distinguished if the partic les are permutable. Correspondingly, there is an interference effect that leads to a difference in the probability distributions for sc attering proc esses involving permutable partic les from those for distinguishable partic les. But if dynamic al distinc tions A and B c an be made between the two partic les, stable over time (in our terms, if A and B are individuating properties), the interference terms will vanish, and the scattering amplitudes will be the same as for distinguishable partic les.3 8 Page 18 of 31
Indistinguishability The same proc edure c an be applied to N = NA + NB c oins, NA of whic h rotate one way and NB the other. The result for large NA, NB is an effec tive phase spac e representation for two nonidentic al gases A and B, eac h separately permutable, eac h with an extensive entropy func tion, with an entropy of mixing as given by (3). And it is c lear this representation admits of degrees: it is an effective representation, more or less accurate, more or less adequate to prac tic al purposes. Figure 10.4 Feynman diagrams for particle scattering But by these means we are a long way from arriving at an effective phase space theory of N distinguishable partic les. That would require, at a minimum, N distinc t individuating properties of the kind we have desc ribed—at whic h point, if used in an effec tive phase spac e representation, the original permutation symmetry will have c ompletely disappeared. But it is hardly plausible (for mic rosc opic systems), when N is large, that a representation like this c an be dynamic ally defined. Even where there are suc h individuating properties, it is hard to see what purposes their introduc tion would serve—their dynamic al definition—unless it is to model explic itly a Maxwell demon.3 9 It may be no harm is done by starting ab initio with a system of distinguishable partic les. On this point we are in agreement with van Kampen. But it must be added: we should rec ogniz e that the use of unreduc ed phase spac e, and the struc ture ℝ6 N underlying it, is in general a mathematic al simplific ation, introduc ing distinc tions in thought that are not instantiated in the dynamics. That seems to be exac tly what Gibbs thought on the matter. He had, rec all, an epistemologic al argument for passing to reduc ed phase spac e—that nothing but similarity in qualities c ould be used to identify partic les ac ross members of an ensemble of gasses—but he immediately went on to say: And this would be true, if the ensemble of systems had a simultaneous objective existence. But it hardly applies to the creations of the imagination. In the c ases whic h we have been c onsidering …. it is not only possible to c onc eive of the motion of an ensemble of similar systems simply as possible c ases of the motion of a single system, but it is actually in large measure for the sake of representing more clearly the possible c ases of the motion of a single system that we use the c onc eption of an ensemble of systems. The perfec t similarity of several partic les of a system will not in the least interfere with the identific ation of a partic ular partic le in one c ase with a partic ular partic le in another. (Gibbs 1902, 188, emphasis added) If pressed, it may be added that a mathematic ian c an always c onstruc t a domain of objec ts in set theory, or in one- to-one c orrespondenc e with the real numbers, eac h number uniquely represented.4 0 Likewise for referenc e to elements of nonrigid struc tures, whic h admit nontrivial symmetries—for example, to a partic ular one of the two roots of −1 in the complex number field, or to a particular orientation on 3-dimensional Euclidean space, the left- handed orientation rather than the right-handed one.4 1 But it is another matter entirely as to whether referenc e like this, in the absenc e of individuating properties, c an c arry over to physic al objec ts. The whole of this c hapter c an be seen as an investigation of whether it c an in the c ase of the c onc ept of partic le; our c onc lusion is negative. The lesson may well be more general. It may be objec ts in mathematic s are always objec ts of singular thought, involving, perhaps, an irreducible indexical element. If, as structuralists like Russell and Ramsey argued, the most one c an hope for in representation of physic al objec ts is struc tural isomorphisms with objec ts of direc t ac quaintanc e, these indexic al elements c an be of no use to physic s. It is the opposite c onc lusion to Kant's. 3.2 Philosophical Logic Page 19 of 31
Indistinguishability A sec ond pass at our question of whether permutable entities c an be c onsidered as objec ts is to ask whether they c an be quantified over in standard logic al terms. Posed in this way, the question takes us to language and objec ts as values of bound variables. Arguably, the notion of objec t has no other home; physic al theories are not direc tly about objec ts, properties, and identity in the logic al sense (namely equality). But if we are to introduc e a formal language, we should be c lear on its limits. We are not trying to reproduc e the mathematic al workings of a physic al theory in formal terms. That would be an ambitious, but hardly novel undertaking; it is the one proposed by Hilbert and Russell, that so inspired Carnap and others in the early days of logic al empiric ism. Our proposal is more modest. The suggestion is that by formaliz ation we gain c larity on the ontology of a physic al theory, not rigor or c larity of deduc tion—or even of explanation. But it is ontology subjec t to symmetries: in our case, permutability. We earlier saw how invariant descriptions and invariant states (under the permutation group) suffic e for statistic al mec hanic s, suffic e even for the desc ription of individual trajec tories; we should now see how this invariance is to be cashed out in formal, logical terms. Permutability of objects, as a symmetry, has a simple formal expression: predicates should be invariant (have the same truth value) under permutations of values of variables. Call suc h a predic ate totally symmetric. Restric tion to predic ates like these c ertainly seems onerous. Thus take the simple c ase where there are only two things, whereupon it is enough for a predic ate to be totally symmetric that it be symmetric in the usual sense. When we say: (i) Buc kbeak the hippogriff c an fly higher than Pegasus the winged horse the sentenc e is c learly informative, at least for readers of literature on mythic al beasts; but “flies higher” is not a symmetric predic ate. How c an we c onvey (i) without this asymmetry? Like this: by omitting use of proper names. Let us suppose our language has the resources to replace them with Russellean desc riptions, say with “Buc kbeak-shaped” and “Pegasus-shaped” as predic ates (individuating predic ates). We c an then say in plac e of (i) (ii) x is Buc kbeak-shaped and y is Pegasus-shaped and x c an fly higher than y. But now (ii) gives over to the equally informative totally symmetric c omplex predic ate: (iii) x is Buc kbeak-shaped and y is Pegasus-shaped and x c an fly higher than y, or y is Buc kbeak-shaped and x is Pegasus-shaped and y can fly higher than x. The latter is invariant under permutation of x and y. Prefacing by existential quantifiers, it says what (i) says (modulo uniqueness), leaving open only the question of which of the two objects is the one that is Buckbeak- shaped, rather than Pegasus-shaped, and vice versa. But continuing in this way—adding further definition to the individuating predic ate-the question that is left open is inc reasingly empty. If no further spec ific ation is available, one loses nothing in referring to that which is Buckbeak-shaped, that which is Pegasus-shaped (given that there are just the two); or to using “Buckbeak” and “Pegasus” as mass terms, like “butter” or “soil.” We then have from (iii): (iv) There is Buc kbeak and there is Pegasus and Buc kbeak c an fly higher than Pegasus, or there is Buc kbeak and there is Pegasus and Buckbeak can fly higher than Pegasus. With “Pegasus” and “Buckbeak” in object position, (iv) is not permutable; it now says the same thing twice. We have rec overed (i). How does this work when there are several other objec ts? Consider the treatment of properties as projec tors in quantum mec hanic s. For a one-partic le projec tor P there c orresponds the N-fold symmetriz ed projec tor: where there are N factors in each term of the summation, of which there are ( N ) = N . For a two-partic le 1 projec tor of the form P ⊗ Q (where P and Q are either the same or orthogonal), the symmetriz ed projec tor is likewise a sum over produc ts of projec tors and their c omplements (N fac tors in eac h), but now there will be Page 20 of 31
Indistinguishability ( N ) = N (N − 1)/2 summands. And so on. The obvious way to mimic these constructions in the predicate 2 c alc ulus, for the c ase of N objec ts, is to define, for eac h one-plac e predic ate A, the totally symmetric N-ary predic ate: The truth of (v) (if it is true) will not be affected by permutations of values of the N variables. It says only that exac tly one partic le, or objec t, satisfies A, not whic h partic le or objec t does so. The c onstruc tion starting with a two-plac e predic ate follows similar lines; and so on for any n-ary predic ate for n ≤ N. Disjunc ts of these c an be formed as well. Do these c onstruc tions tell us all that we need to know? Indeed they must, given our assumption that the N objec ts are adequately described in the predicate calculus without use of proper names, for we have: Theorem 1 Let L be a first-order language with equality, without any proper names. Let S be any L — sentence true only in models of cardinality N. Then there is a totally symmetric N-ary predicate G ∈ L such that Ǝx1 … ƎxN Gx1 … xN is logic ally equivalent to S. (For the proof see Saunders 2006a.) Given that there is some finite number of objec ts N, anything that c an be said of them without using proper names (with no restric tion on predic ates) c an be said of them using a totally symmetric N-ary predic ate.4 2 On the strength of this, it follows we can handle uniqueness of reference in the sense of the “that which” c onstruc tion, as well, “the unique x whic h is Ax.” In Peano's notation it is the objec t ιxAx. Following Russell, it is c ontextually defined by sentenc es of the form: (vi) the x that is an A is a B or B(ı x)Ax, whic h is c ashed out as: (vii) Ǝx(Ax ⋀ ∀y(Ay → y = x) ⋀ Bx). From Theorem 1 it follows that (vii), supplemented by information on just how many objec ts there are, is logic ally equivalent to a sentenc e that existentially quantifies over a totally symmetric predic ate. It says that a thing whic h is A is a B, that something is an A, and that there are no two distinc t things that are both A, without ever saying whic h of N things is the thing which is A. (v) shows how. How muc h of this will apply to quantum partic les? All of it. Of c ourse definite desc riptions of objec ts of definite number are rarely needed in talk of atoms, and rarely available. Individuating properties at the mac rosc opic level normally provide indefinite desc riptions of one of an indeterminate number of partic les. So it was earlier; I was talking of any old helium atom in the canister by the door, any old helium atom in the vacuum chamber, out of an indeterminate number in each case. But sometimes numbers matter: a handful of atoms of plutonium in the wrong part of the human body might be very bad news indeed. Even one might be too many. Nor need we stop with Russellean desc riptions, definite or otherwise. There are plenty of other referential devic es in ordinary language that may be signific ant. It is a virtue of passing from the objec t level, from objec ts themselves (the “material mode,” to use Carnap's term), to talk of objects (the “formal mode”), that the door is open to linguistic investigations of quite broad sc ope. Still, in agreement with Carnap and with Quine, our litmus test is c ompatibility with elementary logic and quantific ation theory. To c onc lude: in the light of Theorem 1, and the use of individuating properties to replac e proper names, nothing is lost in passing from nonpermutable objec ts to permutable ones. There is no loss of expressive c ontent in talking of N permutable things, over and above what is lost in restric ting oneself to the predic ate c alc ulus and abjuring the use of names. That should dissipate most philosophic al worries about permutability. There remains one possible bugbear, however, namely identity in the logic al sense (what we are c alling equality). Quantum objec ts have long been thought problematic on the grounds that they pose insuperable diffic ulties to any reasonable ac c ount of logic al equality—for example, in terms of the princ iple of identity of indisc ernibles (see 43 Page 21 of 31
Indistinguishability below). To this one c an reply, too bad for an ac c ount of equality; 4 3 the equality sign c an be taken as primitive, as is usual in formal logic. (That is to say, in any model of L , if a language with equality, the equals sign goes over to equality in the set-theoretic sense.) But here too one might do better. 3.3 Identity Conditions If physic al theories were (among other things) direc tly about identity in the logic al sense, an ac c ount of it would be available from them. It is just bec ause physic al theories are not like this (although that c ould c hange) that I am suggesting the notion of objec t should be formaliz ed in linguistic terms. It is not spelled out for us direc tly in any physic al theory. But by an “ac c ount of equality” I do not mean a theory of logic al equality in full generality. I mean a theory of equality only of physic al objec ts, and spec ific to a sc ientific language. It may better be c alled an ac c ount of identity c onditions, c ontextualiz ed to a physic al theory. Given our linguistic methods, there is an obvious c andidate: exhaustion of predic ates. That is, if F…s… if and only if F…t…, for every predicate in L and for every predicate position of F, then s and t are equal. Call this L -equality, denote “s = L t.” It is clearly a version of Leibniz's famous “principle of identity of indiscernibles”. This is often paraphrased as the principle that objects which share the same properties, or even the same relational properties, are the same, but this parsing is unsatisfactory in an important respect. It suggests that conjuctions of c onditions of the form (28) are suffic ient to imply that x and y are equal, but more than this is required for exhaustion of predic ates. The latter also requires the truth of sentences of the form: (29) These are the key to demonstrating the nonequality of many supposed c ounterexamples to Leibniz 's princ iple (see Saunders 2003). L -equality is the only defined notion of equality (in first-order languages) that has been taken seriously by logic ians.4 4 It satisfies Gödel's axioms for the sign “=,” used in his c elebrated c ompleteness proof for the predic ate c alc ulus with equality, namely the axiom sc heme: together with the scheme s = s. Since one has completeness, anything true in L equipped with the sign “=” remains true in L equipped with the sign “s = L t.” The difference between L -equality and primitive equality cannot be stated in L .45 But the notion that we are interested in is not L -equality, sameness with respect to every predicate in L , but sameness with respect to invariant predicates constructible in L , denote “L ∗”. Call equality defined in this way physical equality, denote “= L ∗.” With that completeness is no longer guaranteed, but our concern is with ontology, not with deduction. In summary, we have: and, as a nec essary c ondition for physic al objec ts (the identity of physical indiscernibles): 4 6 If s ≠ L ∗, we shall say s and t are (physically) discernible; otherwise (physically) indiscernible. There are c ertain logic al distinc tions (first pointed out by Quine) for equality in our defined sense that will prove useful. Call s and t strongly discernible if for an open sentenc e F in one free variable, Fs and not Ft; c all s and t weakly discernible (respectively relatively discernible) if for an open sentence F in two free variables Fst but not Fss (respec tively, but not Fts). Objec ts that are only weakly or relatively disc ernible are disc erned by failure of c onditions of the form (29), not (28). Page 22 of 31
Indistinguishability Of these, as already mentioned, weak disc ernibility is of greater interest from both a logic al and physic al point of view. Satisfaction of any symmetric but irreflexive relation is enough for weak discernibility: ≠ and ≠ L ∗ are prime examples. And many simple invariant physic al relations are symmetric and irreflexive: for example, having nonz ero relative distanc e in a Euc lidean spac e (a relation invariant under translations and rotations). Thus, take Max Blac k's famous example of identic al iron spheres s, t, one mile apart, in an otherwise empty Euc lidean spac e. The spheres are weakly discerned by the relation D of being one mile apart, for if Dst is true, it is not the case that Dxs ↔ Dxt for any x, since Dst but not Dss (or Dtt), so s ≠ L ∗ t. And, fairly obviously, if L ∗ contains only totally symmetric predic ates, physic al objec ts will be at most weakly disc ernible. Here as before “s” and “t” are terms, that is variables, func tions of variables, or proper names. What differenc e do the latter make? Names are important to discernibility under L -equality Thus if it is established that s and t are weakly L -discernible, then, if “s” or “t” are proper names, they are absolutely L -discernible. In the example just given, if Dst and “s” is a proper name, then Dsx is true of t but not s. But the presence of names in L makes no difference to L ∗-discernibility (discernibility by totally symmetric predicates). Thus, even if Dxy ∈ L ∗, on entering a proper name in variable position one does not obtain a one-place predicate in L ∗. Permutable objects are only weakly disc ernible, if disc ernible at all. We may never say of permutables whic h of them is a suc h-and- suc h; only that there is a suc h-and-suc h. It remains to determine whether permutable partic les are disc ernible at all. In the c lassic al c ase, assuming partic les are impenetrable, they are always some nonz ero distanc e apart, so the answer is positive. Impenetrability also ensures that giving up permutability, and passing to a desc ription of things that are partic le states or trajec tories, they will be at least weakly disc ernible. Typic ally they will be strongly disc ernible, but as Blac k's two spheres illustrate (supposing they just sit there), not always. It is the quantum case that presents the greater challenge; indistinguishable quantum particles have long been thought to violate any interesting formulation of Leibniz 's princ iple of indisc ernibles.4 7 But in fac t the same options arise as in the c lassic al c ase. One c an speak of that whic h has suc h-and-suc h a state, or orbit, and pass to states and orbits of states as things, giving up permutability. One-partic le states or their orbits, like c lassic al trajec tories, will in general be absolutely disc ernible, but sometimes only weakly disc ernible—or (failing impenetrability) not even that. Or retaining permutability, one c an speak of partic les as being in one or other states, and of N partic les as being in an N-partic le state, using only totally symmetric predic ates, and satisfying some totally irreflexive relation. On both strategies there is a real difficulty in the case of bosons, at least for elementary bosons. On the first approac h, there may be two bosonic one-partic le states, eac h exac tly the same; on the sec ond, there seems to be no general symmetric and irreflexive relation that bosons always satisfy. But the situation is different when it comes to fermions. On the first approac h, given only the mild entanglement required by antisymmetriz ation, one is guaranteed that of the N one-partic le states, eac h is orthogonal to every other, so objec ts as one-partic le states are always absolutely disc ernible; and on the sec ond approac h, again following from antisymmetriz ation, an irreflexive symmetric relation can always be defined (whatever the degree of entanglement). Some further c omments on eac h. The first strategy is not without its diffic ulties. To begin with, even restric ting to only mildly entangled states, which one-partic le states are to be the objec ts replac ing partic les is ambiguous. The problem is familiar from the c ase of the singlet state of spin: neglec ting spatial degrees of freedom the antisymmetriz ed state is (30) where |ψ±z ⟩ are eigenstates of spin in the z direction. But this state can equally be expanded in terms of eigenstates of spin in the y direc tion, or of the z direc tion: Whic h pair of absolutely disc ernible one-partic le states is present, exac tly? The problem generaliz es. Thus, for arbitrary orthogonal one-partic le states | φa, | φb〉, and a two-fermion state of the form: (31) Page 23 of 31
Indistinguishability define the states (the first is just a change of notation): (32) They yield a representation of the rotation group. One then has, just as for components of spin: and an ambiguity in attributing one-particle states to the two particles arises with (31) as with (30). I shall come bac k to this in sec tion 3.4. This diffic ulty c an be sidestepped at the level of permutable partic les, however. In the c ase of (30), we may weakly disc ern the partic les by the relation “opposite spin,” with respec t to any direc tion in spac e (Saunders 2003, 2006b; Muller and Saunders 2008). Thus if σx, σy, σz are the Pauli spin matric es, the self-adjoint operator (33) has eigenvalue −1 in the singlet state | Ψ0 〉, with the c lear interpretation that the spins are antic orrelated (with respec t to any direc tion in spac e). Asserting this relation does not pic k out any direc tion in spac e, no more than saying Blac k's spheres are one mile apart pic ks out any position in spac e. For the construction in the generalized sense (32), define projection operators onto the states ∣∣ϕ±k ⟩ and define the self-adjoint operators: Eac h has eigenvalue −1 for | Φ〉, and likewise pic ks out no “direc tion” in spac e (i.e. the analogue of (33) is satisfied). Moreover, one c an define sums of suc h in the c ase of finite superpositions of states of the form (31), by means of whic h fermions c an be weakly disc erned. On the strength of this, one can hope to weakly discern bosons that are composites of fermions, like helium atoms. And even in the case of elementary bosons, self-adjoint operators representing irreflexive, symmetric relations required of any pair of bosons have been proposed.4 8 The diffic ulty of rec onc iling partic le indistinguishability in quantum mec hanic s with the IPI looks well on its way to being solved. 3.4 Eliminativism We are finally in a position to address the arguments for and against eliminativism—that is, for and against renounc ing talk of permutable objec ts in favor of nonpermutable objec ts defined in terms of individuating properties, whether points in μ-spac e, trajec tories, one-partic le states, or orbits of one-partic le states. The gain, usually, is absolute discernibility On the other hand, we have found that quantification over permutable objects satisfies every c onservative guideline we have been able to extrac t from elementary logic (with the possible exc eption of identity c onditions for elementary bosons). There seems to be nothing wrong with the logic of weak disc ernibles. And there remains another c onservative guideline: we should maintain standard linguistic usage where possible. That stac ks the odds against eliminativism, for talk of partic les, and not just of one-partic le states, is everywhere in physic s. But even putting this to one side, eliminativism would seem to fare poorly, for (anti)symmetriz ed states are generic ally entangled, whereupon no set of N one-partic le states will suffic e for the desc ription of N partic les. And as we have seen in section 3.3, where such a set is available it may be non-unique. Page 24 of 31
Indistinguishability Against this there are two objec tions. The first is that we anyway know the partic le c onc ept is stretc hed to breaking point in strongly interac ting regimes. There the best we c an say is that there are quantum fields, and, perhaps, superpositions of states of different partic le number. Where the latter c an be defined, one c an talk of modes of quantum fields instead. In the free-field limit, or as defined by a sec ond-quantiz ation of a partic le theory,4 9 suc h modes are in one-to-one c orrespondenc e with one-partic le states (or, in terms of Fourier expansions of the fields, in c orrespondenc e with “generaliz ed” momentum eigenstates). The elimination of partic les in favor of fields and modes of fields is thus independently motivated. The sec ond objec tion is that we c annot lightly ac c ept indeterminateness in attributing a definite set of N one- partic le states to an N-partic le system, for it applies equally to partic les identified by individuating properties. That is, not even the property of being a bound elec tron in a helium atom in the c anister by the c orner, and being one in the vac uum c hamber by the door, hold unambiguously. The c onstruc tion (32) applies just as muc h to (20). But this diffic ulty we rec ogniz e as a fragment of the measurement problem. Spec ific ally, it is the preferred basis problem: Into what states does a mac rosc opic superposition c ollapse (if there is any c ollapse)? Or, if mac rosc opic superpositions exist: What singles out the basis in which they are written? Whatever settles this question (dec oherenc e, say) will dic tate the c hoic e of basis used to express the state in terms of mac rosc opic individuating properties. Whether suc h a c hoic e of basis—or suc h a solution to the preferred basis problem—c an extend to a preferred basis at the mic rosc opic level is moot. It depends, to some extent, on the nature of the solution (dec oherenc e only goes down so far). Of c ourse it is standard prac tic e in quantum theory to express mic rosc opic states in terms of a basis assoc iated with physic ally interpreted operators (typic ally generators of one-parameter spac etime symmetry groups, or in terms of the dynamical quantities that are measured). The use of quantum numbers for bound states of elec trons in the atom, for energy, orbital angular momentum, and c omponents of angular momentum and spin— in c onventional notation, quadruples of numbers 〈n,l,ml,ms〉—is a c ase in point. When energy degenerac ies are c ompletely removed (introduc ing an orientation in spac e) one c an assign these numbers uniquely. The Pauli exc lusion princ iple then dic tates that every elec tron has a unique set of quantum numbers. Use suc h quadruples as names and talk of permutable particles can be eliminated. It is now c learer that the first objec tion adds support to the sec ond. Quadruples of quantum numbers provide a natural replac ement for partic les in atoms; modes of quantum fields (and their exc itation numbers) provide a natural replac ement for partic les involved in sc attering. And in strongly interac ting regimes, even modes of quantum fields give out (or they have only a shadow existence, as with virtual particles). All this is as it should be. Our inquiry was never about fundamental ontology (a question we can leave to a final theory, if there ever is one), but with good-enough ontology, in a definite regime. In the regime we are c onc erned with, stable partic les of ordinary matter whose number is c onserved in time, there is the equivalenc e between one-partic le states and modes of a quantum field already mentioned. Let us settle on a preferred dec omposition of the field (or preferred basis) in a given c ontext. But suppose that c ontext involves nontrivial entanglement: Can entanglements of particles be understood as entanglements of modes of fields? Surely they c an—but on pain of introduc ing many more modes of the field than there were partic les, and a variable number to boot. As with one-partic le states so modes of the field: in a general entanglement, arbitrarily many suc h modes are involved, even given a preferred dec omposition of the field, whereas the number of partic les is determinate. Just where the particle concept is the most stable, in the regime in which particle number is c onserved, eliminativism in favor of fields and modes of fields introduc es those very features of the partic le c onc ept that we found unsatisfac tory in strongly interac ting regimes. That speaks against eliminativism. This does not, of c ourse, militate against the reality of quantum fields. We rec ogniz e that permutable partic les are emergent from quantum fields, just as nonpermutable particles are emergent from permutable ones. Understood in this way, we c an explain a remaining fragment of the Gibbs paradox—the fac t that partic le identity, and with it permutation symmetry, c an ever be exac t. How is it that intrinsic quantities, like c harge and mass, are identic ally the same? (their values are real numbers, note). The answer is that for a given partic les spec ies, the partic les are one and all exc itations of a single quantum field—whereupon these numeric al identities are forc ed, and permutation symmetry has to obtain. The existenc e of exac t permutation symmetry, in regimes in whic h partic le equations are approximately valid, is therefore explained, and with it partic le indistinguishability. Page 25 of 31
Indistinguishability References Albert, D. (2000). Time and Chance. Cambridge, MA: Harvard University Press. Bac h, A. (1990). Boltz mann's probability distribution of 1877. Archive for the History of the Exact Sciences 41: 1– 40. ———. (1997). Indistinguishable classical particles. Berlin: Springer. Darrigol, O. (1991). Statistics and cominbatorics in early quantum theory, II: Early symptoma of indistinghability and holism. Historical Studies in the Physical and Biological Sciences 21: 237–298. Denbigh, K., and M. Redhead (1989). Gibbs' paradox and non-uniform c onvergenc e. Synthese 81: 283– 312. Dieks, D., and A. Lubberdink (2011). How c lassic al partic les emerge from the quantum world. Foundations of Physics 41, 1051–1064. Available online at arXiv: http://arxiv.o rg/abs/1002.2544. Ehrenfest, P., and V. Trkal (1920). Deduction of the dissociation equilibrium from the theory of quanta and a c alc ulation of the c hemic al c onstant based on this. Proceedings of the Amsterdam Academy 23: 162– 183. Reprinted in P. Bush, ed., P. Ehrenfest, Collected scientific papers. Amsterdam: North-Holland, 1959. Frenc h, S., and D. Krause (2006). Identity in physics: A historical, philosophical, and formal analysis. Oxford: Oxford University Press. Frenc h, S., and M. Redhead (1988). Quantum physic s and the identity of indisc ernibles. British Journal for the Philosophy of Science 39: 233–246. Ghirardi, G., and L. Marinatto (2004). General criterion for the entanglement of two indistinguishable states. Physical Review A70: 012109. Ghirardi, G., L. Marinatto, and Y. Weber (2002). Entanglement and properties of c omposite quantum systems: A c onc eptual and mathematic al analysis. Journal of Statistical Physics 108: 49– 112. Gibbs, J. W. (1902). Elementary principles in statistical mechanics. New Haven: Yale University Press. Goldstein, S., J. Taylor, R. Tumulka, and N. Zanghi (2005a). Are all partic les identic al? Journal of Physics A38: 1567–1576. ———. (2005b). Are all particles real? Studies in the History and Philosophy of Modern Physics 36: 103–112. Gottesman, D. (2005). Quantum statistic s with c lassic al partic les. Available online at http://xxx.lanl.go v/co nd- mat/0511207. Greiner, W., and B. Müller (1994). Quantum Mechanics: Symmetries. 2d ed. New York: Springer. Herc us, E. (1950). Elements of thermodynamics and statistical thermodynamics. Melbourne: Melbourne University Press. Hilbert, D., and P. Bernays (1934). Grundlagen der Mathematik, Vol. 1, Berlin: Springer. Howard, D. (1990). “Nicht sein kann was nicht kein darf”, or the prehistory of EPR, 1909–1935: Einstein's early worries about the quantum mechanics of compound systems. In Sixty-two years of uncertainty: Historical, philosophical and physical inquiries into the foundations of quantum mechanics, A. Miller. New York: Plenum. Huggett, N. (1999a). Atomic metaphysics. Journal of Philosophy 96: 5–24. ———. (1999b). Space from Zeno to Einstein: Classical readings with a contemporary commentary. Cambridge, MA: Bradford Books. Jammer, M. (1966). The conceptual development of quantum mechanics. Mc Graw-Hill. Page 26 of 31
Indistinguishability Jaynes, E. (1992). The Gibbs paradox. In Maximum-entropy and Bayesian methods, ed. G. Eric kson, P. Neudorfer, and C. R. Smith. Dordrecht: Kluwer. 1–21. van Kampen, N. (1984). The Gibbs paradox. Essays in theoretical physics in honour of Dirk ter Haar, ed. W.E. Parry. Oxford: Pergamon Press. 303–312. Lieb, E., and J. Yngvason (1999). The physic s and mathematic s of the sec ond law of thermodynamic s. Physics Reports 310: 1–96. Available online at http://arxiv.o rg/abs/co nd- mat/9708200. Muller, F., and S. Saunders (2008). Distinguishing fermions. British Journal for the Philosophy of Science 59: 499– 548. Muller, F., and M. Seevink (2009). Disc erning elementary partic les. Philosophy of Science 76: 179– 200. Available online at arxiv.o rg/PS_cache/arxiv/pdf/0905/0905.3273v1.pdf. Nagle, J. (2004). Regarding the entropy of distinguishable particles. Journal of Statistical Physics 117: 1047–1062. Penrose, R. (2004). The Road to Reality. London: Vintage Press. Planc k, M. (1900). Zur Theorie des Gesetz es der Energieverteilung im Normalspec trum. Verhandlungen der Deutsche Physicalishe Gesetzen 2: 202–204. Translated in D. ter Haar, ed., The old quantum theory. Oxford: Pergamon Press, 1967. ———. (1912). La loi du rayonnement noir et l'hypothèse des quantités élémentaires d'action. In La Théorie du Rayonnement et les Quanta: Rapports et Discussions de la Résunion Tenue à Bruxelles, 1911, ed. P. Langevin and M. de Broglie. Paris: Gauthier-Villars. ———. (1921). Theorie der Wärmesrahlung. 4th ed. Leipz ig: Barth. Pniower, J. (2006). Particles, Objects, and Physics. D.Phil. thesis, University of Oxford. Available online at philsci- archive.pitt.edu/3135/1/dphil- bo d.pdf. Poinc aré, H. (1911). Sur la theorie des quanta. Comptes Rendues 153: 1103– 1108. ———. (1912). Sur la theorie des quanta. Journal de Physique 2: 1–34. Pooley, O. (2006). Points, particles, and structural realism. In The Structural Foundations of Quantum Gravity, ed. D. Ric kles, S. Frenc h, and J. Saatsi. Oxford: Oxford University Press. 83– 120. Post, H. (1963). Individuality and physics. The Listener, 10 October, 534–537, reprinted in Vedanta for East and West 132: 14–22 (1973). Quine, W. van (1960). Word and object. Cambridge, MA: Harvard University Press. ———. (1970). Philosophy of logic. Cambridge, MA: Harvard University Press. ———. (1990). The pursuit of truth. Cambridge, MA: Harvard University Press. Rapp, D. (1972). Statistical mechanics. New York: Holt, Rinehart and Winston. Reic henbac h, H. (1956). The direction of time. Berkeley: University of California Press. Rosenfeld, L. (1959). Max Planck et la definition statistique de l'entropie. Max-Planck Festschrift 1958. Berlin: Deutsc he Verlag der Wissensc haften. Trans. as Max Planc k and the statistic al definition of entropy. In R. Cohen and J. Stac hel, eds., Selected papers of Leon Rosenfeld. Dordrec ht: Reidel, 1979. Saunders, S. (1991). The Negative Energy Sea. In Philosophy of Vacuum, ed. S. Saunders and H. Brown. Oxford: Clarendon Press. 67–107. ———. (1992). Locality, Complex Numbers, and Relativistic Quantum Theory. Proceedings of the Philosophy of Science Association, Vol. 1, 365–380. Page 27 of 31
Indistinguishability ———. (2003). Physic s and Leibniz 's princ iples. In Symmetries in physics: New reflections, ed. K. Brading and E. Castellani. Cambridge: Cambridge University Press. ———. (2006a). On the explanation of quantum statistics. Studies in the History and Philosophy of Modern Physics 37: 192–211. Available online at arXiv.o rg/quant- ph/0511136. ———. (2006b). Are quantum particles objects? Analysis 66: 52–63. Available online at philsci- archive.pitt.edu/2623/. ———. (2007). Mirroring as an a priori symmetry. Philosophy of Science 74: 452–480. Sc hrödinger E. (1946). Statistical thermodynamics. Cambridge: Cambridge University Press. ———. (1984). What is an elementary particle? Collected Papers, Vol. 4, Österreichische Akademie der Wissensc haften. Reprinted in E. Castellani, ed., Interpreting Bodies: classical and quantum objects in modern physics. Princ eton: Princ eton University Press, 1998. Stern, O. (1949). On the term N! in the entropy. Reviews of Modern Physics 21: 534–35. Swendson, R. (2002). Statistic al mec hanic s of c lassic al systems with distinguishable partic les. Journal of Statistical Physics 107: 1143–66. ———. (2006). Statistic al mec hanic s of c olloids and Boltz mann's definition of the entropy. American Journal of Physics 74. 187–190. Teller, P. (1995). An interpretative introduction to quantum field theory. Princeton: Princeton University Press. Wiggins, D. (2004). Sameness and substance renewed. Oxford: Oxford University Press. Notes: (1) See the c hapters by David Wallac e and Guido Bac c iagalluppi, this volume. (2) I have used a different notation from Planck's for consistency with the notation in the sequel. (3) A mic rostate as just defined c an be spec ified as a string of Ns symbols “p” and Cs − 1 symbols “| ” (thus, for example, Ns = 3, Cs = 4, the string p| | pp| c orresponds to one partic le in the first c ell, none in the sec ond, two in the third, and none in the fourth). The number of distinc t strings is (Ns + Cs − 1)! divided by (Cs − 1)!Ns !, bec ause permutations of the symbol “|” among themselves or the symbol “p” among themselves give the same string. (This derivation of (1) was given by Ehrenfest in 1912.) (4) I take “indistinguishable” and “permutable” to mean the same. But others take “indistinguishable” to have a broader meaning, so I will give up that word and use “permutable” instead. (5) The locus classicus for this story is Jammer (1966), but see also Darrigol (1991). (6) Or as at bottom the same, as argued most prominently by Howard (1990). (7) As suggested by Quine. See French and Krause (2006) for a comprehensive survey of debates of this kind. (8) See Planc k (1912, 1921) and, for c ommentary, Rosenfeld (1959). (9) This sec tion largely follows van Kampen (1984). (10) See, e.g., Lieb and Yngvason (1999) for a statement of the sec ond law at this sort of level of generality. (11) Meaning a proc ess whic h at any point in its progress c an be reversed, to as good an approximation as is required. Nec essary c onditions are that temperature gradients are small and effec ts due to fric tion and turbulenc e are small (but it is doubtful these are sufficient). Page 28 of 31
Indistinguishability (12) At least in the absenc e of Maxwell demons: see sec tion 3.1. (13) I owe this turn of phrase to Jos Uffink. (14) Boltz mann defined the entropy in several different ways; see Bac h (1990). (15) For a textbook derivation using our notation, see, e.g., Hercus (1950). (16) Statements like this c an be found in almost any textbook on statistic al mec hanic s. (17) For another variant of the Ehrenfest-Trkal approach, see Swendsen (2002, 2006). (However, Swendsen does not ac knowledge the restric tion of the result to open systems. See further Nagle (2004).) (18) Should they be exac tly equal? No, bec ause it is an additional c onstraint to insist, given that NA + NB partic les are in volume V A + VB, that exac tly NA are in VA and NB in VB. (19) These c onsiderations apply to quantum partic les too, when desc ribed in terms of the de Broglie-Bohm pilot- wave theory. For the latter, see Bacciagaluppi, this volume. (20) As in, e.g., a c igar-shaped mass distribution, rather than a sphere. Of c ourse, this is not really a breaking of rotational symmetry, in that eac h is desc ribed by relative angles and distanc es between masses, invariant under rotations. (21) For more on this vein, see Saunders (2003). For a c ompilation of original sourc es and c ommentary, see Huggett (1999b). (22) See Huggett (1999a). It was endorsed shortly after by David Albert in his book Time and Chance (Albert 2000, 47–48). (23) See Sc hrödinger (1984, 207– 210). The word “individual” has also been used to mean an objec t answering to a unique desc ription at a single time (as “absolutely disc ernible” in the terminology of Saunders (2003, 2006b). (24) As rec ently endorsed by Pooley (2006 sec tion 8). (25) One might in c lassic al mec hanic s add the c ondition that the partic les are impenetrable; but one c an also, in quantum mec hanic s, require that no two partic les oc c upy the same one-partic le state (the Pauli exc lusion princ iple). See sec tions 2.5, 3.3. (26) As we shall see, there is a complication in the case of fermions (section 3.3), although it does not affect the point about identity over time. (27) The terminology is due to Penrose (2004, 598). See Ghirardi and Marinatto (2004) and Ghirardi, Marinatto, and Weber (2002) for the c laim that entanglement due to (anti)symmetriz ation is not really entanglement at all. (28) A spec ial c ase of the multinomial theorem (see, e.g., Rapp 1972, 49– 50). (29) Of c ourse, for mac rosc opic c oins, the assumption of degenerac y of the energy is wildly unrealistic , but let that pass. (30) One way of putting this is that in the quantum case, the measure on phase space must be discrete, c onc entrated on points representing eac h unit c ell of “volume” h3 . For early arguments to this effec t, see Planc k (1912), Poinc aré (1911, 1912). (31) See Wallac e, this volume. (32) This is to rule out parastatistics—representations of the permutation group that are not one-dimensional (see, e.g., Greiner and Müller 1994). This would be desirable (sinc e parastatistic s have not been observed, exc ept in 2- dimensions, where special considerations apply), but I doubt that it has really been explained. (33) The situation is a little more c omplic ated, as antisymmetry in the spin partof the overall state forc es symmetry in the spatial part—which can lead to spatial bunching (this is the origin of the homopolar bond in quantum Page 29 of 31
Indistinguishability c hemistry). (34) But see Gordon Belot (this volume) for pitfalls in defining such symmetries. (35) This problem afflic ts the orthodox solution to the Gibbs paradox, too (and was raised as suc h by Swendsen 2006). (36) Further, even the familiar intrinsic properties of partic les (like c harge, spin, and mass) may be state- dependent: string theory and supersymmetric theories provide obvious examples. See Goldstein et al. (2005a,b) for the argument that all partic les may be treated as permutable, identic al or otherwise. (37) For further disc ussion, see sec tion 3.3. Whether the A c oin after one toss is the same as the A c oin on another toss (and likewise the B coin) will make a difference to the effective dynamics. (38) There is, however, more to say about indistinguishability and path integral methods. I do not pretend to do justic e to this topic here. (39) The memory rec ords of suc h a demon in effec t provide a system of individuating properties for the N partic les. (40) For further disc ussion, see Muller and Saunders (2008). (Set-theory, of c ourse, yields rigid struc tures par exc ellenc e.) (41) This was also, of c ourse, a key problem for Kant. For further disc ussion, and an analysis of the status of mirror symmetry given parity violation in weak-interaction physics, see Saunders (2007). (42) This c onstruc tion was overlooked by Dieks and Lubberdink (2011) in their c ritic isms of the c onc ept of c lassic al indistinguishable partic les. They go further, rejec ting indistinguishability even in the quantum c ase (they c onsider that partic les only emerge in quantum mec hanic s in the limit where Maxwell-Boltz mann statistic s hold sway-where individuating predicates in our sense can be defined. (43) See Pniower (2006) for arguments to this effect. (44) It was first proposed by Hilbert and Bernays (1934); it was subsequently c hampioned by Quine (1960, 1970). (45) For further discussion, see Quine (1970, 61–64), and, for criticism, Wiggins (2004, 184–188). (46) For further disc ussion of this form of the princ iple of identity of indisc ernibles, see Muller and Saunders (2008, 522–23). (47) See Frenc h and Krause (2006) for this history. (48) See Muller and Seevink (2009). Their idea is to use c ertain c ommutator relations that c ould not be satisfied were there only a single particle. (49) For a disc ussion of the relation between sec ond quantiz ed and free-field theories (fermionic and bosonic respec tively), see Saunders (1991, 1992). Simon Saund ers Sim on Saunders is Professor in the Philosophy of Physics and Fellow of Linacre College at the University of Oxford. He has worked in the foundations of quantum field theory, quantum m echanics, sym m etries, therm odynam ics, and statistical m echanics and in the philosophy of tim e and spacetim e. He was an early proponent of the view of branching in the Everett interpretation as an “effective” process based on decoherence. He is co-editor (with J onathan Barrett, Adrian Kent, and David Wallace) of Many worlds? Everett, quantum theory, and reality (OUP 2010). Page 30 of 31
Unification in Physics Margaret Morrison The Oxford Handbook of Philosophy of Physics Edited by Robert Batterman Abstract and Keywords This c hapter foc uses on unific ation in the field of physic s, arguing that there are different distinc t senses of unific ation in physic s, eac h with different implic ations for how we view unified theories and phenomena. It desc ribes the unific ation provided by Maxwellian elec trodynamic s and Newtonian mec hanic s that brought together terrestrial phenomena and c elestial phenomena. The c hapter also argues for a third type of unific ation that foc uses on unific ation of phenomena independent of the mic ro-reduc tion c harac teristic of unified field theory approac hes. K ey words: ph y si cs, u n i fi cati on , u n i fi ed th eori es, ph en omen a, M ax wel l i an el ectrody n ami cs, Newton i an mech an i cs, u n i fi ed fi el d th eory 1. Intro ductio n and Backgro und What exac tly is unific ation and what form does it take in physic s? Typic ally when this question is asked we think about high-energy physic s and the searc h for a Theory of Everything (TOE). But what would suc h a theory look like and what kind of unific ation would it enc ompass? Again, the preliminary answer is that it would bring together the four forc es of nature and show that they are low-energy manifestations of the same forc e. But, would suc h a theory involve deduc tions from a few simple laws or would it require several free parameters and c omplex models to apply it in c onc rete situations? If it is the latter, at what point are we willing to c laim the theory presents a “unified” ac c ount of the phenomena? Muc h of the disc ussion surrounding the operation of the Large Hadron Collider (LHC) and theories like string theory and quantum gravity suggests that the immediate goals of unific ation in physic s involve finding the Higgs partic le, determining its nature and properties, and somehow bringing gravity into the framework of the Standard Model. The former goal has been ac hieved; in July 2012 CERN announc ed that two experiments using the Large Hadron Collider, ATLAS and CMS, had both amassed strong statistic al evidenc e (around 5 sigma) for a new partic le with a mass of roughly 126 GeV whic h is c onsistent with Standard Model predic tions for the Higgs Boson. However, that the Standard Model provides a unified ac c ount of the elec tromagnetic , weak and strong forc es under one theory is far from c lear and the rec ent disc overy of the Higgs boson may not nec essarily solve that problem. In order to see why this is the c ase, we need to go bac k to our initial question regarding what unific ation is and how we should c harac teriz e a “unified theory.” Put differently—how should we understand the drive for unific ation in physic s? Although these questions are of a philosophic al nature, they are direc tly c onnec ted with sc ientific theory and experiment—and, they are the questions that will oc c upy the bulk of this essay. No ac c ount of unific ation would be c omplete without an investigation into its assoc iated diffic ulties, and in the c ontext of high-energy physic s that foc us will be partly on the role of effec tive theories as a way of dealing with phenomena at different energy levels. Interestingly the mathematic s involved in c onstruc ting effec tive theories points to a different kind of unific ation in physic s, a unific ation at the level of phenomena that had been poorly understood prior to the use of the renormaliz ation group (RG) tec hniques. Our disc ussion will also foc us on this type of unific ation, referred to as “universality,” and its relation to the tec hniques used in the high-energy c ontext. Page 1 of 24
Unification in Physics The first systematic unification in physics was Newtonian mechanics, which brought together terrestrial phenomena (e.g., tides) and celestial phenomena (the moon, planets, etc.) by showing how they mutually influenced each other's motion and were subject to the same force law—universal gravitation. In that sense Newtonian physics represented a grand unification in that it accounted for all the phenomena in the heavens and on earth. More rec ently, the quest for a theory of everything—a grand unific ation—involves showing that when energies are high enough, the forc es (interac tions), while very different in strength, range, and the types of partic les on whic h they ac t, bec ome one and the same forc e. The fac t that these interac tions are known to have many underlying mathematic al features in c ommon suggests that they c an all be ac c ounted for by a unified field theory. Suc h a theory desc ribes elementary partic les in terms of forc e fields that further unify all the interac tions by treating partic les and interac tions in a tec hnic ally and c onc eptually similar way. It is this theoretic al framework that allows for the predic tion that measurements made at a c ertain energy level will supposedly indic ate that the four separate forc es are low-energy manifestations of a single forc e. Aspec ts of this unific ation are what will hopefully be revealed in the LHC, the biggest and most complicated physics experiment ever seen. The successful experiments responsible for the disc overy of the Higgs partic le—the c onfirmatory link in the theory that unifies the weak and elec tromagnetic forc es— will also be important for disc overing other aspec ts of the Standard Model, whic h inc ludes the strong forc e and of whic h the elec troweak theory is a part. In many c ases of unific ation not only is there an ontologic al reduc tion where different phenomena, usually forc es, are seen as one and the same, but the mathematic al framework(s) used to desc ribe the fields assoc iated with these forc es fac ilitates their desc ription in a unified theory. Spec ific types of symmetries serve an important func tion in these c ontexts, not only in the c onstruc tion of quantum field theories (QFT) but also in the c lassific ation of partic les: c lassific ations that c an lead to new predic tions and novel ways of understanding properties like quantum numbers. Hence, in order to address issues about unification and reduction in contemporary physics we must also address the way that symmetries support the development of unified theoretical frameworks. Despite the assoc iation of reduc tion and unific ation, there are c lear c ases where the reduc tionist ideal has not been met and unification has involved a synthesis where the phenomena have remained largely independent but are nevertheless described using the same theory. The electroweak theory is a case in point. The theory unifies the weak and electromagnetic forces under the SU(2) x U(1) symmetry group via a mixing of the fields, but the c arriers of the forc es (partic les) remain distinc t. Contrast this with the unific ation of elec tromagnetism and optic s where Maxwell's theory showed the identity of light and elec tromagnetic waves.1 The other issue relevant in the c ase of synthetic unity is, of c ourse, the role of free parameters. The elec troweak theory c ontains one free parameter, the Weinberg angle, which represents the mixing of the fields and yields the masses for the W and Z bosons. The Standard Model, by contrast, contains somewhere in the range of 26 such parameters, which have to be put in by hand and whose values are extracted from experimental data. This issue of free parameters is extremely important bec ause the raison d’être for unific ation is the ability to ac c ount for a variety of phenomena using a few general principles. The addition of free parameters not only erodes that capability but undermines inferenc es about the identific ation of phenomena (like forc es) on the basis of their desc ription under a single theory. In other words, it c asts doubt on the idea that nature itself is unified.2 The searc h for a theory of everything that would inc orporate gravity presupposes, in some sense, that the Standard Model has unified the weak, strong, and elec tromagnetic forc es. But as I mentioned above, the problem of free parameters and the fact that the theory is an amalgam of three different symmetry groups SU(3) x SU(2) x U(1) rather than a single group speaks against the idea that this is a truly unified theory. Moreover, the searc h for a TOE presumes that gravity is a forc e like the others when ac c ording to General Relativity it is very unlike the others in that there are no partic les that c ouple to the gravitational field and ac t as forc e c arriers; the effec ts of gravitation are asc ribed to spac etime c urvature instead of a forc e per se. Some of the most prominent attempts to inc orporate gravity into a unified framework with quantum mechanics include string theory or others related to supersymmetry (SUSY) and loop quantum gravity, all of whic h fac e theoretic al diffic ulties.3 Those problems aside, the other threat to the unific ationist pic ture of physic s c omes from the failure of reduc tion in a different c ontext, spec ific ally in the c ase of c ondensed matter physic s where many of the phenomena are desc ribed as emergent. This pic ture is exemplified by Anderson's remark that “the ability to reduc e everything to simple fundamental laws does not imply the ability to start from those laws and rec onstruc t the universe. … The behaviour of large and c omplex aggregates of partic les … is not to be understood in terms of a simple extrapolation of the properties of a few particles. Instead at each level of complexity entirely new properties Page 2 of 24
Unification in Physics appear” (1972, 393). Examples of emergent phenomena in c ondensed matter physic s inc lude superc onduc tivity and superfluidity. The defining feature of these phenomena is that their behavior or existenc e, for that matter, c annot be explained, predic ted, or reduc ed to their mic ro c onstituents and the laws that govern them. So, while superc onduc tivity involves the pairing of elec trons, its essential features (e.g., infinite c onduc tivity) do not depend on mic rophysic al details related to that pairing. This dec oupling of physic s at different energy levels that is c harac teristic of emergent phenomena has also been a prominent feature of quantum field theories where effec tive theories c ontaining appropriate degrees of freedom are used to desc ribe physic al phenomena oc c urring at a c hosen length sc ale, while ignoring substruc ture and degrees of freedom at shorter distanc es (or, equivalently, at higher energies). Indeed, muc h of high-energy physic s is now dominated by the use of effec tive field theories (EFTs), and the dec oupling theorem of Appelquist and Caraz z one (1975) has often been understood as a basis for interpreting physic al reality as c onstituting a hierarc hy of layers that are quasi-autonomous.4 As we shall see, this in itself need not speak against the possibility of reduc tion and unific ation. The question is whether the disc overies at the LHC will c hange physic s suffic iently suc h that effec tive theories will no longer be a theoretic al requirement. As I mentioned above, this emphasis on emergenc e and effec tive theories has produc ed a new and different kind of unity that is often not c onsidered in the c ontext of unific ation in physic s. What I have in mind is the explanation of what is termed “universal behavior” by the renormaliz ation group methods developed by Kenneth Wilson (1971, 1975) and others. Before RG, there was no ac c ount of why systems as different as magnets and superfluids shared the same c ritic al exponents and displayed the same behavior near a sec ond-order phase transition. RG explained this phenomenon by showing that the differenc es between them are related to irrelevant observables that play no role in the explanation of behavior near c ritic al point. In other words, features of the system that are responsible for the similarities in behavior are largely independent of mic rophysic al struc ture. Phenomena that share the same c ritic al exponents are said to belong to the same universality c lass. This grouping of phenomena into universality c lasses exhibits a type of unific ation that is the antithesis of reduc tive unity insofar as the mic rophysic al constituents are irrelevant to the universal properties or behavior to be explained. The RG is also an important c omponent in the effec tive field theory program in partic le physic s, so interesting questions arise related to the unity of method in these two very distinc t domains. I will have more to say about these questions below. In order to illustrate, extend, and c larify these issues I want to begin by disc ussing examples of the two different types of unific ation mentioned above—reduc tive and synthetic unity. In partic ular I will address not just the ontologic al features involved in eac h c ase but also the role of mathematic s in c onstruc ting unified theories. In that sense our disc ussion will foc us on both the epistemic and ontologic al features of unific ation and reduc tion. While reduc tive unity exemplifies the goals of unific ation by illustrating the identify of different types of phenomena, its synthetic c ounterpart presents a rather different pic ture in that it unifies phenomena under the same theory but falls short of identifying them as one and the same. From there I will go on to disc uss the c hallenges fac ing the unific ation pic ture from effec tive theories and emergent phenomena. Finally I disc uss the way that RG tec hniques have fac ilitated an understanding of similarities among very different types of phenomena, indic ating a new type of unity in physic s that had been largely ignored and previously inexplic able. 2. Reductive Unity: Maxwell's Electro dynamics The development of Maxwell's elec trodynamic s is interesting in that the theory was initially formulated using a c ompletely fic titious aether model from whic h was derived a wave equation that led to the identific ation of electromagnetic and light waves. This model was given up in later formulations of the theory but its most important feature was the inc orporation of a phenomenon known as the displac ement c urrent whic h was responsible for the transmission of elec tric waves through spac e, thereby produc ing the effec t of having a c losed c irc uit between two c onduc tors. The aether model explained how the displac ement of elec tric ity took plac e, but it was this notion of elec tric displac ement that was the key to produc ing a field-theoretic ac c ount of elec tromagnetism. The idea at the foundation of Maxwell's theory was Faraday's ac c ount of elec tromagnetism in terms of lines of forc e filling spac e. Prior to that, it was thought that elec tromagnetic forc e resided in material bodies and c ould only be transmitted through some type of mec hanic al interac tion. The notion that the seat of elec tromagnetic c harge was the field rather than matter was both revolutionary and c ontroversial. In order to disengage the theory from its questionable origins, later versions relied on what Maxwell desc ribed as “firmly established empiric al fac ts” together with a few general dynamic al princ iples as c harac teriz ed by the 5 Page 3 of 24
Unification in Physics abstrac t mathematic al struc ture of Lagrangian mec hanic s.5 That struc ture, unlike the mec hanic al model, provided no explanatory ac c ount of how elec tromagnetic waves were propagated through spac e nor any understanding of the nature of elec tric c harge. The new unified theory based on that abstrac t dynamic s entailed no onto logic al c ommitment to the existenc e of forc es or struc tures that c ould be seen as the sourc e of elec tromagnetic phenomena. The displac ement c urrent was retained as a basic feature (one of the equations), but no mec hanic al hypothesis was put forward regarding its nature. The question that immediately arises is how we should view this type of unific ation given that the initial model was fic titious and the later version had no underlying ontologic al foundation c apable of grounding the apparent reduc tion. In order to address that question I want to draw attention to the role that mathematic al struc tures play in the unifying proc ess and how those struc tures were able to fac ilitate a reduc tive unity without implying an ontologic al unity in nature. As it turns out, Maxwell's theory did ac c omplish the latter but no evidenc e for that was forthc oming until the produc tion of elec tromagnetic waves by Hertz in 1888. Maxwell's use of the Lagrangian approac h was due primarily to its generality, whic h makes it applic able in a variety of c ontexts; and it was ultimately this feature that made it espec ially suited to unifying different phenomena/domains. In addition to the importanc e of these types of mathematic al struc tures for unific ation, I also want to highlight what I see as the mark of a truly unified theory—the presenc e of a spec ific theoretic al quantity/parameter that represents the theory's ability to reduc e, identify, or synthesiz e two or more proc esses within a single theoretic al framework. What I have in mind here is the idea that one partic ular parameter func tions as a manifestation of the reduc tion of different phenomena to one spec ific kind, or its presenc e produc es a theoretic al c ontext wherein different phenomena c an be unified. In Maxwell's theory the displac ement c urrent plays just suc h a role. It figures prominently as a fundamental quantity in the field equations, and without it there c ould be no notion of a quantity of elec tric ity c rossing a boundary, no derivation of the elec tromagnetic wave equation and henc e no field theoretic basis for elec tromagnetism. In other words, displac ement is responsible for c reating the field theoretic pic ture that allows Maxwell to identify light and elec tromagnetic waves as field theoretic al proc esses. As we shall see below, the Weinberg angle in the elec troweak theory func tions as the “unifying parameter” in that it represents the mixing of the weak and elec tromagnetic fields. Without suc h a parameter, we simply have a theory that c an ac c ommodate different kinds of phenomena but without any relation or c onnec tion between them. To see exac tly why elec trodynamic s qualifies as a reduc tive unific ation and to illustrate the differenc es with synthetic unity let me give a brief overview of the evolution of the theory from its origins in the aether model to its abstrac t dynamic al formulation. Trac ing some of these details is important bec ause in both reduc tive and synthetic unific ation there is a relianc e on mathematic al frameworks as a spec ific type of unifying tool (Lagrangian mec hanic s in the Maxwellian c ase and gauge theory in the elec troweak c ase) yet the outc omes are very different in eac h c ase. In other words, the generality in the applic ation of these frameworks to diverse phenomena does not entail anything spec ific about the type of unity that is produc ed. The latter is solely a produc t of the spec ific way in whic h the phenomena are brought together. The differenc e between reduc tive and synthetic unity is an important feature in establishing ontologic al c laims about unity in nature; henc e, the details of how eac h is ac hieved are an important part of artic ulating how, exac tly, unific ation in physic s ought to be understood. In other words, we c an sometimes c onstruc t unified theories but whether there is evidenc e for unity in nature is a different matter. 2.1 From Fictional Models to a Unified Theory Maxwell's desc ribes his 1861– 62 paper, “On Physic al Lines of Forc e,” as an attempt to “examine” elec tromagnetic phenomena from a mec hanic al point of view and to determine what tensions in, or motions of, a medium were c apable of produc ing the observed mec hanic al phenomena (Maxwell 1965, 1: 467). Faraday had desc ribed elec tromagnetic phenomena as lines of forc e permeating spac e rather than the result of an interac tion among material bodies. At the time Thomson had developed an ac c ount of magnetism that involved the rotation of molec ular vortic es in a fluid aether, an idea that led Maxwell to hypothesiz e that in a magnetic field the medium (or aether) was in rotation around the lines of forc e, the rotation being performed by molec ular vortic es whose axes were parallel to the lines. In order to spec ify the forc es that c aused the medium to move and to ac c ount for elec tric c urrents, Maxwell needed to provide an explanation of the transmission of rotation of the vortic es; something he ac hieved via his aether model. The spec ific details of this early version of the model are not important here but what is important is how the sec ond of his aether models, developed to ac c ount for elec trostatic s, fac ilitated the Page 4 of 24
Unification in Physics derivation of his theory of light. In order to explain c harge and to derive the law of attrac tion between c harged bodies, Maxwell c onstruc ted an elastic solid model in whic h the aetherial substanc e formed spheric al c ells endowed with elastic ity. The c ells were separated by elec tric partic les whose ac tion on the c ells would result in a kind of distortion. Henc e, the effec t of an elec tromotive forc e was to distort the c ells by a c hange in the positions of the elec tric partic les. Bec ause c hanges in displac ement involved a motion of elec tric ity, Maxwell argued that they should be “treated as” c urrents (1965, 1: 491). That gave rise to an elastic forc e that set off a c hain reac tion. Maxwell saw the distortion of the c ells as a displac ement of elec tric ity within eac h molec ule, with the total effec t over the entire medium produc ing a “general displac ement of elec tric ity in a given direc tion” (Maxwell 1965, 1: 491). Understood literally, the notion of displac ement meant that the elements of the dielec tric had c hanged positions. Displac ement also served as a model for dielec tric polariz ation; elec tromotive forc e was responsible for distorting the c ells, and its ac tion on the dielec tric produc ed a state of polariz ation. When the forc e was removed, the c ells would rec over their form and the elec tric ity would return to its former position (Maxwell 1965, 1: 492). The amount of displac ement depended on the nature of the body and on the elec tromotive forc e. Bec ause the phenomenologic al law governing displac ement expressed the relation between polariz ation and forc e, Maxwell was able to use it to c alc ulate the aether's elastic ity (the c oeffic ient of rigidity), the c ruc ial step that led him to identify the elec tromagnetic and luminiferous aethers. It is interesting to note that in Parts I and II of “On Physic al Lines” there is no mention of the optic al aether. However, onc e the elec tromagnetic medium was endowed with elastic ity, Maxwell relied on the optic al aether in support of his assumption: “The undulatory theory of light requires us to admit this kind of elastic ity in the luminiferous medium in order to ac c ount for transverse vibrations. We need not then be surprised if the magneto-elec tric medium possesses the same property” (1965, 1: 489). After a series of mathematic al steps, whic h inc luded c orrec ting the equations of elec tric c urrents for the effec t produc ed by elastic ity and c alc ulating the value for e, the quantity of free elec tric ity in a unit volume, and E, the dielec tric c onstant, he went on to determine the veloc ity with whic h transverse waves were propagated through the elec tromagnetic aether. The rate of propagation was based on the assumption desc ribed above—that the elastic ity was due to forc es ac ting between pairs of partic les. Using the formula V = √m/ρ, where m is the c oeffic ient of rigidity, ρ is the aethereal mass density, and μ is the c oeffic ient of magnetic induc tion, we have giving us π m = V2 μ, whic h yields E = V√μ. Maxwell arrived at a value for V that, muc h to his astonishment, agreed with the value c alc ulated for the veloc ity of light (V = 310,740,000,000 mm/sec ), whic h led him to remark that: “The veloc ity of transverse undulations in our hypothetic al medium, c alc ulated from the elec tromagnetic experiments of Kohlrausc h and Weber, agrees so exac tly with the veloc ity of light c alc ulated from the optic al experiment of M. Fiseau that we c an scarcely avoid the inference that light consists in the transverse undulations of the same medium which is the cause of electric and magnetic phenomena (1965, 1: 500).” Maxwell's suc c ess involved linking the equation desc ribing displac ement (R = −4π E2 h) with the aether's elastic ity (modeled on Hooke's law), where displac ement produc es a restoring forc e in response to the distortion of the c ells of the medium. However, R = −4π E2 h is also an elec tric al equation representing the flow of c harge produc ed by elec tromotive forc e. Consequently, the dielec tric c onstant E is both an elastic c oeffic ient and an elec tric c onstant. Interpreting E in this way allowed Maxwell to determine its value and ultimately identify it with the veloc ity of transverse waves traveling through an elastic aether. In modern differential form, Maxwell's four equations relate the Elec tric Field (E) and magnetic field (B) to the c harge (ρ) and c urrent (J) densities that spec ify the fields and give rise to elec tromagnetic radiation—light. Page 5 of 24
Unification in Physics D is the displac ement field and H the magnetiz ing field. The first equation, Gauss's law, desc ribes how an elec tric field is generated by elec tric c harges where the former tends to point away from positive and toward negative c harges. More spec ific ally, it relates the elec tric flux through any hypothetic al c losed Gaussian surfac e to the elec tric c harge within the surfac e. Gauss's law for magnetism states that there are no “magnetic c harges” (magnetic monopoles), analogous to elec tric c harges; or, that the total magnetic flux through any Gaussian surfac e is z ero. Faraday's law desc ribes how a c hanging magnetic field c an induc e an elec tric field. Finally, Ampère's law with Maxwell's c orrec tion states that magnetic fields c an be generated by elec tric al c urrent (whic h was the original “Ampère law”) and by c hanging elec tric fields (Maxwell's c orrec tion). Maxwell's c orrec tion to Ampère's law is c ruc ial, sinc e it spec ifies that both a c hanging magnetic field gives rise to an elec tric field and a c hanging elec tric field c reates a magnetic field. Consequently, self-sustaining elec tromagnetic waves c an propagate through spac e. In other words, it allows for the possibility of “open c irc uits.”6 Given the importanc e of displac ement for produc ing a field-theoretic ac c ount of elec tromagnetism and its role in c alc ulating the veloc ity of waves, it is obviously the essential parameter in identifying the optic al and elec tromagnetic aethers. In later versions of the theory, the aether was abandoned, but displac ement remained as a fundamental quantity. However, its status c hanged onc e it was inc orporated into the Lagrangian formulation of the theory in that it was no longer assoc iated with an elec tric /elastic mec hanic al explanation. What Maxwell had in fac t shown was that given the spec ific assumptions employed in developing the mec hanic al details of his model, the elastic properties of the elec tromagnetic medium were just those required of the luminiferous aether by the wave theory of light. Henc e, what was effec ted was the reduc tion of elec tromagnetism and optic s to the mec hanic s of one aether, rather than a reduc tion of optic s to elec tromagnetism simplic iter. In that sense, the first form of Maxwell's theory displayed a reduc tive unity, but the more interesting question is whether, in the absenc e of the aether, the identific ation of elec tromagnetic and optic al waves still c onstitutes a reduc tion of two different proc esses to a single natural kind. The answer to this question is c omplic ated by the diffic ulties that plagued the model, the most serious being the status of elec tric displac ement itself. Not only did it suffer from ambiguities in interpretation, it was not a natural c onsequenc e of the model and there was no experimental data that required its postulation. It was introduc ed purely to fac ilitate a field theoretic ac c ount of elec tromagnetic proc esses. Moreover, the equation relating displac ement with c harge was not explic itly given, and without any “physic al” ac c ount of the field it bec ame diffic ult to see just how c harge c ould oc c ur. Maxwell c laimed that his later ac c ount entitled “A Dynamic al Theory of the Elec tromagnetic Field” (1865) (DT) was based on experimental fac ts and general dynamic al princ iples about matter in motion as c harac teriz ed by the abstrac t dynamic s of Lagrange. The aim of Lagrange's Mécanique Analytique (1788) was to rid mec hanic s of Newtonian forc es and the requirement that we must c onstruc t a separate ac ting forc e for eac h partic le. The equations of motion for a mec hanic al system were derived from the princ iple of virtual veloc ities and d'Alembert's princ iple.7 The method c onsisted of expressing the elementary dynamic al relations in terms of the c orresponding relations of pure algebraic quantities, whic h fac ilitated the deduc tion of the equations of motion. Consequently, insofar as the formal struc ture is c onc erned, analytic al mec hanic s, elec tromagnetism, and wave mec hanic s c an all be deduc ed from a variational princ iple, the result being that eac h theory has a uniform Lagrangian appearanc e. Veloc ities, momenta, and forc es related to the c oordinates in the equations of motion need not be interpreted literally in the fashion of their Newtonian c ounterparts. This allows for the field to be represented as a c onnec ted mec hanic al system with c urrents, integral c urrents, and generaliz ed c oordinates c orresponding to the veloc ities and positions of the c onduc tors. In other words, we c an have a quantitative determination of the field without knowing the ac tual motion, loc ation, and nature of the system itself. Using this method Maxwell went on to derive the basic wave equations of elec tromagnetism without any spec ial assumptions about molec ular vortic es, forc es between elec tric al partic les, and without spec ifying the details of the Page 6 of 24
Unification in Physics mec hanic al struc ture of the field. The 20 equations c onsisted of three equations eac h for magnetic forc e, elec tric c urrents, elec tromotive forc e, elec tric elastic ity, elec tric resistanc e, total c urrents; and one equation eac h for free elec tric ity and c ontinuity. This allowed him to treat the aether (or field) as a mec hanic al system without any spec ific ation of the mac hinery that gave rise to the c harac teristic s exhibited by the potential-energy func tion.8 It bec omes c lear, then, that the unifying power of the Lagrangian approac h lay in the fac t that it ignored the nature of the system and the details of its motion. Bec ause very little information is provided about the physic al system, it is easier to bring together diverse phenomena under a c ommon framework. Only their general features are ac c ounted for, yielding a unific ation that, to some extent, is simply a formal analogy between two different kinds of phenomena. The Lagrangian emphasis on energetic properties of a system, rather than its internal struc ture, bec ame espec ially important after the establishment of the princ iple of c onservation of energy. In fac t, in the Treatise on Electricity and Magnetism the notion of “field energy” bec ame the physic al princ iple on whic h an otherwise abstrac t dynamic s c ould rest. Maxwell c laimed that all physic al c onc epts in “A Dynamic al Theory,” exc ept energy, were understood to be merely illustrative, rather than substantial. Displac ement c onstituted one of the basic equations and was defined simply as the motion of elec tric ity, that is, in terms of a quantity of c harge c rossing a designated area. But, if elec tric ity was being displac ed, how did this oc c ur? Due to the lac k of a mec hanic al foundation, the idea that there was a displac ement of elec tric ity in the field (a c harge), without an assoc iated mec hanic al sourc e or body, bec ame diffic ult to motivate theoretic ally. These issues did not pose signific ant problems for Maxwell himself, sinc e he assoc iated the forc e fields with the underlying potentials.9 Bec ause the value of wave propagation for elec tromagnetic phenomena is equivalent to that for light, the wave equation represents the reduc tion of elec tromagnetism and optic s, a proc ess that was fac ilitated by the displacement current. What methodologic al lessons about unific ation c an be gleaned from the Maxwell c ase? At the very least, it shows that theory unific ation c an be a rather c omplex proc ess that integrates mathematic al tec hniques and broad- ranging physic al princ iples that govern material systems. In addition to the generality of the Lagrangian formalism, its deduc tive c harac ter displays a c ruc ial feature for suc c essful unific ation: the ability to derive equations of motion for a physic al system with a minimum of information. But, as I noted above, this mathematic al framework provided little or no insight into spec ific physic al details, leaving the problem of whether to interpret the unific ation as indic ative of a physic al unity in nature. This problem is partic ularly relevant bec ause of the ac c ompanying diffic ulties with displac ement. Bec ause it provides a nec essary c ondition for formulating the field equations, it forms the foundation for a truly unified theory that integrates or reduc es various phenomena as opposed to one that simply inc orporates more phenomena than its rivals. However, the theory c ast in terms of the Lagrangian formalism lac ked real explanatory power due to the absenc e of spec ific theoretic al details. The field equations c ould ac c ount for both optic al and elec tromagnetic proc esses as the results of waves traveling through spac e, but there was no theoretic al foundation for understanding of how that took plac e. And, in the absenc e of any experimental evidenc e for elec tromagnetic waves what Maxwell had shown was only that a unific ation and reduc tion of elec tromagnetism and optic s was theoretic ally possible. Although the unity ac hieved in “On Physic al Lines” and later versions of the theory involved the reduc tion of optic al and elec tromagnetic proc esses, the elec tric and magnetic fields retained their independenc e; the theory simply showed the interrelationship of the two—where a varying elec tric field exists, there is also a varying magnetic field induc ed at right angles, and vic e versa. The two together form the elec tromagnetic field. In that sense the theory united the two kinds of forc es by integrating them in a systematic or synthetic way, but their true unific ation did not take plac e until 1905 with the Spec ial Theory of Relativity. Maxwell's equations were c ruc ial in motivating Einstein's paper where he noted in the beginning paragraph that a desc ription of a c onduc tor moving with respec t to a magnet must generate a c onsistent set of fields irrespec tive of whether the forc e is c alc ulated in the rest frame of the magnet or that of the c onduc tor. Maxwell's equations generated an asymmetry that was not present in the phenomena (1952, 37). Without going into the details of the unific ation provided by spec ial relativity, it is important to point out that the unity of elec tric ity and magnetism was also indic ative of something deeper and more pervasive, spec ific ally, a unific ation of two domains of physic s—mec hanic s and elec trodynamic s. This latter unific ation was a realiz ation of the requirement that the laws of physic s must assume the same form in all inertial frames. The further mathematiz ation of the event struc ture of the theory at the hands of Minkowski showed that the relationship between elec tric ity and magnetism c ould be represented by the transformation properties of the elec tromagnetic Page 7 of 24
Unification in Physics field tensor. If one begins with a field E due to a static c harge distribution, with no magnetic field, and transforms to another frame moving with uniform veloc ity, the transformation equations show that there exists a magnetic field in the moving frame even though none existed in the inertial frame. Henc e, the magnetic field appears as an effec t of the transformation from one frame of referenc e to another. In Maxwell's theory the elec tric and magnetic fields were two entities c ombined by an angle of interac tion, whereas, in the Minkowski formulation the elec tromagnetic field is one entity represented by one tensor—their separation is merely a frame dependent phenomena.10 Maxwell's elec trodynamic s was the first unified field theory in physic s. It exemplified the same type of reduc tive unity present in Newtonian mec hanic s, whic h unified terrestrial and c elestial motion under the same forc e law— universal gravitation. But few if any subsequent c ases of unific ation have demonstrated this kind of reduc tion; in fac t most unified theories are the result of a synthesis of different phenomena under a single theoretic al framework. Mathematic s c ontinues to be c ruc ial for ac hieving unific ation but as we shall see below, in the elec troweak c ase the goal is to use mathematic al tools like symmetry for generating a unified dynamic s as opposed to a framework for representing existing theoretic al relations among different phenomena as in the c ase of elec trodynamic s. 3. Synthetic Unity: The Electro weak Theo ry The elec troweak theory brings together elec tromagnetism with the weak forc e in a single relativistic quantum field theory that involves the produc t of two gauge symmetry groups. From the perspec tive of phenomenology these two forc es are very different. Elec tromagnetism has an infinite range; whereas, the weak forc e, whic h produc es radioac tive beta dec ay, spans distanc es shorter than approximately 10−15 c m. Moreover, the photon assoc iated with the elec tromagnetic field is massless, while the bosons assoc iated with the weak forc e are massive due to their short range. Despite these differenc es, they do share some c ommon features: both kinds of interac tions affec t leptons and hadrons; both appear to be vec tor interac tions brought about by the exc hange of partic les c arrying unit spin and negative parity, and both have their own universal c oupling c onstant that governs the strength of the interac tions. The elec troweak theory is joined with quantum c hromodynamic s (QCD)—the theory of the strong interac tions—to form the Standard Model. My foc us here will be largely on the elec troweak theory for several reasons. First, and perhaps most important for our purposes, by examining the struc ture of the elec troweak theory it is possible to illustrate the nature of unific ation in a way that is not possible with the larger Standard Model. The elec troweak theory involves a c ombination of the SU(2) group governing isospin/weak interac tions and the U(1) group of elec tromagnetism. The mixing of these fields is represented by the Weinberg angle sin2 θ, whic h is a free parameter whose value is determined experimentally. The Standard Model struc ture involves the addition of the SU(3) symmetry group that governs the c olor c harged fermions (quarks) to form the SU(3) x SU(2) x U(1) group. The SU(3) c olor group c orresponds to the loc al symmetry whose gauging gives rise to quantum c hromodynamic s—the theory that governs the strong forc e (QCD). In addition to some of the outstanding theoretic al problems with the Standard Model, suc h as the origin of the masses and mixings of the quarks and leptons, the most signific ant problem from the “unific ation” perspec tive c omes in the applic ation of the theory, whic h involves signific antly more free parameters than elec troweak—approximately 26 in total.11 While the inc ompatibility of the Standard Model with gravity, and until rec ently the status of the Higgs boson, are often c ited as stumbling bloc ks for unific ation, it is the internal struc ture of the Standard Model itself that undermines its status as a unified theory. Moreover, finding the Higgs partic le will only partly rec tify the problems. By c ontrast, the elec troweak theory has only one free parameter and involves more than a simple pasting together of the two different forc e fields under a c ombined symmetry group. As we shall see, it furnishes an ac c ount of the mixing of the fields that involves a synthetic unity that is simply not possible in the c urrent version of the Standard Model. That is not to say that the elec troweak theory is without its own diffic ulties, but only that it c learly qualifies as a unified theory in a way that the Standard Model does not. Below I disc uss some of these theoretic al issues as they arise in the c ontext of the elec troweak theory and their relation to the larger Standard Model, but first let me turn to a more detailed disc ussion of the spec ific s of the elec troweak theory to illustrate the exac t nature of the unific ation and how it was produc ed. A solution to the inc ompatibility between elec tromagnetism and the weak forc e was ac hieved by postulating the Higgs mec hanism, the newly found element of the Standard Model. This fac ilitated a unific ation of the fields but Page 8 of 24
Unification in Physics does so in a way that leaves the forc es more or less distinc t. To see how this unity was ac hieved let me begin by disc ussing how gauge symmetry func tions as a unifying struc ture and go on to show how this type of unific ation presents us with a very different pic ture than the reduc tive unity provided by Newtonian mec hanic s and Maxwellian elec trodynamic s.12 3.1 Symmetry as a Tool for Unification In physic s, a gauge theory is a type of field theory where the Lagrangian is invariant under a c ontinuous group of loc al transformations known as gauge transformations. These transformations form a Lie group, whic h is the symmetry group or the gauge group with an assoc iated Lie algebra of group generators. For eac h group generator there nec essarily arises a c orresponding vec tor field c alled the gauge field, whic h is inc luded in the Lagrangian to ensure its invarianc e under the loc al group transformations. Simply put: in a gauge theory there is a group of transformations of the field variables (gauge transformations) that leaves the basic physic s of the quantum field unc hanged. This c ondition, c alled gauge invarianc e, gives the theory a c ertain symmetry, whic h governs its equations. Henc e, the struc ture of the group of gauge transformations in a partic ular gauge theory entails general restric tions on how the field desc ribed by that theory c an interac t with other fields and elementary partic les. This is the sense in whic h gauge theories are sometimes said to “generate” partic le dynamic s—their assoc iated symmetry c onstraints spec ify the form of interac tion terms. The symmetry assoc iated with elec tric c harge is a loc al symmetry where physic al laws are invariant under a loc al transformation. This involves an infinite number of separate transformations that are different at every point in spac e and time. But by introduc ing new forc e fields that transform in c ertain ways and interac t with the original partic les in the theory, a loc al invarianc e c an be restored. To see how loc al gauge invarianc e is related to physic al dynamic s c onsider the following: if we write the non- relativistic Schrodinger equation (where the c anonic al momentum operator pμ − eAμ is replac ed by the quantum operator -ih ∇- eA), then after a phase c hange an additional gradient term proportional to e∇λ emerges, the result of the operator -ih∇ ac ting on the transformation wave func tion. This additional term spoils the loc al phase invarianc e, whic h c an then be restored by introduc ing the new gauge field Aμ. The gauge transformation: c anc els out the new term. This new gauge field is simply the vec tor potential defining the elec tromagnetic field. A different c hoic e of phase at eac h point c an be ac c ommodated by interpreting Aμ as the c onnec tion relating phases at different points. In other words, the c hoic e of a phase func tion λ(x) will not affec t any observable quantity as long as the gauge transformation for Aμ has a form that allows the phase c hange and the c hange in potential to c anc el eac h other. What this means is that we c annot distinguish between the effec ts of a loc al phase c hange and the effec ts of a new vec tor field. The c ombination of the additional gradient term with the vec tor field Aμ presc ribes the form of the interac tion between matter and the field bec ause Aμ provides the c onnec tions between phase values at nearby points. The phase of a partic le's wave func tion c an be identified as a new physic al degree of freedom that is dependent on spac etime position. In fac t, it is possible to show that from the c onservation of elec tric c harge one c an, given Noether's theorem, c hoose a symmetry, and the requirement that it be loc al forc es us to introduc e a gauge field, whic h turns out to be the elec tromagnetic field. The struc ture of this field, whic h is dic tated by the requirement of loc al symmetry, in turn dic tates, almost uniquely, the form of the interac tion, that is, the prec ise form of the forc es on the c harged partic le and the way in whic h the elec tric -c harge c urrent density serves as the sourc e for the gauge field. In Maxwell's theory the basic field variables are the strengths of the elec tric and magnetic fields, whic h may be desc ribed in terms of auxiliary variables (e.g., the sc alar and vec tor potentials). The gauge transformations in this theory c onsist of c ertain alterations in the values of those potentials that do not result in a c hange of the elec tric and magnetic fields. This gauge invarianc e is preserved in quantum elec trodynamic s (QED) where the phase transformations are one-parameter transformations and form a one-dimensional Abelian group (meaning that any two transformations c ommute)—in this c ase the U(1) group of a U(1) gauge symmetry. Page 9 of 24
Unification in Physics Symmetry groups, however, are more than simply mathematiz ations of c ertain kinds of transformations. In the non- Abelian c ase (non-c ommutative transformations) the mathematic al struc ture of the symmetry group determines the struc ture of the gauge field and the form of the interac tion. In these more c omplic ated situations, there are several wave func tions or fields transforming together, as in the c ase of SU(2) and SU(3) transformations, whic h involve unitary matric es ac ting on multiplets.13 These symmetries are internal symmetries and typic ally are assoc iated with families of identic al partic les. In eac h c ase the c onserved quantities are simply the quantum numbers that label the members of the multiplets (suc h as isospin and c olor), together with operators that induc e transitions from one member of a multiplet to another. Henc e, the operators c orrespond, on the one hand, to the c onserved dynamic al variables (isospin, etc .) and, on the other hand, to the group of transformations of the symmetry group of the multiplets.14 The extension of gauge invarianc e beyond elec tromagnetism began with the work of Yang and Mills (1954) who generaliz ed it to the c onserved quantity isospin (violated in elec tromagnetic and weak interac tions), whic h allows the proton and neutron to be c onsidered as two states of the same partic le. Here a loc al gauge invarianc e means that although we c an, in one loc ation, label the proton as the “up” state of isospin, and the neutron as the “down” state, the up state need not be the same at another loc ation. But bec ause the SU(2) symmetry group that governs isospin is also the group that governs rotations in a three-dimensional spac e, the “phase” is replac ed by a loc al variable that spec ifies the direc tion of the isospin. However, it was not until the work of Sc hwinger (1957) that any signific ant c onnec tion was made between the weak and elec tromagnetic forc es. Sc hwinger's approac h was to begin with some basic princ iples of symmetry and field theory, and go on to develop a framework for fundamental interac tions derived from that fixed struc ture. As we saw above, in QED it was possible to show that from the c onservation of elec tric c harge, one c ould, on the basis of Noether's theorem, assume the existenc e of a symmetry, and the requirement that it be loc al forc es one to introduc e a gauge field, whic h turns out to be just the elec tromagnetic field. The symmetry struc ture of the gauge field dic tates, almost uniquely, the form of the interac tion; that is, the prec ise form of the forc es on the c harged partic le and the way in whic h the elec tric c harge c urrent density serves as the sourc e for the gauge field. The question was how to extend that methodology beyond quantum elec trodynamic s to embody weak interac tions. 3.2 From Mathematics to Physics Bec ause of the mass differenc es between the weak forc e bosons and photons a different kind of symmetry was required if elec trodynamic s and the weak interac tion were to be unified and the weak and elec tromagnetic c ouplings related. Due to the mass problem, it was thought that perhaps only partial symmetries—invarianc e of only part of the Lagrangian under a group of infinitesimal transformations—c ould relate the massive bosons to the massless photon. In 1961 Glashow developed a model based on the SU(2) x U(1) symmetry group, whic h required the introduc tion of an additional neutral boson Zs, whic h c ouples to its own neutral lepton c urrent Js. By properly choosing the mass terms to be inserted into the Lagrangian, Glashow was able to show that the singlet neutral boson from U(1) and the neutral member of the SU(2) triplet would mix in suc h a way as to produc e a massive partic le B (now identified as Z0 ) and a massless partic le that was identified with the photon. But, in order to retain Lagrangian invarianc e gauge theory requires the introduc tion of only massless partic les. As a result the boson masses had to be added to the theory by hand, making the models phenomenologic ally ac c urate but destroying the gauge invarianc e of the Lagrangian, thereby ruling out the possibility of renormaliz ation. Although gauge theory provided a powerful tool for generating an elec troweak model, unlike elec trodynamic s, one c ould not rec onc ile the physic al demands of the weak forc e for the existenc e of massive partic les with the struc tural demands of gauge invarianc e. Both needed to be ac c ommodated if there was to be a unified theory, yet they were mutually incompatible. Hopes of ac hieving a true synthesis of weak and elec tromagnetic interac tions c ame a few years later with Steven Weinberg's (1967) idea that one c ould understand the mass problem and the c oupling differenc es of the different interac tions by supposing that the symmetries relating the two interac tions were exac t symmetries of the Lagrangian that were somehow broken by the vac uum. These ideas originated in the early 1960s and were motivated by work done in solid state physic s on superc onduc tivity. But, if the elec troweak and the elec tromagnetic theory were truly unified and mediated by the same kind of gauge partic les, then how c ould suc h a differenc e in the masses of the bosons and the photons exist? In order for the elec troweak theory to work, it had to be possible for the gauge partic les to ac quire a mass in a way that would preserve gauge invarianc e. Page 10 of 24
Unification in Physics The answer to these questions was provided by the mec hanism of spontaneous symmetry breaking. From work in solid state physic s, it was known that when a loc al symmetry is spontaneously broken the vec tor partic les ac quire a mass through a phenomenon that c ame to be known as the Higgs mec hanism (Higgs, 1964a&b). This princ iple of spontaneous symmetry breaking implies that the ac tual symmetry of a system c an be less than the symmetry of its underlying physic al laws; in other words, the Hamiltonian and c ommutation relations of a quantum theory would possess an exac t symmetry while physic ally the system (in this c ase the partic le physic s vac uum) would be nonsymmetric al. In order for the idea to have any merit one must assume that the vac uum is a degenerate state (i.e., not unique) suc h that for eac h unsymmetric al vac uum state there are others of the same minimal energy that are related to the first by various symmetry transformations that preserve the invarianc e of physic al laws. The phenomena observed within the framework of this unsymmetric al vac uum state will exhibit the broken symmetry even in the way that the physic al laws appear to operate. Although there is no evidenc e that the vac uum state for the elec troweak theory is degenerate, it c an be made so by the introduc tion of the Higgs mec hanism, whic h is an additional field with a definite but arbitrary orientation in the isospin vec tor spac e. The orientation breaks the symmetry of the vac uum. The Higgs field (or its assoc iated partic le the Higgs boson) is really a c omplex SU(2) doublet c onsisting of four real fields, which are needed to transform the massless gauge fields into massive ones. A massless gauge boson like the photon has two orthogonal spin c omponents transverse to the direc tion of motion while massive gauge bosons have three inc luding a longitudinal c omponent in the direc tion of motion. In the elec troweak theory the W+− and the Z0 , whic h are the c arriers of the weak forc e, absorb three of the four Higgs fields, thereby forming their longitudinal spin c omponents and ac quiring a mass. The remaining neutral Higgs field is not affec ted and should therefore be observable as a partic le in its own right. The Higgs field breaks the symmetry of the vac uum by having a preferred direc tion in spac e, but the symmetry of the Lagrangian remains invariant. So, the elec troweak gauge theory predic ts the existenc e of four gauge quanta, a neutral photon-like objec t, sometimes referred to as the X0 and assoc iated with the U(1) symmetry, as well as a weak isospin triplet W+− and W0 assoc iated with the SU(2) symmetry. As a result of the Higgs symmetry breaking mec hanisms the partic les W+− ac quire a mass and the X0 and W0 are mixed so that the neutral partic les one sees in nature are really two different linear c ombinations of these two. One of these neutral partic les, the Z0 , has a mass while the other, the photon, is massless. Sinc e the masses of the W+− and Z0 are governed by the struc ture of the Higgs field they do not affec t the basic gauge invarianc e of the theory. The so-c alled “weakness” of the weak interac tion, whic h is mediated by the W+− and the Z0 , is understood as a c onsequenc e of the masses of these partic les. We c an see from the disc ussion above that the Higgs phenomenon plays two related roles in the theory. It explains the disc repanc y between the photon and the intermediate vec tor boson masses—the photon remains massless bec ause it c orresponds to the unbroken symmetry subgroup U(1) assoc iated with the c onservation of c harge, while the bosons have masses bec ause they c orrespond to SU(2) symmetries that are broken. Sec ond, the avoidanc e of an explic it mass term in the Lagrangian allows for gauge invarianc e and the possibility of renormaliz ability. With this mec hanism in plac e the weak and elec tromagnetic interac tions c ould be unified under a larger gauge symmetry group that resulted from the produc t of the SU(2) group that governed the weak interac tions and the U(1) group of elec trodynamic s.15 From this very brief sketc h, one c an get at least a snapshot of the role played by the formal, struc tural c onstraints provided by gauge theory/symmetry in the development of the elec troweak theory. I now want to turn to the spec ific kind of unity that emerged in this c ontext. The point I want to emphasiz e regarding the elec troweak unific ation is that the unity ac hieved was largely struc tural rather than substantial and as a result does not fit with the ideal of reduc ing elements of the weak and elec tromagnetic forc e to the same basic entity. In the c ase of elec trodynamic s, the generality provided by the Lagrangian formalism allowed Maxwell to unify elec tromagnetism and optic s without providing any spec ific details about how the elec tromagnetic waves were produc ed or how they were propagated through spac e. However, in addition to the struc tural aspec ts of the unific ation, light and elec tromagnetic waves were thought to be identic al; henc e the reduc tive aspec t of the unific ation. The SU(2) × U(1) gauge theory furnishes a similar kind of struc ture; it spec ifies the form of the interac tions between the weak and elec tromagnetic forc es but provides no c ausal ac c ount as to why the fields must be unified. In this c ase both the elec tromagnetic and weak forc es remain essentially distinc t; the unity that is supposedly ac hieved results from the unique way in whic h these forc es interac t. Henc e, with respec t to the unifying proc ess the c ore of the theory is really the representation of the Page 11 of 24
Unification in Physics interac tion or mixing of the various fields. Bec ause the fields remain distinc t, the theory retains two distinc t c oupling c onstants, q assoc iated with the U(1) elec tromagnetic field and g with the SU(2) gauge field. In order to make spec ific predic tions for the masses of the W+− and Z0 partic les, one needs to know the value for the Higgs ground state | Ф0 | . Unfortunately, this c annot be direc tly c alc ulated, sinc e its value depends explic itly on the parameters of the Higgs potential and at the time the theory was formulated little was known about the properties of the field.16 In order to rec tify the problem, the c oupling c onstants are c ombined into a single parameter known as the Weinberg angle θw. The angle is defined from the normaliz ed forms of Aem and Z0 whic h are respec tively: The mixing of the Aμ gauge field of U(1) and the new neutral gauge field Wμ3 is interpreted as a rotation through θw i.e., By relating the weak c oupling c onstant g to the Fermi c oupling c onstant G one obviates the need for the quantity | Фa | (the value of the Higgs ground state). The masses c an now be defined in the following way: In order to obtain a value for θw, one needs to know the relative sign and values of g and q; the problem however is that they are not direc tly measurable. Instead one must measure the interac tion rates for the W+− and Z0 exc hange proc esses and then extrac t values for g, q, and θw. What θw does is fix the ratio of U(1) and SU(2) c ouplings, and in order for the theory to be unified θw must be the same for all proc esses. Despite this rather restric tive c ondition, the theory itself does not provide direc t values for the Weinberg angle and henc e does not furnish a full account of how the fields are mixed (i.e., the degree of mixing is not determined by the theory). More important, the mixing is not the result of c onstraints imposed direc tly by gauge theory itself; rather it ultimately depends on the assumption that leptons c an be c lassified as weak isospin doublets governed by the SU(2) symmetry group. The latter requires the introduc tion of the new neutral gauge field W3 in order to c omplete the group generators, that is, a field c orresponding to the isospin operator τ. This is the field that c ombines with the neutral photon- like X0 to produc e the Z0 nec essary for the unity. We c an see then that the use of symmetries to c ategoriz e various kinds of partic les and their interac tion fields is muc h more than simply a phenomenologic al c lassific ation; in addition it allows for a kind of partic le dynamic s to emerge. In other words, the symmetry group provides the foundation for the loc ally gauge-invariant quantum field theory. Henc e, given the assumption about isospin, the formal restric tions of the symmetry groups and gauge theory c an be deployed in order to produc e a formal model showing how these gauge fields c ould be unified. The c ruc ial feature that fac ilitates this interac tion is the non-Abelian struc ture of the group rather than something derivable from phenomenology of the physic s. Although the Higgs mec hanism is a c ruc ial part of the physic al dynamic s of the theory and nec essary for a unified pic ture to emerge, the framework within whic h the unific ation is realiz ed results from the c onstraints of the isospin SU(2) group and the non-Abelian struc ture of the field. To summariz e: gauge theory serves as a unifying tool by spec ifying the form for the strong, weak, and elec tromagnetic fields. In that sense it func tions in a global way to restric t the c lass of ac c eptable theories and in a loc al way to determine spec ific kinds of interac tions, produc ing not only unified theories but also a unified method. But it is not simply the presenc e of a unifying method or structure that is required for theory unific ation. As we saw in with elec trodynamic s, the displac ement c urrent was the c ruc ial theoretic al parameter that allowed Maxwell to formulate a field theoretic ac c ount of elec tromagnetism and to c alc ulate the veloc ity of wave propagation. The Higgs mec hanism fac ilitates the unific ation in the elec troweak theory by providing the symmetry-breaking mec hanism that c reates the boson masses; however, it does not explain the mixing of the fields. That mixing was Page 12 of 24
Unification in Physics possible through the identific ation of leptons with the SU(2) isospin symmetry group and represented in the Weinberg angle θw. Employing gauge-theoretic al c onstraints, one c ould then generate the dynamic s of an elec troweak model from the mathematic al framework of gauge theory. But in what sense does this mixing represent a unific ation? Bec ause of the neutral-c urrent interac tions, the old measure of elec tric c harge given by Coulomb's law (whic h supposedly gives the total forc e between elec trons) was no longer applic able. Owing to the c ontribution from the new weak interac tion, the elec tromagnetic potential Aeμm could not be just the gauge field Aμ but had to be a linear combination of the U(1) gauge field and the W3μ field of SU(2). Henc e, the mixing was nec essary if the elec tromagnetic potential was to have a physic al interpretation in the new theory. So although the two interac tions are integrated under a framework that results from a c ombination of their independent symmetry groups, there is a genuine unity, not merely the c onjunc tion of two theories. A rec onc eptualiz ation of the elec tromagnetic potential and a new dynamic s emerged from the mixing of the fields. Although this synthesis retains an element of independenc e for eac h domain, it also yields a broader theoretic al framework within whic h their integration c an be ac hieved. So, despite the lac k of reduc tion, the elec troweak theory nevertheless provides a unified ac c ount of the two fields. 4. Pro blems and Pro spects: Electro weak Unificatio n and Beyo nd As we noted above, the c ruc ial parameter in the elec troweak theory is the Weinberg angle, or as it is sometimes c alled, the weak mixing angle but its value is not predic ted from within the theory and needs to be extrac ted from parity-violating neutral-c urrent experiments. The elec troweak theory has enjoyed overwhelming suc c esses with predic tions holding over a range of distanc es from 10−18 m to more than 108 m. It has predic ted the existenc e and properties of weak neutral c urrent interac tions, the properties of the gauge bosons W+− and Z0 that mediate neutral and c harge c urrent interac tions, and required the fourth quark flavor—c harm. The rec ent disc overy of the Higgs boson provides the missing link for the elec troweak theory but there is still a great deal left unanswered. With a large amount of data still unanalysed, questions remain as to whether the disc overy points to a simple Higgs partic le or a more c omplex entity in a larger family of Higgs partic les. The standard model predic ts that the Higgs boson lasts for only a very short time before it dec ays into other well known partic les. These dec ay patterns are the data relevant for the disc overy. The dec ay c hannels, five studied by CMS, yielded a signal with statistic al signific anc e at 4.9 above bac kground. The c ombined fit to the two most sensitive and high resolution c hannels (photos and leptons) yielded a statistic al signific anc e of 5 sigma. What this means is that the probability of the bac kground alone fluc tuating up by this amount or more is about one in three million. Further data are required to measure properties like the dec ay rates in various c hannels as well as the spin and parity. These will determine whether the observed partic le is the Higgs boson as predic ted by the standard model, a more c omplic ated version of it or the result of new physic s beyond it. Some of the other problems fac ing the elec troweak theory spec ific ally inc lude the fac t that it ac c ommodates but does not predict or explain fermion masses and mixings (elementary fermions are quarks and leptons while c omposite fermions are baryons that inc lude protons and neutrons). The CKM (Cabibbo-Kobayashi-Maskawa) framework, whic h represents quark mixing using a 3 × 3 unitary matrix, desc ribes CP violation but does not explain its origin.17 The mass of the neutrino, whic h is implied as a result of the disc overy of neutrino flavor mixing, also requires an extension of the c urrent elec troweak theory, sinc e spec ific values are determined by Yukawa c ouplings of fermions to the Higgs field rather than being set by the theory itself.18 There are several other problems related to the instability of the Higgs sec tor to large radiative c orrec tions as well as the lac k of any c andidates to explain the c old dark matter required for struc ture formation in the early universe. The Higgs boson, however, is unlikely to provide an explanation for dark matter sinc e the latter must be stable with a very long lifetime and the Higgs dec ays very rapidly. The favoured explanation is the least massive supersymmetric partic le bec ause it c annot dec ay any further; but, despite the enormous quantity of data from the LHC there is as yet no evidenc e for the existenc e of any supersymetric partic les. Many of these issues speak to the inc ompleteness of the Standard Model in general and against the view that it provides a unified desc ription of the strong, weak, and elec tromagnetic forc es. But, some of these issues are also related to the c onnec tion between the elec troweak theory and the larger c ontext of the Standard Model. For example, the CP violation mentioned above is one suc h problem. Quantum c hromodynamic s (QCD) does not seem to break the CP symmetry even though the elec troweak theory does. Although there are natural terms in the Page 13 of 24
Unification in Physics QCD Lagrangian that c an break the CP symmetry, experiments do not indic ate any CP violation in the QCD sec tor. One of the reasons the CP problem is troublesome is that it leaves unanswered the question of why the universe does not c onsist of equal parts matter and antimatter. In fac t, it is possible to show that one of the c onditions required for the c urrent imbalanc e is CP violation during the first sec onds after the Big Bang. Other explanations require the imbalanc e to be present from the beginning, whic h is far less plausible. Although the violation of CP symmetry has been verified in the c ase of the weak forc e, it only ac c ounts for a small portion of the violation required to explain the matter in the universe. The fac t that this disc repanc y is not even predic ted by the Standard Model suggests a rather serious gap or inc ompatibility with the elec troweak sec tor. And, there are other more pressing problems for elec troweak theory itself. In addition to the fermion mass problem, there are also the mixing angles that parameteriz e the disc repanc ies between neutrino mass eigenstates and those in the quark sec tor. Although the Higgs boson may be responsible for fermion masses, there is nothing in the elec troweak theory that c an or will determine the c ouplings of the Higgs partic les to fermions; and in that sense the theory is seriously inc omplete.19 Another equally serious c onc ern is the gauge hierarc hy problem, whic h refers to the marked differenc e between fundamental parameters like masses and c ouplings that are c ontained in the Lagrangian and the values that are measured experimentally. Typic ally the latter are related to the former via renormaliz ation but in many c ases there are c anc ellations between the fundamental quantity and quantum c orrec tions that involve short distanc e physic s. The problem is that very often the details of physic s at short distanc es are largely unknown. More spec ific ally, the gauge hierarc hy problem relates to the fac t that the weak forc e is 103 2 times stronger than gravity. This disc repanc y gives rise to the question of why the Higgs boson or the weak sc ale (at 100 GeV) is so muc h smaller than the Planc k sc ale (at 1019 GeV). The weak sc ale is given by the vac uum expec tation value of the Higgs, VEV = 246 GeV, but it is not naturally stable under radiative c orrec tions. The radiative c orrec tions to the Higgs mass, whic h result from its c ouplings to gauge bosons, Yukawa c ouplings to fermions, and its self-c ouplings, result in a quadratic sensitivity to the ultraviolet c utoff. Henc e, if the Standard Model were valid up to the Planc k sc ale, then mh and therefore the minimum of the Higgs potential would be driven to the Planc k sc ale by the radiative corrections. To avoid this one has to adjust the Higgs bare mass in the Standard Model Lagrangian to one part in 1017 . This is c alled “unnatural fine-tuning” where naturalness is defined in terms of the magnitude of quantum c orrec tions where the bare value and the quantum c orrec tion appear to have an unexpec ted c anc ellation that gives a result muc h smaller than either c omponent. The issue of fine-tuning is important here bec ause, as we saw above, the mass of the Higgs boson is not given by the theory and without fine-tuning the mass would be so large as to undermine the internal c onsistenc y of the elec troweak theory. Henc e, the question bec omes whether additions to the Standard Model or any new physic s will still require fine-tuning. The answer will depend on what further data from the LHC will reveal about the nature of the Higgs boson and what additional partic les might be disc overed. Implic it in the reasoning that leads to the fine-tuning is the unsubstantiated assumption that very little physic s other than renormaliz ation group sc aling exists between the Higgs sc ale and the grand unific ation energy whic h are separated by roughly 11 orders of magnitude (known as the “big dessert” assumption). If this is true, then it would seem that fine-tuning is something we need to live with, at least for the time being.2 0 Of c ourse, depending on the spec ific findings at the Higgs sc ale the need for fine-tuning may very well be obviated. Another instanc e of the hierarc hy problem, and one that is a more serious violation of the naturalness requirement, involves the c osmologic al c onstant. Observations of an ac c elerating universe imply the existenc e of a small but nonz ero c osmologic al c onstant. But, the essential fac t is that the observed vac uum energy density must be extremely small—a few milli-elec tronvolts. However, if we take v, the Higgs potential whic h is roughly 246 GeV, and insert the c urrent lower bound on mH , the Higgs mass whic h is 126 GeV, then the Higgs field c ontribution to the vacuum energy density is roughly 54 orders of magnitude greater than the upper bound inferred from the cosmological constant. If there are other, heavier Higgs fields, the problem is even worse. It seems c lear from our disc ussion that the often c ited problem of trying to adapt the quantum field theoretic framework to general relativity is simply one of several problems fac ing the Standard Model. Indeed, many of the pressing theoretic al diffic ulties are generated from within the struc ture of the theory itself. The task of finding the Higgs partic le is intimately c onnec ted with the possibility of disc overing “new physic s” beyond the Standard Model that would explain or rec tify the origins of the hierarc hy problem, among others. Given the list of unanswered Page 14 of 24
Unification in Physics questions that arise from the elec troweak theory and its c onnec tion with the Standard Model, it is reasonably c lear that nothing like a unified understanding of the elec tromagnetic , weak, and strong forc es is available from our present theories. So, while the disc overy of the Higgs boson has verified an important part of the elec troweak theory, it will not nec essarily solve the outstanding internal problems fac ing the theory. In order for physic s beyond the Standard Model to regulate the Higgs mass, and restore naturalness, its energy scale must be around the TeV. Most of the alternative theories that offer solutions to the problem imply that new physic s will be disc overed at the LHC, the most popular c andidate being weak sc ale supersymmetry. Supersymmetry (SUSY) relates partic les of one spin to other “superpartners” that differ by half a unit. In a theory with an unbroken supersymmetry, every type of boson has a c orresponding type of fermion with the same mass and internal quantum numbers and vic e versa. Bec ause the superpartners of the Standard Model partic les have not been observed, if supersymmetry exists it must be broken thereby allowing the superpartic les to be heavier than their c orresponding Standard Model partic les. There are c urrently many models proposed to explain SUSY breaking, as well as models that inc orporate weakly interac ting massive partic les that serve as c andidates for dark matter.21 The other bonus supplied by supersymmetry is its ability to unify the different c oupling c onstants at a high-energy sc ale. Currently, within the framework of the Standard Model, there is no single energy at whic h they all bec ome equal. However, inc orporating supersymmetry c hanges the rate at whic h the c ouplings vary with energy, allowing them to be unified at a single point. If supersymmetry exists c lose to the TeV sc ale, it allows for a solution of the hierarc hy problem bec ause the superpartners of the Standard Model partic les, having different statistic s, c ontribute to the radiative c orrec tions to the Higgs mass with the opposite sign. In the limit of exac t supersymmetry, all c orrec tions to mh c anc el. In the quest to unify the four forc es into a single fundamental framework—a TOE—Supersymmetry also inc ludes a theory of quantum gravity that would unite general relativity and the Standard Model. Currently, the two predominant approac hes to quantum gravity are string theory and loop quantum gravity (LQG). For string theory to be consistent, supersymmetry appears to be required at some level (although it may be a strongly broken symmetry).2 2 Loop quantum gravity, in its c urrent formulation, predic ts no additional spatial dimensions as in the c ase of string theory or anything else about partic le physic s. Nor does LQG require any assumptions about supersymmetry.2 3 Experimental evidenc e at the LHC c onfirming supersymmetry in the form of supersymmetric partic les c ould provide support for string theory, sinc e supersymmetry is one of its required c omponents. However, the outlook isn't bright. Consistenc y of the standard model demanded that the Higgs c ould not be too massive but bec ause the superpartners (partic les predic ted by supersymmetry) are supposed to be only slightly heavier than the mass of the Higgs, it was assumed that onc e the Higgs was found the superpartners would also be in evidenc e. Moreover, they were supposed to be produc ed in muc h greater numbers. Bec ause none has been found a possible explanation is that their mass is an order of magnitude heavier than the Higgs, making them c urrently inac c essible; but that value is inc onsistent with the standard model ac c ount. Henc e, many versions of string theory that predic t c ertain low mass superpartners will need to be signific antly revised. As was the c ase with Maxwell's elec trodynamic s at the time of its c onstruc tion, the elec troweak theory and the Standard Model in general are by no means free of theoretic al diffic ulties. The experiments at the Large Hadron Collider in CERN will probe the elec troweak symmetry breaking sec tor to determine whether the properties of the newly disc overed partic le are c onsistent with those predic ted for the standard model Higgs boson. Although the elec troweak theory suc c essfully unifies the weak and elec tromagnetic fields, the broader theoretic al implic ations c reate signific ant problems that serve to undermine its ability to furnish a theoretic ally c oherent ac c ount, that is, one that is c onsistent with other well-established theoretic al c laims in partic le physic s and c osmology. Consequently, despite its unifying power its overall epistemic status is not wholly unproblematic . 5. Effective Field Theo ries, Reno rmalizatio n, and a New Type o f Unificatio n As we have seen above, muc h of what falls under the title “unific ation” in high-energy physic s involves a synthesis under the produc t of different symmetry groups rather than the kind of reduc tive unity c harac teristic of Newtonian mec hanic s and elec trodynamic s. More generally, the failure of the unific ation/reduc tion strategy in partic le physic s has given way to the effec tive field theory (EFT) program where the “theory” inc orporates only the Page 15 of 24
Unification in Physics partic les that are important for the energy levels or distanc e sc ales being investigated. Bec ause the theory is valid only below the masses of the heavy partic les, it must be superseded by another effec tive theory on that energy sc ale or a c omplete fundamental theory. The predominanc e of effec tive theories is sometimes seen as evidenc e against reduc tion and the goal of unific ation but many, inc luding Weinberg, c laim EFTs c an be interpreted as simply low-energy approximations to a more fundamental theory (e.g., string theory) thereby allowing one to embrac e EFTs while remaining loyal to the reduc tivist/unific ation goal. The alternative involves the “tower” of EFTs, where there may be no end to the proc ess, just more and more sc ales as the energies get higher. Moreover, the lac k of experimental evidenc e and diffic ulties assoc iated with unific ation that nec essitate the use of EFTs may no longer be an issue onc e the LHC starts produc ing suffic ient data. Regardless of the future output from the LHC, philosophic al questions arise c onc erning the epistemic and ontologic al status of unity given the theoretic al problems mentioned above and the prevalenc e of EFTs in many other areas of physic s besides high energy. An examination of the evidenc e from both experiment and theoriz ing suggests the following c harac teristic s of unity: it is something that c an be ac hieved in c ertain loc al c ontexts, it is c harac teriz able in different ways, but c annot be extended to a “unity of nature” that is systematic ally defined. None of this speaks against the possibility of grand unific ation but the question that we, as philosophers, need to address is how to interpret the evidenc e at hand, partic ularly the extensive role of EFTs. Several authors have c ontributed to this debate inc luding Hartmann (2001), who c laims that good sc ientific researc h c an be c harac teriz ed by a fruitful interac tion between fundamental theories, phenomenologic al models, and effec tive field theories. All of them have their appropriate func tions in the researc h proc ess, and all of them are indispensable, c omplementing eac h other and hanging together in a c oherent way. Cao and Sc hweber (1993) take a more radic al approac h, c laiming that the c urrent situation is evidenc e for a pluralism in theoretic al ontology, antifoundationalism in epistemology, and antireduc tionism in methodology. But what exac tly are the implic ations of these c laims and are they borne out by the evidenc e? Consider, for instanc e, methodologic al antireduc tionism and pluralistic ontologies; no one would deny that low and high-energy domains involve not only different kinds of phenomena but also different methodologies in the sense that the reduc tionism inherent in the searc h for fundamental theories has been largely unsuc c essful in treating many phenomena in the low-energy domain. While rec ogniz ing that low and high-energy domains have rather different goals and require different tec hniques, it is important to note that they also both make use of effec tive theories and renormaliz ation group (RG) methods. In that sense, there is a unity of method, espec ially where the latter is c onc erned. But that in itself is not philosophic ally interesting unless we c an point to reasons why the method should work so well in two rather disparate domains. In other words, is there some other sense of unity in physic s that ac c ounts for the suc c ess of RG methods? Before addressing that question, it is important to note that the development of RG methods also revealed a rather different kind of unity that been previously inexplic able, namely, the way that different phenomena suc h as liquids and magnets exhibit the same type of behavior near c ritic al points regardless of differenc es in their mic rostruc ture. These phenomena are grouped together into universality c lasses and share the same c ritic al exponents— parameters that c harac teriz e phase transitions. These c ritic al exponents at, for example, the liquid-gas transition are independent of the c hemic al c omposition of the fluid. The predic tions of universal behavior based on RG methods result from the fac t that thermodynamic properties of a system near a phase transition depend only on a small number of features, suc h as dimensionality and symmetry, and are insensitive to the underlying mic rosc opic properties of the system. Although this kind of unity among different kinds of phenomena is quite distinc t from the unific ation of theories in high-energy physic s, in some way the goals are similar—explaining why seemingly different phenomena exhibit the same type of behavior.24 I will say more about this below but first let me turn to the more general methodologic al issues of unific ation as they arise with RG. The first systematic use of the renormaliz ation group in quantum field theory was by Gell-Mann and Low (1954). A c onsequenc e of their approac h was that quantum elec trodynamic s c ould exhibit a simple sc aling behavior at small distanc es. In other words, quantum field theory has a sc ale invarianc e that is broken by partic le masses, but these masses are negligible at high energies or short distanc es provided one renormaliz es in the appropriate way. In statistic al physic s Kadanoff (1966) developed the basis for an applic ation of RG to thermodynamic systems near c ritic al point. This pic ture also led to c ertain sc aling equations for the c orrelation func tions used in the statistic al desc ription, a method that was refined and extended by Wilson (1971). With respec t to the unific ation issue two different questions arise. First: What, if anything, is the unifying thread that c onnec ts the different RG methods and Page 16 of 24
Unification in Physics why c an we use RG to desc ribe very different kinds of phenomena? In some sense this question involves two parts: the first c onc erns the different mathematic al tec hniques with an eye to artic ulating a c ommon ground that will underwrite the use of RG in both fields. Onc e the different tec hniques have been c ompared the question is whether there is anything about the method itself that fac ilitates its use in different domains. In other words, onc e we have illustrated the similarities between the quantum field theoretic approac h and that used in statistic al physic s, will that reveal a unity of method in the two domains? That brings us to the sec ond question: Is there anything c ommon to the phenomena themselves suc h that they c an all be treated using the RG approac h? It is important to keep in mind here that I am not simply assuming that bec ause we c an use the RG approac h as a unifying methodology it also unifies phenomena in a way that shows them to be similar. Rather, a proper answer to the sec ond question involves seeing what similarities might be exhibited between statistic al and field theoretic phenomena suc h that they c an both be suc c essfully treated using RG tec hniques. There is a brief answer to the first question whic h c an then be spelled out in greater detail, but for our purposes here I will outline just the main point. In order to do that I first need to say a c ouple of things about the basic idea behind the RG approac h. Initially one c an think of QFT and statistic al physic s as having similar kinds of pec uliarities that give rise to c ertain types of problems (e.g., many degrees of freedom, fluc tuations, and diverse spatial and temporal sc ales). The RG framework is signific ant in its ability to link physic al behavior ac ross different sc ales and in c ases where fluc tuations on many different sc ales interac t. Henc e, it bec omes c ruc ial for treating asymptotic behavior at very high (or in massless theories very low) energies (even where the c oupling c onstants at the relevant sc ale are too large for perturbation theory). In field theory when bare c ouplings and fields are replac ed with renormaliz ed ones defined at a c harac teristic energy sc ale μ the integrals over virtual momenta will be c ut off at energy and momentum sc ales of order μ. As we c hange μ we are in effec t c hanging the sc ope of the degrees of freedom in the c alc ulations. So, to avoid large logarithms take μ to be the order of the energy E that is relevant to the proc ess under investigation. In other words, the problem is broken down into a sequenc e of sub-problems with eac h one involving only a few length sc ales. Eac h one has a c harac teristic length and you get rid of the degrees of freedom you do not need. Reduc ing the degrees of freedom gives you a sequenc e of c orresponding Hamiltonians, whic h c an be pic tured as a trajec tory in a spac e spanned by the system parameters (temperature, external fields, and c oupling c onstants). So the RG gives us a transformation that looks like this: (1) where H is the original Hamiltonian with N degrees of freedom. A wide c hoic e of operators R is possible. Not only is there momentum or Fourier spac e methods, whic h are usually assoc iated with field theory, but also what is termed real spac e renormaliz ation used in statistic al physic s (c ases where there is a definite lattic e). The initial version, the Gell-Mann/Low formulation, involved the momentum spac e approac h and hinged on the degree of arbitrariness in the renormaliz ation proc edure. They essentially reformulated and renormaliz ed perturbation theory in terms of a c utoff-dependent c oupling c onstant e(Λ). For example, e, measured in c lassic al experiments is a property of the very long distanc e behavior of QED (whereas the natural sc ale is the Compton wavelength of the elec tron, ∼10−11 c m). G-M/L showed that a family of alternative parameters eλ c ould be introduc ed, any one of whic h c ould be used in plac e of e. The parameter eλ is related to the behavior of QED at an arbitrary momentum sc ale λ instead of the low momenta for whic h e is appropriate. In other words, you c an c hange the renormaliz ation point freely in a QFT and the physic s will not be affec ted. Introduc ing a sliding renormaliz ation sc ale effec tively suppresses the low- energy degrees of freedom. The real spac e approac h is linked to the Wilson-Kadanoff method. Kadanoff's ac c ount of sc aling relations involves a lattic e of interac ting spins (ferromagnetic transition) and transformations from a site lattic e with the Hamiltonian Ha(S) to a bloc k lattic e with Hamiltonian H2a(S). Eac h bloc k is c onsidered as a new basic entity. One then c alc ulates the effec tive interac tions between them and in this way c onstruc ts a family of c orresponding Hamiltonians. If one starts from a lattic e model of lattic e siz e a, one would sum over degrees of freedom at siz e a while maintaining their average on the sub-lattic e of siz e 2a fixed. Starting from a Hamiltonian Ha(S) on the initial lattic e, one would generate an effec tive Hamiltonian H2a(S) on the lattic e of double spac ing. This transformation is repeated as long as the lattic e spac ing remains small c ompared to the c orrelation length. The key idea is that the Page 17 of 24
Unification in Physics transition from Ha(S) to H2a(S) c an be regarded as a rule for obtaining the parameters of H2a(S) from those of Ha(S). The proc ess c an be repeated with the lattic e of small bloc ks being treated as a site lattic e for a lattic e of larger blocks. Close to c ritic al point the c orrelation length (the distanc e over whic h the fluc tuations of one mic rosc opic variable are assoc iated with another) far exc eeds the lattic e c onstant a, whic h is the differenc e between neighboring spins. As we move from small to larger bloc k lattic es we gradually exc lude the small sc ale degrees of freedom by averaging out through a proc ess of c oarse graining. So, for eac h new bloc k lattic e one has to c onstruc t effec tive interac tions and find their c onnec tion with the interac tions of the previous lattic e. What Wilson did was show how the c oupling c onstants at different length sc ales c ould be c omputed, how c ritic al c omponents c ould be estimated and henc e how to understand universality, whic h follows from the fac t that the proc ess c an be iterated (i.e., universal properties follow from the limiting behavior of suc h iterative proc esses).2 5 I will have more to say about these processes in answer to question (2) below. Initially, this looks like one is doing very different things; in the c ontext of c ritic al phenomena one is interested only in long distanc e not short distanc e behavior. In the c ase of QFT, the renormaliz ation sc heme is used to provide an ultraviolet c utoff while in c ritic al behavior the very short wave numbers are integrated out. Moreover, why should sc ale invarianc e of the sort found in QFT be important in c ases of phase transitions? To answer these questions we can think of the similarities in the following way: in the K-W version the grouping together of the variables referring to different degrees of freedom induc es a transformation of the statistic al ensemble desc ribing the thermodynamic system. Or, one c an argue in terms of a transformation of the Hamiltonian. Regardless of the notation, what we are interested in is the suc c essive applic ations of the transformation that allow us to probe the system over large distanc es. In the field theoretic c ase, we do not c hange the “statistic al ensemble” but the stoc hastic variables do undergo a loc al transformation whereby one c an probe the region of large values of the fluc tuating variables. Using the RG equations, one c an take this to be formally equivalent to an analysis of the system over large distances. This formal similarity also provides some c lues to why RG c an be suc c essfully applied to suc h diverse phenomena. But here I think we need to look more c losely at what exac tly the RG method does. In statistic al physic s we distinguish between two phases by defining an order parameter that has a nonz ero value in the ordered phase and z ero in the disordered phase (high temperature). In a ferromagnetic transition the order parameter is homogenous magnetiz ation. A nonz ero value for the order parameter c orresponds to symmetry breaking (here, rotational symmetry). In liquid-gas transition the order parameter is defined in terms of differenc e in density. In the vic inity of a transition, a system has fluc tuations for whic h one c an define a c orrelation length ξ that inc reases as T → Tc (provided all other parameters are fixed). If the c orrelation length diverges as T → Tc , then the fluc tuations bec ome c ompletely dominant and we are left without a c harac teristic length sc ale bec ause all lengths are equally important. Reduc ing the number of degrees of freedom with RG amounts to establishing a c orrespondenc e between one problem having a given c orrelation length and another whose length is smaller by a c ertain fac tor. So, we get a very c onc rete model (henc e real spac e renormaliz ation) for reduc ing degrees of freedom. In c ases of relativistic quantum field theories like QED, the theory works well for the elec tron bec ause at long distanc es there is simply not enough energy to observe the behavior of other c harged partic les; that is, they are present only at distanc es very small c ompared to the elec tron's Compton wavelength. By c hoosing the appropriate renormaliz ation sc ale, the logarithms that appear in perturbation theory will be minimiz ed bec ause all the momenta will be of the order of the c hosen sc ale. In other words, one introduc es an upper limit Λ on the allowed momentum equivalent to a mic rosc opic length sc ale h/2π Λc . We c an think of a c hange in eac h of these sc ales as analogous to a phase transition where the different phases depend on the values of the parameters, with the RG allowing us to c onnec t eac h of these different sc ales. So, regardless of whether you are integrating out very short wave numbers or using it to provide an ultraviolet c utoff, the effec t is the same in that you are getting the right degrees of freedom for the problem at hand.2 6 Henc e, bec ause the formal nature of the problems is similar in these two domains, one c an see why the RG method is so suc c essful in dealing with different phenomena. In the momentum spac e or field theory approac h, we c an think of the high-momentum variables as c orresponding to short-range fluc tuations integrated out. And in the K-W version the rec iproc al of a (the lattic e c onstant whic h is the differenc e between neighboring spins) ac ts as a c utoff parameter for large momenta; that is, it eliminates short wave length fluc tuations with wavenumbers c lose to the c utoff parameter. Page 18 of 24
Unification in Physics The notion that the RG equations and EFTs support ontologic al pluralism, as suggested by Cao and Sc hweber, is direc tly c onnec ted to the suc c ess of the dec oupling theorem (Appelquist and Caraz z one 1975). In simple terms the theorem states that if one has a renormaliz able theory where some fields have muc h larger masses c ompared with others, a renormaliz ation proc edure c an be found enabling the heavy partic les to dec ouple from the low-energy domain. The low-energy physic s is then desc ribed by an effec tive theory that deals only with the partic les that are important for the energy level being c onsidered. Using the RG equations, one c an delete the heavy fields from the c omposite system and redefine the c oupling c onstants and masses. However, what is signific ant here is that the dec oupling is, to some extent, only partial. In some c ases the heavy partic les produc e renormaliz ation effec ts but are suppressed by a power of the relevant experimental energy divided by a heavy mass (the fundamental energy). In that sense the c utoffs represented by the heavy partic les define the domain in whic h the EFT is applic able, that is, the proc ess is mass dependent.2 7 But what about unific ation? As we saw above, properties near c ritic al point are determined primarily by the c orrelation length for fluc tuations in the order parameter (i.e. bloc ks of spins within a c orrelation length of eac h other will be c oherently magnetiz ed). The c orrelation length diverges on approac hing c ritic al point but using the RG equations to reduc e the degrees of freedom is in effec t reduc ing the c orrelation length. As the proc ess is iterated the Hamiltonian bec omes more and more insensitive to what happens on smaller length sc ales. These ideas are important for defining the notion of universality mentioned above—the similar behavior in different kinds of systems in the neighborhood of c ritic al point. An instanc e of this is the wide variety of liquid-vapor systems whose c orrelation lengths appear to diverge in prec isely the same way as ferromagnets. The systems form a “universality class” that is determined primarily by the nature of the order parameter. The behavior of thermodynamic parameters near c ritic al point is also c harac teriz ed by what are c alled c ritic al indic es. Phase transitions with the same set of c ritic al indic es are said to belong to the same universality c lass. It is important to point out that this is not simply a c ase of sharing the same exponents in the way that gravitation and elec tromagnetism both obey an inverse square law, (exponent -2); that does not show a unity between the forc es. A c orrespondenc e of exponents whose values are frac tions like .63 provides evidenc e that the mic rostruc ture is unimportant. In that sense the unity among these phenomena has nothing to do with similarity at the level of c onstituent properties as in the c ase of unific ation via reduc tion. One of the c ruc ial features of Wilson's work was that it showed that in the long wave-length/large spac e-sc ale limit the sc aling proc ess leads to a fixed point when the system is at a c ritic al point. The properties of this fixed point determine the c ritic al exponents that c harac teriz e the fluc tuations at the c ritic al point. The same fixed point interac tions c an desc ribe a number of different types of systems. RG shows that different kinds of transitions have the same c ritic al exponents and c an be understood in terms of the same fixed-point interac tion that desc ribes all these systems. What the fixed points do is determine the kinds of c ooperative behavior that are possible. So, the important point here is not just the elimination of irrelevant degrees of freedom but also the existenc e of c ooperative behavior and its relation to the order parameter (symmetry breaking) that c harac teriz es the different kinds of systems. What the renormaliz ation group equations show is that phenomena at c ritic al points have an underlying order. Indeed what makes the behavior of c ritic al point phenomena predic table, even in a limited way, is the existenc e of c ertain sc aling properties that exhibit “universal” behavior. The problem of c alc ulating the c ritic al indic es for these different systems was simplified by using the renormaliz ation group bec ause it shows us that the different kinds of transitions suc h as liquid– gas, magnetic , alloy, and so on that have the same c ritic al exponents experimentally c an be understood in terms of the same fixed-point interac tion that desc ribes all these systems. In other words, the RG equations provide a mathematic al framework that shows how and why these phenomena are related to eac h other. While the notion of unific ation defined here is in terms of universality, the final question remains to be answered, namely, whether there is some notion of unific ation based on a c onnec tion between the phenomena in QFT and c ondensed matter physic s that is eluc idated via the renormaliz ation group tec hniques. One possibility is to think of gauge theories c harac teristic of QFT as exhibiting different phases depending on the value of the parameters. Eac h phase is assoc iated with a symmetry breaking in the same way that phase c hange in statistic al physic s is assoc iated with the order parameter. In statistic al physic s nature presents us with a mic rosc opic length sc ale. Cooperative phenomena near a c ritic al point c reate a c orrelation length and in the limit of the c ritic al point the ratio of these two lengths tends to ∞. In QFT we introduc e an upper limit Δ on the allowed momentum defined in terms of a mic rosc opic length sc ale h/2πΔc. The real physic s is rec overed in the limit in whic h the artific ial sc ale is small Page 19 of 24
Unification in Physics c ompared to the Compton wavelength of the relevant partic les. The ratios of the two length sc ales Δ/m are tuned toward infinity. In that sense all relativistic QFTs desc ribe c ritic al points with assoc iated fluc tuations on arbitrarily many length sc ales (Weinberg, 1983). And, to that extent we c an think of them together with those in c ondensed matter as exhibiting a kind of generic struc ture; a struc ture that is made more explic it as a result of the applic ation of RG tec hniques. What RG does is expose physical struc tural similarities in the phenomena it treats. As I said above, the unific ation assoc iated with universal behavior is very different from what is normally understood when we think of unific ation in physic s. But that is exac tly the point I want to stress. Unific ation is a diverse notion that takes many different forms, some of whic h are linked with reduc tion while others are not. Indeed some speak against the very notion of reduc tion by showing that we c an have a unity among phenomena that is c ompletely unrelated to their underlying mic rostruc ture. Despite these various ways of understanding unific ation and the theoretic al and experimental diffic ulties assoc iated with theories of everything, unific ation remains the goal that drives most if not all of high-energy physic s. The question of whether, how, and in what form that goal will be realiz ed and how it relates to a unity in nature is an ongoing aspec t of both physic s researc h and philosophic al inquiry. References Aitchinson, I. J. R., and Hey, A. J. (1989). Gauge theories in particle physics. Bristol: Adam Hilger. Anderson, P. (1972). More is different: Broken symmetry and the nature of the hierarc hic al struc ture of sc ienc e. Science 177 (4047): 393–396. Appelquist, T., and Caraz z one, J. (1975). Infrared Singularities and Massive Fields, Physical Review D11, 2856– 2861. Batterman, R. (2002). Devil in the details: Asymptotic reasoning in explanation. Oxford: Oxford University Press. Brading, K., and Castellani, E. (2003). Symmetries in physics: Philosophical reflections. Cambridge: Cambridge University Press. Cao, T. Y., and Sc hweber, S. (1993). The c onc eptual foundations and the philosophic al aspec ts of renormaliz ation theory. Synthese 97: 33–108. Dine, M. (2007). Supersymmetry and string theory: Beyond the standard model. Cambridge: Cambridge University Press. Einstein, A. (1952). On the elec trodynamic s of moving bodies. In The principle of relativity: A collection of original memoirs on the special and general theory of relativity, trans. W. Perrett and G. B. Jeffrey, 37–65. New York: Dover. (Originally published 1905). Galison, P. (1987). How experiments end. Chic ago: University of Chic ago Press. Gell-Mann, M., and Low, F. E. (1954). Quantum Elec trodynamic s at Small Distanc es. Physical Review 95(5): 1300– 1312. Georgi, H. (1993). Effec tive field theory. Annual Review of Nuclear and Particle Science 43: 209– 252. Glashow, S. (1961). Partial symmetries of weak interac tions. Nuclear Physics 22: 579– 588. Hartmann, S. (2001). Effec tive field theories, reduc tion and sc ientific explanation. Studies in History and Philosophy of Modern Physics 32B: 267–304. Higgs, P. (1964a). Broken symmetries, massless partic les and gauge fields. Physics Letters 12: 132– 133. ———. (1964b). Broken symmetries and masses of gauge bosons. Physical Review Letters 13: 508–509. Kadanoff, L. P. (1966). Sc aling laws for Ising models near Tc. Physics 2: 263. Page 20 of 24
Unification in Physics Lagrange, J. L. (1788). Mécanique analytique. Paris. Maudlin, T. (1996). “On the unific ation of physic s”. Journal of Philosophy 93(3): 129– 144. Maxwell, J. C. (1873). Treatise on electricity and magnetism. 2 vols. Oxford: Clarendon Press. Reprinted 1954 New York: Dover. ———. (1965). The scientific papers of James Clerk Maxwell. 2 vols. Edited W. D. Niven. New York: Dover. Morrison, M. (1995), “The New Aspec t: Symmetries as Meta-Laws” in Laws of Nature: Essays on the Philosophical, Scientific and Historical Dimension, ed. F. Weinert. Berlin: De Gruyter, 157–90. ———. (2000). Unifying scientific theories: Physical concepts and mathematical structures. Cambridge: Cambridge University Press. ———. (2008) “Fic tions, Representation and Reality” Mauric io Suarez (ed.) Fictions in Science: Philosophical Essays on Modelling and Idealization. London: Routledge, 110–138. Pic kering, A. (1984). Constructing quarks. Chic ago: University of Chic ago Press. Quigg, C. (2009). Unanswered questions in the elec troweak theory. Annual Review of Nuclear and Particle Science 59: 506–555. Rovelli, C. (2007). Quantum gravity. Cambridge: Cambridge University Press. Sc hwinger, J. (1957). A theory of fundamental interac tions. Annals of Physics 2: 407– 434. Smolin, L. (2001). Three roads to quantum gravity. New York: Basic Books. Weinberg, S. (1967). A model of leptons. Physical Review Letters 19: 1264–1266. ———. (1983). Why the renormaliz ation group is a good thing. In Asymptotic realms of physics, Essays in honor of Francis Low, ed. A. Guth, K. Huang, and R. L. Jaffe, 1–19. Cambridge, MA: MIT Press. Wilson, K. (1971). The renormaliz ation group (RG) and c ritic al phenomena 1. Physical Review B 4: 3174. ———. (1975). The renormaliz ation group: Critic al phenomena and the Kondo problem. Reviews of Modern Physics 47: 773–839. Yang, C. N., and Mills, R. J. (1954). Conservation of isotropic spin and isotropic gauge invarianc e. Physical Review 96: 191. Zinn Justin, J. (1998). Renormaliz ation and renormaliz ation group: From the disc overy of UV divergenc es to the c onc ept of effec tive field theories. In Proc eedings of the NATO ASI on Quantum Field Theory: Perspective and Prospective, ed. C. de Witt-Morette and J.-B. Zuber, 375– 388. Les Houc hes, Franc e: Kluwer Ac ademic Publishers, NATO ASI Series C 530. Notes: (1) The first unific ation in Maxwell's theory was in terms of a reduc tion of the elec tromagnetic and luminiferous aethers. (2) For an extensive treatment of different types of unific ation in physic s, as well as the way that mathematic al struc tures are used as unifying tools in biology, see Morrison (2000). See also Maudlin (1996) for a disc ussion of unific ation in physic s. (3) Perhaps the most c ited problem with string theory is that it has a huge number of equally possible solutions, c alled string vac u, that may be suffic iently diverse to explain almost any phenomena one might observe at lower energies. If so, it would have little or no predic tive power for low-energy partic le physic s experiments. Other c ritic isms inc lude the fac t that it is bac kground dependent, requiring a spec ific starting point. This is inc ompatible Page 21 of 24
Unification in Physics with general relativity, whic h is bac kground independent. The problems assoc iated with loop quantum gravity also involve c omputational diffic ulties in making predic tions direc tly from the theory and the fac t that its desc ription of spac etime at the Planc k sc ale has a c ontinuum limit that is not c ompatible with general relativity. Obviously, there are many more detailed issues here that I have not mentioned. For more disc ussion, see Dine (2007) on string theory and supersymmetry and Rovelli (2007) on quantum gravity. See Smolin (2001) for a popular ac c ount of the latter. (4) In order to analyz e a physic al problem, it is nec essary to isolate the relevant details or c hoic e of variables that will c apture the physic s one is interested in. Sinc e this will involve separated energy sc ales, we c an study low- energy dynamic s independently of the details of high-energy interac tions. The proc edure is to identify the parameters that are very large (small) c ompared with the relevant energy sc ale of the physic al system and put them to infinity (z ero). We c an then use this as an approximation that c an be improved by adding c orrec tions induc ed by the neglec ted energy sc ales as small perturbations. (5) Maxwell (1965, 1: 564). The experimental fac ts c onc erned the induc tion of c urrents by inc reases or dec reases in neighboring c urrents, the distribution of magnetic intensity ac c ording to variations of a magnetic potential and the induc tion of statistic al elec tric ity through dielec tric s. (6) For a more extensive disc ussion of this point, see Morrison (2008). (7) Given a system described by n generalized coordinates qi, their velocities qi. along with purely holonomic c onstraints, d'Alembert's princ iple yields n equations of motion where T = ∑i 1 miv2i is the kinetic energy and 2 is the generaliz ed forc e c orresponding to qj. For a c onservative system, the forc es Fi may be written in terms of a potential func tion V(r1,r2 ,…), suc h that Therefore The equations of motion bec ome where we have made use of the fac t that V depends only on the generaliz ed c oordinates q and not their veloc ities. This motivates the definition of the Lagrangian from which the Euler-Lagrange equations follow: The utility of the Lagrangian approac h is that, by virtue of d'Alembert's use of generaliz ed c oordinates, (holonomic ) c onstraint forc es do not appear explic itly. (8) His attac hment to the potentials as primary was also c ritic iz ed, sinc e virtually all theorists of the day believed that the potentials were simply mathematic al c onvenienc es having no physic al reality whatsoever. To them, the forc e fields were the only physic al reality in Maxwell's theory but the formulation in DT provided no ac c ount of this. Today, of course, we know in the quantum theory that it is the potentials that are primary, and the fields are derived from changes in the potentials. (9) The methods used in “A Dynamic al Theory” were extended and more fully developed in the Treatise on Electricity and Magnetism (TEM), where the goal was to examine the c onsequenc es of the assumption that elec tric c urrents were simply moving systems whose motion was c ommunic ated to eac h of the parts by c ertain forc es, the Page 22 of 24
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 1 - 50
- 51 - 100
- 101 - 150
- 151 - 200
- 201 - 250
- 251 - 300
- 301 - 350
- 351 - 400
- 401 - 450
- 451 - 500
- 501 - 550
- 551 - 585
Pages: