The Tyranny of Scales Figure 7.5 Gambles within gambles within gambles … Now why should she think that these two parameters—properties of c ollec tions of c asinos offering different and varied kinds of games (roulette, poker, blackjack, slots, etc.)—are the correct ones with which to make the pre- sentation to the board? Equivalently, why she should employ a Gaussian probability distribution (it is uniquely defined by the mean and variance) in the first place, as opposed to some other probability distribution? The answer is effec tively provided by an RG argument analogous to that whic h allows us to determine the func tional form of the order parameter M near c ritic ality—that it sc ales as | t| β near c ritic ality. It is an argument that leads us to expect behavior in accord with the central limit theorem. There are deep similarities between the arguments for why the functional form with exponent β is universal and why Gaussian or central limiting behavior is so ubiquitous. In the former c ase, the RG demonstrates that various systems all flow to a single fixed point in an abstrac t spac e of Hamiltonians or c oupling c onstants. That fixed point determines the universality c lass that is c harac teriz ed by the sc aling exponent β. Similarly, the Gaussian probability distribution is a fixed point for a wide c lass of probability distributions under a similar renormaliz ation group transformation. (For details see Batterman 2010 and Sinai 1992.) Thus, the answer to why the mean μ and the varianc e σ2 are the relevant parameters depends upon an RG, limiting argument. Generaliz ing, one should expec t related argument strategies to tell us why the two elastic “c onstants” (related to Young's modulus) are the c orrec t parameters with whic h to c harac teriz e the universality c lass of elastic solids. The appeal to something like c entral limiting behavior is c harac teristic of homogeniz ation theory and distinguishes this line of argumentation from that employing REV averaging techniques. In fac t, the differenc e between averaging and homogeniz ation is related to the differenc e between the law of large numbers and the central limit theorem: averaging or first order perturbation theory “can often be thought of as a form (or c onsequenc e) of the law of large numbers.” Homogeniz ation or sec ond order perturbation theory “c an often be thought of as a form (or c onsequenc e) of the c entral limit theorem” (Pavliotis and Stuart 2008, pp. 6– 7). Here is a brief discussion that serves to motivate these connections. Consider a sum function of independent and identically distributed random variables, Yi : S(n) = ∑ni=1 Yi. The sample average ¯S¯¯(¯n¯¯¯) = 1/n ∑ni=1 Yi c onverges to the mean or expec ted value μ. The strong law of large numbers asserts that As such it tells us about the first moment of the random variable (¯S¯¯(¯¯n¯¯) —the average. The central limit theorem by c ontrast tells us about the sec ond moment of the normaliz ed sum (¯S¯¯(¯n¯¯¯) ; that is it tells us about the behavior of fluctuations about the average μ. It says that for n → ∞ the probability distribution of √−n ¯(¯S¯¯(¯¯n¯¯)¯)¯ − μ) converges σ2 σ2 to the normal or Gaussian distribution N (0, ), with mean 0 and variance n where σ is the standard deviation of the Yi's.24 Page 15 of 23
The Tyranny of Scales Thus again we see that in the probabilistic sc enario, as in the c ase of c ritic al phenomena, we must to pay attention to the fac t that c ollec tions of gambles (bubbles) c ontribute to the behavior of the system at the mac rosc ale. Onc e again, we need to pay attention to fluctuations about some average behavior, and not just the average behavior itself. Furthermore, a similar pic ture is possible regarding the upsc aling of our modeling of the behavior of the steel girder with whic h we started. Compare the two c ases, figure 7.6, noting that here too only a small number of phenomenologic al parameters are needed to model the c ontinuum/mac rosc ale behavior. (E is Young's modulus and I is the area moment of inertia of a c ross-sec tion of the girder.) The general problem of justifying the use of Euler's continuum recipe to determine the macroscopic equation models involves c onnec ting a statistic al/disc rete theory in terms of atoms or lattic e sites to a hydrodynamic or c ontinuum theory. Muc h effort has been spent on this problem by applied mathematic ians and materials sc ientists. And, as I mentioned above, the RG argument that effec tively determines the c ontinuum behavior of systems near c ritic ality is a relatively simple example of this general homogeniz ation program. Figure 7.6 Gaussian and steel—few (macro) parameters: [μ, σ 2]; [E, I] In hydrodynamic s, for example Navier-Stokes theory, there appear density func tions, ρ(x), that are defined over a c ontinuous variable x. These func tions exhibit no atomic struc ture at all. On the other hand, for a statistic al theory, suc h as the Ising model of a ferromagnet, we have seen that one defines an order parameter (a magnetic density func tion) M(x) that is the average magnetiz ation in a volume surrounding x that c ontains many lattic e sites or atoms. The radius of the volume, L, is intermediate between the lattice constant (or atomic spacing) and the cor- relation length ξ: (a ≪ L ≪ ξ). As noted in section 2.1 this makes the order parameter depend upon the length L (Wilson 1974, p. 123). A c ruc ial differenc e between the hydrodynamic (thermodynamic ) theory and the statistic al theory is that the free energy in the former is determined using the single magnetiz ation func tion M(x). In statistic al mec hanic s, on the other hand, the free energy is “a weighted average over all possible forms of the magnetiz ation M(x).” (Wilson 1974, p. 123) This latter set of func tions is parameteriz ed by the volume radius L. On the statistic al theory due originally to Landau, the free energy defined as a function of M(x) takes the following form: (9) where R and U are (temperature dependent) c onstants and B is a (possibly absent) external magnetic field. (Wilson 1974, p. 122) This (mean field) theory predic ts the wrong value, 1/2, for β– the c ritic al exponent. The problem, as diagnosed by Wilson, is that while the Landau theory can accommodate fluctuations for lengths λ 〈 L in its definition of M as an average, it c annot ac c ommodate fluc tuations of lengths L or greater. A sure sign of trouble in the Landau theory would be the dependence of the constants R and U on L. That is, suppose one sets up a proc edure for c alc ulating R and U whic h involves statistic ally averaging over fluc tuations with wavelengths λ 〈 L. If one finds R and U depending on L, this is proof that long-wavelength fluc tuations are important and Landau's theory must be modified. (p. 123) Page 16 of 23
The Tyranny of Scales The RG ac c ount enables one to exploit this L-dependenc e and eventually derive differential equations (RG) for R and U as func tions of L that allow for the c alc ulation of the exponent β in agreement with experiment. The key is to c alc ulate and c ompare the free energy for different averaging siz es L and L + δL. One c an proc eed as follows2 5: Divide M(x) in the volume element into two parts: (10) MH is a hydrodynamic part with wavelengths of order ξ and Mf l is a fluc tuating part with wavelength between L and L + δL. The former will be effec tively c onstant over the volume. By performing a single integral over m—the sc ale fac tor in (10)—we get an iterative expression for the free energy for the averaging siz e L + δL, FL+δL, in terms of the free energy for the averaging siz e L: (11) In effec t, one finds a step by step way to inc lude all the fluc tuations—all the physic s—that play a role near c ritic ality One moves from a statistic al theory defined over finite N and dependent on L to a hydrodynamic theory of the c ontinuum behavior at c ritic ality. “Inc luding all of the physic s” means that the geometric struc ture of the bubbles within bubbles pic ture gets preserved and exploited as one upsc ales from the finite disc rete atomistic ac c ount to the c ontinuum model at the sc ale of ξ—the siz e of the system. That is exac tly the struc ture that is wiped out by the standard REV averaging, and it is for that reason that Landau's mean field theory failed. 5.1 Homogenization Continuum modeling is c onc erned with the effec tive properties of materials that, in many instanc es, are mic rostruc turally heterogeneous. These mic rostruc tures, as noted, are not always to be identified with atomic or lowest sc ale “fundamental” properties of materials. Simple REV averaging tec hniques often assume something like that, but in general the effec tive, phenomenologic al properties of materials are not simple mixtures of volume frac tions of different c omposite phases or materials. Many times the mic rostruc tural features are geometric or topologic al inc luding (in addition to volume frac tions) “surfac e areas of interfac es, orientations, siz es, shapes, spatial distributions of the phase domains; c onnec tivity of the phases; etc .” (Torquato 2002, p. 12). In trying to bridge the sc ales between the atomic domain and that of the mac rosc ale, one needs to c onnec t rapidly varying loc al func tions of the different phases to differential equations c harac teriz ing the system at muc h larger sc ales. Homogeniz ation theory ac c omplishes this by taking limits in whic h the loc al length (small length sc ale) of the heterogeneities approac hes z ero in a way that preserves (and inc orporates) the topologic al and geometric features of the mic rostruc tures. Most simply, and abstrac tly, homogeniz ation theory c onsiders systems at two sc ales: ξ, a mac rosc opic sc ale c harac teriz ing the system siz e, and a mic rosc opic sc ale, a, assoc iated with the mic rosc ale heterogeneities. There may also be applied external fields that operate at yet a third sc ale Λ. If the mic rosc ale, a, is c omparable with either ξ or Λ, then the modeler is stuc k trying to solve equations at that smallest sc ale. However, as is often the c ase, if a ≪ Λ ≪ ξ, then one c an introduc e a parameter that is assoc iated with the fluc tuations at the mic rosc ale of the heterogeneities—the loc al properties (Torquato 2002, pp. 305– 6). In effec t, then one looks at a family of func tions uε and searc hes for a limit u = limε→0 uε that tells us what the effec tive properties of the material will be at the mac rosc ale. Page 17 of 23
The Tyranny of Scales Figure 7.7 Homogenization limit (after Torquato 2002, pp. 2, 305–6) Figure 7.7 illustrates this. The left box shows the two sc ales a and ξ with two phases of the material K1 and K2 . The homogeniz ation limit enables one to treat the heterogeneous system at sc ale a as a homogeneous system at sc ale ξ with an effec tive material property represented by Ke. For an elastic solid like the steel girder, Ke would be the effec tive stiffness tensor and is related experimentally to Young's modulus. For a c onduc tor, Ke would be the effec tive c onduc tivity tensor that is related experimentally to the parameter σ—the spec ific c onduc tanc e— appearing in Ohm's law: where J is the current density at a given location x in the material and E is the electric field at x. At the risk of being overly repetitive, note that in these and other cases, it is unlikely that the effective material property Ke will be a simple average. Let me end this brief discussion of homogenization by highlighting what I take to be a very important concept for the general problem of upscaling. This is the concept of an order parameter and related functions. The notion of an order parameter was introduced in our discussion of continuous phase transitions in thermodynamics, and the statistic al mec hanic al explanations of c ertain of their features. In effec t, the order parameter is a mic rostruc ture (mesosc opic sc ale) dependent func tion introduc ed to c odify the phenomenologic ally observed transition between different states of matter. As we have seen, the magnetiz ation M represented in figure 7.2 is introduc ed to reflec t the fac t that at the Curie temperature the systems goes from an unordered phase, above Tc to an ordered phase, below Tc. In this context, the divergences and nonanalyticities at the critical point play an essential role in deter- mining the fixed point that characterizes the class of systems exhibiting the same scaling behavior: M α |t| . Butβ, again following Nelson's suggestion, entire c lasses of systems suc h as the c lass of linear elastic solids are also c harac teriz ed by “fixed points” represented by a relatively few phenomenologic al parameters like Young's modulus. It is useful to introduc e an order-like parameter in this more general c ontext of upsc aling where c ritic ality is not really an issue. For example, c onsider the left image in figure 7.7. In upsc aling to get to the right image, one c an begin by defining indic ator or c harac teristic func tions for the different phases as a func tion of spatial loc ation (Torquato 2002, pp. 24– 5). For instanc e, if the shaded phase oc c upies a region Us in the spac e, then an indic ator func tion of that phase is given by One c an also introduc e indic ator func tions for the interfac es or boundaries between the two phases.2 6 Muc h information c an then be determined by investigating n-point probability func tions expressing the probabilities that n loc ations x1, …, xn are to be found in regions oc c upied by the shaded phase.2 7 In this way many features, other than simple volume frac tion, that exist at mic rosc ales c an be represented and employed in determining the homogeniz ation limit for c omplex heterogeneous systems. The introduc tion of suc h field variables, c orrelation func tions, etc ., allow us to c harac teriz e the heterogeneous struc tures above the atomic sc ales. In some c ases, suc h as the bubbles within bubbles struc ture of the different phases at a c ontinuum phase transition, muc h of this additional apparatus will not be nec essary. (Though, of c ourse, it is essential to take into c onsideration that struc ture in that partic ular c ase.) But for many more involved upsc aling problems suc h as steel, Page 18 of 23
The Tyranny of Scales the additional mathematic al apparatus will be c ritic al in determining the appropriate effec tive phenomenologic al theory at the c ontinuum level. As we have seen these mic rostruc tures are c ritic al for an understanding of how the phenomenologic al parameters at the c ontinuum sc ale emerge. The main lesson to take from this all-too-brief disc ussion is that physic s at these mic ro/meso-sc opic sc ales need to be c onsidered. Bottom-up modeling of systems that exist ac ross a large range of sc ales is not suffic ient to yield observed properties of those systems at higher sc ales. Neither is c omplete top-down modeling. After all, we know that the parameters appearing in c ontinuum models must depend upon details at lower sc ale levels. The interplay between the two strategies—a kind of mutual adjustment in whic h lower sc ale physic s informs upper sc ale models and upper sc ale physic s c orrec ts lower sc ale models—is c omplex, fasc inating, and unavoidable. 6. Co nclusio n The solution to the tyranny of sc ales problem has been presented as one of seeing if it is possible to exploit mic rostruc tural sc ale information (intermediate between atomic sc ales and mac rosc opic sc ales) to bridge between two dominant and apparently inc ompatible modeling strategies. These are the traditional bottom-up strategies assoc iated with a broadly reduc tionist ac c ount of sc ienc e and pure top-down strategies that held sway in the nineteenth c entury and motivated the likes of Mac h, Duhem, Maxwell, and others. Despite great progress in understanding the physic s of atomic and subatomic partic les, the persistenc e of c ontinuum modeling has led to heated debates in philosophy about emergenc e, reduc tion, realism, etc . We have c anvassed several different attitudes to the apparent in eliminability of c ontinuum level modeling in physic s. On the one hand, there is the view of Butterfield and others, that the use of c ontinuum limits represents nothing more than a preferenc e for the mathematic al c onvenienc e of the infinite. Another possible view, c oming out of the tyranny of sc ales, suggests a kind of skeptic ism: we need both atomic sc ale models and c ontinuum sc ale models that essentially employ infinite idealiz ations. However, a unified ac c ount of applied mathematic s that inc orporates both the literally c orrec t atomic models and the essentially idealiz ed c ontinuum models seems to be beyond our reac h.2 8 I c laim that neither of these attitudes is ultimately ac c eptable. Butterfield et al. are wrong to believe that c ontinuum models are simply mathematic al c onvenienc es posing no real philosophic al c onc erns. This position fails to respec t some rather deep differenc es between kinds of c ontinuum modeling. In partic ular, the strategies employed in the renormaliz ation group and in homogeniz ation theory differ signific antly from those employed in standard representative elementary volume (REV) averaging sc enarios. The signific anc e of Wilson's renormaliz ation group advanc e was exac tly to point out why suc h REV methods fail and how that failure c an be overc ome. The answer, as we have seen, is to pay attention to “between” sc ale struc tures as in the c ase of the bubbles within bubbles pic ture of what happens at phase transitions. Inc orporating suc h struc tures—features that c annot be understood as averages over atomic level struc tures—is exac tly the strategy behind upsc aling attempts that c onnec t Euler- type disc rete modeling rec ipes to Euler-type c ontinuum rec ipes. Homogeniz ation lets us give an answer to why the use of the c ontinuum rec ipe is safe and robust. It provides a satisfac tory justific ation for the use of suc h c ontinuum models, but not one that is “straightforward” or pragmatic ally motivated. As suc h, homogeniz ation provides the beginning of an ac c ount of applied mathematic s that unifies the radic ally different sc ale-dependent modeling strategies. I have also tried here to foc us attention on a rather large subfield of applied mathematic s that should be of interest to philosophers working on spec ific issues of modeling, simulation, numeric al methods, and idealiz ations. In addition, understanding the nature of materials in terms of homogeniz ation strategies c an inform c ertain questions about the nature of physic al properties and issues about realism. For instanc e, we have seen that many materials at mac rosc ales are c harac teriz ed by a few phenomenologic al parameters suc h as the elastic c onstants. Understanding the nature of materials requires understanding why these c onstants and not others are appropriate, as well as understanding from where the c onstants arise. One important lesson is that many of these material defining parameters are not simply dependent upon the nature of the atoms that c ompose the material. There is a c ruc ial link between struc ture at intermediate sc ales and observed properties at the mac rosc ale. It may do to end with an nic e statement (partially c ited earlier) from Rob Phillips's exc ellent book Crystals, Defects, and Microstructures (2001) expressing this point of view. Despite the power of the idea of a material parameter, it must be greeted with c aution. For many features of Page 19 of 23
The Tyranny of Scales materials, c ertain “properties” are not intrinsic. For example, both the yield strength and frac ture toughness of a material depend upon its internal c onstitution. That is, the measured material response c an depend upon mic rostruc tural features suc h as the grain siz e, the porosity, etc . Depending upon the extent to whic h the material has been subjec ted to prior working and annealing, these properties c an vary c onsiderably. Even a seemingly elementary property suc h as the density c an depend signific antly upon that material's life history. The signific anc e of the types of observations given above is the realiz ation than many material properties depend upon more than just the identity of the partic ular atomic c onstituents that make up that material.…[M]ic rostruc tural features suc h as point defec ts, disloc ations, and grain boundaries c an eac h alter the measured mac rosc opic “properties” of a material. (pp. 5– 8) Philosophers who insist that bottom-up explanations of the mac rosc opic properties of materials are desirable to the exc lusion of top-down modeling c onsiderations are, I think being naive, similar to those who maintain that top-down c ontinuum type modeling strategies are superior. The tyranny of sc ales appears to forc e us to c hoose between these strategies. However, new work on understanding the problem of upsc aling or modeling ac ross sc ales suggests that both types of strategies are required. Our top-down c onsiderations will inform the c onstruc tion of models at lower sc ales. And our bottom-up attempts will likewise induc e c hanges and improvements in the c onstruc tion of higher sc ale models. Mesosc opic struc tures c annot be ignored and, in fac t, provide the bridges that allow us to model ac ross sc ales. References Sorin Bangu. Understanding thermodynamic singularities: Phase transitions, data, and phenomena. Philosophy of Science, 76(4):488–505, 2009. Robert. W. Batterman. Theories between theories: Asymptotic limiting intertheoretic relations. Synthese, 103: 171– 201, 1995. Robert W. Batterman. Intertheory Relations in Physic s. The Stanford Encyclopedia of Philosophy, http://plato .stanfo rd.edu/entries/physics- interrelate/, 2001. Robert W. Batterman. The Devil in the Details: Asymptotic Reasoning in Explanation, Reduction, and Emergence. Oxford Studies in Philosophy of Sc ienc e. Oxford University Press, New York, 2002. Robert W. Batterman. Critic al phenomena and breaking drops: Infinite idealiz ations in physic s. Studies in History and Philosophy of Modern Physics, 36:225–244, 2005a. Robert W. Batterman. Response to Belot's “Whose devil? Whic h details?”. Philosophy of Science, 72(1): 154– 163, 2005b. Robert W. Batterman. Encyclopedia of Philosophy, c hapter Reduc tion. Mac millan Referenc e, Detroit, 2nd edition, 2006. Robert W. Batterman. On the explanatory role of mathematic s in empiric al sc ienc e. The British Journal for the Philosophy of Science, doi = 10.1093/bjps/axp018:1–25, 2009. Robert W. Batterman. Reduc tion and renormaliz ation. In Gerhard Ernst and Andreas Hüttemann, editors, Time, Chance, and Reduction: Philosophical Aspects of Statistical Mechanics. Cambridge University Press, Cambridge, 2010. Robert W. Batterman. Emergenc e, singularities, and symmetry breaking. Foundations of Physics, 41(6): 1031– 1050, 2011. Gordon Belot. Whose devil? whic h details? Philosophy of Science, 71(1): 128– 153, 2005. Jeremy Butterfield. Less is different: Emergenc e and reduc tion rec onc iled. Foundations of Physics, 41(6): 1065– 1135, 2011a. Jeremy Butterfield and Naz im Bouatta. Emergenc e and reduc tion c ombined in phase transitions. http://philsci- Page 20 of 23
The Tyranny of Scales archive.pitt.edu/id/eprint/8554, 2011b. Reint de Boer. Theory of Porous Media: Highlights in Historical Development and Current State. Springer, Berlin, 2000. R. Feynman, R. Leighton, and M. Sands. The Feynman Lectures on Physics, volume 2. Addison-Wesley, Reading, Massachusetts, 1964. Ulrich Hornung, editor. Homogenization and Porous Media, volume 6 of Interdisciplinary Applied Mathematics. Springer, New York, 1997. Leo P. Kadanoff. Sc aling, universality, and operator algebras. In C. Domb and M. S. Green, editors, Phase Transitions and Critical Phenomena, volume 5A. Ac ademic Press, San Diego, 1976. Leo P. Kadanoff. Statistical Physics: Statics, Dynamics, and Renormalization. World Sc ientific , Singapore, 2000. Penelope Maddy. How applied mathematic s bec ame pure. The Review of Symbolic Logic, 1(1): 16– 41, 2008. Tarun Menon and Craig Callender. Turn and fac e the strange … Ch-Ch-Changes: Philosophic al questions raised by phase transitions, This Volume 2012. Ernest Nagel. The Structure of Science: Problems in the Logic of Scientific Explanation. Harc ourt, Brac e, & World, New York, 1961. David R. Nelson. Defects and Geometry in Condensed Matter Physics. Cambridge University Press, Cambridge, 2002. John Norton. Approximation and idealiz ation: Why the differenc e matters. http://philsci- archive.pitt.edu/8622/, 2011. J. T. Oden (Chair). Simulation based engineering sc ienc e—an NSF Blue Ribbon Report. www.nsf.go v/pubs/repo rts/sbes_final_repo rt.pdf, 2006. Grigorios A. Pavliotis and Andrew M. Stuart. Multiscale Methods: Averaging and Homogenization. Texts in Applied Mathematics. Springer, New York, 2008. Rob Phillips. Crystals, Defects, and Microstructures: Modeling across Scales. Cambridge University Press, Cambridge, 2001. Ya G. Sinai. Probability Theory: An Introductory Course. Springer-Verlag, Berlin, 1992. Trans.: D. Haughton. Lawrence Sklar. Physics and Chance: Philosophical Issues in the Foundations of Statistical Mechanics. Cambridge University Press, Cambridge, 1993. Isaac Todhunter and Karl Pearson (Ed.). A History of the Theory of Elasticity and of the Strength of Materials from Galilei to Lord Kelvin, volume 1: Galilei to Saint-Venant 1639–1850. Dover, New York, 1960. Salvatore Torquato. Random Heterogeneous Materials: Microstructure and Macroscopic Properties. Springer, New York, 2002. Kenneth G. Wilson. Critic al phenomena in 3.99 dimensions. Physica, 73: 119– 128, 1974. Mark Wilson. Mec hanic s, c lassic al. In Routledge Encyclopedia of Philosophy . Routledge, London, 1998. Notes: (1) For related disc ussions, see Mark Wilson's forthc oming Physics Avoidance and Other Essays. (2) See Maddy (2008) for a forc eful expression of this skeptic al worry. Page 21 of 23
The Tyranny of Scales (3) Those who think that the renormaliz ation group provides a bottom-up explanation of the universality of c ritic al phenomena, e.g. Norton (2011), are mistaken, as we shall see below. (4) “Loc al” in the sense that the invarianc e holds for sc ales of several orders of magnitude but fails to hold if we z oom in even further, using x-ray diffrac tion tec hniques, for example. (6) I c all these intermediate sc ales “mic rosc ales” and the struc tures at these sc ales “mic rostruc tures” following the prac tic e in the literature, but it may be best to think of them as “mesosc opic .” (7) These latter are transformations that take plac e under c ooling when a relatively high symmetry lattic e suc h as one with c ubic symmetry loses symmetry to bec ome tetragonal. Some properties of steel girders therefore depend c ruc ially on dynamic al c hanges that take plac e at sc ales in between the atomic and the mac rosc opic (Phillips 2001, p. 547–8). (8) Though simpler than the c ase of understanding how atomic aspec ts of steel affec t its phenomenologic al properties, this is, itself, a diffic ult problem for whic h a Nobel priz e was awarded. (9) This is the limit in whic h the number of partic les N in a system approac hes infinity in suc h a way that the density remains c onstant—the volume has to go to infinity at the same time as the number of partic les. (10) See Batterman 2001 and 2006 for surveys of this and more sophistic ated strategies. (11) In the present example, it is hard indeed to see how to define or identify a nonstatistic al quantity suc h as temperature or pressure in thermodynamic s with a nec essarily statistic al quantity or set of quantities in the reduc ing statistic al mec hanic s. (See Sklar 1993, Chapter 9.) (12) I believe that the use of the evaluative terms “better,” “worse,” and “tainted” reflec ts an inherent prejudic e against nonreduc tionist points of view. In partic ular, as one of the issues is whether a more detailed (atomic ) theory is really better for explanatory, predic tive, and modeling c onc erns, this way of speaking serves to bloc k debate before it can get started. (13) I have taken this terminology from Hornung (1997, p. 1). (14) See Kadanoff (2000) and Batterman (2002, 2005, 2011) for details. (15) Systems above the c ritic al temperature will also appear homogeneous as the spins will be unc orrelated, randomly pointing up and down. (16) Thanks to Mark Wilson for the c olorful terminology! (17) See (Phillips 2001). (18) Note that in c ontinuum mec hanic s, generally, a material point or “material partic le” is not an atom or molec ule of the system; rather it is an imaginary region that is large enough to c ontain many atomic subsc ales (whether or not they really exist) and small enough relative to the sc ale of field variables c harac teriz ing the impressed forc es. Of c ourse, as noted, Navier's derivation did make referenc e to atoms. (19) I have fixed a typographic al error in these equations. (20) See Todhunter and Pearson (1960, pp. 224 and 235–27) for details. Note also how this limiting assumption yields different and correct results in c omparison to the finite atomistic hypotheses. (21) Cited in Todhunter and Pearson (1960, p. 495). (22) This is the temptation promised by an ultimate reduc tionist point of view. (23) See Phillips (2001, pp. 41–42). (24) Proofs of the c entral limit theorem that involve moment generating func tions M(t) for the c omponent random variables Yi make explic it that there is an asymptotic expansion in a small parameter t, where trunc ation of the series at first order gives the mean, and trunc ation of the series at sec ond order gives the fluc tuation term. Henc e Page 22 of 23
The Tyranny of Scales the c onnec tion between these limit theorems and first and sec ond order perturbation theory. In fac t, two limits are involved: the limit as the small parameter t → 0 and the limit n → ∞. (25) Details in Wilson (1974, pp. 125–27). (26) These will be generaliz ed distribution func tions. (27) See Torquato (2002) for a detailed development of this approac h. (28) See Maddy (2008) for a good disc ussion of this point of view among other interesting topic s about the applic ability of mathematic s. Robert Batterman Robert Batterm an is Professor of Philosophy at The University of Pittsburgh. He is a Fellow of the Royal Society of Canada. He is the author of _The Devil in the Details: Asym ptotic Reasoning in Explanation, Reduction, and Em ergence_ (Oxford, 2002). His work in philosophy of physics focuses prim arily upon the area of condensed m atter broadly construed. His research interests include the foundations of statistical physics, dynam ical system s and chaos, asym ptotic reasoning, m athem atical idealizations, the philosophy of applied m athem atics, explanation, reduction, and em ergence.
Symmetry Sorin Bangu The Oxford Handbook of Philosophy of Physics Edited by Robert Batterman Abstract and Keywords This c hapter, whic h provides a broad and c omprehensive survey of c onc epts of symmetry and invarianc e, disc usses the c lassific ation of symmetries and analyz es c ontinuous symmetries and the so-c alled gauge argument. It desc ribes a c onc rete situation in the 1960s where symmetry arguments led to the predic tions of elementary pc hapters. The c hapter also examines the c onnec tion between symmetries and laws, and evaluates the idea that c ertain symmetries c an be c onsidered “superprinc iples.” K ey words: sy mmetry , i n v ari an ce, con ti n u ou s sy mmetri es, gau ge argu men t, el emen tary pch apters, su perpri n ci pl es 1. Intro ductio n Always a fasc ination for the human mind, symmetry plays a fundamental role in modern physic s. Anc ient Greek thinkers introduc ed the c onc ept (and the word, συ μμετρια ) from whic h the modern one derives; 1 their interest, however, was mainly aroused by the aesthetic c onnotations attac hed to this notion (harmony, good proportion, unity). Later on symmetry c onsiderations bec ame instrumental in the domain of mathematiz ed physic al sc ienc e. The work of Johannes Kepler illustrates the lure of mathematic al harmony exemplarily: he famously attempted, in his Mysterium Cosmographicum (1596), to devise a theoretic al model of the solar system by drawing inspiration from geometric al relations. Kepler was partic ularly impressed by the austere beauty of the image of the spheric al shells insc ribed within, and c irc umsc ribed around, the five Platonic solids (tetrahedron, c ube, oc tahedron, ic osahedron, and dodec ahedron). More prec isely, he sought to demonstrate a c orrespondenc e between the distanc es of the planets from the sun and the radii of these shells; he was c onvinc ed that God's blueprint of the universe reflects an agreement between the observed ratios of the maximum and minimum radii of the planets and the geometric al ratios c alc ulated for the nested Platonic solids. Yet, as the c elebrated physic ist Freeman Dyson onc e noted, “This model is a supreme example of misguided mathematic al intuition” (1964, 130). The sought c orrelations did not exist, sinc e there are disc repanc ies between the predic tions and the observed data. A similar belief in the perfec tion of the c irc le also hindered Kepler's efforts to find out the c orrec t (elliptic al) orbits of the planets. While sometimes leading researc hers astray, it is beyond doubt that this type of belief has very often helped them a great deal. It appears that Kepler's c onfidenc e that the universe was c reated following mathematic al symmetry princ iples has in the end been useful in disc overing the laws of planetary motion. In fac t, it turns out that his sec ond law2 does establish an important astronomic al relation that is grounded on a symmetry (the time-invarianc e of angular momentum). An aesthetic notion of symmetry c ertainly played a role in the early evolution of physic s, sinc e c ases other than Kepler's c an be easily doc umented. But it is the more prec ise notion of symmetry as invariance that is truly fundamental for the modern period. At an intuitive level we speak of symmetric al geometric al figures, or symmetric al c onc rete shapes, suc h as snowflakes; what we mean, however, is that they have invarianc e Page 1 of 23
Symmetry properties. For instanc e, a square has the property that there are ways to manipulate it suc h that the end result of the manipulation is a square identic al to the initial one. Suc h transformations c an be a reflec tion in one of its diagonals, or a c loc kwise 90° rotation about its c enter; by c omparison, a c irc le is invariant under arbitrary amounts of rotation about its c enter. Rotations and reflec tions of a geometric al figure are just a partic ular type of transformation, performed on a partic ular kind of entity. As has been observed, this idea c an be generaliz ed naturally: in addition to visualiz able, c onc rete entities (suc h as squares, disks, or snowflakes), abstrac t entities c an be manipulated too— and found to be invariant. For instanc e, the relations holding within a c ertain c onfiguration of objec ts c an be invariant under permuting the objec ts or under uniformly shifting the objec ts’ positions in spac e; also, the form of a mathematic al expression c an be invariant under c ertain mathematic al transformations. Onc e this generaliz ation was effec ted, the next major c onc eptual advanc e in the study of invarianc e was the observation that one c an c onsider the set of all transformations that leave an entity (either abstrac t or c onc rete) unc hanged, and then define the operation of c omposition of transformations on this set. A simple example is the set of transformations that leave the square invariant, together with the c omposition operation. There are eight suc h transformations: four c loc kwise 90°, 180°, 270°, and 360° rotations and four reflec tions (two in the diagonals and two in the lines joining the middle of two opposite sides). This set, c all it R, is c losed under suc c essive c omposition of these transformations: performing two suc c essive transformations (either rotations or reflec tions) that leave the square invariant is equivalent to performing one (either rotation or reflec tion), with this resultant transformation being inc luded in R too. Moreover, there is a spec ial member of this set, a c loc kwise rotation of 360°, whic h has no c ompositional effec t when following or prec eding other transformations from the set. And, for any transformation t in R there is another transformation in the set that c an be c omposed with t to produc e the effec t of a 360° rotation. Finally, we note that the operation of transformation is also assoc iative. The set R and the operation of c omposition of transformations form a spec ific mathematic al struc ture, c alled a group. In this c ase, the group is c alled the symmetry group of the square. As we have seen, we c an manipulate both c onc rete and abstrac t entities. This latter kind of manipulation is relevant to physic s: if the invariant objec t is a c ertain mathematic al func tion assoc iated with the evolution of a physic al system—the Lagrangian (or, more prec isely, its time integral, the action)—then important c onsequenc es follow from its invarianc e. They have to do with the existenc e of c onservation laws, in a way made prec ise by Noether's theorem (disc ussed in sec tion 3). This generaliz ation is in the spirit of Herman Weyl's remark that the deeper signific anc e of symmetry for modern physic s c omes from the fac t that “we no longer seek this harmony in static forms like regular solids, but in dynamic laws” (Weyl 1952, 77). Group theory is the branc h of mathematic s that studies the most general properties of struc tures like R. Unlike analysis or differential geometry, the c onnec tion of this mathematic al theory to physic s was underapprec iated until the beginning of the twentieth c entury.3 But, as has been pointed out (Brading and Castellani 2003, 4– 5; 2007, sec t. 5), earlier attempts to link physic s and mathematic al transformations c an be doc umented. The c ase they disc uss is that of C. G. Jac obi's c anonic al transformation theory, developed in the c ontext of the “analytic al” version of c lassic al mec hanic s elaborated by Lagrange, d'Alembert, Liouville, Poisson, Hamilton, and others. The essenc e of the diffic ulty c onfronting the analytic al formulation of mec hanic s was that the c anonic al Hamiltonian equations of motion c ould not be integrated direc tly. For the c onservative systems, what was needed was a (c anonic al) transformation that would turn the Hamiltonian into a func tion of new variables, the goal being to transform the equations of motion into equivalent ones—and thus the initial problem into an equivalent but simpler one (i.e., one in whic h the c anonic al equations, in the new system of c oordinates, c an be integrated).4 What grounds this approac h is the powerful methodologic al symmetry princ iple “Same problem, same solution!” (disc ussed in van Fraassen (1991, 25) and (1989, c h. 10)). This transformational strategy is the c onc eptual anc estor of the line of researc h pursued by the famous Göttingen mathematic ians Felix Klein, Hermann Weyl, David Hilbert, and Emmy Noether, who pioneered a new way to c onc eive of the aims and methods of the physic al science—as the study of the invariant properties of theories. The goal of the following survey is to highlight some of the themes, problems, and arguments that justify viewing symmetry and invarianc e as important topic s in physic s and in the philosophy of physic s. The approac h taken here (one of several possible ones5) is to foc us on the impressive methodological and heuristic effec tiveness of Page 2 of 23
Symmetry symmetry thinking. Although methodology and heuristic s are granted c enter stage, the disc ussion will branc h off naturally toward a variety of related issues, espec ially traditional metaphysic al and epistemologic al queries about sc ientific c lassific ation, explanation, predic tion, ontology, and unific ation. More c onc retely, as we just saw with Jac obi, spec ific problems in physic s are sometimes solved by simplifying them as a result of operating c ertain invarianc e transformations. Or, as is the c ase with Einstein's theories of spec ial and general relativity (STR and GTR henc eforth), laws of nature and even whole theories are selec ted by imposing invarianc e c onstraints. Furthermore, by requiring that a c ertain type of symmetry hold loc ally (as opposed to globally), one disc overs that the most natural attempt to c omply with this c onstraint leads to the introduc tion of a new field, whic h happens to be the elec tromagnetic field (this is the famous gauge argument, to be outlined in sec tion 3). Mathematic al symmetries are also an indispensable c lassific atory tool in partic le physic s. Their taxonomic func tion brings with it two main benefits. The first is that order c an be imposed over the huge variety of elementary partic les. The sec ond c omes from the use of these sc hemes of c lassific ation as guides toward the fundamental ontology: while the known partic les neatly fitted the sc hemes, physic ists also perc eived “gaps” in these sc hemes, that is, positions for whic h no c orresponding partic le was detec ted. The existenc e of these gaps suggests that the physic al partic les that would fill them might themselves exist, as disc ussed in sec tion 5. Moreover, symmetry c onsiderations c an operate at a higher level, in the form of overarc hing methodologic al or heuristic princ iples. We have already enc ountered the seemingly unassailable dic tum “Same problem, same solution!” c onstraining the types of answers proposed to physic al questions. Even more general is the famous Princ iple of Suffic ient Reason, whic h is often invoked as a justific ation for attempts to disc over fac tors responsible for breaking a symmetry. A related, but more spec ific , methodologic al rule is the so-c alled “Curie Princ iple,” whic h, by urging that any asymmetry in the effec ts is reflec ted in an asymmetry of the c ause (or, equivalently, that no asymmetry arises spontaneously), suggests a direc tion of researc h when dealing with c ertain problems in physic s.6 In addition to its heuristic -methodologic al role, symmetry has a more spec ific epistemologic al func tion too. It is quite natural to try to c apture the very notion of objectivity in terms of invarianc e, and a number of rec ent authors have tried to do this in a variety of ways (Noz ic k 2001; Kosso 2003; Debs and Redhead 2007). When judged in c onnec tion to physic s, the key link between invarianc e and objec tivity is the idea that, roughly speaking, what is truly, objec tively real must look the same independent of the perspec tive from whic h it is desc ribed. In other words, it should be invariant under c hanging the frame of referenc e of the observer. The experimental side of sc ienc e also offers good examples of the c onc eptual c onnec tion between objec tivity and invarianc e: the result of an experiment qualifies as a piec e of objec tive knowledge insofar as the experiment c an be replic ated. And this means that the result is robust, or invariant under c hanging laboratories, lab tec hnic ians, c ountries, politic al systems, and so on. This epistemologic al relation between invarianc e and objec tivity is not restric ted to physic s, but oc c urs naturally in ethic s as well: our moral judgments should not be influenc ed by (i.e., remain invariant under c hanges in) the soc ial status, rac e, ethnic ity, nationality, and so on of the persons involved. It is widely ac c epted that these invarianc e properties are important c onstituents of an objec tive ethic al assessment. Finally, while symmetry is no doubt a pivotal notion in modern sc ienc e, one should not forget that there are important aspec ts of rec ent physic s in whic h it is an asymmetry, or the breaking of a symmetry, that must rec eive spec ial attention. One suc h example is the asymmetry between past and future, as observed in the behavior of entropy in thermodynamic s and statistic al mec hanic s.7 Another example is the asymmetry between matter and antimatter, an asymmetry effec t inc orporated in the weak interac tions.8 Equally important, spontaneous symmetry breaking (SSB) plays a c entral role in the Standard Model for partic le physic s. This c hapter is struc tured as follows. Sec tion 2, after introduc ing the c lassific ation of symmetries, foc uses on disc rete symmetries. Sec tion 3 begins with a presentation of the main result holding for c ontinuous symmetries (Noether's theorem) and develops naturally toward an examination of the so-c alled “gauge argument.” The aim of sec tion 4 is to further the general c ase for the fertility of the gauge idea but also to offer some details on the more rec ent c onc rete uses of symmetry in physic s, suc h as in ac hieving elec troweak unific ation and in the ongoing hunt for the Higgs boson. Sec tion 5 disc usses a c onc rete situation in whic h symmetry arguments led to the spec tac ular predic tions of elementary partic les in the 1960s. Sec tion 6 enlarges the perspec tive, taking up the theme of the c onnec tion between laws and symmetries, aiming to explore Wigner's idea that c ertain symmetries c an be thought of as “superprinc iples” (Wigner 1967, 43) able to explain the laws themselves. Page 3 of 23
Symmetry 2. Classifying Symmetries The literature reviewing the role of symmetry in physic s, by philosophers and physic ists alike (Brading and Castellani 2003b, 2007; Morrison 2008; Coughlan and Dodd 1991, c h. 6, etc .), uses the following c lassific atory divisions. First, there is the distinc tion between (i) spac etime symmetries and symmetries that do not involve spac etime, (ii) between c ontinuous and disc rete symmetries, and (iii) between loc al (“gauge”) and global (“rigid”) symmetries. Another important distinc tion is the one introduc ed by Eugene Wigner (1967) between geometric al and dynamic al symmetries. Geometric al symmetries are universal, spac etime invarianc es of laws of nature—all laws of physic s have to be Lorentz (Poinc aré) invariant; dynamic al symmetries, on the other hand, are invarianc es of the laws governing the spec ific interac tions found in nature (weak, strong, elec tromagnetic , and gravitational). After I sketc h the variety of options physic ists have in c lassifying partic ular invari-anc es in sec tion 2.1, I disc uss the disc rete symmetries (c harge c onjugation, parity, and time-reversal), and then their relations (sec tion 2.2). Continuous symmetries and the loc al vs. global distinc tion will be examined in sec tion 3; I will return to Wigner's views in the final sec tion. When an individual symmetry is c onsidered, it usually falls into more than one c ategory. For instanc e, isospin symmetry (to be disc ussed below) is not a spac e-time symmetry while being a global symmetry (in the sense, to be explained, that the proton-neutron transformation is effec ted everywhere at onc e). By c ontrast, the grounding symmetry princ iple of Einstein's GTR, the invarianc e of the laws of physic s under transformations of c oordinates depending on arbitrary func tions of spac e and time, is of c ourse a spac etime symmetry, a loc al symmetry, and, moreover, a dynamic al one (this being the very example that prompted Wigner to introduc e the geometric al- dynamic al distinc tion (Wigner 1967, 23)). Yet, when referring to spatiotemporal symmetries physic ists usually have in mind the symmetries of STR, expressed mathematic ally as the 10-parameter Poinc aré group. Another typic al example of a symmetry unrelated to spatiotemporal transformations of c oordinates is c harge-c onjugation symmetry, whic h holds when systems transform into themselves upon swapping partic les for antipartic les (e.g., the elec tron and the positron). This last example brings up another way to c lassify symmetries, into c ontinuous and discrete. 2.1 Discrete Symmetries: Parity, Charge Conjugation, Time-Reversal Intuitively, the distinc tion of c ontinuous vs. disc rete symmetries c an be c aptured in terms of the simple example presented in sec tion 1, the symmetries of a square vs. the symmetries of a c irc le. They both have rotational symmetry, but while the square is invariant only under disc rete amounts of rotation (multiples of 90°), the c irc le remains invariant under any amount of rotation about an axis passing through its c enter. Not all spatiotemporal symmetries are c ontinuous. An interesting c ase of disc rete symmetry that is also spatiotemporal is parity, or spac e inversion. The operation involved in the parity symmetry (the operator is typic ally denoted by P) is a reflec tion through the origin of the c oordinate system. The transformation simply reverses the spatial c oordinates of an event from (x, y, z ) to (–x, –y, –z ). Importantly, the way the parity transformation has been desc ribed above amounts to understanding it in an “ac tive” way: it is the system that undergoes the transformation. We c an c onc eive of the transformation as a “passive” one too; in this c ase the transformation is applied to the c oordinate system (used in the system's desc ription), not to the system itself. Thus, the parity transformation amounts to turning a left-handed c oordinate system into a right-handed one. In general, physic al systems (suc h as a single partic le or a c ollec tion of them) that remain the same after a parity transformation are said to have “even” parity; if they do not, they are assigned “odd” parity. More c onc retely, the parity transformation operator P ac ts on wave-func tions as follows. If r is the spatial c oordinates vec tor, we have, by definition, that P| ψ (r, t)) = | ψ (−r, t)). Those states that are eigenstates of the parity operator have definite parity, indic ated as even by the eigenvalue + 1 (i.e., satisfying P| ψ (r, t)) = | ψ (r, t))), or, if the eigenvalue is −1, as odd (P |ψ(r, t)) = −|ψ(r, t))). While parity is not always c onserved (e.g., in proc esses involving weak interac tion, suc h as the beta dec ay of c obalt-60), physic ists rec ogniz e an important epistemic payoff assoc iated with its c onservation. When the system is subjec ted to a type of interac tion that obeys this symmetry (and all three other types of physic al interac tions— strong, elec tromagnetic , and gravitational—do obey it), then c hanges of parity state are forbidden. Thus, parity c onservation imposes a c onstraint on the possible evolutions of the system in question.9 Page 4 of 23
Symmetry We have already enc ountered another important disc rete symmetry, c harge c onjugation symmetry; its operator is typic ally denoted by C , and the c harge c onjugation transformation is C | ψ (r)) = ± | ψ (r)〉. Similar to the parity symmetry, wave-func tions of systems c an have either even or odd c harge c onjugation symmetry. The photon wave-func tion, for instanc e, is odd. This helps determine the parity of a partic le that previously dec ayed into two photons: it must have even parity (the produc t of the parities of the two photons), if the interac tion responsible for the dec ay c onserves this symmetry. This piec e of information is extremely valuable: it forbids other types of proc esses to happen, in partic ular the dec ay of that partic le into an odd-c harge c onjugated state. The third important disc rete symmetry is time-reversal, whose operator is designated as T. Generally speaking, this invarianc e means that the direc tion of the flow of time is irrelevant in fundamental interac tions.10 To say that a system is invariant under T amounts to saying that if the system evolves from an initial state to a final one, then the reversal of the direc tion of motion of its c omponents is possible and will bring the system from the final state bac k to the initial one (mathematic ally speaking, we simply replac e the expression for time with its negative version). Partic les’ c ollision and their time-reversed twin, the dec ay, are typic al c ontexts in whic h this symmetry c an be demonstrated. 2.2 The CPT Theorem The C and P disc rete symmetries c an operate together, as a produc t (or c omposite) symmetry; one way to c ombine them is in the form of a C P transformation (i.e., c harge c onjugation and spac e reversal). If this symmetry is to hold, then, naturally, the laws of physic s should be invariant under two operations: interc hanging a partic le with its antipartic le, and spatial inversion of left and right. But this symmetry is violated, as has been disc overed by Christenson, Cronin, Fitc h, and Turlay (1964) by studying the dec ays of neutral kaons. More prec isely, what they showed was that weak interac tions violate both the c harge-c onjugation symmetry C and the mirror reflec tion symmetry P, and also their c ombination. While this violation might sound like bad news, it turns out that it is intimately linked with another asymmetry mentioned above: the dominanc e of matter over antimatter in the known universe. However, onc e the third operation (time-reversal T) is taken into ac c ount, the final produc t C PT is an exact symmetry—as far as we c an tell for now. The c laim has testable c onsequenc es, and two are usually stressed. First, that partic les and antipartic les have the same masses and lifetimes—and, as a c orroboration of the C PT result, this has been c onfirmed through many experiments over the years. Sec ond, that some sort of c ompensation rule is in effec t: when one symmetry (or a pair of them) is broken, the other(s) c anc el the violation out so that the final c omposite C PT symmetry remains intac t (Coughlan and Dodd 1991, 48; see also Greaves 2010, for some philosophic al puz z les assoc iated with this symmetry). I shall stop here with the review of disc rete symmetries and in the next sec tion I will turn to c ontinuous symmetries. The reason they must be given spec ial attention is the abovementioned c onnec tion between these symmetries and a spec ial c lass of laws of nature—the c onservation laws—as established by a famous theorem proved by Emmy Noether in 1918 (presented in sec tion 3.1). This disc ussion will progress naturally toward another important topic , the distinc tion between the global (or “rigid”) and loc al, or gauge, symmetries, followed by the so-c alled “gauge argument” (section 3.2). 3. Co ntinuo us Symmetries, Co nservatio n Laws, and the “Gauge Argument” 3.1 Noether's Theorem In a nutshell, the theorem of interest here11 maintains that for every c ontinuous global symmetry of the Lagrangian there is a c onservation law (and vic e versa, though this sec ond c laim is not made in the original theorem). It is c lear, even from this rough formulation, that this result is not explic itly available in standard Newtonian c lassic al mec hanic s. It presupposes the c onc eptual framework of the so-c alled “analytic al” mec hanic s, in whic h the Lagrangian func tion L (in essenc e, the differenc e between the kinetic and potential energy) plays the c entral role. An important methodologic al innovation lies behind the Lagrangian formulation of mec hanic s: the step-by-step Newtonian desc ription of physic al systems is abandoned, and an overall approac h is adopted. Within the Newtonian sc heme, the aim is to c ompute what the system (say, a moving partic le) will do in the next infinitesimal time interval. Within the Lagrangian sc heme, the issue is tac kled from a different angle: we take an overall view of Page 5 of 23
Symmetry the trajec tory of the partic le and determine it all at onc e. The ac tual path is singled out as the one that satisfies c ertain minimiz ation c onstraints. While in terms of predic tive power the Lagrangian and Newtonian versions of mec hanic s are taken to be equivalent, physic ists tend to think that results suc h as Noether's make the former preferable to the latter; 12 however, rec ent analyses of this relation (suc h as Mark Wilson's, in this volume) rec ommend more c aution in assigning suc h priorities. The Lagrangian methodology c an be broken down into two steps. After defining the func tion L, a func tional S— c alled the action—is introduc ed. S is defined for eac h possible path (“history”) c onnec ting the initial and the final spac etime positions of a partic le, as the time integral of the Lagrangian. Sec ond, the ac tual path followed by the partic le is selec ted: among all possible paths, this is the one that minimiz es S. The applic ation of this minimiz ation (extremum) c onstraint—Hamilton's “Princ iple of the Least Ac tion”—leads to the Euler-Lagrange equations: Physic ists rec ogniz e important heuristic gains obtained from rec onc eiving c lassic al mec hanic s in this way. Unlike the Newtonian approac h, the analytic al approac h requires the c onstruc tion of a single quantity L, a sc alar, whic h yields the equations of motion. Moreover, due to the introduc tion of the so-c alled “generaliz ed” c oordinates in the analytic al sc heme (usually denoted by q), the Euler-Lagrange equations preserve the same form upon switc hing from Cartesian c oordinates to any other general set of c oordinates. Finally, the observation that brings us c loser to Noether's theorem is that in this formalism the link between symmetries and c onservation laws bec omes easily notic eable. If we require that the Lagrangian L be independent of a c ertain c oordinate q (whic h is to require that L is invariant under the transformations of this c oordinate) then, in mathematic al terms, this is tantamount to saying that its c orresponding partial derivative is z ero: Now, if we look at the Euler-Lagrange identity, it is evident that the time derivative appearing on its right hand has to be nil too, whic h amounts to the statement of a c onservation law: namely, that the quantity does not vary in time. In partic ular, if we return to usual Cartesian c oordinates and plug in we get that and, since is the linear momentum p, it follows that this quantity is c onserved. In other words, invarianc e under translations in space implies conservation of linear momentum. The lesson to draw from these simple c onsiderations is that we ac tually do not need Noether's theorem to establish some straightforward, but important, implic ations like the one above. Sinc e the relation between some of the familiar symmetries and the c onservation laws was not a surprise at the time she c ommunic ated her work, the main reason for which the theorem was praised was its generality (Brading and Brown 2003, 89, 98): roughly put, it proves the most general fac t that if the ac tion integral S is invariant under a c ontinuous (Lie) group of transformations (c harac teriz ed by a finite number s of parameters) then, if the Euler-Lagrange equations are satisfied, then there exist s c onserved “c urrents.” Yet, as Brading and Brown (2003, 92) point out, Noether's original c onc ern had to do with the following (again, more general) question, the so-c alled “variational problem” (whic h is in fac t the title of her paper): given a smooth infinitesimal transformation of the independent or dependent variables (appearing in the Lagrangian), under what c onditions does the ac tion remain invariant? More prec isely, her aim was to find the general c onditions that the variables must satisfy, if the first-order func tional variation of the ac tion ∂S vanishes (assuming that the region of integration in the integral defining the ac tion is arbitrary). The more familiar implic ations—suc h as the one holding between the invarianc e under translations in time and the c onservation of energy, or between the invarianc e under spatial rotations and the c onservation of angular momentum—follow immediately, as applic ations of the general result. Furthermore, the theorem is general in yet another way. Loosely speaking, it turns out that the details of the ac tion are irrelevant: if two different ac tions remain invariant under the 13 Page 6 of 23
Symmetry same transformation, the same c onservation law c orresponds to both (Zee 2007, 119– 120).13 In Noether's obituary, Einstein plac ed her theorem in the spec ial c ategory of “spiritual formulas.” This praise was meant to c onvey the point that the result amounts to signific antly more than a tec hnic ally brilliant ac hievement: it is a profound insight into the order of nature. In addition to this gain in understanding, the prac tic al-heuristic use of the theorem is equally impressive. Examples are easy to find, espec ially in the high-energy domains. As Zee (2007, 119– 120) explains, these are situations in whic h physic ists knew the c ontinuous symmetry governing a c ertain physic al situation, and thus c ould c onfidently begin to look for a c onserved quantity (as we will see, isospin is a c ase in point). The reverse c ase is also possible: sometimes the physic ists did not have any c lue as to what the ac tion was but were able to identify experimentally c ertain c onserved quantities, so they inferred that some c orresponding symmetries must exist. The theorem is then invoked to give them hints about what the ac tion might be. In fac t, this sec ond situation is illustrated by the famous example of the elec tric c harge. That this quantity is c onserved had been known for a long time, and Noether's theorem indic ates that a c ertain symmetry must c orrespond to it. Indeed, in 1927, Fritz London, building up on some ideas advanc ed by Weyl in the early 1920s, showed that the c onservation of c harge follows from global phase invarianc e (that is, invarianc e of the physic s under an arbitrary shift in the c omplex phase of the wave-func tion). Weyl's previous idea was to propose scale invarianc e—i.e., the requirement that the physic al laws do not c hange if the sc ale of all length measurements is shifted by the same amount. This insight, although ultimately inc orrec t, was nothing short of revolutionary—it marked the first appearanc e of the c onc ept of “gauge.” The c ontext in whic h Weyl did this work was his reflec tions on possible ways to generaliz e the Riemannian geometry of Einstein's GTR (for more details, see Ryc kman 2003). 3.2 The Gauge Argument It is worth beginning the following brief exposition of the so-c alled “gauge” argument (and “gauge” princ iple) by noting that, in and of itself, the requirement of global phase invarianc e—i.e., the requirement that the physic s does not c hange upon the multiplic ation of the wave-func tion by a c onstant wave fac tor—is rather unc ontroversial. The more interesting question is what happens when a related but stronger c onstraint is imposed, namely local phase invarianc e. (Historic ally, the loc ality idea was envisaged in analogy with the symmetry grounding GTR, the requirement of invarianc e under arbitrary c urvilinear c oordinate transformations.) Thus, the original thought was to study those situations in whic h the phase fac tor is not held c onstant (same everywhere), but is allowed to vary with eac h spac etime point (henc e the global/loc al distinc tion).14 Following Quigg (1983, 45– 47) and Martin (2003, 42– 43), more detail c an be filled in the above general desc ription. If we start with the Lagrangian L for a free c omplex sc alar field ψ (x), the c orresponding ac tion will be invariant (∂L = 0) under global transformations of the form ψ → eiqθψ; ¯ψ¯ → e−iqθ¯ψ¯, where θ is constant (the corresponding group is the abelian Lie group U(1)). When the equations of motion are satisfied, Noether's theorem gives us the c orresponding c onserved c urrent (where q c an be identified as the elec tric c harge). As noted, the next step is the loc aliz ation of the transformation; we now take θ to be θ(x), a func tion of spac etime c oordinates. It is immediate that L is no longer invariant under the corresponding transformation ψ(x) → eiqθ(x)ψ(x), ψ¯¯(x) → e−iqθ(x)ψ¯¯(x), sinc e simple c omputations show that the derivatives ∂μθ (x) do not vanish. Therefore, if the loc al invarianc e is to be preserved, L has to be modified. Instead of L we consider a new Lagrangian, L* = L − JμAμ, where J μ = qψ¯¯γμ ψ (the fac tors γμ are the Dirac matric es). This new Lagrangian is invariant, and what c ontributed to sec uring the invarianc e was the introduc tion of the (“c ompensatory”) field Aμ (the so-c alled “gauge potential”), whic h transforms as Aμ(x) → Aμ(x) − ∂μθ(x). It is in virtue of this behavior that Aμ c an be reinterpreted as the (familiar) elec tromagnetic potential. Thus, the c onsequenc es of imposing this “loc al” symmetry are quite astonishing: roughly speaking, one realiz es that a natural route to take is to introduc e a new field that has the properties of the elec tromagnetic field! It is as if there is a “gauge logic ” of nature (Martin 2003, 43). This field must have an infinite range, so its quantum must be massless (to obey the time-energy unc ertainty relation: massive partic les—the quanta of the fields—dec ay quic kly, and c an travel only short distanc es).15 The surprise is that this is what ac tually happens in nature: the quantum is the photon. The situation is c onc eptually intriguing, sinc e it looks like something (a physic ally signific ant objec t) has been gotten from nothing (mere mathematic al re-desc ription). A rehearsal of the main philosophic al problems raised by this argument is in order. Physic ists reflec ting on the “power of the gauge” usually endorse this feeling of mystery and surprise, espec ially in their more popular presentations (e.g., Sc humm 2004). Most philosophers, however, adopt a more c irc umspec t attitude (see Brown Page 7 of 23
Symmetry 1999; Teller 2000; Earman 2003a; Healey 2007; etc .), expressing, in some c ases, serious reservations about this power. Martin (2003) provides a rec ent c omprehensive analysis of this issue, along the following lines. First of all, it is not c lear what are the metaphysic al and epistemologic al grounds upon whic h to demand loc al gauge invarianc e; or, in other words, it is far from c lear why one should embrac e the Yang-Mills gauge princ iple—“every continuous symmetry of nature is a local symmetry” (Mills 1989, 496; emphasis in original). A series of reasons are traditionally advanc ed in the textbooks, but a quic k review of the philosophic al literature (some of it mentioned above) reveals that they are not found entirely satisfying. In their 1954 paper, Yang and Mills argue that the idea of loc al symmetries is “more c onsistent with the c onc ept of loc aliz ed fields” (1954, 191; see also Auyang 1995). More generally, physic ists emphasiz e that loc ality is required on the basis of STR prec luding instantaneous c ommunic ation between distant spac etime loc ations. Yet, it is objec ted, it is not immediately evident whether the loc al– global distinc tion grounding the gauge argument perfec tly mirrors the loc ality c onstraint as imposed by STR. Another topic that c aught philosophers’ attention is the uniqueness of the c ompensatory modific ation desc ribed above. Some deny this feature altogether; Martin, for instanc e, argues that “the modific ation is not uniquely dic tated by the demand of loc al gauge invarianc e. There are infinitely many gauge-invariant terms that might be added to the Lagrangian if gauge invarianc e were the only input to the argument” (2003, 44). In other words, what is questioned here is the real power of this invarianc e demand: if taken in isolation, loc al gauge invarianc e does not dic tate the form of the field; this uniqueness is ac hieved only by also requiring (i) Lorentz invarianc e, and (ii) that the final outc ome be a renormaliz able theory.16 Thus, Martin suggests, the renormaliz ability of the theory (roughly, the mathematic al tec hnique by whic h it is ensured that the theory delivers finite values as predic tions of the quantities of interest, suc h as the elec tric c harge) should be granted a more prominent role in the evaluation of the outcome of the gauge argument. One c an even ask why the gauge-invarianc e c onstraint is more signific ant than renormaliz ability in the ec onomy of the argument. Thus, a different angle of attac k against the argument is possible, ac c ording to whic h renormaliz ability is in fac t the pivotal feature of the model; gauge invarianc e bec omes only one among a few valuable features of a quantum theory. Finally, a good deal of the disc ussion in the philosophic al literature is devoted to c larifying to what extent the argument presents us with a situation in whic h an interac tion field is somehow “generated” out of sheer mathematic al formalism.17 Earman (2003a, 157) summariz es the grounds for a reserved attitude toward the “magic ” of the gauge as follows: I am in agreement with Martin ([2003]) who finds the “getting something from nothing” c harac ter of the gauge argument too good to be true. In partic ular, a c areful look at applic ations of this argument reveals that a unique theory of the interac ting field results only if some meaty restric tions on the form of the final Lagrangian are implic itly in operation; and furthermore, the kind of loc ality needed for the move from the “global symmetry” (invoking Noether's first theorem) to the “loc al symmetry” (invoking Noether's sec ond theorem) is not justified by an appeal to the no-ac tion-at-a-distanc e sense of loc ality supported by relativity theory. Not only is there no magic to be found in the gauge argument, but the “gauge princ iple” that presc ribes a move from global to loc al symmetries for interac ting fields c an be viewed as output rather than input: for example, it c an be viewed as the produc t of a self-c onsistenc y requirement (see, for example, Wald 1986) or as a c onsequenc e of the requirement of renormaliz ability (see, for example, Weinberg 1974). 4. Gauge Theo ries, Unificatio n, Symmetry Breaking, and the Higgs Field 4.1 The Fertility of the Gauge Idea Elec tromagnetism c an be understood as the first illustration of the applic ation of the (heuristic ) effec tiveness of the gauge princ iple.18 But physic ists were able to find a larger signific anc e of the loc al gauge invarianc e idea. As Mills (1989) narrates, C. N. Yang was, in the mid-1950s, among the very few who realiz ed the potential of this insight; he envisaged the thought that all fundamental physic al theories c ould be formulated as gauge theories. Bac k then, the only c onservation law seemingly similar to that of the elec tric c harge was the c onservation of a quantity c alled “isospin”(more on this below). In fac t, the sc heme Yang had in mind was an analogy with the loc al gauge invarianc e idea that worked so well in the elec tromagnetic c ase: as we saw, for the strong interac tion the role of the elec tric c harge would be played by isospin, and, onc e the loc al gauge invarianc e c onstraint was in plac e, new Page 8 of 23
Symmetry (gauge) fields needed to be introduc ed, having the “gluing” role assigned to the elec tromagnetic field in electrodynamics. The isospin idea has been developed along the following lines.19 Physic ists were intrigued by two indisputable experimental findings. First, neutrons and protons have approximately the same mass.2 0 Sec ond, despite the fac t that they have different elec tric c harges (neutrons are neutral while protons are positive), they are bound together inside the nuc leus by the strong nuc lear forc e. This means, in physic ists’ jargon, that the strong forc e is “c harge- blind.” Henc e, their c harge c ould not be of physic al signific anc e, as far as the strong forc e is c onc erned—that is, onc e other types of interac tions, the elec tromagnetic one in partic ular, are “switc hed off.” For this reason, a natural idea was to c onc eive of their different c harges as mere different “labels” applied to them. Heisenberg, who introduced the isospin idea in 1932, proposed to treat protons and neutrons as different states (labeled n and p) of the same partic le, c alled the nucleon. In this way, the pair n−p c an be understood either as referring to two partic les (the orientation of the isospin distinguishing them), or as desc ribing two different states of the same partic le, the nuc leon. Thus, isospin is introduc ed by analogy with the elec tron spin; just as the spin of the elec tron c an have two orientations along the third axis, so too the nuc leon c an appear in two isospin states (the positive eigenvalue indicates a proton, while the negative one a neutron). This invarianc e of strong interac tions under neutron-proton permutations simply means that the proton and the neutron are indistinguishable2 1 —when looked at from the viewpoint of this kind of interac tion. (I mention this aspec t now, sinc e, as we will see below, it is important for c lassific atory reasons.) As with all symmetries, the proton-neutron isospin symmetry is c aptured mathematic ally in terms of a group struc ture; the group involved here is SU(2). The elements of this group “rotate” protons into neutrons (and vic e versa); nuc leons are thus “mixtures” of these two c omponents. Yang and Mills built up on this idea and required invarianc e under loc al redefinitions of these c omponents of the nuc leon. In ac c ordanc e with the “gauge logic ” presented above, this invarianc e led to the introduc tion of a massless spin-1 gauge partic le c alled ρ, whic h, bec ause of the possibility that the nuc leon might c hange its elec tric c harge during interac tions, had to exist in three c harged states—positive, negative and neutral. But these partic les (gauge bosons) c ould not be found in nature, so a new theory had to be advanc ed. As gradually bec ame evident, the strong interac tions are governed by a bigger symmetry and, c onsequently, the structure of interest had to be a bigger group. Roughly speaking, the story unfolds as follows (Coughlan and Dodd 1991, 60). First, a new c harac teristic of strong interac tions has been disc overed—the c onservation of a new quantity, dubbed by Gell-Mann “strangeness.” Sec ond, onc e this new quantum number was c onsidered in addition to the isospin number, the new symmetry governing these interac tions has been found to be SU(3). Thus, it turned out that the original theories about isospin have in fac t been invoking the symmetry above its fundamental level. Nevertheless, those theories were of c ruc ial importanc e in the development of the ideas, and the symmetries, underlying the ultimately suc c essful theory: the gauge theory based on the SU(3) group, quantum c hromodynamic s (QCD), whic h postulates quarks (of various “c olors”) as the basic entities and the ultimate c onstituents of hadrons (the generic name for partic les partic ipating in strong interac tions).2 2 Not only strong interac tions, but weak interac tions too c an be treated within the gauge framework.2 3 The analogy with isospin is also heuristic ally helpful in this c ontext, the c onserved quantity being c alled “weak isospin.” The key physic al invarianc e grounding the theory is that the weak interac tion is blind to distinc tions between the neutrino and the elec tron—it only “sees” a generic lepton. Mathematic ally, the struc ture of interest is the group of weak isospin, whic h, as it happens, is the previously introduc ed SU(2). So, after overc oming a series of false starts and various experimental diffic ulties,2 4 one c an say that at the beginning of the 1970s the strong, elec tromagnetic and weak interac tions were desc ribed by relatively well-understood gauge theories. An important theoretic al breakthrough, in whic h symmetry c onsiderations had a c ruc ial role to play, c ame in 1967– 68 when Steven Weinberg and, independently, Abdus Salam, advanc ed a unified model of the elec tromagnetic and weak interactions. Their model drew on previous work by Sheldon Glashow and, in a nutshell, was a gauge theory whose gauge group c onsisted in “pasting” together the two groups governing the c omponent interac tions: U(1) × SU(2). The Glashow-Salam-Weinberg (GSW) model, shown to be renormaliz able by ‘t Hooft and Veltman in 1971, was meant to desc ribe the interac tions of leptons through the exc hange of weak bosons and photons.2 5 While the thought that the two types of interac tions might rec eive a unified treatment was in the air for a while (sinc e two of the weak bosons bear elec tric al c harge), the new, distinc tive feature of the model was the inc orporation of a “mec hanism” meant to ensure that the W and Z bosons ac quire mass, while the photon remains massless—and 26 Page 9 of 23
Symmetry today this is c alled the “Higgs mec hanism” after Peter Higgs, the physic ist who proposed it in 1964.2 6 4.2 Symmetry Breaking to the Rescue: Electroweak Unification and the Higgs Mechanism One way to solve the mass problem for the Standard Model was to borrow an idea from c ondensed matter physic s, the spontaneous breaking of a symmetry (SSB). More prec isely, it was the work done on phase transitions in (quantum) statistic al mec hanic s that offered the best analogy. Upon a drop in temperature below a c ertain value, liquid water freez es; the formation of ic e c rystals brings the system in a stable state (free energy is minimiz ed), but the rotational symmetry of molec ules available in the liquid state is lost (or “broken”). In essenc e, SSB c laims that the massiveness of the partic les arise in an analogous fashion: right after the Big Bang, when the value of the available energy dropped under a c ritic al value, the unified (and symmetric ) elec troweak interac tion broke into what we observe today, the elec tromagnetic and weak c omponents. Partic les’ interac tion with the Higgs field pervading the whole spac e “slowed them down,” induc ing effec ts similar to inertial effec ts, and thus mimic king their possession of mass. Importantly, this is a general solution to the mass problem: all partic les must interac t with this “dragging” field—not only the W and Z bosons, but elec trons and protons too, the top quark, and so on—to appear as massive (the more intensely they c ouple to the Higgs field, the more massive they appear). The photon, on the other hand, does not c ouple to the Higgs field at all, and thus appears as massless. More prec isely, the Higgs field is a doublet of fields (Φu p p e r, Φlo we r), and the theory c laims that right after the Big Bang nature c hose one of these c omponents (the lower one) as being the field pervading the whole universe.2 7 As this c hoic e is arbitrary, the symmetry of the Higgs doublet is in fac t preserved; it is not really broken, only hidden. Models borrowed from c lassic al mec hanic s have also been heuristic ally instrumental in developing the SSB insight.2 8 Imagine a small ball on top of a Mexic an hat; this is a symmetric c onfiguration, but not the one of minimum energy. Similar to the freez ing example, the ground state is asymmetric ; it is the state toward whic h the system tends to evolve as a result of a small perturbation, the one in whic h the ball tumbles down and settles into the rim in an arbitrary position (one among the infinite number of positions available).2 9 In the quantum c ontext, SSB oc c urs (roughly speaking) when the Lagrangian is symmetric but the ground state is not. When tried as a solution to the mass problem, the idea of a spontaneous breaking of a global symmetry works only up to a point. Mathematic al manipulations—in fac t, redefinitions of the two c omponent fields—do deliver the sought outc ome, that is, the needed mass is “generated”; yet, as it turns out, the main drawbac k c omes from a result known as the Goldstone theorem (see Goldstone 1961, 1962), whic h c laims in essenc e that the spontaneous breaking of a (global) symmetry results in the appearanc e of a massless spin-0 boson, the so-c alled “Goldstone boson.” But suc h an entity does not exist in nature, so this spec ific way to exploit SSB had to be abandoned. The main insight c ould be resc ued though, by demanding that the Mexic an-hat-shaped Lagrangian be invariant under local gauge transformations. As expec ted, the logic of gauge has it that this requirement c an be satisfied if a massless gauge boson is introduc ed. In this new c ontext, however, the old tric k works. Upon redefining the two c omponent fields, physic ists ac hieved exac tly what they needed: the Goldstone boson disappears, as if absorbed into the massless gauge boson just added3 0 —whic h thus ac quires mass (sinc e, as physic ists’ say, it “eats” the Goldstone boson). For this story to work, the element still in need of experimental c onfirmation is the ac tual existenc e of the quanta of the Higgs field (the Higgs boson), one of the emerging redefined fields. As it happens, the present essay is being written during the time when experimentalists at Fermilab's Tevatron and at CERN's Large Hadron Collider are searc hing for the Higgs boson; as of May 2012, its existenc e has yet to be either c onfirmed or ruled out. While various events assoc iated with this biggest sc ientific experiment ever performed make the popular news regularly, what fasc inates physic ists and philosophers of physic s alike are a series of extremely intriguing theoretic al features of the Higgs field. It is generally ac knowledged that this field, were it to exist, would be rather unique. The Higgs mec hanism is indeed the simplest mec hanism having the nec essary features (gives mass to the gauge bosons and is inc orporable in a gauge theory). It has a nonz ero (renormaliz ed) vac uum expec tation value (of 246 GeV) and, as Gunion, Stange, and Willenbroc k point out, “despite the simplic ity of the standard Higgs model, it does not appear to be a c andidate for a fundamental theory. The introduc tion of a fundamental sc alar field is ad hoc; the other fields in the theory are spin-one gauge fields and spin-half fermion fields…. The standard Higgs model ac c ommodates, but does not explain, those features of the elec troweak theory for whic h it is responsible” (1996, 24). The worry here is akin to the more general and familiar philosophic al antirealist c onc ern that the postulation of entities and struc tures guided by pragmatic values might generate a c onflic t, in this c ase between simplicity and explanatory power. Page 10 of 23
Symmetry Even when told in this oversimplified form, this story invites a diversity of philosophic al questions, mainly c onc erning three issues: (i) the ontologic al implic ations of the elec troweak unific ation, (ii) the nature of the Higgs mec hanism, and (iii) the right epistemologic al attitude toward the hidden/spontaneously broken symmetries. In essenc e, the literature in this area emphasiz es the serious diffic ulties enc ountered in the proc ess of c larifying and interpreting the c onc eptual moves made by the theoretic ians. In partic ular, a key epistemologic al problem stands out: insofar as it is impossible to “switc h off” the Higgs field, we c annot really know what happens with the partic les’ mass in its absenc e. Moreover, what are the metaphysic al lessons to draw from the elec troweak unific ation? This is unc lear given that the pasting together of the two theories and the mixing of the fields does not seem to be muc h more than a mathematic al, formal operation, doing rather little to vindic ate the idea that we have disc overed true unity in nature, as Georgi (1989), Maudlin (1996), and Morrison (2000) disc uss. Furthermore, one c an ask whether this c ase of unific ation lends support or ac tually undermines the idea that unific ation and explanation are c onnec ted (Morrison 2000). Another sourc e of c onc ern is the ac tual meaning of SSB and the (methodologic al) reasons, if any, for whic h we should give priority to symmetric , as opposed to asymmetric , laws (Earman 2003b, 2004b). Qualms are also raised with regard to the empiric al grounds for believing in the existenc e of underlying hidden symmetries given that what we observe is asymmetric phenomena (Morrison 2000, 2003). Topic s prompted by the analysis of the Higgs mec hanism range from the diffic ulty of getting a grip on its gauge-invariant c ontent (Earman) to the question of whether we should interpret it realistic ally or not (Morrison).3 1 5. Symmetries, Classificatio n, and Predictio n Ontologic al3 2 issues suc h as the ones disc ussed above are not the only area where symmetry c onsiderations are important. They are also essential in c lassific ation and predic tion; more prec isely, they are instrumental in framing a new form of predic tion c alled, by the physic ists themselves,“predic tion from multiplet struc ture” (Lipkin 1966). While not unknown to philosophers of sc ienc e, this idea has not been fully investigated yet. Below I present it very briefly, by drawing on a c onc rete example, the predic tion of the “omega minus” hadron by Gell-Mann and Ne'eman in 1962. Sinc e Weyl and Wigner's important suc c esses in applying group theory to quantum mec hanic s, physic ists have tried to replic ate their methods and take inspiration from their insights. In partic ular, internal symmetries have bec ome an indispensable tool in the c lassific ation of elementary partic les and, as we will see, equally important ac hievements have followed. Wigner is c redited with introduc ing the idea that the 2-dimensional spac e defined by the proton and neutron has a mathematic al c orrespondent in the 2-dimensional irreduc ible representation of the group SU(2). This insight is now exploited more generally: in mathematic al terms, an elementary partic le is c onc eived to be a physic al system whose possible states transform into eac h other ac c ording to some representation of the appropriate symmetry group—in the c ase first disc ussed by Wigner, the group is SU(2) and the spec ific way in whic h these transformations take plac e is desc ribed mathematic ally in terms of the irreducible representations of SU(2). There is an intuitive way to grasp this relation (Castellani 1998): the elementarity of a partic le (system) is mirrored by the irreduc ibility of the representation of a c ertain group, where the elements of the group are the transformations that govern the interac tion into whic h the partic le enters.3 3 Given the symmetry group governing a physic al system, the superposed states of the system (in partic ular, the protonness and the neutronness states) transform into eac h other ac c ording to the irreduc ible representations of the group. But what is the c onnec tion between the physic s and the mathematic s more prec isely? At the physic al level, there are the transformations (of superposed states), and these transformations are expressed mathematic ally as operators ac ting on the state spac e. The eigenvalues of these operators supply the invariant numbers (to be used as labels) for identifying, or c lassifying, the irreduc ible representations of the group.3 4 Given their similarity, the proton and the neutron were thought to oc c upy the same plac e in a sc heme of c lassific ation. On the fac e of things, one might wonder why this is important. The answer is that the epistemic c ontext of partic le physic s is a spec ial one. The taxonomic al aspec ts are far from trivial here bec ause elementary partic les, by their very elementary nature, lac k the great number of properties displayed by the medium-siz ed physic al objec ts (c olor, shape, texture, etc .). Therefore, insofar as these typic al c lassific atory c riteria c annot be found in the mic ro-world, taxonomies in fundamental physic s are very hard to c ome by, so any unambiguous c riterion able to c ontribute to partic les’ identific ation and c lassific ation is welc ome.3 5 These c riteria are usually Page 11 of 23
Symmetry supplied by the partic les’ assoc iated sets of quantum numbers (mass, c harge, spin, isospin, strangeness, etc .), whic h desc ribe their c onservation properties under various sets of transformations and thus determine their positions in multiplets. These multiplets, in turn, are mathematic ally determined as bases of irreduc ible representations of different groups of transformations (suc h as U(1), SU(2), or SU(3)). In addition to ordering the multitude of partic les, the sc hemes of c lassific ation based on symmetries have been put to a different, and somewhat surprising, use: they have been instrumental in making predic tions. Perhaps the most famous one is the predic tion of a new hadron, c alled “Omega minus.” Onc e the physic ists were provided with a multiplet sc heme of c lassific ation, the predic tion of the partic les filling out the plac es left unoc c upied in the multiplet c ame “as a matter of c ourse” (Lipkin 1966, 25– 26, 53). Physic ists even introduc ed a term to refer to it, Lipkin c alling this idea “predic tion from the multiplet struc ture.” How does this work? As noted, SU(2) is not the group desc ribing the strong interac tions; they are governed by a bigger symmetry, (the flavor) SU(3). Similar to what happens with the SU(2) group, the dimensionalities of SU(3)'s irreduc ible representations (1, 8, 10, 27, …) give the c ardinality of the sets of hadron multiplets. The definitive suc c ess of this c lassific atory strategy c ame in 1964 with the detec tion of a new partic le that c ompleted the spin-3/2 baryon dec uplet. The main idea behind this predic tion is rather simple: given the c lassific ation sc heme for the already known spin-3/2 baryons, the unoc c upied, apparently superfluous, entry in the sc heme was taken as a guide to the existenc e of a new partic le. It was exac tly this “surplus” that suggested the existenc e of new physic al reality (in the form of new partic les, to fill in gaps in multiplets). Reminding of Mendeleev's predic tion of new c hemic al elements from his table, Murray Gell-Mann and Yuvaal Ne'eman's predic tive reasoning c an be extrac ted from the rather detailed ac c ount of Ne'eman and Kirsh: 3 6 In 1961 four baryons of spin 3/2 were known. These were the four resonanc es Δ−, Δ0 , Δ+, Δ++ whic h had been disc overed by Fermi in 1952. It was c lear that they c ould not be fitted into an oc tet, and the eightfold way predic ted that they were part of a dec uplet or of a family of 27 partic les. A dec uplet would form a triangle in the S—I3 [strangeness-isospin] plane, while the 27 partic les would be arranged in a large hexagon. (Ac c ording to the formalism of SU(3), supermultiplets of 1, 8, 10 and 27 partic les were allowed.) In the same year (1961) the three resonanc es Σ (1385) were disc overed, with strangeness −1 and probable spin 3/2, whic h c ould fit well either into the dec uplet or the 27- member family. At a c onferenc e of partic le physic s held at CERN, Geneva, in 1962, two new resonanc es were reported, with strangeness −2, and the elec tric c harge −1 and 0 (today known as the Ξ (1530)). They fitted well into the third c ourse of both sc hemes (and c ould thus be predic ted to have spin 3/2). On the other hand, Gerson and Shoulamit Goldhaber reported a “failure”: in c ollisions of K+ or K0 with protons and neutrons, one did not find resonanc es. Suc h resonanc es would indeed be expec ted if the family had 27 members. The c reators of the eightfold way, who attended the c onferenc e, felt that this failure c learly pointed out that the solution lay in the dec uplet. They saw the pyramid being c ompleted before their very eyes [see figure 8.1]. Figure 8.1 A spin-3/2 baryon decuplet Only the apex was missing, and with the aid of the model they had c onc eived, it was possible to desc ribe exac tly what the properties of the missing partic le should be! Before the c onc lusion of the c onferenc e Gell- Mann went up to the blac kboard and spelled out the antic ipated c harac teristic s of the missing partic le, whic h he c alled “omega minus” (bec ause of its negative c harge and bec ause omega is the last letter of the Greek alphabet). He also advised the experimentalists to look for that partic le in their ac c elerators. Page 12 of 23
Symmetry Yuval Ne'emanhad spoken in a similar vein to the Goldhabers the previous evening and had presented them in a written form with an explanation of the theory and the predic tion. The Gell-Mann and Ne'eman predic tive reasoning started off with the observation that eac h of the upper nine positions in the symmetry sc heme3 7 has a physic al interpretation. In other words, eac h of these positions gives the physic al c oordinates of a known spin-3/2 baryon. Now, based on this regularity, it was c onjec tured that a new physic al law might be in sight, c laiming that the SU(3) sc heme desc ribes the c orrec t symmetry of spin-3/2 baryons. But, were this the c ase, there should have been 10 baryons, not only nine, sinc e the sc heme c ontains 10 positions. So, a question was raised about the existenc e of the tenth partic le. Gell-Mann notic ed that the apex is formally/mathematic ally similar to the other nine positions, and this is so bec ause it is, like them, an element of the sc heme. Therefore, he made the predic tion that the apex position has a physic al interpretation too: in other words, that the c oordinates of this position desc ribe a tenth spin-3/2 baryon as well.3 8 However, as is easy to notic e, an extra premise, not explic itly stated in the text, is of c ourse needed in order to c omplete the reasoning. This is the idea that the existenc e of a baryon having the predic ted c harac teristic s is not forbidden by the laws of physic s, and thus can oc c ur in nature (even if it was not detec ted so far).3 9 This line of reasoning is supposed to answer the question asked by an experimentalist ready to perform the detec tions (and by the politic ians in c harge of approving the expenses with the ac c elerators!), namely, “What are the grounds to believe (a) that there is a new entity in nature, and also (b) that this entity has the predic ted physic al c harac teristic s?” On c loser inspec tion, however, the reasoning involved here is very pec uliar. It bears little resemblanc e to other famous predic tions suc h as Leverrier and Adams's predic tion of the planet Neptune in 1846, or Wolfgang Pauli's postulation of the neutrino in 1931. Unlike these two predic tions, the omega minus one is essentially formal, bec ause the c riteria of similarity are given in mathematic al terms (Steiner 1998). The heuristic - ontologic al role of the mathematic al sc heme of c lassific ation is thus c ruc ial; mathematic s does not play merely an eliminable, desc riptive, or c omputational role, sinc e in this c ase it seems that the predic tive argument c annot be rec onstruc ted without invoking the mathematic al features of the physic al desc ription. This appears to be a rather startling example of the “unreasonable effec tiveness of mathematic s” (Wigner 1960), though further analysis is c ertainly needed in order to sort out what is unreasonable about the suc c ess of this, and other, applic ations.4 0 6. Final Remarks: “The Reversal o f a Trend” While the present survey has touc hed upon a relatively wide range of arguments and issues, a c entral theme is worth disc ussing before the end: the emergenc e of a new relation, without prec edent in the history of physic s, between symmetries and the laws of nature—an idea advoc ated by Wigner, who in turn attributes it to Einstein. Wigner highlights it in a number of his philosophic al writings, a partic ularly sharp formulation of it being given in his Nobel lecture of 1963. Wigner begins by drawing attention to an important distinc tion, whic h, he suggests, seems to have been first perc eived by Newton: roughly, this is the separation between a c ertain kind of general statement expressing a regularity holding between events (a law of nature), and more spec ific statements taking the form of desc riptions of c urrent states of affairs (the initial c onditions).4 1 To Wigner, the distinc tion is methodologic ally c ruc ial: its introduc tion simply delineates the very objec t of the physic al sc ienc e, as it amounts to “the spec ific ation of the explainable” (1967, 39); moreover, this spec ific ation “may have been the greatest disc overy of physic s so far.” So, ac c ording to him, it is wrong to say that physic s explains “nature,” if this means that the foc us of physic al theoriz ing is on ac c ounting for some spec ific states of affairs. The aim of physic s is rather “to explain the regularities in the behaviour of objec ts” (1967, 39). He puts this c ontrast in historic al terms, mentioning Kepler, who, as we saw, was also c onc erned with finding what determines the spec ific magnitudes of the planetary orbits, in addition to his searc h for the laws of motion. On the other hand, Newton had restric ted his interest to searc hing only for the explanation of the regularities, the laws of motion (1967, 39–40). If physic ists’ interest lies in desc ribing, explaining, and predic ting regular behavior, then the separation of laws from initial c onditions bec omes nec essary, as the laws do not also spec ify their initial c onditions. While the former are “precise beyond anything reasonable,” “we know virtually nothing” about the latter (1967, 40), as they contain “a strong element of randomness” (1967, 41– 42).4 2 Thus, in the Wignerian sc heme the objec t of physic s is to disc over the laws of nature, whic h desc ribe regular c orrelations between events. A new event is predic ted and Page 13 of 23
Symmetry explained if suc h a c orrelation, or law, is known. But, what about the laws themselves? The question is thus whether there might exist “a superprinciple whic h is in a similar relation to the laws of nature as these are to events” (1967, 43, emphasis added). That is, just as the laws c onstrain what events might take plac e in the world, suc h a superprinc iple would c onstrain, or determine, the laws—in other words, it would have the role of a meta-law whic h would somehow explain the laws.4 3 While Wigner never states this heuristic superprinc iple explic itly, it is pretty c lear that he understands it as inc orporating c ertain c onstraints, whic h he c alls “invarianc e princ iples,” (“symmetry princ iples” or “invarianc e transformations”). Wigner explains the role of the superprinc iple as follows. Suppose that events A, B, C,… entail the oc c urrenc e of another event X—this is the general form of a law of nature. The question is whether there are transformations that turn A, B, C into A′, B′, C′,… and X into X′ suc h that if A′, B′, C′,… obtain, then X′ obtains too— so, onc e these transformations are found, new laws of nature are established. Next, he identifies three types of suc h “invarianc e transformations.” First, there are the Euc lidean transformations, where the primed events are identic al to the unprimed ones exc ept for a shift in the loc ation in spac e. In partic ular, the spatial relations within the c onfigurations of primed and unprimed events are retained. Sec ond, there are time displac ements: the primed events are the same as the unprimed ones, but they oc c ur at a different time, and the time intervals in the primed and unprimed c onfigurations are retained. Third, there is the uniform motion transformation: when assessed from the perspec tive of a uniformly moving c oordinate system, the primed events appear to be identic al to the unprimed events (Wigner 1967, 43). Having identified the c omponents of the superprinc iple (of whic h the last invarianc e was c ruc ial in Einstein's formulation of STR), Wigner points out that “the use of the set of invarianc e princ iples whic h is surely most important at present” is as a test for “the validity of possible laws of nature” (1967, 46). More prec isely, “a law of nature c an be ac c epted as valid only if the c orrelations whic h it postulates are c onsistent with the ac c epted invarianc e princ iples” (1967, 46)—and, historic ally, the first illustration of this insight is Einstein's 1905 c onstruc tion of STR.4 4 In fac t, Wigner stresses that viewed from this perspec tive, Einstein's work on STR is methodologic ally revolutionary: it marks “the reversal of a trend” (1967, 5). Before it, the princ iples of invarianc e were seen merely as interesting features to be noted when examining the laws; after it, a new methodologic al move bec ame available: “it is now natural for us to derive the laws of nature and to test their validity by means of the laws of invarianc e, rather than to derive the laws of invarianc e from what we believe to be the laws of nature” (1967, 5). Even a c ursory glanc e should support the view that symmetry is an extremely generous foundational topic in physic s and philosophy. It prompts questions about the relation between the formal or mathematic al struc tures and the c onstitution of the world (does the world really exhibit symmetric al struc tures, or they are just an artifac t of our desc ription of it?), and it also c hallenges us to make sense of the suc c ess of symmetry thinking at a methodologic al-heuristic level. Regardless of the perspec tive from whic h this topic is approac hed, understanding the power of symmetry is c ertainly one of the most important tasks for both philosophers of sc ienc e and philosophic ally-minded physic ists. References Aitchison, L. J. R., and A. J. G. Hey (1989). Gauge theories in particle physics. Bristol: IOP Publishing. Albert, D. (2000). Time and chance. Cambridge, MA: Harvard University Press. Arntz enius, F., and H. Greaves (2009). Time reversal in c lassic al elec tromagnetism. British Journal for the Philosophy of Science 60(3): 557–584. Auyang, S. (1995). How is quantum field theory possible? Oxford: Oxford University Press. Bangu, S. (2006). Steiner on the applic ability of mathematic s and naturalism. Philosophia Mathematica 14(1): 26– 43. ———. (2008). Reifying mathematic s? Predic tion and symmetry c lassific ation. Studies in History and Philosophy of Modern Physics 39(2): 239–258. ———. (2009). Understanding thermodynamic singularities: Phase transitions, data and phenomena. Philosophy of Science 76(4): 488–505. Page 14 of 23
Symmetry Batterman, R. W. (2001). The devil in the details: Asymptotic reasoning in explanation, reduction, and emergence. Oxford: Oxford University Press. ———. (2005). Critic al phenomena and breaking drops: Infinite idealiz ations in physic s. Studies in History and Philosophy of Modern Physics 36: 225–244. ———. (2006). On the spec ialness of spec ial func tions (The nonrandom effusions of the divine mathematic ian). British Journal for the Philosophy of Science 58: 263–286. ———. (2010). On the explanatory role of mathematic s in empiric al sc ienc e. British Journal for the Philosophy of Science 61: 1–25. Belot, G. (1996). Whatever is never and nowhere is not: Spac e, time, and ontology in c lassic al and quantum gravity. Ph.D. diss., University of Pittsburgh. Available at http://sitemaker.umich.edu/belo t/files/dissertatio n.pdf. ———. (1998). Understanding elec tromagnetism. British Journal for the Philosophy of Science 49: 531– 555. Brading, K., and H. Brown (2003). Symmetries and Noether's theorems. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 89–109. Cambridge: Cambridge University Press. ———. (2004). Are gauge symmetry transformations observable? British Journal for the Philosophy of Science 55(4): 645–665. Brading, K., and E. Castellani, eds. (2003a). Symmetries in physics: Philosophical reflections. Cambridge: Cambridge University Press. ———. (2003b). “Introduc tion.” In Brading and Castellani (2003a), 1– 18. ———. (2007). Symmetry in c lassic al physic s. In Handbook of the philosophy of physics, ed. J. Butterfield and J. Earman, 1331–1367. Amsterdam: Elsevier. Brown, H. (1999). Aspec ts of objec tivity in quantum mec hanic s. In From physics to philosophy, ed. J. Butterfield and C. Pagonis, 45–71. Cambridge: Cambridge University Press. Butterfield, J. (1989). The hole truth. British Journal for the Philosophy of Science 40: 1–28. ———. (2006). On symmetries and c onserved quantities in c lassic al mec hanic s. In Physical theory and its interpretation, ed. W. Demopoulos and I. Pitowsky, 43– 99. Dordrec ht: Springer. Callender, C. (2001). Taking thermodynamic s too seriously. Studies in History and Philosophy of Modern Physics 32: 539–553. Castellani, E. (1998). Galilean partic les: An example of c onstitution of objec ts. In Interpreting bodies: Classical and quantum objects in modern physics, ed. E. Castellani, 181– 194. Princ eton: Princ eton University Press. Christenson, J., J. Cronin, V. Fitc h, and R. Turlay (1964). Evidenc e for the 2π dec ay of the K Meson. Phys. Rev. Letters 13: 138. Coughlan, G. D., and J. E. Dodd (1991). The ideas of particle physics. 2d ed. Cambridge: Cambridge University Press. Curie, P. (1894). Sur la symétrie dans les phénomènes physiques : Symétrie d'un c hamp élec trique et d'un c hamp magnétique. Journal de Physique 3(1) : 393–417. Debs, T., and M. Redhead (2007). Objectivity, invariance, and convention: Symmetry in physical science. Cambridge, MA: Harvard University Press. Dyson, F. (1964). Mathematic s in the physic al sc ienc es. Scientific American 211(3): 129– 146. Earman, J. (2003a). Trac king down gauge: An ode to the c onstrained Hamiltonian formalism. In Symmetries in Page 15 of 23
Symmetry physics: Philosophical reflections, ed. K. Brading and E. Castellani, 140–162. Cambridge: Cambridge University Press. ———. (2003b). Rough guide to spontaneous symmetry breaking. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 334–345. Cambridge: Cambridge University Press. ———. (2004a). Curie's princ iple and spontaneous symmetry breaking. International Studies in the Philosophy of Science 18: 173–198. ———. (2004b). Laws, symmetry, and symmetry breaking: Invarianc e, c onservation princ iples, and objec tivity. Philosophy of Science 71(5): 1227–1241. Earman, J., and J. Norton (1987). What pric e spac etime substantivalism? The hole story. British Journal for the Philosophy of Science 38: 515–525. Feynman, R. (1985). QED. Princ eton: Princ eton University Press. Francoise, J. L., G. L. Naber, and T. S. Tsun, eds. (2006). Encyclopedia of mathematical physics. Berlin: Springer. Frenc h, S. (1998). On the withering away of physic al objec ts. In Interpreting bodies: Classical and quantum objects in modern physics, ed. E. Castellani, 93– 113. Princ eton: Princ eton University Press. ———. (2000). The reasonable effec tiveness of mathematic s: Partial struc tures and the applic ation of group theory to physics. Synthese 125: 103–120. ———. (2008). Identity and individuality in quantum theory. In The Stanford encyclopedia of philosophy (Fall 2008 edition), ed. E. Zalta. Available at http://plato .stanfo rd.edu/archives/fall2008/entries/qt- idind/. Frenc h, S., and J. Ladyman (2010). In defenc e of ontic struc tural realism. In Scientific structuralism, ed. P. Bokulic h and A. Bokulic h, 25– 43. Boston: Boston Studies in the Philosophy of Sc ienc e, Springer. Frenc h, S., and D. Ric kles (2003). Understanding permutation symmetry. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 212–238 Cambridge: Cambridge University Press. Gell-Mann, M., and Y. Ne'eman, eds. (1964). The eightfold way. New York: W. A. Benjamin. Georgi, H. (1989). Grand unified theories. In The New Physics, ed. P. Davies, 425–446. New York: Cambridge University Press. Goldstone, J. (1961). Field theories with superc onduc tor solutions. Nuovo Cimento 19: 154– 164. Goldstone, J., et al. (1962). Broken symmetries. Physical Review 127: 965–970. Greaves, H. (2010). Towards a geometric al understanding of the CPT Theorem. British Journal for the Philosophy of Science 61: 27–50. Gunion, J. F., A. Stange, and S. Willenbroc k (1996). Weakly-c oupled Higgs bosons. In Electroweak symmetry breaking and the new physics at the TeV scale, ed. T. L. Barklow, S. Dawson, H. E. Haber, and J. L. Siegrist, 23– 124. Singapore: World Sc ientific Publishing. Healey, R. (2007). Gauging what's real: The conceptual foundations of contemporary gauge theories. Oxford: Oxford University Press. Higgs, P. W. (1964). Broken symmetries and the masses of gauge bosons. Phys. Rev. Lett. 13: 508–509. Hoefer, C. (2000). Kant's hands and Earman's pions: Chirality arguments for substantival spac e. International Studies in the Philosophy of Science 14: 237–256. Hon, G., and B. R. Goldstein (2008). From summetria to symmetry: The making of a revolutionary scientific concept. Berlin: Springer. Page 16 of 23
Symmetry Huggett, N. (2003). Mirror symmetry: What is it for relational spac e to be orientable? In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 281–289. Cambridge: Cambridge University Press. Ic ke, V. (1995). The force of symmetry. Cambridge: Cambridge University Press. Ismael, J. (1997). Curie's princ iple. Synthese 110: 167– 190. Joshi, A. W. (1982). Elements of group theory for physicists. 3d ed. Hoboken, NJ: John Wiley & Sons. Kosso, P. (2000). The empiric al status of symmetries in physic s. British Journal for the Philosophy of Science 51(1): 81–98. ———. (2003). Symmetry, objec tivity, and design. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 410–421. Cambridge: Cambridge University Press. Ladyman, J. (2009). Struc tural realism. In The Stanford encyclopedia of philosophy (Summer 2009 edition), ed. E. Zalta. Available at http://plato .stanfo rd.edu/archives/sum2009/entries/structural- realism/. Lanc z os, C. (1949). The variational principles of mechanics. Toronto: University of Toronto Press. Lange, M. (2007). Laws and meta-laws of nature: Conservation laws and symmetries. Studies in History and Philosophy of Modern Physics 38: 457–481. Lipkin, H. (1966). Lie groups for pedestrians. Amsterdam: North-Holland Publishing Company. Liu, C. (2001). Infinite systems in SM explanations: Thermodynamic limit, renormaliz ation (semi-) groups, and irreversibility. Philosophy of Science 68: S325–S344. ———. (2003). Spontaneous symmetry breaking and c hanc e in a c lassic al world. Philosophy of Science 70: 590– 608. Lyre, H. (2008). Does the Higgs mec hanism exist? International Studies in the Philosophy of Science 22(2): 119– 133. Maddy, P. (2007). Second philosophy. Oxford: Oxford University Press. Mainz er, K. (2005). Symmetry and complexity: The spirit and beauty of nonlinear science. Singapore: World Scientific Publishing. Malament, D. (2004). On the time reversal invarianc e of c lassic al elec tromagnetic theory. Studies in History and Philosophy of Modern Physics 35B (2): 295–315. Martin, C. (2003). On c ontinuous symmetries and the foundations of modern physic s. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 29–60. Cambridge: Cambridge University Press. Maudlin, T. (1996). On the unific ation of physic s. Journal of Philosophy 93: 129– 144. Mills, R. (1989). Gauge fields. American Journal of Physics 57: 493. Morrison, M. (2000). Unifying scientific theories: Physical concepts and mathematical structures. Cambridge: Cambridge University Press. ———. (2003). Spontaneous symmetry breaking: Theoretic al arguments and philosophic al problems. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 346–362. Cambridge: Cambridge University Press. ———. (2008). Symmetry. In The Routledge companion to philosophy of science, ed. S. Psillos and M. Curd, 468– 478. London: Routledge. Ne'eman, Y., and Y. Kirsh (1996). The particle hunters. 2d ed. Cambridge: Cambridge University Press. Nerlich, G. (1994). The shape of space. 2d ed. Cambridge: Cambridge University Press. Page 17 of 23
Symmetry Noether, E. (1918). Invariante variationsprobleme. Konigliche Gesellschaft der Wissenschaften zu Gottingen. Mathematisch-Physikalische Klasse. Nachrichten, 235–257. Trans. M. A. Tavel (1971). Invariant Variation Problems. In Transport Theory and Statistical Physics, I, pp. 186–207. Norton, J. (2004). Einstein's investigations of Galilean c ovariant elec trodynamic s prior to 1905. Archive for History of Exact Sciences 59: 45–105. ———. (2008). The hole argument. In The Stanford encyclopedia of philosophy (Winter 2008 edition), ed. E. Zalta. http://plato .stanfo rd.edu/archives/win2008/entries/spacetime- ho learg/. Noz ic k, R. (2001). Invariances: The structure of the objective world. Cambridge, MA: Harvard University Press. Pic kering, A. (1998). Against putting the phenomena first: The disc overy of the weak neutral c urrent. In Scientific knowledge: Basic issues in the philosophy of science, ed. J. Kourany, 135–152. Stamford, CT: Wadsworth Publishing Company. Reprinted from Studies in History and Philosophy of Science 15(2): 85–117. Pinc oc k, C. (2010). Applic ability of mathematic s. In Internet encyclopedia of philosophy. Available at http://www.iep.utm.edu/math-app/. Pooley, O. (2003). Handedness, parity violation, and the reality of spac e. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 250–280. Cambridge: Cambridge University Press. Quigg, C. (1983). Gauge theories of the strong, weak, and electromagnetic interactions. Reading, MA: Addison- Wesley. Redhead, M. (2003). The interpretation of gauge symmetry. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 124–140. Cambridge: Cambridge University Press. Robson, E. (2008). Mathematics in Ancient Iraq: A social history. Princ eton: Princ eton University Press. Ruetsche, L. (2006). Johnny's so long at the ferromagnet. Philosophy of Science 73: 473–486. ———. (2003). A matter of degree: Putting unitary inequivalenc e to work. Philosophy of Science 70: 1329– 1342. Ryc kman, T. A. (2003). The philosophic al roots of the gauge princ iple: Weyl and transc endental phenomenologic al idealism. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 61–88. Cambridge: Cambridge University Press. ———. (2008). Invarianc e princ iples as regulative ideals: From Wigner to Hilbert. Royal Institute of Philosophy Supplement 83: 63–80. Saunders, S. (2002). Indisc ernibles, general c ovarianc e, and other symmetries. In Revisiting the foundations of relativistic physics: Festschrift in honour of John Stachel, ed. A. Ashtekar, D. Howard, J. Renn, S. Sarkar, and A. Shimony, 151– 173. Dordrec ht: Kluwer. Schumm, B. (2004). Deep down things. Baltimore, MD: Johns Hopkins University Press. Sklar, L. (1993). Physics and chance. Cambridge: Cambridge University Press. Smith, S. (2008). Symmetries and the explanation of c onservation laws in the light of the inverse problem in Lagrangian mec hanic s. Studies in the History and Philosophy of Modern Physics 39: 325– 345. Steiner, M. (1998). The applicability of mathematics as a philosophical problem. Cambridge, MA: Harvard University Press. ———. (2005). Mathematic s—Applic ation and applic ability. In Oxford handbook of philosophy of mathematics and logic, ed. S. Shapiro, pp. 625–61. Oxford: Oxford University Press. Teller, P. (2000). The gauge argument. Philosophy of Science 67: S466–S481. van Fraassen, B. C. (1989). Laws and symmetry. Oxford: Oxford University Press. Page 18 of 23
Symmetry ———. (1991). Quantum mechanics: An empiricist view. Oxford: Clarendon Press. Wald, R. M. (1986). Spin-two fields and general c ovarianc e. Physical Review D 33: 3613– 3625. Weinberg, S. (1974). Rec ent progress in gauge theories of the weak, elec tromagnetic and strong interac tions. Reviews of Modern Physics 46: 255–277. Weyl, H. (1952). Symmetry. Princ eton: Princ eton University Press. Wigner, E. (1959). Group theory and its application to the quantum mechanics of atomic spectra. London: Academic Press. ———. (1960). The unreasonable effec tiveness of mathematic s in the natural sc ienc es. Communications in Pure and Applied Mathematics 13(1): 1–14. Reprinted in Wigner (1967), 222–238. ———. (1963). Events, laws of nature, and invarianc e princ iples. Nobel Lec ture, Dec ember 12, 1963. Reprinted in Science 145(3636): 995– 999 and Wigner (1967), 38– 51. Referenc es are given to the 1967 version. ———. (1967). Symmetries and reflections. Bloomington: Indiana University Press. Wilkinson, D. H., ed. (1969). Isospin in nuclear physics. Amsterdam: North-Holland Pub. Co. Wilson, M. (2000). The unreasonable unc ooperativeness of mathematic s in the natural sc ienc es. The Monist 83(2): 296–314. Wu, C. S., Ambler, E., Hayward, R. W., Hoppes, D. D.; Hudson, R. P. (1957). Experimental Test of Parity Conservation in Beta Dec ay. Physical Review 105(4): 1413– 1415. Yang, C. N., and R. Mills (1954). Conservation of isotopic spin and isotopic gauge invarianc e. Physical Review 96(1): 191–195. Zee, A. (2007). Fearful symmetry: The search for beauty in modern physics. Princ eton: Princ eton University Press. Notes: (1) For disc ussions of symmetry before the Greeks, see Robson (2008, 43– 48) on Mesopotamian mathematic s, third and early sec ond millennia B.C. Mainz er (2005, espec ially c h. 1) surveys the role of symmetry in a variety of c ultural spac es. Hon and Goldstein (2008) aim to rewrite the c onc eptual history of the idea of symmetry, taking issue with a series of rec eived views. (2) This is also known as the “equal areas” law: a segment c onnec ting the Sun and a planet on an elliptic al orbit sweeps out equal areas in equal time intervals. (3) Dyson (1964, 129) reports a c onversation between O. Veblen and J. Jeans in 1910 about the reformation of the mathematic s c urric ulum at Princ eton. Jeans was of the opinion that “we may as well c ut out group theory. This is a subjec t whic h will never be of any use in physic s.” (4) This follows Lanc z os (1949, 229–239), whic h c ontains the tec hnic al details. (5) Excellent surveys of the symmetry theme are already available, a few by philosophers (e.g., Brading and Castellani 2003b, 2007; Morrison 2008) and many more by physic ists (Coughlan and Dodd 1991, c h. 6; Ic ke 1995; entries in Franc oise, Naber, and Tsun 2006; Zee 2007, etc .), so a c hallenge for the present survey is to minimiz e the overlap. The existenc e of c ommon themes is, however, unavoidable. (6) The name of the princ iple c omes from the Frenc h physic ist Pierre Curie, who formulated it while working on the properties of c rystals. See Curie (1894). For rec ent disc ussions, see Ismael (1997), Earman (2004a), and Brading and Castellani (2007, sec t. 2). (7) See Sklar (1993) for a luc id exposition. Page 19 of 23
Symmetry (8) The theory of this type of interac tion c annot yet explain a key, observed, asymmetry—namely why there is significantly more matter than antimatter in the universe. (9) Coughlan and Dodd (1991, 44– 49) provide further tec hnic al details. Parity violation (demonstrated experimentally by a team led by Wu, in 1957) has reignited the interest in the disc ussions on the struc ture of physic al spac e and the nature of c hiral objec ts, going bac k to Kant's attempts to ac c ount for the differenc e between the “inc ongruent c ounterparts” by appeal to their relation to absolute spac e (“inc ongruent c ounterparts” are objec ts that are mirror images of eac h other but are not superposable through rigid motion, e.g., a right and left glove). See Nerlic h (1994) for an introduc tion and Hoefer (2000), Huggett (2003), and Pooley (2003) for rec ent discussions. (10) One c an rec all here Feynman's famous proposal to understand antipartic les as partic les moving bac kward in time, or, in other words, that the time-reversal operation applied to a partic le state would turn it into the c orresponding antipartic le state (Feynman 1985). For details and c ritic ism, see Arntz enius and Greaves (2009). This disc ussion is related to an earlier debate between Albert and Malament with regard to c lassic al elec tromagnetism. Albert (2000) argued that c lassic al elec tromagnetism is not time-reversal invariant, while Malament (2004) defended the standard view, ac c ording to whic h the theory does possess this feature. (11) Noether's original 1918 paper c ontains in fac t two theorems. The first one, briefly disc ussed here, c onc erns the (“global”) invarianc e of the ac tion under a Lie group c harac teriz ed by a finite number of parameters; the sec ond (“loc al”) c onc erns an infinite dimensional Lie group. For the full version of the two theorems, see p. 3 of Tavel's (1971) translation of Noether (1918) at http: //arxiv.org/PS_ c ac he/physic s/pdf/0503/0503066v1.pdf (12) In addition to this, the Lagrangian framework permits a more natural passage to quantum mec hanic s—whic h was in fact worked out by Feynman in the 1940s, in his path integral formalism. (13) For rec ent work on the philosophic al implic ations of the Lagrangian formalism for symmetries and other related issues, see Butterfield 2006 and Smith 2008. (14) Interesting questions to ask here are (i) to what extent gauge symmetries are ac tually observable, as well as (ii) whether it makes sense to apply to them the distinc tion between the ac tive and passive ways of understanding a transformation (introduc ed in sec tion 2.1). See Kosso (2000) and Brading and Brown (2004) for a disc ussion. (15) However, mass and life-time are only indirec tly linked. (The proton is heavy and lives more than 102 8 years.) To be more prec ise, it is the Compton wavelength that is inversely proportional to the mass of the partic le (λ = h/mc ), and thus direc tly linked to its range. (16) Moreover, it is demanded that we write as simple a theory as possible, hoping that the notorious vagueness of the simplic ity c onstraint c an somehow be satisfied. (17) Note that it is the “kinetic ” c omponent −1/4Fμν Fμν of the Lagrangian for the full theory (also featuring an “interac ting” c omponent), whic h, as Martin nic ely puts it, “imbues the field with its own existenc e, ac c ounting for the presenc e of non-z ero elec tromagnetic fields, for the propagation of free photons” (2003, 43). See Quigg (1983, 45– 48) for the tec hnic al details. But, tec hnic alities aside, one of the main c omplaints against this standard story has been that this generation talk is misleading, as the gauge field is put in by hand. For disc ussion, see Brown (1999). A number of further issues arise, having to do with the (in)determinist c harac ter of a gauge theory. A sourc e of c onc ern is the identific ation of those quantities that are ac tually “physic al,” as opposed to mere artifac ts of desc ription. The disc ussions in the literature foc us on Einstein's “hole argument” (Earman and Norton 1987; Butterfield 1989; Belot 1996, esp. c hs. 5, 6, 7; 1998; Saunders 2002; etc .; for a rec ent introduc tion, see Norton 2008). Equally pressing is the question about the right ontologic al interpretation that should be given to those quantities that are not gauge-invariant, the so-c alled (by Redhead 2003) “surplus struc ture.” (18) We c an say this with the benefit of hindsight; elec tromagnetism, as a theory, of c ourse pre-dates gauge. (19) This presentation follows Wilkinson (1969). (20) The c urrent explanation of the differenc e is in terms of the mass differenc e between the up and down quark. Page 20 of 23
Symmetry (21) See Frenc h (2008) for identity and individuality in quantum theory, and Frenc h and Ric kles (2003) on some subtleties of the permutation symmetry. (22) To c larify: Gell-Mann's so-c alled “Eightfold Way” SU(3)-based theory mentioned at the beginning of this paragraph is not QCD as developed later on. The degrees of freedom of the Eightfold Way are not the degrees of freedom of the SU(3)-based QCD—though the group is the same, SU(3). This later theory postulates three different types of strong-forc e c harge (the red, green, and blue quarks). The former SU(3) spac e (where only global invarianc e required) is a different entity than the SU(3) spac e of strong c harge, whic h is under the c onstraint of local (“gauge”) invarianc e. Within the former theory, we only c ategoriz e nonfundamental c ollec tions of quarks (for more, see sec tion 5). It is the latter theory whic h is the c urrently ac c epted dynamical ac c ount of the strong nuc lear forc e. Yang and Mills (see the previous paragraph) attempted to make a dynamic al theory out of the SU(2) isospin spac e, but we c an now see that this is c learly wrong-headed, sinc e protons and neutrons are not fundamental particles. (23) This account follows Coughlan and Dodd (1991). (24) Part of this story provided soc ial-c onstruc tivists with a c ase to uphold their position. As we will see below (next footnote), the experimental demonstration of the so-c alled “weak neutral c urrents” would have c orroborated the unified model. Analyz ing this episode, Andy Pic kering (1998, 136) writes: “There I argue that the ac c eptability of the weak neutral c urrent (and henc e of the assoc iated interpretative prac tic es) was determined by the opportunities its existenc e offered for future experimental and theoretic al prac tic e in partic le physic s. Quite simply, partic le physic ists ac c epted the existenc e of the neutral c urrent bec ause they c ould see how to ply their trade more profitably in a world in whic h the neutral c urrent was real. The key idea here is that of a symbiotic relationship between experimenters and theorists, the two distinc t professional groupings within partic le physic s.” (25) Slightly more tec hnic ally, the situation is as follows. After the development of the U(1) × SU(2) GSW unified (“standard”) model (SM), one should distinguish between the photon of the elec tromagnetic interac tion (as desc ribed in quantum elec trodynamic s QED) and the quantum of the U(1) symmetry of the Standard Model, whic h is the so-c alled B0 field. A somewhat similar point holds for the quanta of weak interac tion; the SU(2) field quanta are the previously known W+ and W− weak field quanta, but the GSW model also predic ts a third, neutral W0 . Experimental results (suc h as the impossibility to tell whether a c ertain interac tion is the result of exc hanging B0 s or W0 s) led to the idea that those interac tions leaving the elec tric al c harge of the partic les involved unc hanged must take plac e through the exc hange of a composite of the two neutral quanta. Remarkably, the model c ombines prec ise amounts of B0 and W0 and rec overs the properties of the (neutral) photon of quantum elec trodynamic s. The perc entage of eac h in the mixture is known, being given by the Weinberg mixing angle θw. For the photon (c all it A), A = W0 sin θw+ B0 c os θw. The “leftovers” of eac h B0 and W0 make up a new elec tric ally neutral field quantum whic h is responsible for the weak nuc lear interac tion as we observe it: the Z0 boson, where Z0 = W0 c osθw+ B0 sin θw. (For more details, see Coughlan and Dodd 1991, 100; as they explain it, the masses of the Ws and of Z0 s depend on θw, whic h is suc h that sin2 θw ≍ 0.23.) Note, however, that the B0 field quanta is “unphysic al.” If, for instanc e, one would somehow manage to produc e a B0 , it would dec ay rapidly, and what one would see in the detec tor is either that it lives forever (as a photon), or it dec ays quic kly as a Z0 . The experimental disc overy (in 1973) that weak interac tions c an also take plac e via the exc hange of a neutral weak field quanta is the topic of the essay by Pic kering mentioned in the previous footnote. (26) See Higgs (1964). A number of other physic ists (R. Brout, F. Englert, G. Guralnik, C. R. Hagen, and T. Kibble) have presented ideas similar to Higgs's. (27) The spec ific c hoic e is dic tated by the so c alled “Higgs potential.” (28) Liu (2003) disc usses SSB in the c lassic al c ontext. (29) Additional diffic ulties oc c ur in the matter of SSB bec ause only systems with infinite degrees of freedom c an undergo suc h “phase transitions.” For philosophic al disc ussion, see Liu (2001), Batterman (2001, 2005), Callender (2001), Ruetsche (2003, 2006), and Bangu (2009). (30) More prec isely, the Goldstone boson bec omes the third polariz ation state of the mass-ac quiring boson. The tec hnic al details on whic h this ac c ount draws c an be found in the textbooks (e.g., Quigg 1983; Aitc hison and Hey Page 21 of 23
Symmetry 1989; Coughlan and Dodd 1991). (31) More rec ent work on the Higgs mec hanism is Lyre (2008). (32) This section draws on Bangu (2008). (33) The literature on the (Wignerian) group theoretic approac h to the c onstitution of physic al objec ts has been growing in the last dec ade, when a variety of approac hes have been attempted. See Castellani (1998) and espec ially the work on ontic struc tural realism by Frenc h (1998), Frenc h and Ladyman (2010), and Ladyman (2009), esp. sect. 4 and the bibliography therein. (34) In partic ular, physic ists assoc iate these labels (e.g. -1/2 and +1/2 in the isospin c ase) with the values of the invariant properties (isospin) c harac teriz ing physic al systems (in this c ase, the doublet neutron-proton). Wigner (1959) derives a formula that enc odes the general form of the representations. For a more modern approac h, see Joshi (1982, 131). (35) The diversity and the large number of partic les had always bothered the high-energy physic ists. Willis Lamb voic ed this uneasiness in his Nobel speec h, in whic h he reminded the public of a popular saying in the partic le physic s c ommunity: anyone who disc overs a new partic le ought be punished by a $10,000 fine (instead of being awarded a Nobel Priz e!) (36) From Ne'eman and Kirsh (1996, 202–203). For more details on Gell-Mann and Ne'eman's work, see their (1964). This c ollec tion also c ontains the Brookhaven experimental report “Observation of a Hyperon with Strangeness Minus Three” (Phys. Rev. Letters 12 (1964)), whic h desc ribes the details of the detec tion of the omega minus. (37) More prec isely, the “sc heme” refers to the “10-dimensional representation of the group SU(3)” pic tured above. (38) Its mass is 1672 MeV, strangeness –3, spin 3/2 and 0 isospin in the z -direc tion. (39) The so-c alled “totalitarian princ iple” (attributed to Gell-Mann), ac c ording to whic h “what is not forbidden must oc c ur” is of notoriety in the partic le physic s c ommunity. It is unc lear, however, whether this dic tum (reminisc ent of the anc ient Princ iple of Plenitude—stating, roughly, that given an infinite time, all genuine possibilities ac tualiz e) played an important role in this episode. Its c onverse—when it seems that an event c an happen but it does not, look for a c onservation law that prec ludes it—is also a well-known heuristic tool. (40) See Steiner (1998, 2005), French (2000), Wilson (2000), Bangu (2006), Maddy (2007, part IV.2), Batterman (2006, 2010), and Pinc oc k (2010) for a variety of rec ent perspec tives on this issue. (41) See Ryckman (2008) for discussion. (42) Wigner also disc usses to what extent the initial c onditions are arbitrary; see pp. 40…41. (43) For symmetries as meta-laws, see Lange (2007). (44) See Norton (2004) and the bibliography therein for historic al details and philosophic al disc ussion of Einstein's methodology prior to, and related to, his 1905 STR paper. Sorin Bang u Sorin Bangu is Associate Professor of Philosophy at the University of Bergen, Norway. He received his Ph.D. from the University of Toronto and has previously been a postdoctoral fellow at the University of Western Ontario and a fixed-term lecturer at the University of Cam bridge, Departm ent of History and Philosophy of Science. His m ain interests are in philosophy of science (especially philosophy of physics, m athem atics, and probability) and later Wittgenstein. He has published extensively in these areas and has recently com pleted a book m anuscript on the m etaphysical and epistem ological issues arising from the applicability of m athem atics to science. Page 22 of 23
Symmetry and Equivalence Gordon Belot The Oxford Handbook of Philosophy of Physics Edited by Robert Batterman Abstract and Keywords This c hapter c onsiders the issues of symmetry and physic al equivalenc e or invarianc e. It suggests that if the notions of symmetry and equivalenc e c oinc ide, then there is a tight c onnec tion between a purely formal c onc eption of the symmetries of a theory and a methodologic al/interpretive c onc eption of what it is for two solutions to represent the same physic al state of affairs. The c hapter also desc ribes different ways for making prec ise the notion of the symmetries of a c lassic al theory. K ey words: sy mmetry , ph y si cal equ i v al en ce, i n v ari an ce, meth odol ogi cal /i n terpreti v e con cepti on , cl assi cal th eory 1. Intro ductio n My topic is the relation between two notions, that of a symmetry of a physic al theory and that of the physic al equivalenc e of two solutions or models of suc h a theory. In various guises, this topic has been widely addressed by philosophers in rec ent years.1 As I intend to use the term here, a symmetry of a theory is a map or transformation that leaves invariant the struc ture used to enc ode the laws of the theory. The notion of physic al equivalenc e that I have in mind is as follows: two solutions (models) of a physic al theory are physically equivalent if and only if, for eac h possible physical situation, the two are equally well- or ill-suited to represent that situation. The first of these notions is a formal one, in the sense that in spec ifying the formalism of a theory, one spec ifies its symmetries. The sec ond is an interpretative notion: two solutions that are physic ally equivalent ac c ording to one interpretation of a given formalism may be inequivalent ac c ording to another. Part of the interest in these notions among philosophers derives from the following line of thought. In an influential disc ussion, John Earman argued that in the spec ial c ase of spac etime symmetries, fac ts about symmetries plac e interesting c onstraints on good interpretative prac tic e.2 Consider the c ase of time translation by b temporal units. In the first instanc e, this operation is a map that takes eac h point of spac etime to a point of spac etime b units later than itself. But we c an also think of time translation as ac ting in an obvious way on fields living on spac etime. Earman observes that in this c ontext, a transformation like time translation is a symmetry of a theory if and only if it maps solutions of the theory to solutions of the theory. And he goes on to argue that under any reasonable interpretation, a transformation of this kind ought to be a symmetry of the theory if and only if it is a symmetry of the geometric struc ture of spac etime.3 In partic ular, it follows from Earman's arguments that if one foc uses on theories in which spacetime symmetries are the only relevant symmetries, then under any reasonable interpretation, solutions are physic ally equivalent if and only if they are related by a symmetry—in this way formal fac ts plac e interesting constraints on (good) interpretation. It is natural to wonder how muc h of this pic ture c arries over if one moves beyond spac etime symmetries. Muc h 4 Page 1 of 17
Symmetry and Equivalence rec ent philosophic al writing on symmetries and related topic s seems to assume that the answer is: all of it.4 For the following two doc trines play an important (if often implic it) role in this literature. D1. The symmetries of a c lassic al theory are those transformations that map solutions of the theory's equation of motion to solutions of the theory's equation of motion. D2. Two solutions of a c lassic al theory's equation of motion are related by a symmetry if and only if they are physic ally equivalent, in the sense that they are equally well- or ill-suited to represent any partic ular physic al situation. Here D1 is a formal c ondition, D2 a c onstraint on good interpretative prac tic e. Combined, they would yield a strong and interesting constraint on (good) interpretation. However, it is not diffic ult to see that the c ombination of D1 and D2 is an unhappy one. Consider any c lassic al theory. Let u1 and u2 be solutions of the theory's equation of motion. Consider the transformation T on the spac e of solutions that maps u1 to u2 , u2 to u1, and leaves every other solution where it is. Ac c ording to the first doc trine above, T is a symmetry of the theory. The sec ond doc trine above then implies that u1 and u2 are physic ally equivalent. So the two doc trines jointly imply that any two solutions of any c lassic al theory are physic ally equivalent—and henc e that eac h c lassic al theory is more or less useless bec ause it is unable to disc riminate among the systems for whic h it provides good models.5 There are many strategies for enc oding the laws of a theory in a mathematic al form—and eac h suc h strategy is assoc iated with a formal notion of symmetry. D1 enc apsulates the notion of symmetry assoc iated with the most spare and naïve method of enc oding the laws of a theory. It is natural to wonder whether some more sophistic ated relative of D1 might form a fruitful c ombination with D2. There has been some rec ent disc ussion of this question among philosophers, with some authors adopting an optimistic view, arguing or assuming that some more sophistic ated relative to D1 will do the tric k, and some authors adopting a pessimistic view, ac c ording to whic h the notion of physic al equivalenc e is not c losely linked with a formal notion of symmetry (even when we restric t to well- behaved interpretations).6 My projec t here is to examine the viability of D1 and D2 and to say a bit about the sourc e of their prima fac ie plausibility. In the next sec tion, I briefly sketc h the framework in whic h I will proc eed. Sec tion 3 is devoted to a disc ussion of D1 and its inadequac ies. Sec tions 4 and 5 give a brief overview of some of the more sophistic ated alternatives to D1. Sec tion 6 argues that none of the notions of symmetry c onsidered c ombine satisfac torily with D2. Sec tion 7 presents my c autiously pessimistic c onc lusions. 2. Stage-Setting Consider a c lassic al theory in whic h a history of a physic al system is represented by a func tion u : M → W; here M is the manifold parameteriz ed by the independent variables of the theory and W is the manifold parameteriz ed by the dependent variables of the theory.7 Thus M might c orrespond to time and W to the spac e of possible c onfigurations of some system of partic les. Or M might c orrespond to spac etime and W to the spac e of values available to some field at eac h point of spac etime. The laws of the theory are given by an equation, where Δ is a (linear or nonlinear) differential operator.8 The job performed by suc h an equation is to single out from a large space K of functions that represent “kinematically possible” histories of the system those that represent dynamically possible histories. Typically, K is specified by specifying the independent and dependent variables of the theory and the degree of regularity (smoothness etc .) of c andidate func tions, as well as any (asymptotic ) boundary conditions that they must satisfy. We call the subspace S ⊂ K consisting of those u that satisfy the equation of the theory the space of solutions. 3. A Recipe fo r Disaster Here is a rec ipe for arriving at the c harac teriz ation of symmetries of c lassic al theories that is embodied in the doc trine D1 disc ussed above: begin with the standard abstrac t notion of a symmetry; reflec t on the c ase of spacetime symmetries; extrapolate. Page 2 of 17
Symmetry and Equivalence Symmetries, abstractly speaking. There is a basic notion of symmetry, elaborated in various ways in the various branc hes of mathematic s. Consider a structure—a set of objec ts, D, equipped with some relations, R1, …, and func tions, f1,….A symmetry of this struc ture is a one-to-one and onto map F : D → D that preserves all of the relations and func tions between the objec ts.9 Spacetime symmetries. Suppose that we want to c hec k whether a given field theory is invariant under a spac etime transformation suc h as time-translation. Then all we have to do is to c hec k to make sure that whenever u(x, t) is a solution of the theory's equation of motion, so is u(x, t − b), for any real number b. That is, we define in the obvious way an operator on the set of kinematic ally possible fields that implements the putative symmetry, then c hec k to see whether it maps solutions to solutions. Generalization. Suppose that we think of a classical theory as a structure consisting of a large set of functions K with a distinguished subset S . Then we would expect the symmetries of such a theory to be those one-to-one and onto maps from K to itself that preserve S . Spacetime symmetries fit this pattern exactly! This suggests that we ought to think of c lassic al theories in the way just outlined. Something has gone wrong. Following this rec ipe leads to the following Fruitless Definition: A symmetry of a differential equation Δ is a one-to-one and onto map T: u ∈ K ↦ T(u) ∈ K that preserves the space of solutions, in the sense that u ∈ S if and only if T(u) ∈ S . As an attempt to c apture the ordinary notion of a symmetry of an equation (and henc e of a theory), this is a disaster. Ordinarily, symmetries of theories are hard to c ome by. But some remarkable theories have atypic ally large symmetry groups (relative to the number of degrees of freedom of the systems that they treat). The definition above effac es this sort of distinc tion between theories. For if we allow arbitrary permutations of the solutions of a theory to c ount as symmetries, then the siz e of a theory's group of symmetries depends only on the siz e of its spac e of solutions.10 All of this is liable to strike fans of the Fruitless Definition as c heap and misleading: after all, sinc e the spac es involved have topologic al and differential struc ture, surely one should restric t attention to c ontinuous or smooth transformations of the spac e of kinematic possibilities? This is fair enough—but it would not make muc h differenc e. At best, it would lead us to identify the symmetries of a theory with something like the family of smooth permutations of the spac e of solutions, rather than with the full family of permutations of that spac e.11 But this would still lead us to the c onc lusion that every theory had an enormous group of symmetries that depended only on relatively c oarse features of its spac e of solutions.12 4. Symmetries o f Differential Equatio ns Laws of c lassic al physic s are given by differential equations—and there are a number of ways that a differential equation c an be enc oded in a struc ture whose symmetries c an be investigated. D1 c orresponds to the most flat- footed of these. In this sec tion and the next we briefly survey some more sophistic ated alternatives: the foc us in this sec tion is on approac hes that direc tly enc ode the equation of motion of the theory in a struc ture whose symmetries c an be identified with the symmetries of the theory; in the next sec tion, we c onsider approac hes that take a detour via a Lagrangian or Hamiltonian formulation of the theory. There is no settled, definitive notion of a symmetry of a differential equation— rather there are a family of related notions.13 Three are espec ially important: the notion of a c lassic al symmetry, the notion of a generaliz ed symmetry, and the notion of a nonloc al symmetry. I will sketc h the first of these and then desc ribe how the other two are related to it. A few more details c onc erning c lassic al symmetries are presented in an appendix below. Let us foc us on the field-theoretic c ase: the independent variables of the theory parameteriz e the spac etime manifold M; a c onfiguration of the field is represented by a func tion u : M → W; and the dynamic al laws of the theory are enc oded in a kth-order partial differential equation Δ. Roughly speaking, the c lassic al symmetries of suc h a theory c an be c harac teriz ed as follows. Let E be the manifold M × W that is parameteriz ed by the independent and dependent variables of the theory taken together. Any diffeomorphism d¯ : E → E will induce a transformation d from the spac e of kinematic ally possible fields of the theory to itself (this c laim is unpac ked in the Page 3 of 17
Symmetry and Equivalence appendix below). A map d that arises in this way is a classical symmetry of the equation if and only if it maps solutions to solutions.14 So c lassic al symmetries are, roughly speaking, transformations of the spac e of kinematic ally possible fields that: (a) map solutions to solutions; and (b) are suitably loc al, in the sense that they arise from smooth transformations of the dependent and independent variables of the theory. Classic al symmetries are also known as Lie symmetries or point symmetries. The spacetime symmetries of a theory are those classical symmetries that arise from transformations of the independent variables of the theory that leave the dependent variables untouc hed.15 (For partic le theories in fixed spac etime bac kgrounds, the spac etime symmetries are those translation, rotations, boosts, and dilatations that map solutions to solutions). The c lassic al symmetries of an equation form a family muc h more restric ted than that pic ked out by D1—and typic ally this means that solutions related by a c lassic al symmetry share many salient features. In typic al c ases, the c lassic al symmetries of a theory of n Newtonian partic les are just the spac etime symmetries—so solutions are related by a symmetry if and only if they have the partic les instantiating the same sequenc e of relative distanc e relations.16 In the c ase of general relativity, the c lassic al symmetries are generated by spac etime diffeomorphisms and by scale transformations—so that, roughly speaking, two solutions are related by a symmetry if and only if they agree about the pattern of ratios of distanc es instantiated.17 The notion of a c lassic al symmetry is a spec ial c ase of the basic mathematic al notion of symmetry: the c lassic al symmetries of a differential equation are the struc ture-preserving maps for a c ertain struc ture assoc iated with the equation. Here is the story in briefest outline (a few more details are provided in the appendix below).18 One c onstruc ts a spac e, Jk (E) (the kth jet bundle over E) a point of whic h is spec ified, intuitively speaking, by spec ifying a point x of spac etime and the values of the field and its partial derivatives through order k at x. Jk (E) comes equipped with some geometric structure, C (the Cartan distribution), that ensures that the specification of a point of Jk (E) lives up to this intuitive picture. Our differential equation Δ determines a submanifold E ⊂ J k(E) — sinc e for eac h point of spac etime Δ imposes a c onstraint on the values of the fields and their partial derivatives at that point. So we have a structure consisting of a large space, Jk (E), a geometric structure C on Jk (E), and a distinguished subspace E of Jk (E). In accord with our template for defining symmetries of structures, we take a symmetry of this contraption to be a one-to-one and onto map F : Jk (E) → Jk (E) that respects C and maps points of E to points of E . It turns out that the resulting notion coincides with the notion of a classical symmetry as c harac teriz ed above. In order to c harac teriz e generaliz ed and nonloc al symmetries, it is helpful (as well as more honest—see fn. 60 below) to desc ribe the relevant notions by c harac teriz ing their infinitesimal generators.19 From this perspec tive, c lassic al symmetries (investigated by Sophus Lie and others in the late nineteenth c entury) are, roughly speaking, transformations that map solutions to solutions and whose infinitesimal generators depend only on the independent and dependent variables of the theory. Emmy Noether introduc ed a substantially more general notion at the beginning of the twentieth c entury.2 0 Her generalized symmetries (also known as local or higher symmetries or as Lie–Bäcklund transformations) are, roughly speaking, transformations that map solutions to solutions and whose infinitesimal generators depend on derivatives of the fields, as well as on the independent and dependent variables of the theory.21 These arise as the symmetries of a fanc ier struc ture that c an be used to enc ode the differential equation of interest: again the equation is represented as a submanifold of an ambient spac e equipped with a Cartan struc ture; this time the ambient spac e is an infinite jet bundle—the spec ific ation of a point of whic h involves spec ifying a point of spac etime together with the values at that point of a func tion and all of its partial derivatives. Every c lassic al symmetry is also a generaliz ed symmetry. Some equations have no nonc lassic al generaliz ed symmetries (the Einstein field equations of general relativity are an example).22 But many equations have generaliz ed symmetries that are not c lassic al symmetries. Striking examples inc lude: Page 4 of 17
Symmetry and Equivalence The This is the problem of determining the motion of Newtonian massive partic le moving in the Kepler external gravitational field of a fixed massive body.2 3 In addition to the “obvious” c onserved Problem. quantities for this system—energy and angular momentum—there is a “hidden” one, the Lenz - Runge vec tor.2 4 Assoc iated with this hidden c onserved quantity is a nonc lassic al generaliz ed symmetry of the Kepler problem (more on this below).25 The KdV The Korteweg– de Vries equation is an equation used to model waves in shallow water. We c an Equation. think of it as an equation governing a field u(x, t) on a two-dimensional spac etime: The only c lassic al symmetries of this equation are spac etime and sc aling symmetries; but it has an infinite-dimensional family of generaliz ed symmetries.2 6 Yet more general are the nonloc al symmetries that have been investigated in rec ent dec ades.2 7 These are, roughly speaking, transformations that map solution to solutions and whose infinitesimal generators are allowed to depend on nonloc al func tionals (suc h as integrals) of the fields, as well as on their derivatives and on the dependent and independent variables of the equation.2 8 They arise as symmetries of a struc ture that results when the differential equation is enc oded in a c ertain extension of the infinite jet bundle. Many examples of nonloc al symmetries are known—even the Kepler problem admits nonloc al symmetries that are not generaliz ed symmetries.29 5. Variatio nal and Hamilto nian Symmetries The governing equations of many physic al theories c an be given a Lagrangian or Hamiltonian treatment (although this sometimes requires some c raft or flexibility). For suc h theories, it is natural to c onsider the symmetries of the struc tures employed in the Lagrangian or Hamiltonian treatment—whic h need not c oinc ide with the symmetries of the equation itself. 5.1 Variational Symmetries There are two main styles of Lagrangian formalism, sometimes c alled the dynamical and the covariant approac hes, whic h c oinc ide for theories with finitely many degrees of freedom but differ for field theories.3 0 Dynamic al approac hes are quite similar in spirit to the Hamiltonian approac h c onsidered below.3 1 Under c ovariant approac hes, whic h are our c onc ern here, the basic spac e of interest is a jet bundle over the spac e of dependent and independent variables of the theory (as in the treatment of c lassic al symmetries sketc hed above) and the Lagrangian is now to be thought of as an objec t that when fed a kinematic ally possible field u gives in return a d- form L[u] on the spac etime M (here d = dim M).3 2 Eac h suc h Lagrangian is assoc iated with a differential equation, whose solutions c orrespond to those fields that satisfy a c ertain variational princ iple for the Lagrangian. A (c lassic al or generaliz ed) symmetry of an equation arising from a Lagrangian is c alled a variational symmetry if it leaves the Lagrangian invariant. Even if a theory admits a Lagrangian formulation, some symmetries of the underlying equation may not show up as variational symmetries—as a rule, Galilean boosts and sc aling symmetries are often not variational.3 3 Modulo nic eties about boundary c onditions, Noether's theorem assures us that eac h one-parameter family of variational symmetries is assoc iated in a c anonic al way with a c onservation law.3 4 5.2 Hamiltonian Symmetries Let us restric t attention to the ideal c ase in whic h spec ifying an initial data set (= a possible instantaneous dynamical state) for the equation of the theory determines a unique solution defined at all times. Then, speaking roughly and heuristic ally, a Hamiltonian treatment amounts to the following. The phase space of the theory is the space, J , of all initial data sets. The Hamiltonian, H, of the theory is the real-valued function on the phase space that assigns to eac h point of the phase spac e the energy of the c orresponding physic al state. The phase spac e c an be equipped with a geometric struc ture, ω, with the following marvelous feature: together H and ω determine a J Page 5 of 17
Symmetry and Equivalence family of c urves in J , exac tly one passing through eac h point; eac h of these c urves c orresponds to a solution of the theory's equation of motion, in the sense that the two objec ts pic k out the same sequenc e of instantaneous dynamic al states; and for any suc h solution there is a c orresponding c urve of this kind.3 5 So the struc ture (J, ω, H) in effect encodes the differential equation of the theory. It is natural to investigate the Hamiltonian symmetries of the theory: those one-to-one and onto maps from J to itself that preserve both ω and H.36 The Hamiltonian version of Noether's theorem assures us (fussing about boundary c onditions aside) that there is a c onserved quantity assoc iated with eac h one-parameter family of Hamiltonian symmetries of a theory. 6. Symmetry and Physical Equivalence Why was it (in hindsight, in one sense) a mistake for Newton to postulate absolute spac e? Bec ause he thereby postulated more spac etime struc ture than was required for his dynamic s—boosts are symmetries of his laws of motion but not of the spatiotemporal struc ture that he postulated. Reflec tion on this example and others motivates the princ iple that sound interpretative prac tic e requires that the spac etime symmetries of one's ontology ought to include the spacetime symmetries of one's preferred theory. Conversely, it seems like something has gone wrong if the spac etime symmetry group of one's equations of motion is more restric ted than the spac etime symmetry group of one's ontology—for in this c ase, one would be shirking one's duty by in effec t employing geometric al struc ture in one's dynamic al theory while not being willing to pay the ontologic al c ost. It is tempting to think that this pic ture ought to generaliz e: Why should spac etime symmetries be spec ial? At the level of slogans, the idea is easy to state—our interpretative prac tic e should be guided by the princ iple that the symmetries of a theory's laws and the symmetries of its ontology should c oinc ide. It is not obvious how to give a prec ise and general formulation of the idea. But one thing that seems c lear is that any suc h formulation would have as a c onsequenc e the sec ond doc trine disc ussed in the opening sec tion of this c hapter: D2. Two solutions of a c lassic al theory's equation of motion are related by a symmetry if and only if they are physic ally equivalent, in the sense that they are equally well- or ill-suited to represent any partic ular physic al situation.3 7 We have seen that disaster follows if this doc trine is c ombined with the notion that the symmetries of a c lassic al theory are the maps that send solutions to solutions. But having set aside that notion, we c an ask whether D2 c an be safely c ombined with one of the more nuanc ed and disc riminating notions of symmetry on offer. Indeed, one can give various plausibility arguments in favor of theses in the neighborhood of D2. (a) Intuitively speaking, two solutions are related by a symmetry if and only if they are interc hangeable by the lights of the theory's formalism. And surely if everything has been set up properly (i.e., just the right ingredients have been built into the formalism of the theory), two solutions that are formally interc hangeable should also have identic al representational c apac ities—and so should be physic ally equivalent in the present sense. (b) Further, one might think that there is a pretty good reason for thinking that two solutions related by a c lassic al symmetry of a theory must be physic ally equivalent. For suc h symmetries c an be thought of as smooth transformations of the dependent and independent variables of the theory that preserve the form of the equations of the theory. How c ould our representational prac tic es reasonably distinguish between two solutions of a theory that are related by a reparameteriz ation of the theory's variables to whic h the equations of the theory are themselves indifferent? (c) That this is the right way to go is also suggested by c onsideration of familiar c ases. No one denies that solutions of a theory's equations are physic ally equivalent if they are related by a spac etime symmetry of the theory (i.e., by a symmetry that transforms the independent variables of the theory without affec ting the dependent variables).3 8 A similar c onsensus exists regarding paradigm c ases of (global and loc al) internal symmetries (that involve transformations of dependent variables of the theory that do not affec t the independent variables). Thus, if one has a theory involving a c omplex sc alar field suc h that global phase transformations of the form φ (x) ↦ eiθ φ (x) (θ a c onstant) are symmetries, then one regards solutions related by suc h a transformation as physic ally equivalent. Similarly, one regards as physic ally equivalent solutions of Maxwell's equations in vec tor potential form that are related by loc al gauge transformations of the form Aμ (x) ↦ Aμ (x) + dΛ(x) (here Λ is a smooth func tion on spac etime, so in general the transformation effec ted on the Page 6 of 17
Symmetry and Equivalence dependent variables varies from spac etime point to spac etime point). Why should other symmetries be different? Unfortunately, these mutually reinforc ing half-arguments do not add up to muc h: eac h of the general notions of symmetry that we have c onsidered leads to unpalatable c onsequenc es when plugged into D2. Consider first the notion of a c lassic al symmetry of a differential equation. As noted above, most equations admit relatively few suc h symmetries, so that solutions related by a c lassic al symmetry share a great deal in c ommon. But there exist equations whose symmetry groups are so large that they ac t transitively on the spac e of solutions (i.e., for any two solutions, u1 and u2 , there is a c lassic al symmetry of the equation that maps u1 to u2 ). This sort of shoc king behavior c an be found in some of the most basic theories of c lassic al physic s. (i) A single Newtonian free partic le. In this c ase, the spac etime symmetries of the theory are given by the Galilei group, and any solution c an be mapped onto any other by suc h a symmetry.3 9 (ii) The harmonic osc illator. In one spatial dimension, an arbitrary solution takes the form q = A c os t + B sin t. The theory admits a one-parameter group of c lassic al symmetries that ac ts on solutions by c hanging the value of B, and another suc h group that c hanges the value of A—so any solution c an be mapped to any other by a symmetry of the theory.40 (iii) Linear homogeneous partial differential equations—suc h as the heat equation, the wave equation, the sourc e-free Maxwell equations, etc . Corresponding to any solution u0 of suc h an equation, there is a c lassic al symmetry Tu0 : u ↦ u + u0 .4 1 And, of c ourse, for any two solutions u1 and u2 of a linear equation, there is a solution u0 suc h that u2 = u1 + u0 . So any two solutions of a linear homogeneous partial differential equation are related by a symmetry. If we maintain that any two solutions related by a c lassic al symmetry are physic ally equivalent, then eac h of these theories will be unable to disc riminate among the systems it provides good models for—it will c onsider them to have the same physic s. That is presumably the right verdic t in the c ase of the theory of a free Newtonian partic le—if we follow the polic y of regarding solutions related by a spac etime symmetry as physic ally equivalent, then no degrees of freedom remain in this theory onc e suc h symmetries are fac tored out. But the other examples are very different: under any ordinary reading, they admit solutions that represent situations in whic h nothing is happening (the oscillator is permanently immobile at the origin, the field is in a ground state) and others that represent situations in whic h plenty is going on (the osc illator is c ontinually sproinging around, energy in the form of heat or waves is propagating). An approac h to understanding physic al theories that leaves us unable to see these distinc tions is not something we c an live with. So D2 is false if understood as a thesis c onc erning c lassic al symmetries of differential equations. Further, employing generaliz ed or nonloc al symmetries would not help: the problems just noted would remain, and new ones of the same ilk would c rop up.4 2 Nor does it help to shift attention to variational or Hamiltonian symmetries. Consider first variational symmetries.Thec lassof(c lassic al or generaliz ed)variational symmetries of anequation is more restric tive than the c lass of (c lassic al or generaliz ed) symmetries of an equation. And that has some benefits: the variational symmetries of the wave equation do not inc lude addition-of-an-arbitrary-solution (and do not ac t transitively on the spac e of solutions).4 3 But: the problem with the harmonic osc illator remains—for any two solutions, there is a variational symmetry that maps one to the other.4 4 So the c lass of variational symmetries is still in some respec ts too generous to underwrite D2. It is also in some respec ts too restric tive: for example, neither the boost symmetry of c lassic al mec hanic s nor the sc aling symmetry of general relativity is a variational symmetry.4 5 But it is hard to deny that two solutions related by suc h a symmetry are physic ally equivalent in the relevant sense.4 6 Further, rec all that not every equation of motion admits a Lagrangian treatment.4 7 True, with some ingenuity one c an often find surrogates for suc h equations that do admit Lagrangian treatment (e.g., by replac ing the variables of the original theory by suitable potentials, or by introduc ing nonphysic al fields).4 8 But it is hard to see why one should have to take suc h detours in order to understand the c onnec tion between symmetry and physic al equivalence. What about Hamiltonian symmetries, then? This c lass is again too restric tive in some respec ts. As in the variational c ase, sc aling symmetries are typic ally not Hamiltonian symmetries. Galilean boosts c ause the same sort of trouble Page 7 of 17
Symmetry and Equivalence —sinc e they typic ally leave the potential energy of a system invariant while altering its kinetic energy, they fail to preserve the Hamiltonian. So if one took solutions to be physic ally equivalent only if related by a Hamiltonian symmetry, then one would have to violate the general princ iple that solutions of c lassic al theories are physic ally equivalent if related by a spac etime symmetry.4 9 At the same time, the c lass of Hamiltonian symmetries is also too generous in some respec ts. In the Hamiltonian setting we are at least safe from the threat of the group of symmetries of a theory ac ting transitively on the spac e of states (since Hamiltonian symmetries preserve the Hamiltonian and theories ordinarily allow states of differing energy). But we nonetheless still run into c ases in whic h we are unwilling to c onsider states or solutions related by Hamiltonian symmetries as physic ally equivalent. Consider the Kepler problem. This theory has a number of Hamiltonian symmetries, some of whic h are spac etime symmetries and some of whic h are not. The spac etime symmetries inc lude time translation (assoc iated with c onservation of momentum) and rotation (assoc iated with c onservation of angular momentum). As usual, we want to regard solutions related by spac etime symmetries as physic ally equivalent—in the present c ase, this means that we regard solutions as physic ally equivalent if the c orresponding elliptic al orbits are of the same shape (i.e., have the same ec c entric ity and have equally long major axes). The further Hamiltonian symmetries of the Kepler problem are assoc iated with the c onservation of the Lenz – Runge vec tor. If two solutions are related by one of these symmetries, then the c orresponding ellipses have equally long major axes, but (in general) have different ec c entric ities and different orientations in spac e.50 The upshot is that if we take being related by a Hamiltonian symmetry to imply physic al equivalenc e, then we must take solutions of the (negative energy) Kepler problem to be physic ally equivalent if and only if they c orrespond to ellipses with equally long major axes.51 But we do not normally regard highly ec c entric orbits and perfec tly c irc ular orbits as being physic ally equivalent.52 We run into the same problem with the harmonic osc illator: the family of Hamiltonian symmetries ac ts transitively on the surfac es of c onstant energy in the spac e of initial data—so two solutions of the theory are related by a Hamiltonian symmetry if and only if they represent the system as having the same total energy.53 But two solutions of the harmonic osc illator c an have the same energy while c orresponding to quite different motions—for example, in the two-dimensional c ase, one might have the partic le moving bac k and forth on a line segment while the other has it exec uting a c irc ular motion.54 Or, again, consider the Korteweg–de Vries equation. This equation governs a real-valued field u in one spatial dimension. Under its usual interpretation, this field desc ribes the dynamic s of a shallow body of water with one horiz ontal dimension, with u(x,t) giving the depth of the water above spatial point x at time t. Consider a partic ular solution, u0 . How large is the family of solutions physic ally equivalent to u0 ? Well, it surely inc ludes solutions related to u0 by spac etime symmetries, and perhaps also sc aling symmetries. But the family of generaliz ed symmetries of the equation and the family of Hamiltonian symmetries are both infinite-dimensional—so it seems c lear that eac h must c ontain many symmetries that relate physic ally inequivalent solutions.55 7. Outlo o k Where does this leave us? There are various interesting formal notions of symmetry applic able to c lassic al physic al theories. But none of the standard ones are suited to underwrite a princ iple like D2 that makes a direc t link between the being related by a symmetry and being physic ally equivalent. I leave it as a challenge to the reader to identify a general and interesting formal notion of symmetry that renders D2 true—or, better, to identify a family X of symmetries suc h that two solutions of a theory that are related by a symmetry are physic ally equivalent if and only if they are related by a symmetry that belongs to X. Above we have seen that we ordinarily take spac etime symmetries to belong to X. Further, we have seen that there are c lassic al symmetries, generaliz ed symmetries, nonloc al symmetries, variational symmetries, and Hamiltonian symmetries that are not in X. We have also seen that X c ontains symmetries that fall outside of the c lasses of spacetime symmetries, variational symmetries, and Hamiltonian symmetries. Perhaps it is possible to find some formal notion of symmetry that c ombines well with D2. But it is hard to be optimistic —c ertainly, the ways of enc oding the c ontent of laws that are most appealing to mathematic ians and Page 8 of 17
Symmetry and Equivalence physic ists appear to lead to notions of symmetry that are c oolly indifferent to c onsiderations of representational equivalenc e. So it appears that the sort of c onstraint that knowledge of the symmetries of a theory plac es on the range of reasonable interpretations of that theory may well be more modest than one might have hoped.56 Here are a few more details about c lassic al symmetries of differential equations. Let us begin by desc ribing Jk(E).57 For present purposes, it is easiest to work in terms of c oordinates.58 A point y in M is spec ified by spec ifying a d-tuple of real numbers, (y1,…,yd) (the c oordinates of y). A point (y,v) in E is spec ified by spec ifying a (d + m)-tuple of real numbers, (y1,…,yd; v1,…,vm) (the c oordinates in M of y plus the c oordinates in W of u(y), the value of the field u at spac etime point y). A point (y,v,p) in Jk(E) is specified by specifying a [d + m( d + k )]-tuple of real numbers, (y1,…,yd; v1, k …,vm;p1,…,pn), where the pi are just numerous enough to be regarded as a list of the values of the partial derivatives (from order one through k) of u by the spac etime c oordinates. When we think of Jk(E) in this way, it provides a natural way to enc ode the c ontent of any kth order partial differential equation whose dependent and independent variables parameteriz e E. For think what suc h a differential equation does: for eac h point x of spac etime, the equation imposes a c onstraint on the values of the field and its partial derivatives through order k when evaluated at x. But, roughly speaking, that is to say that suc h a differential equation determines a submanifold of E ⊂ J k(E) and that the content of the equation is exhausted by the struc ture of E as a submanifold of Jk(E). Now we are getting somewhere: we c an identify symmetries as one-to-one and onto maps from Jk (E) to itself that preserve E and all relevant struc ture of the ambient spac e Jk (E). But what does this struc ture amount to? We know that Jk (E) is a manifold and that we are working in a setting in whic h everything is at least a bit smooth— so it is natural to require our symmetries to be appropriately smooth diffeomorphisms from Jk (E) to itself. There is one further kind of struc ture on Jk (E) that matters in the present c ontext. Rec all that we have said that, intuitively, the extra variables that we add in moving from E to Jk (E) are supposed to c orrespond to the values of the partial derivatives of the field with respec t to the spac etime c oordinates. We need to put some struc ture on Jk (E) in order to enforc e this intuitive demand. In order to get a feeling for what is required here, c onsider the following line of thought. Let u : M → W be a kinematic ally possible field. The c ontent of u is enc oded in its graph, Γu, the subset of E c onsisting of pairs for the form (x, u(x)), for x ε M. Of c ourse, the map x ε M ↦ (x, u(x)) ε Γu is a diffeomorphism. Further, a suffic iently smooth submanifold Γ ⊂ E c orresponds to a kinematic ally possible u in this way if and only if the projec tion map ¯π : (x, v) ∈ Γ ↦ x ∈ M is a diffeomorphism from Γ onto M. Now, eac h kinematic ally possible u also determines a diffeomorphism from M to Jk (E). We define the k-jet of u to be the map j[u]: x ε M → (x,u(x),p) ε Jk(E), where p = (p1,…, pn) enc odes the values at x of the partial derivatives of u through order k. The k-jet of u is a diffeomorphism onto its image Ju. There is of c ourse a projec tion map π: (x,v,p) ε Jk(E) ↦ x ε M, whose restric tion to Ju is the inverse of the jet j[u]. It may be tempting to suppose that any suffic iently smooth submanifold K ⊂ Jk (E) suc h that π : K → M is a diffeomorphism onto its image arises as the Ju for some kinematic ally possible u. But this is false. For c onsider suc h a K. For eac h spac etime point x ε M, K inc ludes exac tly one point (x,v,p). So, questions of boundary c onditions aside, K does determine a kinematic ally possible u—for eac h spac etime point x just set u(x) = v, where (x,v,p) is the unique point in K that π sends to x. But there is no guarantee that Ju = K, prec isely bec ause we have not yet built into the struc ture of Jk (E) any c onnec tion between the pi and the partial derivatives of the field—so in general, if (x, v,p) is a point in K and u is the kinematic ally possible field determined by K, there is no reason to expec t that the values of the partial derivatives of u at x are given by p (that is, in general one expec ts that j[u](x) ≠ (x,v,p) so that Ju ≠ K). There is an elegant solution to this problem. One imposes on Jk (E) some geometric struc ture, in the form of the Cartan distribution, C , which singles out at each point of Jk (E) a distinguished subspace of dimension [d + m( )] [d + m( )] Page 9 of 17
Symmetry and Equivalence [d + m( d +k− 1 )] in the [d + m( d + k )]-dimensional tangent spac e at that point. The Cartan distribution d−1 k has the following beautiful feature: a submanifold K of Jk(E) that is diffeomorphic to M via the natural projec tion map π : (x,v,p) → x is of the form Ju for some kinematic ally possible u if and only at any point of K, all tangent vec tors pointing along K lie in the privileged subspace picked out at that point by C . That is: the Cartan distribution enc odes our intuitive c onstraint that the pi should c orrespond to the values of the partial derivatives of fields.59 Putting all of this together: a symmetry (in the present sense) of our equation is a diffeomorphism from Jk (E) to itself that preserves: (i) the submanifold E ⊂ J k(E) that encodes the equation; and (ii) the Cartan distribution C . It now turns out that (exc ept in the spec ial c ase where m = 1) the only suc h diffeomorphisms arise via diffeomorphisms from E to itself (this c laim will be unpac ked in the next paragraph). And of c ourse, sinc e eac h kinematic ally possible field u c an be identified with the submanifold j[u](M) ⊂ Jk (E), any diffeomorphism from Jk (E) to itself ac ts in a natural way on the spac e of kinematic ally possible fields—and (subtleties aside) any diffeomorphism from Jk (E) to itself that preserves E maps solutions to solutions. So: the symmetries (in the present sense) of a differential equation are the transformations of the spac e of kinematic ally possible solutions that map solutions to solutions and that are suitably loc al, in the sense of depending only on the independent and dependent variables of the theory. There are two unfinished bits of business. The first is to unpac k the notion of a diffeomorphism from Jk (E) to itself arising via a diffeomorphism from E to itself. Let d¯ : E → E be a diffeomorphism. We seek to define a c orresponding diffeomorphism d : Jk (E) → Jk (E). We proc eed by selec ting an arbitrary (x, v,p) ε Jk (E) and showing how to find d(x,v,p). Let u be a kinematic ally possible field suc h that u(x) = v and j[u](x) = (x,v,p) (i.e., (x,v,p) gives the value of u and its partial derivatives at x). Let Γu be the graph of u in E (i.e., the set of points of E of the form (x,u(x)), for all x ε M). Γu is a smooth submanifold of E that projec ts diffeomorphic ally to M under the map π¯ : (x, v) ↦ x Now, since d¯ is a diffeomorphism from E to itself, the set d¯(Γu) is also a smooth submanifold of E. Sadly, d¯(Γu) need not project diffeomorphically to M. But let us ignore that detail, and pretend that it does—and henc e c orresponds to a kinematic ally possible u*.6 0 Then we c an define d(x,v,p) : = j[u*](x′), where x′ = π¯(d¯(x, v)). The resulting map d: Jk (E) → Jk (E) is a well-defined diffeomorphism (in particular, it is independent of the c hoic es we made along the way). The final piec e of unfinished business is to note that one typic ally works with fields that have well-defined transformations laws under changes of coordinates on spacetime: so a spacetime diffeomorphism d¯ : M → M induces in a natural way a diffeomorphism ¯D¯¯ : E → E.61 The corresponding transformation D : K → K is determined as above. And we find, as expec ted, that D is a c lassic al symmetry if and only if it maps solutions to solutions. References Abraham, R., and J. Marsden (1985). Foundations of mechanics. 2d ed. Cambridge, MA: Perseus. Anc o, S., and J. Pohjanpelto (2008). Generaliz ed symmetries of massless free fields on Minkowski spac e. Symmetry, Integrability and Geometry: Methods and Applications 4: 004. Anderson, I., and C. Torre (1996). Classific ation of loc al generaliz ed symmetries for the vac uum Einstein equations. Communications in Mathematical Physics 176: 479–539. Baker, D. (2010). Symmetry and the metaphysic s of physic s. Philosophy Compass 5: 1157– 1166. Bluman, G. (2005). Connec tions between symmetries and c onservation laws. Symmetry, Integrability and Geometry: Methods and Applications 1: 011. Bluman, G., and S. Anc o (2002). Symmetry and integration methods for differential equations. Berlin: Springer– Verlag. Brading, K., and E. Castellani (2007). Symmetries and invarianc es in c lassic al physic s. In Philosophy of physics, Part B, ed. J. Butterfield and J. Earman, 1331–1367. Amsterdam: Elsevier. Page 10 of 17
Symmetry and Equivalence Cantwell, B. (2002). Introduction to symmetry analysis. Cambridge: Cambridge University Press. Castrillón López , M., and J. Marsden (2008). Covariant and dynamic al reduc tion for princ ipal bundle field theories. Annals of Global Analysis and Geometry 34: 263–285. Cushman, R., and L. Bates (1997). Global aspects of classical integrable systems. Basel: Birkhäuser. Cushman, R., and D. Rod (1982). Reduc tion of the semisimple 1: 1 Resonanc e. Physica D 6: 105– 112. Dasgupta, S. (2010). Symmetries in physic al reasoning. Unpublished manusc ript. Debs, T., and M. Redhead (2007). Objectivity, invariance, and convention: symmetry in physical science. Cambridge, MA: Harvard University Press. Duarte, L., S. Duarte, and I. Moreira (1987). One-dimensional equations with the maximum number of symmetry generators. Journal of Physics A 20: L701–L704. Earman, J. (1989). World enough and spacetime: Absolute versus relational theories of space and time. Cambridge, MA: MIT Press. Goldstein, H., C. Poole, and J. Safko (2002). Classical mechanics. 3d ed. New York: Addison–Wesley. Guillemin, V., and S. Sternberg (1990). Variations on a theme by Kepler. Providenc e, RI: Americ an Mathematic al Society. Healey, R. (2009). Perfec t symmetries. British Journal for the Philosophy of Science 60: 697– 720. Hydon, P. (2000). Symmetry methods and differential equations: A beginner's guide. Cambridge: Cambridge University Press. Ibragimov, N., and T. Kolsrud (2004). Lagrangian approac h to evolution equations: Symmetries and c onservation laws. Nonlinear Dynamics 36: 29–40. Ismael, J., and B. van Fraassen (2003). Symmetry as a guide to superfluous theoretic al struc ture. In Symmetries in physics: Philosophical reflections, ed. K. Brading and E. Castellani, 371–392. Cambridge: Cambridge University Press. Jauc h, J., and E. Hill (1940). On the problem of degenerac y in quantum mec hanic s. Physical Review 57: 641– 645. Kishimoto, A., N. Oz awa, and S. Sakai (2003). Homogeneity of the pure state spac e of a separable C∗-algebra. Canadian Mathematical Bulletin 46: 365–372. Klainerman, S. (2008). Partial differential equations. In The Princeton companion to mathematics, ed. T. Gowers, J. Barrow– Green, and I. Leader, 455– 483. Princ eton: Princ eton University Press. Kolář, I., P. Michor, and J. Slovák (1993). Natural operations in differential geometry. Berlin: Springer–Verlag. Krasil'shchik, I., and A. Vinogradov, eds. (1999). Symmetries and conservation laws for differential equations of mathematical physics. Providenc e, RI: Americ an Mathematic al Soc iety. Leac h, P., K. Andriopoulos, and M. Nuc c i (2003). The Ermanno– Bernoulli c onstants and the representations of the complete symmetry group of the Kepler problem. Journal of Mathematical Physics 44: 4090–4106. Lévy– Leblond, J.– M. (1971). Conservation laws for gauge-variant Lagrangians in c lassic al mec hanic s. American Journal of Physics 39: 502–506. Lutz ky, M. (1978). Symmetry groups and c onserved quantities for the harmonic osc illator. Journal of Physics A 11: 249–258. Morehead, J. (2005). Visualiz ing the extra symmetry of the Kepler problem. American Journal of Physics 73: 234– 239. Page 11 of 17
Symmetry and Equivalence North, J. (2010). Struc ture in c lassic al mec hanic s. Unpublished manusc ript. Olver, P. (1984). Conservation laws in elastic ity: II. Linear homogeneous isotropic elastostatic s. Archive for Rational Mechanics and Analysis 85: 131–160. ———. (1993), Applications of Lie groups to differential equations. 2d ed. Berlin: Springer–Verlag. Pohjanpelto, J. (1995). Symmetries, c onservation laws, and Maxwell's equations. In Advanced electromagnetism: foundations, theory and applications, ed. T. Barrett and D. Grimes, 560– 589. Singapore: World Sc ientific . Princ e, G. (2000). The inverse problem in the c alc ulus of variations and its ramific ations. In Geometric approaches to differential equations, ed. P. Vassiliou and I. Lisle, 171–200. Cambridge: Cambridge University Press. Princ e, G., and C. Eliez er (1980). Symmetries of the time-dependent N-dimensional osc illator. Journal of Physics A 13: 815–823. ———. (1981). On the Lie symmetries of the c lassic al Kepler problem. Journal of Physics A 14: 587– 596. Roberts, J. (2008). A puz z le about laws, symmetries, and measurability. British Journal for the Philosophy of Science 59: 143–168. Rosen, N. (1966). Flat spac e and variational princ iple. In Perspectives in geometry and relativity: Essays in honor of Václav Hlavatý, ed. B. Hoffmann, 325–327. Bloomington: Indiana University Press. Ruetsche, L. (2011). Interpreting quantum theories. Oxford: Oxford University Press. Saunders, D. (2008). Jet manifolds and natural bundles. In Handbook of global analysis, ed. D. Krupka and D. Saunders, 1035–1068. Amsterdam: Elsevier. Sorkin, R. (2002). An example relevant to the Kretsc hmann– Einstein debate. Modern Physics Letters A 17: 695– 700. Torre, C. (1995). Natural symmetries and Yang–Mills equations. Journal of Mathematical Physics 36: 2113–2130. Notes: (1) See, e.g., Baker (2010), Brading and Castellani (2007), Dasgupta (2010), Debs and Redhead (2007), Healey (2009), Ismael and van Fraassen (2003), North (2010), and Roberts (2008). (2) See Earman (1989, §3.4). (3) If spac etime geometry fails to be invariant under some suc h transformation that is a symmetry of the laws, then one is positing unnec essary geometric struc ture (think of Newton on absolute spac e). Conversely, if one's spac etime struc ture is invariant under transformations that are not symmetries of the laws, then one is c heating somehow or other, employing some struc ture in one's formalism to break a spac etime symmetry without being willing to pay the ontologic al pric e (think of someone whose theory has a point mass sproinging bac k and forth about the origin in the Cartesian plane, but who is unwilling to posit a privileged point of spac e or any physic al struc ture—matter, field, forc e, etc .—other than the single moving point-partic le). (4) I won't do anything to substantiate this c laim here (let us not desc end into rec riminations at this time!), beyond saying that, like many a rant, this one is direc ted in part at earlier versions of its author. (5) Our world c ontains many systems that c an be treated (for all prac tic al purposes) as being isolated. It follows from the conjunction of D1 and D2 that if a theory provides equally good models of two isolated systems, then any model of the theory must provide equally good representations of eac h of these systems—whic h is to say that the theory is just about useless in applic ation. (6) Sinc e none of the authors I have in mind address the topic in quite the same terms employed here (or in quite the same terms as one another), the following attributions should perhaps be taken with a grain of salt. On the Page 12 of 17
Symmetry and Equivalence optimistic side: Baker (2010, §1) c onjec tures that D2 is true relative to some generaliz ation of the Hamiltonian approac h disc ussed in § 5 below; Brading and Castellani (2007, § 4.1 and 8.2) c laim that D2 is true relative to the notion of a c lassic al symmetry disc ussed in § 4 below; Roberts (2008, fn. 3) holds that D2 is true for the sort of version of D1 disc ussed in fn. 11 below. On the pessimistic side: Ismael and van Fraassen (2003) take D1 to be the only philosophically salient notion of symmetry—and thus think that in order to get around the sort of problems noted above, one should take models to be physic ally equivalent if and only if they are related by a symmetry and agree in (roughly speaking) their direc tly perc eivable features; Healey (2009) likewise notes some of the problems surrounding D1, mentions that there are other formal notions of symmetry available, then pursues an approac h that builds physic al equivalenc e into its notion of symmetry; Dasgupta (2010) argues that a notion of symmetry appropriate to D2 must involve notions like that of a mental state. (7) More generally, here and throughout, one c ould take histories to be represented by sec tions of a fiber bundle E → M with typic al fiber W. Almost any c lassic al field c an be thought of this way. For example, a gauge field, standardly represented by a c onnec tion one-form on a princ iple bundle P → M, c an be represented instead by a sec tion of a c ertain affine bundle J1 (P)/G → M assoc iated with P; see, e.g., Kolář, Mic hor, and Slovák (1993, § 17.4). (8) So the left-hand side of this equation is the func tion on M that results from applying Δ to u and the right-hand side is the z ero func tion. (9) I.e., R(x1,…,xn) if and only if R(F(x1),…,F(xn)); and f(x1,…,xn) = y if and only if f(F(x1),…,F(xn)) = F(y). (10) Don't mathematic ians sometimes offer up c harac teriz ations of symmetries along these lines? Yes—but only when speaking loosely and heuristic ally. Thus in the introduc tion to Olver's influential textbook on symmetries of differential equations, we are told that: “Roughly speaking, a symmetry group of a system of differential equations is a group which transforms solutions of the system to other solutions” Olver (1993, xviii)—see also, e.g., Bluman and Anc o (2002, 2) and Klainerman (2008, 457). But on the next page we are told that onc e “one has determined the symmetry group of a system of differential equations, a number of applic ations bec ome available. To start with, one c an direc tly use the defining property of suc h a group and c onstruc t new solutions to the system from known ones.” Of c ourse, one c annot do this if one's notion of a symmetry is given by the Fruitless Definition—one needs to be working with one of the more spec ializ ed notions that are the foc us of Olver's book, some of whic h are desc ribed below. And likewise for the other applic ations on Olver's list. (11) Roughly speaking: for non-freaky Δ, one expects S to sit inside K as (something like) a submanifold; if ϕ1 and ϕ2 are (generic) points in S then there will be a diffeomorphism F : S → S with ϕ2 = F(ϕ 1); and one expects to be able to extend F to a diffeomorphism F¯¯¯ : K → K . (Slightly more carefully: one expects that any obstruc tions to extending F in this way are going to be tec hnic al in nature, and not detrac t from the main point.) (12) Consider, by way of illustration, the Newtonian theory of three gravitating point partic les of distinc t masses. Here a point in the space of kinematically possible fields, K , essentially assigns each of the particles a worldline in spacetime (without worrying about whether these worldlines jointly satisfy the Newtonian laws of motion). The space of solutions, S , is the 18-dimensional submanifold of K consisting of points corresponding to particle motions obeying Newton's laws. So one expects that any diffeomorphism from S itself can be extended to a suitably nice map from K to itself. But for any solutions u1 , u2 ∈ S , we can find a diffeomorphism from S to itself that maps u1 to u2, so we again find that arbitrary pairs of solutions are related by symmetries. This seems unac c eptable, sinc e we ordinarily think of this theory as having a relatively small symmetry group (c onsisting just of spacetime symmetries). (13) For an historic al overview and referenc es, see Olver (1993, 172ff. and 374ff.). There is of c ourse a trade-off to be made between fec undity and generality. In prac tic e, the notions that mathematic ians and physic ists find interesting are far more restric tive than the fruitless notion c onsidered above. But there would appear to be no feeling that there is a correct notion of a symmetry of a differential equation—plausibly, for every interesting suc h notion, there is a yet more general one that is still interesting. (14) Two subtleties are glossed over in the text. (i) In general d may map a kinematic ally possible field to an objec t that is a sort of partially-defined multiply-valued field rather than a kinematic ally possible field (see fn. 60 below). (ii) The story is (yet) more c omplic ated in the important spec ial c ase where dim W = 1. Page 13 of 17
Symmetry and Equivalence (15) Note, in partic ular, that we make no appeal here to metric tensors and the like—we just look for transformations of the independent variables that leave invariant the equations of motion. (Some subtleties arise at this point for c ertain types of fields; see, e.g., Kolár?, Mic hor, and Slovák (1993) on gauge natural bundles.) (16) Hydon (2000, example 4.4). (17) See Anderson and Torre (1996). (18) For what follows, see, e.g., Krasil'shc hik and Vinogradov (1999, c h. 3). (19) Unfortunately, disc rete symmetries threaten to go missing when one adopts this perspec tive. For one approac h to this problem, see Hydon (2000, c h. 11). (20) See, e.g., Olver (1993, c h. 5) or Krasil'shc hik and Vinogradov (1999, c h. 4). (21) To get a feeling for what this means, c onsider the sort of gauge transformations that normally arise in presentations of Maxwell's theory: if we take the vec tor potential A(x) as our field, then the theory is invariant under infinitesimal transformations of the form A ↦ A + εdΛ where Λ(x) is a real-valued func tion on spac etime. Now suppose that A is a map then when fed a kinematic ally possible A returns a real-valued func tion Λ [A] on spac etime. If for eac h spac etime point x and eac h A, the value of Λ [A] at x depends only on x and on A(x), then the infinitesimal transformation A ↦ A + εd Λ[A] c orresponds to a c lassic al symmetry of Maxwell's theory; if the value of Λ [A] (x) depends also on a finite number of derivatives of A at x, then this map is a generaliz ed symmetry of Maxwell's theory See the disc ussion of generaliz ed gauge symmetries in Pohjanpelto (1995) and in Torre (1995). For a thoroughly worked-out example involving only finitely many degrees of freedom, see Cantwell (2002, §14.4.1). (22) See Anderson and Torre (1996). (23) Note that the Kepler problem c ontains all of the dynamic s of the honest Newtonian two-body problem. See, e.g., Goldstein, Poole, and Safko (2002, §3.1). (24) The Lenz –Runge vec tor is − q + μ1 p × (q × p), were q is the position of the moving particle, p is its ∣q∣ momentum, and μ is a c onstant that depends on the masses. (25) For the generaliz ed symmetry of the Kepler problem, see Lévy–Leblond (1971, §5.B). Note that this system also admits a c lassic al symmetry that is in a sense assoc iated with the Lenz –Runge vec tor; see Princ e and Eliez er (1981). (26) For the c lassic al and generaliz ed symmetries of the Korteweg–de Vries equation, see Olver (1993, 125ff. and 312ff). (27) See, e.g, Krasil'shc hik and Vinogradov (1999, c h. 6). (28) For a thoroughly worked-out example in whic h an infinitesimal symmetry and the c orresponding finite symmetry both depend on an integral over spac e of the dependent variable of the theory, see Cantwell (2002, §16.2.2.1). (29) See, e.g., Leac h, Andriopoulos, and Nuc c i (2003). (30) For these labels and for an investigation of the relation between the two approac hes, see Castrillón López and Marsden (2008). For introduc tions to the two approac hes, see Abraham and Marsden (1985, § 3.5ff. and Example 5.5.9). (31) Under suc h approac hes, one works with a spac e of instantaneous states (whic h will be infinite-dimensional in the field-theoretic c ase), equips this spac e with a real-valued func tion, L (the Lagrangian), and employs a variational princ iple to find those c urves in the spac e of states that c orrespond to dynamic ally possible histories of the system. (32) For instanc es of this styles of approac h, see, e.g., Castrillón López and Marsden (2008) and Krasil'shc hik and Page 14 of 17
Symmetry and Equivalence Vinogradov (1999, c h. 5). (33) On sc ale transformations, see Olver (1993, 255). A standard remedy is to introduc e the notion of a divergence symmetry, a transformation that leaves the Lagrangian invariant up to a total divergenc e; many interesting symmetries are divergenc e symmetries but not variational symmetries, inc luding boosts of Newtonian systems and the c onformal symmetries of the wave equation; see Olver (1993, 278– 281). Sc aling symmetries are more subtle. Sc ale transformations are symmetries of general relativity, but are neither variational nor divergenc e symmetries; see Anderson and Torre (1996, § 2.B). Resc aling of spac e and time is a symmetry of the wave equation that is neither a variational nor a divergenc e symmetry, although there is a related sc ale transformation that ac ts on the dependent variables, as well as the independent variables, whic h is a divergenc e symmetry (but not a variational symmetry); see Olver (1993, Examples 2.43, 4.15, and 4.36). (34) But: c ertain types of variational (or divergenc e) symmetries of theories whose initial value problems are ill- posed are assoc iated with so-c alled trivial c onservation laws; see Olver (1993, 342– 346) on Noether's sec ond theorem. And: there exist tec hniques for assoc iating c onservation laws with symmetries that do not rely on Noether's theorem; see, e.g., Bluman (2005). (35) ω is a symplectic form—a closed, nondegenerate two-form. ω and H determine a vector field XH on J : XH is the vec tor field that when c ontrac ted with ω yields the one-form dH. Integrating this vec tor field gives the c urves mentioned in the text. Note that there is a canonical recipe for constructing J , H, H, and ω given a Lagrangian treatment of the theory. (36) Note that the symplectic space (J , ω) has a vast family of symmetries. Suppose that we are interested in a Newtonian theory of finitely many particles. Then J is finite-dimensional, but the family of smooth permutations of J that preserve ω is infinite-dimensional—it is only when we restrict attention to transformations that also preserve H that we end up with something like what we want. Something similar is of c ourse true in ordinary quantum mec hanic s: while the family of unitary transformations of a Hilbert spac e will be very large, the family of suc h transformations that preserve a given Hamiltonian will be quite small—and only the latter is a good c andidate for the symmetry group of a theory. The situation is more perplexing in the c ase of fanc ier quantum theories. On the one hand, an arbitrary C∗ -algebra automorphism is pretty c learly the analogue of an arbitrary symplec tic or unitary transformation and so is not a good c andidate to be a symmetry of a theory: indeed, in many c ases of interest any two states are related by suc h an automorphism; see Kishimoto, Oz awa, and Sakai (2003). On the other hand, in some c ontexts it is not possible to identify symmetries of a theory with those C∗ -algebra automorphisms that preserve the Hamiltonian bec ause there is no Hamiltonian operator available at the C∗ -algebra level; on this point, see Ruetsc he (2011, § 12.3). (37) In truth, in order to formulate a plausible doc trine in the neighborhood of D2, one should probably work with infinitesimal symmetries—otherwise it is easy to c onc oc t c ounterexamples by deleting points from a theory's spac e of solutions. (38) There is disagreement over the question whether there are distinc t physic al possibilities related by shifts and the like. But that is a different question. (39) In fac t, the c lassic al symmetries for this theory are not exhausted by the Galilei symmetries—already in two spac etime dimensions, where the Galilei group is three-dimensional, the c lassic al symmetry group for the free particle is eight-dimensional. See, e.g., Duarte, Duarte, and Moreira (1987). (40) See, e.g., Lutz ky (1978, §3). Note that the c omplete c lassic al symmetry group of the one-dimensional osc illator is eight-dimensional, and so outstrips the group of spac etime symmetries. Note also that all of these results c arry over, mutatis mutandis, to the c ase of a time-dependent osc illator in n spatial dimensions; see Princ e and Eliez er (1980). (41) See, e.g., Hydon (2000, 145). For the c lassic al symmetries of the heat equation and the wave equation, see, e.g., Olver (1993, Examples 2.41 and 2.43). For symmetries of the sourc e-free Maxwell equations, see Anc o and Pohjanpelto (2008). (42) E.g., the group of nonloc al symmetries of the Kepler problem ac ts transitively on solutions; see Leac h, Page 15 of 17
Symmetry and Equivalence Andriopoulos, and Nuc c i (2003). (43) See Olver (1993, Example 4.15). (44) See Lutz ky (1978) and Princ e and Eliez er (1980). (45) What happens if in plac e of variational symmetries, we c onsider divergenc e symmetries? Some problems go away, others reappear. Good news: Galilean boosts and c ertain types of sc aling transformations c ount as symmetries (see fn. 33 above). Bad news: addition-of-an-arbitrary solution is a divergenc e symmetry of the wave equation; Olver (1993, Example 4.36). So if we c ount solutions related by a divergenc e symmetry as physic ally equivalent, we have to view every pair of solutions of the wave equation as being physic ally equivalent. (46) In the c ase of general relativity, denying that solutions g and c · g (c a positive c onstant) related by a sc ale transformation are physic ally equivalent is espec ially diffic ult for those who deny that there are possible worlds that agree about distanc e ratios but disagree about matters of absolute distanc e—for ac c ording to suc h philosophers there is only one world that is adequately represented by g and c · g taken together, and it is not easy to see how one of the two solutions c ould have a better c laim than the other to represent that world. (47) The problem of identifying those whic h do so is known as the inverse problem of the calculus of variations or as Helmholtz's problem. For an introduc tory survey, see Princ e (2000). (48) See, e.g., Ibragimov and Kolsrud (2004), Olver (1993, Exerc ises 5.35, 5.36, and 5.46), Rosen (1966), and Sorkin (2002). (49) Just as one might move from variational symmetries to divergenc e symmetries (see fn. 33 above), one might c onsider transformations of a system's phase spac e that leave invariant the set of Hamiltonian trajec tories without worrying about whether they also leave the Hamiltonian itself invariant. Suitably interpreted, this should manage to c apture Galilean boosts in Newtonian mec hanic s and the sc aling symmetry of the Kepler problem; see Abraham and Marsden (1985, 446 f.) and Princ e and Eliez er (1981, §5). Of c ourse, it also inc ludes the various undesirable c harac ters that already c ount as Hamiltonian symmetries (see below). (50) See Morehead (2005). This symmetry c orresponds to a generaliz ed symmetry of the equations of motion. (51) For the negative energy Kepler problem, the length of the major axis determines the energy of a solution, so any Hamiltonian symmetry leaves this quantity invariant; see, e.g., Goldstein, Poole, and Safko (2002, §3.7). (52) (1) One might think that we would do so if sc aling transformations were up for grabs, sinc e one c an indeed transform a c irc ular orbit into an ec c entric orbit by resc aling one c oordinate axis while leaving the others invariant. But this sort of resc aling does not preserve the Hamiltonian of the Kepler problem (bec ause it c hanges the length of the major axis of some solutions). (2) The disc ussion above glosses over an interesting subtlety. At the infinitesimal level, one does indeed find a large set of symmetries of the Kepler problem. But the spac e of initial data features a singular set of points (c orresponding to situations in whic h the two partic les c ollide). And the existenc e of this singular set provides an obstruc tion to integrating the infinitesimal symmetries into a group ac tion; see Cushman and Bates (1997, 74). However, there exist ways of regulariz ing the singularities of the Kepler problem—and these c onstruc tions lead to a family of finite symmetries that ac t transitively on surfac es of c onstant energy. For disc ussion and referenc es, see Cushman and Bates (1997, c h. 2) or Guillemin and Sternberg (1990, § 2.7). (3) Note that the fac t that the Hamiltonian symmetry group of the Kepler problem is larger than its spac etime symmetry group plays an important role in the quantum theory of the hydrogen atom; see Jauc h and Hill (1940) or Guillemin and Sternberg (1990, §7). (53) Consider the two-dimensional c ase. There are four c onserved quantities, c orresponding to four independent Hamiltonian symmetries. A surfac e of c onstant energy is a three-sphere in the phase spac e and the group of Hamiltonian symmetries is the group U(2), whic h ac ts transitively on the energy surfac es. For details, see, e.g., Goldstein, Poole, and Safko (2002, § 9.8), Cushman and Rod (1982), or Cushman and Bates (1997, c h. 2). (54) Again, the fac t that the symmetry group of the c lassic al system is U(2) rather than just the spac etime symmetry group plays an important role in the quantum theory; see Jauc h and Hill (1940). (55) In the examples just c onsidered, it is pretty c lear that one does not want to c ount every pair of solutions Page 16 of 17
Symmetry and Equivalence related by a generaliz ed symmetry as being physic ally equivalent. Does one ever want to c ount solutions as physic ally equivalent that are related by a generaliz ed symmetry that is not a c lassic al symmetry? Yes—for instanc e, when the solutions in question are also related by a respec table c lassic al symmetry. Consider, e.g., the generaliz ed gauge transformations desc ribed in fn. 21 above—if two solutions are related by suc h a symmetry, then they are also related by an ordinary gauge transformation. Are there pairs of solutions related by a generaliz ed symmetry (but not by any c lassic al symmetry) that one would want to c onsider physic ally equivalent? That appears to be a more diffic ult question. Part of the diffic ulty lies in the fac t that what one has in prac tic e are the infinitesimal generators of generaliz ed symmetries: it is in general a nontrivial task to find the c orresponding group ac tions; see, e.g., Olver (1993, 297ff.). Further, even in c ases where the c orresponding groups of transformations c an be determined, their physic al interpretation c an be obscure; see, e.g., Olver (1984, 136f.). (56) Of course, some interesting weaker relative of D2 might be true. E.g., for all that has been said here, being related by a Hamiltonian symmetry that c orresponds to a c lassic al symmetry of their equation of motion may be a suffic ient c ondition for two solutions to be physic ally equivalent. (57) For what follows, see, e.g., Krasil'shc hik and Vinogradov (1999, c h. 3). (58) But note that everything desc ribed below c an be done in a respec table global and c oordinate-independent fashion. See, e.g., Saunders (2008). (59) The space of vectors picked out by C at a point of Jk (E) coincides with the vectors annihilated by the family of one-forms on Jk (E) that enforc e the differential relations that require the pi to be the derivatives of the c omponents of u with respec t to the xj. (60) To do everything honestly, we would need to: (i) introduc e a generaliz ed notion of a solution; (ii) work loc ally; or (iii) shift our foc us to the infinitesimal symmetries. (61) This is one point at whic h the story c an bec ome more c omplic ated for fields that are not given by tensors. See, e.g., Kolár?, Mic hor, and Slovák (1993) on gauge natural bundles. G ord on Belot G ordon Belot is Professor of Philosophy at the University of Michigan. He has published a num ber of articles on philosophy of physics and related areas—and one sm all book, G eom etric possibility (Oxford, 2011). Page 17 of 17
Indistinguishability Simon Saunders The Oxford Handbook of Philosophy of Physics Edited by Robert Batterman Abstract and Keywords This c hapter analyz es permutation symmetry, foc using on the proper understanding of pc hapter indistinguishability in c lassic al statistic al mec hanic s and in quantum theory. It shows that it is possible to treat the statistic al mec hanic al statistic s for c lassic al pc hapters as invariant under permutation symmetry and argues that while the c onc ept of indistinguishable, permutation invariant, c lassic al pc hapters is c oherent, it is in c ontradic tion with many c laims found in the literature. The c hapter also c ontends that the c onc ept of permutation symmetry should be c onsidered in the same level as other symmetries and invarianc es of physic al theories. K ey words: permu tati on sy mmetry , pch apter i n di sti n gu i sh abi l i ty , stati sti cal mech an i cs, qu an tu m th eory , mech an i cal stati sti cs, cl assi cal pch apters, ph y si cal th eori es By the end of the nineteenth c entury the c onc ept of partic le indistinguishability had entered physic s in two apparently quite independent ways: in statistic al mec hanic s, where, ac c ording to Gibbs, it was needed in order to define an extensive entropy func tion; and in the theory of blac k-body radiation, where, ac c ording to Planc k, it was needed to interpolate between the high frequenc y (Wien law) limit of thermal radiative equilibrium, and the low frequenc y (Rayleigh-Jeans) limit. The latter, of c ourse, also required the quantiz ation of energy, and the introduc tion of Planc k's c onstant: the birth of quantum mec hanic s. It was not only quantum mec hanic s. Planc k's work, and later that of Einstein and Debye, foreshadowed the first quantum field theory as written down by Dirac in 1927. Indistinguishability is essential to the interpretation of quantum fields in terms of partic les (Foc k spac e representations), and thereby to the entire framework of high- energy partic le physic s as a theory of loc al interac ting fields. In this c hapter, however, we c onfine ourselves to partic le indistinguishability in low-energy theories, in quantum and c lassic al statistic al mec hanic s desc ribing ordinary matter. We are also interested in indistinguishability as a symmetry, to be treated in a uniform way with other symmetries of physic al theories, espec ially with spac etime symmetries. That adds to the need to study permutation symmetry in c lassic al theory—and returns us to Gibbs and the derivation of the entropy func tion. The c onc ept of partic le indistinguishability thus c onstrued fac es some obvious c hallenges. It remains c ontroversial, now for more than a c entury, whether c lassic al partic les c an be treated as indistinguishable; or if they c an, whether the puz z les raised by Gibbs are thereby solved or alleviated; and if so, how the differenc es between quantum and c lassic al statistic s are to be explained. The bulk of this c hapter is about these questions. In part they are philosophical. As Quine remarked: Those results [in quantum statistic s] seem to show that there is no differenc e even in princ iple between saying of two elementary partic les of a given kind that they are in the respec tive plac es a and b and that they are oppositely plac ed, in b and a. It would seem then not merely that elementary partic les are unlike Page 1 of 31
Indistinguishability bodies; it would seem that there are no suc h deniz ens of spac etime at all, and that we should speak of plac es a and b merely as being in c ertain states, indeed the same state, rather than as being oc c upied by two things. (Quine 1990, 35) He was speaking of indistinguishable partic les in quantum mec hanic s, but if partic les in c lassic al theory are treated the same way, the same questions arise. This c hapter is organiz ed in three sec tions. The first is about the Gibbs paradox and is largely expository. The sec ond is on partic le indistinguishability, and the explanation of quantum statistic s granted that c lassic al partic les just like quantum partic les c an be treated as permutable. The third is about the more philosophic al questions raised by sec tions 1 and 2, and the question posed by Quine. There is a spec ial diffic ulty in matters of ontology in quantum mec hanic s, if only bec ause of the measurement problem.1 I shall, so far as is possible, be neutral on this. My conclusions apply to most realist solutions of the measurement problem, and even some nonrealist ones. 1. The Gibbs Parado x 1.1 Indistinguishability and the Quantum Quantum theory began with a puz z le over the statistic al equilibrium of radiation with matter. Spec ific ally, Planc k was led to a c ertain c ombinatorial problem: For eac h frequenc y υ s, what is the number of ways of distributing an integral number Ns of “energy elements” over a system of Cs states (or “resonators”)? The distribution of energy over eac h type of resonator must now be c onsidered, first, the distribution of the energy Es over the Cs resonators with frequenc y υ s. If Es is regarded as infinitely divisible, an infinite number of different distributions is possible. We, however, c onsider—and this is the essential point—Es to be composed of a determinate number of equal finite parts and employ in their determination the natural c onstant h = 6.55 × 10−2 7 erg sec . This c onstant, multiplied by the frequenc y, us, of the resonator yields the energy element Δεs in ergs, and dividing Es by hυs, we obtain the number Ns, of energy elements to be distributed over the Cs resonators. (Planc k 1900, 239)2 Thus was made what is quite possibly the most suc c essful single c onjec ture in the entire history of physic s: the existenc e of Planc k's c onstant h, postulated in 1900 in the role of energy quantiz ation. The number of distributions Zs, or microstates as we shall c all them, as a func tion of frequenc y, was sought by Planck in an effort to apply Boltzmann's statistical method to calculate the energy-density ¯E¯¯¯s of radiative equilibrium as a func tion of temperature T and of Zs. To obtain agreement with experiment he found (1) The expression has a ready interpretation: it is the number of ways of distributing Ns indistinguishable elements over Cs distinguishable cells—of noting only how many elements are in whic h c ell, not whic h element is in whic h c ell.3 Equivalently, the mic rostates are distributions invariant under permutations. When this c ondition is met, we c all the elements permutable.4 Following standard physic s terminology, they are identical if these elements, independent of their mic rostates, have exac tly the same properties (like c harge, mass, and spin). Planc k's “energy elements” at a given frequenc y were c ertainly identic al; but whether it followed that they should be c onsidered permutable was hotly disputed. Onc e interpreted as partic les (“light quanta”), as Einstein proposed, there was a natural alternative: Why not c ount mic rostates as distinc t if they differ in whic h partic le is loc ated in whic h c ell, as had Boltz mann in the c ase of material partic les? On that c ount the number of distinc t mic rostates should be: (2) Considered in probabilistic terms, again as Einstein proposed, if eac h of the Ns elements is assigned one of the Cs c ells at random, independent of eac h other, the number of suc h assignments will be given by (2), eac h of them equiprobable. ¯¯¯¯ Page 2 of 31
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 1 - 50
- 51 - 100
- 101 - 150
- 151 - 200
- 201 - 250
- 251 - 300
- 301 - 350
- 351 - 400
- 401 - 450
- 451 - 500
- 501 - 550
- 551 - 585
Pages: