Friday, July 1, 2022

Which Standard Model Muon g-2 Calculation Is Correct?

Background

Recent experimental data has confirmed, to high precision with well understood uncertainties, a prior measurement of the anomalous magnetic moment of the muon (muon g-2), a "second generation" heavy electron whose magnetic properties are predicted to extremely high precision in the Standard Model of Particle Physics.

This magnetic property of muons has been both measured experimentally, and calculated, at precisions on the order of parts per billion. But, because the experimental measurement and calculation are both so precise, differences between the experimental measurement and the theoretical prediction that are tiny in absolute terms can still be very statistically significant and point to potential flaws in the Standard Model of Particle Physics used to make the calculation. 

Also, the uncertainties in the theoretical calculation, while quantified as having particular values, are less well understood than the uncertainties in the experimental calculation and may be understated.

There are three components to the calculation corresponding to the three fundamental forces of the Standard Model (gravity can be safely neglected in this context): the electromagnetic component (QED), the weak force component (EW), and the strong force component (QCD a.k.a. the hadronic component).

The contributions of the QED, EW, and QCD parts to the overall result, and of the two subparts of the QCD part called hadronic vacuum polarization (HVP) and hadronic light by light (HLbL), from Theory Initiative's determination at Aoyama, et al., "The anomalous magnetic moment of the muon in the Standard Model" arXiv (June 8, 2020). can be summarized as follows (multiplied by 10^11 for easier reading):

Muon g-2 = QED+EW+QCD

QED = 116 584 718.931 ± 0.104

EW = 153.6 ± 0.1

QCD = HVP+HLbL = 6937 ± 44

HVP = 6845 ± 40

HLbL = 92 ± 18

The QED and EW components are both profoundly easier to calculate, and much more precisely determinable, than the QCD component, for intrinsic scientific reasons related to the differences between these forces discussed in the final section of this post.

About 99.5% of the uncertainty in the muon g-2 calculation comes from the QCD part, even though it is the source of only about 0.006% of the absolute value of muon g-2.

Both the HVP and HLbL parts of the QCD calculation make a material contribution to the uncertainty in the muon g-2 calculation, but the HVP part contributes much more to the overall uncertainty than the HLbL part, because when you have multiple sources of uncertainty in a calculation, the bigger uncertainties tend to swamp the smaller ones unless they are very close in magnitude and there are a great many distinct smaller ones.

There is no disagreement concerning the values of the QED and EW contributions to the calculation. So, naturally, all of the debate over the correct Standard Model prediction of muon g-2 comes down to the QCD calculation with the HVP part of that calculation taking center stage, even though both parts are important.

There are two leading determinations of the Standard Model prediction's hadronic component which is a profoundly more difficult calculation. One is the Theory Initiative value that substitutes experimental data for some lattice QCD computations of the predicted value, which has a significant difference from the experimentally measured value of muon g-2. The other is the BMW group value that is consistent with the experimentally measured value using purely lattice QCD computations.

The experimental results from directly measuring muon g-2, and theoretically calculated Standard Model predictions for the value of muon g-2 can be summarized as follows (multiplied by 10^11 for easier reading, with one standard deviation magnitude in the final digits shown in parenthesis after each result):

Fermilab (2021): 116,592,040(54)
Brookhaven's E821 (2006): 116,592,089(63)
Combined measurement: 116,592,061(41)
Difference between measurements: 59 (0.7 sigma)

Theory Initiative (TI) prediction: 116,591,810(43)
BMW prediction: 116,591,954(55)
Difference between predictions: 144 (2.1 sigma)

Combined measurement v. TI: 251 (4.2 sigma)
Combined measurement v. BMW: 107 (1.6 sigma)

What's At Stake?

The stakes regarding which of these calculations of the Standard Model predicted value are high, because the value of muon g-2 is an experimentally observable quantity that is globally sensitive to the existence and magnitude of most lower energy deviations from the Standard Model of Particle Physics in a single high precision measurement. 

If the Theory Initiative value is correct, something about the Standard Model is incorrect at energy scales that can be reached by existing or near future high energy physics experiments. 

If the BMW value is correct, then any deviations between Nature and the Standard Model at energy scales that can be reached by existing or near future high energy physics experiments, are limited to those that cancel out in muon g-2 calculations (which as a practical matter, rules out almost all seriously considered experimentally accessible new particle physics theories).

If the BMW calculation is right, we are almost certainly in a "new physics desert" and a next generation particle collider will probably reveal no new physics.

If the Theory Initiative calculation is right, there is a very high likelihood that new physics is right around the corner and will be seen at a next generation particle collider.

Given that the cost of a next generation particle collider is on the order of billions of dollars to build and operate for its full experimental run, the desirability of this very big new purchase for humanity hinges to a great extent on whether the BMW or Theory Initiative calculation is right.

Also, even if we can't resolve that question, knowing precisely what is causing the differences between the calculations can highlight and clarify which kind of next generation collider is most likely to be useful to see new physics if they are out there, and what features it is important for a next generation collider to have to be able to resolve the questions that are the underlying source of the discrepancy.

The Latest Development

A new lattice QCD calculation continues the process of pinning down the exact source of the difference in calculations of the hadronic component of the Standard Model expected value of the anomalous magnetic moment of the muon (muon g-2) between the Theory Initiative value and the BWM value.

This new calculation isolates a particular part of the hadronic vacuum polarization (HVP) part of the hadronic component of muon g-2 prediction calculation, called the intermediate time-distance window, that is one of the main sources of the discrepancy between between the Theory Initiative calculation and the BMW value.

This focuses the effort of getting to the bottom of the question of why two collaborations of distinguished physicists are getting different results which can facilitate the process of figuring out how to determine which is correct.

The Paper

The paper and its abstract (not reformatted for blogger to preserve superscripts and subscripts and the like in formulas):

We present a lattice determination of the leading-order hadronic vacuum polarization (HVP) contribution to the muon anomalous magnetic moment, aHVPμ, in the so-called short and intermediate time-distance windows, aSDμ and aWμ, defined by the RBC/UKQCD Collaboration. 
We employ a subset of the gauge ensembles produced by the Extended Twisted Mass Collaboration (ETMC) with Nf=2+1+1 flavors of Wilson-clover twisted-mass quarks, which are close to the physical point for the masses of all the dynamical flavors. The simulations are carried out at three values of the lattice spacing ranging from 0.057 to 0.080 fm with linear lattice sizes up to L7.6~fm. 
For the short distance window we obtain aSDμ(ETMC)=69.33(29)1010, which is consistent with the recent dispersive value aSDμ(e+e)=68.4(5)1010 within 1.6σ. 
In the case of the intermediate window we get the value aWμ(ETMC)=235.0(1.1)1010, which is consistent with the result aWμ(BMW)=236.7(1.4)1010   by the BMW collaboration as well as with the recent determination by the CLS/Mainz group of aWμ(CLS)=237.30(1.46)1010 at the 1.01.3σ level. However, it is larger than the dispersive result aWμ(e+e)=229.4(1.4)1010 by 3.1σ. The tension increases to 4.2σ if we average our ETMC result with the BMW and the CLS/Mainz ones. 
Our accurate lattice results in the short and intermediate windows hint at possible deviations of the e+e cross section data with respect to Standard Model (SM) predictions distributed somewhere in the low (and possibly intermediate) energy regions, but not in the high energy region.
C. Alexandrou, "Lattice calculation of the short and intermediate time-distance hadronic vacuum polarization contributions to the muon magnetic moment using twisted-mass fermions" arXiv:2206.15084 (June 30, 2022) (82 pages).

My Take

For what it is worth, I personally believe it is very likely that the BMW calculation that matches the experimentally measured value is the correct one, and that the apparent muon g-2 anomaly hinting at a variety of possible "new physics" is, in fact, merely due to a flaw in the Theory Initiative determination of the Standard Model predicted value of muon g-2. 

The Theory Initiative determination might be flawed because the experimental data it is using to substitute for some lattice QCD calculations is itself flawed. This is something that was found previously to have caused the muonic proton radius problem. 

If that is the problem, it could be resolved by redoing the electron collider experiments at the Linear Electron-Positron Collider experiment (LEP) from 1989-2000, upon which the Theory Initiative is mostly relying, with the greater precision and quality control methods that the subsequent two decades of high energy physics has made possible.

On the other hand, if the problem with the Theory Initiative calculation is the way that this experimental data was incorporated into the overall calculation has some subtle flaw, a new theoretical paper could point out the source of the error. This task would be advanced by a better understanding of what part of the Theory Initiative determination is most likely to be flawed allowing scientists to better focus on what kind of methodological error might be involved, which is what this new paper helps to do.

Perspective On The Precision Of These Measurements

In order to maintain perspective it is also important to note that both experimental measurements of muon g-2, and both leading Standard Model theoretical predictions, are identical to the first six significant digits. Thus, they are in perfect agreement up to the one part per million level. Only at the parts per ten million level do discrepancies emerge.

This actually underestimates the precision, because the full value of the magnetic moment of the muon which is actually measured called g(µ) , as opposed to the merely anomalous component of the magnetic moment of the muon called muon g-2, is approximately 2.00233184(1) (i.e. double the anomalous magnetic moment plus two). This adds three more significant digits to its value, making it a parts per billion agreement, with discrepancies arising only at the parts per tens of billions level.

This is greater precision, for example, than the theoretically much easier tasks than the empirically determined precision of a first round counting ballots cast in a statewide or national election, or the count of the number of people residing in the United States on a particular day every ten years in the decennial census.

The discrepancies are arising at a precision equivalent to one millimeter per ten kilometers.

Footnote Regarding Statistical Significance

In physics (and most fields) a discrepancy of less than two sigma (i.e. two standard deviations in a "normal" distribution of data) is considered statistically insignificant and constitutes results that are "consistent" with each other. 

In physics, a discrepancy with a global statistical significance of five sigma or more that is replicated and has some plausible theoretical reason is the standard for a definitive scientific discovery.

The focus on "global significance" is due to the "look elsewhere effect" which observed that is you do enough experiments of the same kind, you expect some of the results to be statistical flukes that would be statistically significant if you were only doing on experiment. For example, if you do twenty experiments of the same kind, you expect to have, on average, on outlier that is more than two sigma from the true value. But, correctly calculating global significance is a matter that is more art than science because you need to determine how to count the total number of experiments you have taken which are measuring the same thing, and this turns out to be very hard to define in any complex multifaceted context like particle collider experiments that do millions and billions of collisions or more over the lifetime of the experiment, not all of which are comparable to each other.

In physics, a discrepancy of more than two but less than five sigma, or a discrepancy that hasn't been replicated, or doesn't have any plausible theoretical explanation, is considered a "tension" between theory and experiment, that is stronger if the number of sigma differing between experiment and theory is larger, but doesn't constitute a definitive scientific discovery. Scientists spend a lot of time seeing if tensions that they observe go away with further research, or solidify into higher significance scientific discoveries.

Why Is The QCD Calculation So Difficult?

The QCD calculation is much more difficult than the QED and EW calculations for two main reasons.

Coupling constant strength 

One is that all of the calculations involve terms for each power of the coupling constant (a dimensionless number) of the force in question, and the magnitude of these coupling constants is very different for the respective forces.

In other words, they take the form:

a*g + b*g^2 + c*g^3 . . .

where a, b, and c are real numbers that come from adding up the calculations for the terms with the same power of the coupling constant for the force in question, and g is the coupling  constant for the force in question.

The strong force coupling constant at the muon mass is in the ballpark of:

0.7 to 0.9

and even at the fourth power is it still about 0.24.

It gets significantly weaker at higher energy scales reaching: 0.1184 at the energy scale of the Z boson mass (of about 91.1 GeV).

The QED coupling constant is, in the low energy limit:

0.007 297 352 569 3(11)

and at the fourth power is it about 0.000 000 002 8 which is about one hundred million times smaller than the fourth power of the strong force coupling constant at the muon mass.

The QED couple constant gets slightly stronger at higher energy scales, reaching about 0.00787 at the energy scale of the Z boson mass.

The weak force coupling constant is on the order of:

0.000001

and at the fourth power is it about 10^-24.

Converted in comparable terms to the coupling constants above at the electron mass, the gravitational coupling constant is about 

6 * 10^-39

which is comparable in magnitude to the sixth power of the weak force coupling constant, the thirteenth power of the QED coupling constant, and a far higher power of the QCD coupling constant.

As a result, terms with higher powers of the QCD coupling constant can't be ignored (especially in low energy interactions where the methodology used for QED and weak force calculations call perturbative methods break down and different methods called lattice QCD need to be used), while higher order terms in QED (typically calculated to the fifth power of the QED coupling constant) and the weak force calculation can ve ignored.

Gluon self-interactions

Let's return to our formula for each contribution in the form 

a*g + b*g^2 + c*g^3 . . .

where a, b, and c are real numbers that come from adding up the calculations for the terms with the same power of the coupling constant for the force in question, and g is the coupling  constant for the force in question.

This formula is really the sum of terms for every possible way that a process can happen (which is described by a Feynman diagram), and the power of the coupling constant is a function of how many interactions there are with the force in question in a possible way that something can happen.

In the case of QED, the electromagnetic force is carried by the photon, which interacts with electromagnetically charged particles, but not with other photons.

In the case of the weak force, which is carried by W and Z bosons, these force carrying particles can interact with each other, but have very weak interactions making interactions between them very small and necessary to consider only at the first or second order level.

But, in the case of the strong force, which is carried by gluons, gluons interact with each other with a strength on the same order of magnitude as interactions between gluons and quarks in the strong force. This means that at each power of the strong force coupling constant, there are far more terms to be considered than in the QED or EW calculations, and that the rate at which the number of terms grows with each additional power of the strong force coupling constant is much greater than in the QED or EW calculations.

Conclusion

So, the bottom line is that to get comparable precision, you need to consider far higher powers of the coupling constant to do strong force calculations than QED or EW calculations, and the number of terms that have to be calculated at each power of the coupling constant in strong force calculations is also profoundly greater than in QED or EW calculations, with the disparity getting worse with each additional power of the coupling constant you try to consider.

One the calculations are set up for the QED or EW cases, those calculations can be made to maximal precision in less than a day with an ordinary desktop computer with a single processor, and the limiting factor on the number of calculations you do is the precision of the coupling constant measurement which leaves you with spurious accuracy beyond the fifth power of that coupling constant in QED and sooner in the EW calculation.

In contrast, the strong force calculations done to three or four powers of the strong force coupling constant, which still aren't very precise, take weeks of non-stop calculations with the equivalent of millions of single processor desk top computers working together.

No comments: