Dispatches From Turtle Island: mathematics

Showing posts with label mathematics. Show all posts

Wednesday, July 9, 2025

Non-Linear Cosmology Dynamics

Assuming the data has a Gaussian distribution (i.e. is distributed in a "normal" probability curve) is often reasonable, since this is what happens when data comes from independent simple percentage probability events. And, it is a convenient assumption when it works, because mathematically it is much easier to work with Gaussian distributions than most other probability distributions. But, sometimes reality is more complicated than that and this assumption isn't reasonable.

The supernova data used to characterize dark energy phenomena isn't Gaussian.

Trivially, this means that statistical uncertainty estimates based upon Gaussian distributions overestimate the statistical significance of observations in the fat tailed t-distribution.

Non-trivially, this means that the underlying physics of dark matter phenomena are more mathematically complex than something like Newtonian gravity (often assumed for astronomy purposes as a reasonable approximation of general relativity) or a simple cosmological constant. Simple cosmology models don't match the data.

This paper estimates dark energy parameters for more complex dark energy models that can fit the data.

Type Ia supernovae have provided fundamental observational data in the discovery of the late acceleration of the expansion of the Universe in cosmology. However, this analysis has relied on the assumption of a Gaussian distribution for the data, a hypothesis that can be challenged with the increasing volume and precision of available supernova data.

In this work, we rigorously assess this Gaussianity hypothesis and analyze its impact on parameter estimation for dark energy cosmological models. We utilize the Pantheon+ dataset and perform a comprehensive statistical, analysis including the Lilliefors and Jarque-Bera tests, to assess the normality of both the data and model residuals.

We find that the Gaussianity assumption is untenable and that the redshift distribution is more accurately described by a t-distribution, as indicated by the Kolmogorov Smirnov test. Parameters are estimated for a model incorporating a nonlinear cosmological interaction for the dark sector. The free parameters are estimated using multiple methods, and bootstrap confidence intervals are constructed for them.

Fabiola Arevalo, Luis Firinguetti, Marcos Peña, "On the Gaussian Assumption in the Estimation of Parameters for Dark Energy Models" arXiv:2507.05468 (July 7, 2025).

Monday, June 30, 2025

Criticism Of Numerical Approaches To Indo-European Language Phylogeny

I've long been critical of Gray & Atkinson and the New Zealand school's efforts to do computational linguistics for the Indo-European languages, and specifically questioned Heggarty (2023) when it was released. A big factor in that is mishandling the importance of language contact, which can vary depending upon the relative dominance of the languages in contact and the nature of the words, phonetic values, and grammatical structures involved.

In this paper, we present a brief critical analysis of the data, methodology, and results of the most recent publication on the computational phylogeny of the Indo-European family (Heggarty et al. 2023), comparing them to previous efforts in this area carried out by (roughly) the same team of scholars (informally designated as the “New Zealand school”), as well as concurrent research by scholars belonging to the “Moscow school” of historical linguistics.

We show that the general quality of the lexical data used as the basis for classification has significantly improved from earlier studies, reflecting a more careful curation process on the part of qualified historical linguists involved in the project; however, there remain serious issues when it comes to marking cognation between different characters, such as failure (in many cases) to distinguish between true cognacy and areal diffusion and the inability to take into account the influence of the so-called derivational drift (independent morphological formations from the same root in languages belonging to different branches).

Considering that both the topological features of the resulting consensus tree and the established datings contradict historical evidence in several major aspects, these shortcomings may partially be responsible for the results. Our principal conclusion is that the correlation between the number of included languages and the size of the list may simply be insufficient for a guaranteed robust topology; either the list should be drastically expanded (not a realistic option for various practical reasons) or the number of compared taxa be reduced, possibly by means of using intermediate reconstructions for ancestral stages instead of multiple languages (the principle advocated by the Moscow school).

Alexei S. Kassian and George Starostin, "Do 'language trees with sampled ancestors' really support a 'hybrid model' for the origin of Indo-European? Thoughts on the most recent attempt at yet another IE phylogeny". 12 (682) Humanities and Social Sciences Communications (May 16, 2025).

From the body text:

Discussion and conclusions

In the previous sections, we have to tried to identify several factors that might have been responsible for the dubious topological and chronological results of Heggarty et al. 2023 experiment, not likely to be accepted by the majority of “mainstream” Indo-European linguists. Unfortunately, it is hard to give a definite answer without extensive tests, since, in many respects, the machine-processed Bayesian analysis remains a “black box”. We did, however, conclude at least that this time around, errors in input data are not a key shortcoming of the study (as was highly likely for such previous IE classifications as published by Gray and Atkinson, 2003; Bouckaert et al. 2012), although failure to identify a certain number of non-transparent areal borrowings and/or to distinguish between innovations shared through common ancestry and those arising independently of one another across different lineages (linguistic homoplasy) may have contributed to the skewed topography.

One additional hypothesis is that the number of characters (170 Swadesh concepts) is simply too low for the given number of taxa (161 lects). From the combinatorial and statistical point of view, it is a trivial consideration that more taxa require more characters for robust classification (see Rama and Wichmann, 2018 for attempts at estimation of optimal dataset size for reliable classification of language taxa). Previous IE classifications by Gray, Atkinson et al. involved fewer taxa and more characters (see Table 1 for the comparison).

Table 1 suggests that the approach maintained and expanded upon in Heggarty et al. 2023 project can actually be a dead-end in classifying large and diversified language families. In general, the more languages are involved in the procedure, the more characters (Swadesh concepts) are required to make the classification sufficiently robust. Such a task, in turn, requires a huge number of man-hours for wordlist compilation and is inevitably accompanied by various errors, partly due to poor lexicographic sources for some languages, and partly due to the human factor. Likewise, expanding the list of concepts would lead us to less and less stable concepts with vague semantic definitions.

Instead of such an “expansionist” approach, a “reductionist” perspective, such as the one adopted by Kassian, Zhivlov et al. (2021), may be preferable, which places more emphasis on preliminary elimination of the noise factor rather than its increase by manually producing intermediate ancestral state reconstructions (produced by means of a transparent and relatively objective procedure). Unfortunately, use of linguistic reconstructions as characters for modern phylogenetic classifications still seems to be frowned upon by many, if not most, scholars involved in such research — in our opinion, an unwarranted bias that hinders progress in this area.

Overall one could say that Heggarty et al. (2023) at the same time represents an important step forward (in its clearly improved attitude to selection and curation of input data) and, unfortunately, a surprising step back in that the resulting IE tree, in many respects, is even less plausible and less likely to find acceptance in mainstream historical linguistics than the trees previously published by Gray & Atkinson (2003) and by Bouckaert et al. (2012).

Consequently, the paper enhances the already serious risk of discrediting the very idea of the usefulness of formal mathematical methods for the genealogical classification of languages; it is highly likely, for instance, that a “classically trained” historical linguist, knowledgeable in both the diachronic aspects of Indo-European languages and such adjacent disciplines as general history and archaeology, but not particularly well versed in computational methods of classification, will walk away from the paper in question with the overall impression that even the best possible linguistic data may yield radically different results depending on all sorts of “tampering” with the complex parameters of the selected methods — and that the authors have intentionally chosen that particular set of parameters which better suits their already existing pre-conceptions of the history and chronology of the spread of Indo-European languages.

While we are not necessarily implying that this criticism is true, it at least seems obvious that in a situation of conflict between “classic” and “computational” models of historical linguistics, assuming that the results of the latter automatically override those of the former would be a pseudo-scientific approach; instead, such conflicts should be analyzed and resolved with much more diligence and much deeper analysis than the one presented in Heggarty et al. 2023 study.

Friday, April 5, 2024

A New Cosmology Based Neutrino Mass Estimate

For the flat ΛCDM model with the sum of neutrino mass ∑mν free, combining the DESI and CMB data yields an upper limit ∑mν<0.072 (0.113) eV at 95% confidence for a ∑mν>0 (∑mν>0.059) eV prior. These neutrino-mass constraints are substantially relaxed in models beyond ΛCDM.

From arXiv:2404.0300.

This suggests that a near zero lightest neutrino mass is the best fit value, that a one sigma range for the lightest neutrino mass in a normal hierarchy, is about 0-8.7 meV and that a two sigma range for the lightest neutrino mass in a normal hierarchy is about 0-17.3 meV. It also disfavors, but not decisively, an inverted neutrino mass hierarchy.

The merely non-zero Bayesian prior for the sum of the three neutrino masses is contrary to the whole point of using Bayesian statistics, and should just be ignored as meaningless.

Monday, March 13, 2023

Free Floating Planets And Other Astronomy Quick Hits

* There are vast numbers of free floating planets out there, ripped from the stars around which the formed. The James Webb Space Telescope (JWST) will soon reveal many more of them.

While these are ultimately just cold rocks, there are also isolated stars outside any galaxy out there. What if life developed in a place like that?

The possibilities found in the universe are awe inspiring.

* In other astronomy observations, "general relativistic contributions" reduce "the probability that the solar system destabilizes within 5 Gyr by a factor of 60."

* Power laws continue to be fascinating and make random phenomena, while still random, far more ordered than they seem, while also suggesting the kind of processes that give rise to them.

Many astronomical phenomena, including Fast Radio Bursts and Soft Gamma Repeaters, consist of brief distinct aperiodic events. The intervals between these events vary randomly, but there are periods of greater activity, with shorter mean intervals, and of lesser activity, with longer mean intervals. A single dimensionless parameter, the width of a log-normal function fitted to the distribution of waiting times between events, quantifies the variability of the activity. This parameter describes its dynamics in analogy to the critical exponents and universality classes of renormalization group theory. If the distribution of event strengths is a power law, the width of the log-normal fit is independent of the detection threshold and is a robust measure of the dynamics of the phenomenon.

J. I. Katz, "Log-Normal Waiting Time Widths Characterize Dynamics" arXiv:2303.05578 (March 9, 2023) (3 pages).

* Organic molecules that seeded life may have had a head start in interstellar space according to a new preprint: "Protoplanetary disk around a just born young star contains a lot of cosmic dust. especially polycyclic-aromatic-hydrocarbon (PAH), which would become basic component to create biological organics. "

Sunday, January 1, 2023

Breakthrough Made In Numerically Solving Feynman Integrals

Theorists have found a way to solve complex Feynman integrals numerically by reducing them to simple linear algebra.

From Science.

If you don't know what that means, the article at the link does a decent job of explaining it at the undergraduate physics-math-engineering major level.

80% of new publications solving Feynman integrals used these theorists' open source code, which was released a year ago, to do so.

This is especially important for the physics of the strong force (that holds protons and neutrons made up of quarks together) a.k.a. QCD, and efforts to figure out quantum gravity, even though the article refers to the more familiar case of the quantum version of electromagnetism called quantum electrodynamics (QED for short).

Thursday, December 1, 2022

Four New Metric Prefixes Adopted

First uses of prefixes in SI date back to definition of kilogram after the French Revolution at the end of the 18th century. Several more prefixes have gone into use be by the 1947th IUPAC's 14th International Conference of Chemistry, before being officially adopted for the first time in 1960.

The most recent prefixes adopted were ronna-, quetta-, ronto-, and quecto- in 2022, after a proposal from British metrologist Richard J. C. Brown. The large prefixes ronna- and quetta- were adopted in anticipation of needs from data science, and because unofficial prefixes that did not meet SI requirements were already circulating. The small prefixes were added as well even without such a driver in order to maintain symmetry. After these adoptions, all Latin letters have now been used for prefixes or units.

From here.

Friday, November 4, 2022

How Strong Are Observational Constraints On Decaying Dark Matter?

The fundamental problem with using Bayesian statistics, as the authors of the paper below rightly point out, is that your results are only as good as your Bayesian priors. The whole point of Bayesian statistics relative to frequentist statistics, is to leverage information you have before you look at the data to maximize the amount of information you can glean from new data.

Previous studies using the dark matter particle paradigm strongly disfavor decaying dark matter models with mean lifetimes of less than many times the age of the universe, unless it is very short lived and in a dynamic equilibrium that keeps the total amount of dark matter almost precisely constant.

For example, as I noted in an answer at the Physics Stack Exchange:

Assuming a dark matter particle paradigm, according to a pre-print by Yang (2015) subsequently published in Physical Review D, the lower bound on the mean lifetime of dark matter particles is $3.57 \times 10^{24}$ seconds. This is roughly $10^{17}$ years. By comparison the age of the universe is roughly $1.38 \times 10^{10}$

The authors make a different Bayesian prior assumption that prior decaying dark matter parameter estimates and concludes that in their best fit model, around 3% of cold dark matter decays just prior to recombination. In the conventional cosmology timeline, "recombination" (which is "the epoch during which charged electrons and protons first became bound to form electrically neutral hydrogen atoms") occurs about 370,000 years after the Big Bang (at a redshift of z =1100). This implies dark matter with a mean lifetime of about 12.15 billion years, about ten million times shorter than estimates from previous studies. The new Bayesian prior favors metastable, rather than truly stable, dark matter candidates.

Since this is driven by a choice of Bayesian prior, however, it is worth considering why a scientist might be biased towards a prior that leads to more dark matter decays. The most obvious is that searches for decaying dark matter by looking for dark matter decay signatures provides a motivation for an entire subfield of astronomy studies looking for those signatures that would otherwise be ill-motivated since in the standard ΛCDM model dark matter doesn't decay and there are no dark matter decay signatures to be looking for in these astronomy studies.

A large number of studies, all using Bayesian parameter inference from Markov Chain Monte Carlo methods, have constrained the presence of a decaying dark matter component. All such studies find a strong preference for either very long-lived or very short-lived dark matter.

However, in this letter, we demonstrate that this preference is due to parameter volume effects that drive the model towards the standard ΛCDM model, which is known to provide a good fit to most observational data.

Using profile likelihoods, which are free from volume effects, we instead find that the best-fitting parameters are associated with an intermediate regime where around 3% of cold dark matter decays just prior to recombination. With two additional parameters, the model yields an overall preference over the ΛCDM model of Δχ2≈−2.8 with Planck and BAO and Δχ2≈−7.8 with the SH0ES H0 measurement, while only slightly alleviating the H0 tension.

Ultimately, our results reveal that decaying dark matter is more viable than previously assumed, and illustrate the dangers of relying exclusively on Bayesian parameter inference when analysing extensions to the ΛCDM model.

Emil Brinch Holm, et al., "Discovering a new well: Decaying dark matter with profile likelihoods" arXiv:2211.01935 (November 3, 2022).

Thursday, November 3, 2022

Algebra

The Preaching of the Antichrist by Luca Signorelli

Thursday, October 13, 2022

Less Than Explicit Proofs

Saturday, September 3, 2022

A Duel That Set Back Science

French mathematician Évariste Galois was a B-list genius whose work made otherwise insoluble equations possible to calculate with and solve, and also made it easier to determine when an equation could not be solved analytically. His methods are used today in solving difficult questions in particle physics, among other things.

But, due to his untimely death in a duel at the age of twenty, the point at which his work was widely known and appreciated was delayed more than seventy years.

We will never know what other great discoveries this prodigy might have made had he lived a full life. There is every reason to think that some scientific discoveries might have been made a generation or two earlier if he had lived. Even science today might have reached greater heights with access to mathematical tools that have not yet been devised that he might have invented.

In 1830 [Évariste] Galois (at the age of 18) submitted to the Paris Academy of Sciences a memoir on his theory of solvability by radicals; Galois' paper was ultimately rejected in 1831 as being too sketchy and for giving a condition in terms of the roots of the equation instead of its coefficients. Galois then died in a duel in 1832, and his paper, "Mémoire sur les conditions de résolubilité des équations par radicaux", remained unpublished until 1846 when it was published by Joseph Liouville accompanied by some of his own explanations. Prior to this publication, Liouville announced Galois' result to the Academy in a speech he gave on 4 July 1843.

From here. See also a recent Physics Forums Insights Post on the subject.

Thursday, August 4, 2022

Compactly Approximating The Fine Structure Constant

The website vixra.org is a completely non-selective pre-print service for amateur physicists, most of which, to be perfectly honest, is crackpot material. A recent paper there, however, caught my eye because its claim is unambitious, unambiguously proven, and cute.

The author of the one page paper simply proposes a simple formula to approximate one divided by the Fine Structure Constant that matches its true value to two parts per billion precision (in fact, the actual precision is four parts per billion).

In standard deviations of statistical uncertainty of the experimental value, it isn't really even close - it is sixteen sigma from the measured value. The experimentally measured value of the Fine Structure Constant, according to the Particle Data Group, is actually:

1/137.035 999 084(21).

But, it is still a lot better than the three or six significant digit approximations often used in practice when greater precision isn't necessary.

The author, Roger N. Weller, doesn't claim that it has any theoretical foundation.

It is simply a cute and compact way to approximate this physical constant to a 4 parts per billion precision (the claim is two parts per billion, but the measured value is out of date), mostly utilizing the base ten numbers 1, 2, 7 and 9 and various combinations.

I dedicate this post to the late Marni Dee Sheppeard, a theoretical physicists and Internet friend of mine, whose relic blog I link to, who would have really appreciated this formula. Some of her PhD informed theoretical physics work leaned heavily on combinations of 2, 7 and 9 in the the fundamental structure of Nature (see, e.g. here).

Friday, July 29, 2022

Unknown Unknowns

This article discusses an important methodological issue of wide interdisciplinary importance: how to deal with "unknown unknowns" so as not to be overconfident about scientific results, without throwing out the baby with the bathwater and retreating to a nihilist position that we know nothing.

It demonstrates an approach to estimating the uncertainty of results even though we don't know the precise sources of the uncertainties, including possible researcher fraud.

Uncertainty quantification is a key part of astronomy and physics; scientific researchers attempt to model both statistical and systematic uncertainties in their data as best as possible, often using a Bayesian framework. Decisions might then be made on the resulting uncertainty quantification -- perhaps whether or not to believe in a certain theory, or whether to take certain actions.

However it is well known that most statistical claims should be taken contextually; even if certain models are excluded at a very high degree of confidence, researchers are typically aware there may be systematics that were not accounted for, and thus typically will require confirmation from multiple independent sources before any novel results are truly accepted.

In this paper we compare two methods in the astronomical literature that seek to attempt to quantify these `unknown unknowns' -- in particular attempting to produce realistic thick tails in the posterior of parameter estimation problems, that account for the possible existence of very large unknown effects.

We test these methods on a series of case studies, and discuss how robust these methods would be in the presence of malicious interference with the scientific data.

Peter Hatfield, "Quantification of Unknown Unknowns in Astronomy and Physics" arXiv:2207.13993 (July 28, 2022).

Friday, July 1, 2022

Which Standard Model Muon g-2 Calculation Is Correct?

Background

Recent experimental data has confirmed, to high precision with well understood uncertainties, a prior measurement of the anomalous magnetic moment of the muon (muon g-2), a "second generation" heavy electron whose magnetic properties are predicted to extremely high precision in the Standard Model of Particle Physics.

This magnetic property of muons has been both measured experimentally, and calculated, at precisions on the order of parts per billion. But, because the experimental measurement and calculation are both so precise, differences between the experimental measurement and the theoretical prediction that are tiny in absolute terms can still be very statistically significant and point to potential flaws in the Standard Model of Particle Physics used to make the calculation.

Also, the uncertainties in the theoretical calculation, while quantified as having particular values, are less well understood than the uncertainties in the experimental calculation and may be understated.

There are three components to the calculation corresponding to the three fundamental forces of the Standard Model (gravity can be safely neglected in this context): the electromagnetic component (QED), the weak force component (EW), and the strong force component (QCD a.k.a. the hadronic component).

The contributions of the QED, EW, and QCD parts to the overall result, and of the two subparts of the QCD part called hadronic vacuum polarization (HVP) and hadronic light by light (HLbL), from Theory Initiative's determination at Aoyama, et al., "The anomalous magnetic moment of the muon in the Standard Model" arXiv (June 8, 2020). can be summarized as follows (multiplied by 10^11 for easier reading):

Muon g-2 = QED+EW+QCD

QED = 116 584 718.931 ± 0.104
EW = 153.6 ± 0.1
QCD = HVP+HLbL = 6937 ± 44

HVP = 6845 ± 40
HLbL = 92 ± 18

The QED and EW components are both profoundly easier to calculate, and much more precisely determinable, than the QCD component, for intrinsic scientific reasons related to the differences between these forces discussed in the final section of this post.

About 99.5% of the uncertainty in the muon g-2 calculation comes from the QCD part, even though it is the source of only about 0.006% of the absolute value of muon g-2.

Both the HVP and HLbL parts of the QCD calculation make a material contribution to the uncertainty in the muon g-2 calculation, but the HVP part contributes much more to the overall uncertainty than the HLbL part, because when you have multiple sources of uncertainty in a calculation, the bigger uncertainties tend to swamp the smaller ones unless they are very close in magnitude and there are a great many distinct smaller ones.

There is no disagreement concerning the values of the QED and EW contributions to the calculation. So, naturally, all of the debate over the correct Standard Model prediction of muon g-2 comes down to the QCD calculation with the HVP part of that calculation taking center stage, even though both parts are important.

There are two leading determinations of the Standard Model prediction's hadronic component which is a profoundly more difficult calculation. One is the Theory Initiative value that substitutes experimental data for some lattice QCD computations of the predicted value, which has a significant difference from the experimentally measured value of muon g-2. The other is the BMW group value that is consistent with the experimentally measured value using purely lattice QCD computations.

The experimental results from directly measuring muon g-2, and theoretically calculated Standard Model predictions for the value of muon g-2 can be summarized as follows (multiplied by 10^11 for easier reading, with one standard deviation magnitude in the final digits shown in parenthesis after each result):

Fermilab (2021): 116,592,040(54)

Brookhaven's E821 (2006): 116,592,089(63)

Combined measurement: 116,592,061(41)

Difference between measurements: 59 (0.7 sigma)

Theory Initiative (TI) prediction: 116,591,810(43)

BMW prediction: 116,591,954(55)

Difference between predictions: 144 (2.1 sigma)

Combined measurement v. TI: 251 (4.2 sigma)

Combined measurement v. BMW: 107 (1.6 sigma)

What's At Stake?

The stakes regarding which of these calculations of the Standard Model predicted value are high, because the value of muon g-2 is an experimentally observable quantity that is globally sensitive to the existence and magnitude of most lower energy deviations from the Standard Model of Particle Physics in a single high precision measurement.

If the Theory Initiative value is correct, something about the Standard Model is incorrect at energy scales that can be reached by existing or near future high energy physics experiments.

If the BMW value is correct, then any deviations between Nature and the Standard Model at energy scales that can be reached by existing or near future high energy physics experiments, are limited to those that cancel out in muon g-2 calculations (which as a practical matter, rules out almost all seriously considered experimentally accessible new particle physics theories).

If the BMW calculation is right, we are almost certainly in a "new physics desert" and a next generation particle collider will probably reveal no new physics.

If the Theory Initiative calculation is right, there is a very high likelihood that new physics is right around the corner and will be seen at a next generation particle collider.

Given that the cost of a next generation particle collider is on the order of billions of dollars to build and operate for its full experimental run, the desirability of this very big new purchase for humanity hinges to a great extent on whether the BMW or Theory Initiative calculation is right.

Also, even if we can't resolve that question, knowing precisely what is causing the differences between the calculations can highlight and clarify which kind of next generation collider is most likely to be useful to see new physics if they are out there, and what features it is important for a next generation collider to have to be able to resolve the questions that are the underlying source of the discrepancy.

The Latest Development

A new lattice QCD calculation continues the process of pinning down the exact source of the difference in calculations of the hadronic component of the Standard Model expected value of the anomalous magnetic moment of the muon (muon g-2) between the Theory Initiative value and the BWM value.

This new calculation isolates a particular part of the hadronic vacuum polarization (HVP) part of the hadronic component of muon g-2 prediction calculation, called the intermediate time-distance window, that is one of the main sources of the discrepancy between between the Theory Initiative calculation and the BMW value.

This focuses the effort of getting to the bottom of the question of why two collaborations of distinguished physicists are getting different results which can facilitate the process of figuring out how to determine which is correct.

The Paper

The paper and its abstract (not reformatted for blogger to preserve superscripts and subscripts and the like in formulas):

We present a lattice determination of the leading-order hadronic vacuum polarization (HVP) contribution to the muon anomalous magnetic moment, aHVPμ, in the so-called short and intermediate time-distance windows, aSDμ and aWμ, defined by the RBC/UKQCD Collaboration.

We employ a subset of the gauge ensembles produced by the Extended Twisted Mass Collaboration (ETMC) with Nf=2+1+1 flavors of Wilson-clover twisted-mass quarks, which are close to the physical point for the masses of all the dynamical flavors. The simulations are carried out at three values of the lattice spacing ranging from ≃0.057 to ≃0.080 fm with linear lattice sizes up to L≃7.6~fm.

For the short distance window we obtain aSDμ(ETMC)=69.33(29)⋅10−10, which is consistent with the recent dispersive value aSDμ(e+e−)=68.4(5)⋅10−10 within ≃1.6σ.

In the case of the intermediate window we get the value aWμ(ETMC)=235.0(1.1)⋅10−10, which is consistent with the result aWμ(BMW)=236.7(1.4)⋅10−10 by the BMW collaboration as well as with the recent determination by the CLS/Mainz group of aWμ(CLS)=237.30(1.46)⋅10−10 at the ∼1.0−1.3σ level. However, it is larger than the dispersive result aWμ(e+e−)=229.4(1.4)⋅10−10 by ≃3.1σ. The tension increases to ≃4.2σ if we average our ETMC result with the BMW and the CLS/Mainz ones.

Our accurate lattice results in the short and intermediate windows hint at possible deviations of the e+e− cross section data with respect to Standard Model (SM) predictions distributed somewhere in the low (and possibly intermediate) energy regions, but not in the high energy region.

C. Alexandrou, "Lattice calculation of the short and intermediate time-distance hadronic vacuum polarization contributions to the muon magnetic moment using twisted-mass fermions" arXiv:2206.15084 (June 30, 2022) (82 pages).

My Take

For what it is worth, I personally believe it is very likely that the BMW calculation that matches the experimentally measured value is the correct one, and that the apparent muon g-2 anomaly hinting at a variety of possible "new physics" is, in fact, merely due to a flaw in the Theory Initiative determination of the Standard Model predicted value of muon g-2.

The Theory Initiative determination might be flawed because the experimental data it is using to substitute for some lattice QCD calculations is itself flawed. This is something that was found previously to have caused the muonic proton radius problem.

If that is the problem, it could be resolved by redoing the electron collider experiments at the Linear Electron-Positron Collider experiment (LEP) from 1989-2000, upon which the Theory Initiative is mostly relying, with the greater precision and quality control methods that the subsequent two decades of high energy physics has made possible.

On the other hand, if the problem with the Theory Initiative calculation is the way that this experimental data was incorporated into the overall calculation has some subtle flaw, a new theoretical paper could point out the source of the error. This task would be advanced by a better understanding of what part of the Theory Initiative determination is most likely to be flawed allowing scientists to better focus on what kind of methodological error might be involved, which is what this new paper helps to do.

Perspective On The Precision Of These Measurements

In order to maintain perspective it is also important to note that both experimental measurements of muon g-2, and both leading Standard Model theoretical predictions, are identical to the first six significant digits. Thus, they are in perfect agreement up to the one part per million level. Only at the parts per ten million level do discrepancies emerge.

This actually underestimates the precision, because the full value of the magnetic moment of the muon which is actually measured called g(µ) , as opposed to the merely anomalous component of the magnetic moment of the muon called muon g-2, is approximately 2.00233184(1) (i.e. double the anomalous magnetic moment plus two). This adds three more significant digits to its value, making it a parts per billion agreement, with discrepancies arising only at the parts per tens of billions level.

This is greater precision, for example, than the theoretically much easier tasks than the empirically determined precision of a first round counting ballots cast in a statewide or national election, or the count of the number of people residing in the United States on a particular day every ten years in the decennial census.

The discrepancies are arising at a precision equivalent to one millimeter per ten kilometers.

Footnote Regarding Statistical Significance

In physics (and most fields) a discrepancy of less than two sigma (i.e. two standard deviations in a "normal" distribution of data) is considered statistically insignificant and constitutes results that are "consistent" with each other.

In physics, a discrepancy with a global statistical significance of five sigma or more that is replicated and has some plausible theoretical reason is the standard for a definitive scientific discovery.

The focus on "global significance" is due to the "look elsewhere effect" which observed that is you do enough experiments of the same kind, you expect some of the results to be statistical flukes that would be statistically significant if you were only doing on experiment. For example, if you do twenty experiments of the same kind, you expect to have, on average, on outlier that is more than two sigma from the true value. But, correctly calculating global significance is a matter that is more art than science because you need to determine how to count the total number of experiments you have taken which are measuring the same thing, and this turns out to be very hard to define in any complex multifaceted context like particle collider experiments that do millions and billions of collisions or more over the lifetime of the experiment, not all of which are comparable to each other.

In physics, a discrepancy of more than two but less than five sigma, or a discrepancy that hasn't been replicated, or doesn't have any plausible theoretical explanation, is considered a "tension" between theory and experiment, that is stronger if the number of sigma differing between experiment and theory is larger, but doesn't constitute a definitive scientific discovery. Scientists spend a lot of time seeing if tensions that they observe go away with further research, or solidify into higher significance scientific discoveries.

Why Is The QCD Calculation So Difficult?

The QCD calculation is much more difficult than the QED and EW calculations for two main reasons.

Coupling constant strength

One is that all of the calculations involve terms for each power of the coupling constant (a dimensionless number) of the force in question, and the magnitude of these coupling constants is very different for the respective forces.

In other words, they take the form:

a*g + b*g^2 + c*g^3 . . .

where a, b, and c are real numbers that come from adding up the calculations for the terms with the same power of the coupling constant for the force in question, and g is the coupling constant for the force in question.

The strong force coupling constant at the muon mass is in the ballpark of:

0.7 to 0.9

and even at the fourth power is it still about 0.24.

It gets significantly weaker at higher energy scales reaching: 0.1184 at the energy scale of the Z boson mass (of about 91.1 GeV).

The QED coupling constant is, in the low energy limit:

0.007 297 352 569 3(11)

and at the fourth power is it about 0.000 000 002 8 which is about one hundred million times smaller than the fourth power of the strong force coupling constant at the muon mass.

The QED couple constant gets slightly stronger at higher energy scales, reaching about 0.00787 at the energy scale of the Z boson mass.

The weak force coupling constant is on the order of:

0.000001

and at the fourth power is it about 10^-24.

Converted in comparable terms to the coupling constants above at the electron mass, the gravitational coupling constant is about

6 * 10^-39

which is comparable in magnitude to the sixth power of the weak force coupling constant, the thirteenth power of the QED coupling constant, and a far higher power of the QCD coupling constant.

As a result, terms with higher powers of the QCD coupling constant can't be ignored (especially in low energy interactions where the methodology used for QED and weak force calculations call perturbative methods break down and different methods called lattice QCD need to be used), while higher order terms in QED (typically calculated to the fifth power of the QED coupling constant) and the weak force calculation can ve ignored.

Gluon self-interactions

Let's return to our formula for each contribution in the form

a*g + b*g^2 + c*g^3 . . .

This formula is really the sum of terms for every possible way that a process can happen (which is described by a Feynman diagram), and the power of the coupling constant is a function of how many interactions there are with the force in question in a possible way that something can happen.

In the case of QED, the electromagnetic force is carried by the photon, which interacts with electromagnetically charged particles, but not with other photons.

In the case of the weak force, which is carried by W and Z bosons, these force carrying particles can interact with each other, but have very weak interactions making interactions between them very small and necessary to consider only at the first or second order level.

But, in the case of the strong force, which is carried by gluons, gluons interact with each other with a strength on the same order of magnitude as interactions between gluons and quarks in the strong force. This means that at each power of the strong force coupling constant, there are far more terms to be considered than in the QED or EW calculations, and that the rate at which the number of terms grows with each additional power of the strong force coupling constant is much greater than in the QED or EW calculations.

Conclusion

So, the bottom line is that to get comparable precision, you need to consider far higher powers of the coupling constant to do strong force calculations than QED or EW calculations, and the number of terms that have to be calculated at each power of the coupling constant in strong force calculations is also profoundly greater than in QED or EW calculations, with the disparity getting worse with each additional power of the coupling constant you try to consider.

One the calculations are set up for the QED or EW cases, those calculations can be made to maximal precision in less than a day with an ordinary desktop computer with a single processor, and the limiting factor on the number of calculations you do is the precision of the coupling constant measurement which leaves you with spurious accuracy beyond the fifth power of that coupling constant in QED and sooner in the EW calculation.

In contrast, the strong force calculations done to three or four powers of the strong force coupling constant, which still aren't very precise, take weeks of non-stop calculations with the equivalent of millions of single processor desk top computers working together.