Friday, July 21, 2023

The Case For Warning Labels For Physics Articles

Current academic and scientific publishing norms call for certain factors to be called out in something akin to "warning labels" for personal interests of the authors that could bias the result (e.g. drug company sponsorship in research related to a drug), or the lack of peer review for materials released in advance of publication in a peer reviewed scientific journal.

But, there are other known factors which are at least as serious that make conclusions in scientific journal articles less credible. Failure to recognize these risk factors are an important reason that main stream media science reporting makes misleading reports about new scientific research. One solution to this problem is for these risk factors to be routinely called out in the same manner. 

In the spirit of free inquiry and not suppressing minority views, these risk factors should not be used to deny publication to such papers, however. Instead, the should only be used to alert readers with "yellow flags" to the need to be particularly skeptical of the conclusions in the papers rather than accepting them uncritically.

What are risk factors that should trigger warning labels in scientific papers in physics?

    Data Quality

(1) An observation relied upon for new physics has not been replicated by an independent research group. More generally, the paper relies upon a single observation (e.g. a single astronomy observation) to support broad theoretical conclusions. This certainly shouldn't be used to deny publication, somebody has to be the first to observe anything, but still calls for a warning label (see, e.g., the Opera experiment's superluminal neutrino measurement that turned out to be due to a flaw in the experimental measurement).

(2) The paper does not consider global statistical significance in addition to local statistical significance when consideration of global statistical significance is necessary for a sound evaluation of the notability of the observation. Failure to consider global statistical significance is an utterly predictable and inevitable way to make noise look like a signal.

(3) The paper relies for the statistical significance of its result upon the combined significance of multiple observations that are individually either not globally statistically significant or represent mild (less than 3 sigma) tensions with a null hypothesis, especially if the separate observations are assumed to have uncorrelated uncertainties (which is rarely true, even though it is hard to quantify the correlations). In part, this is because all manner of theories can be devised after the fact to connect what are actually unrelated statistical flukes due to measurement uncertainty.

(4) An observation relied upon has been contradicted by, or is in strong tension with, other observations of the same phenomena (i.e. the paper relies upon "outlier observations"). This warning should be heightened if the paper relies upon outlier observations to the exclusion of non-outlier observations of the same thing (see, e.g., papers based upon a recent outlier measurement of the W boson mass). Likewise, often there are strong discrepancies between inclusive and exclusive measurements of the same quantity, and considering only one of these without a well reasoned explanation for doing so should trigger a warning label. Several kinds of measurements have consistent disparities between different kinds of measurements of the same quantity of this kind.

(5) The observation relied upon is only marginally statistically significant (i.e. less than 3 sigma of global statistical significance), or the statistical significance of the result relied upon has decreased as more measurements have been made of the same thing. Again, there is nothing wrong with publishing these results, but they should be taken with a grain of salt, especially when used to support otherwise ill-motivated new physics (i.e. outcomes that haven't been predicted but not observed for a long time in some theory that hasn't been strongly disfavored by other evidence).

(6) The group making an observation relied upon has reported results later found to be incorrect or contradicted by multiple other observers of the same phenomena that they were incorrect upon in the past. 

    Literature Review And Authorship 

(7) Papers should clearly state that they are proposing or relying upon new physics rather than widely accepted existing physics. This warning should be heightened where the paper purports to explain an experimental anomaly with new physics within three weeks of a newly reported anomalous experimental result by someone not from the group that did the experiment (i.e. before adequate time has elapsed for papers writing non-new physics explanations to be released). The heightened warning is necessary to counter the fact that there is a rush to release ill-vetted new physics explanations immediately after a new result is announced, making any necessary corrections later, in order to get credit for being the first to discover a new physics phenomena in the unlikely case that the proposal is the correct one.  

(8) The paper relies upon a result from a "fake journal".

(9) None of the authors of the paper is a professional academic or researcher in the field in which it is written with a PhD. (Papers with a professional academic or researcher and also a co-author who is a student in the field, or an industry professional, or someone making contributions from outside the field such as a graphic designer who did the especially difficult figures for the paper or a native English language speaker who helped the authors write idiomatically, or an autodidact, however, do not require a warning label).

(10) A proposed new physics explanation of a phenomena is presented when the phenomena has purportedly been explained already without new physics in a published paper or preprint.

(11) A paper that explains an observation with new physics without first discussing and ruling out possible explanations that do not involve new physics, including the possibility that there are understated or overlooked sources of uncertainty or inaccuracy in the observations relied upon.

(12) The paper does not review the literature, or has identified a paper in its review of the literature that reaches a contrary result and does not contain an analysis engaging with the analysis of prior work that reaches a contrary result. The literature review is inadequate and shouldn't count as a literature review sufficient to avoid a warning label unless it identifies literature critical of an overall new physics research program that the paper advances (e.g. supersymmetry, string theory, or QCD axions) or states that the authors have looked for and not found any such criticisms. Often a simple Wikipedia search would reveal such criticisms, but only a minority of papers advancing new physics include these criticisms in their literature reviews (instead often stating that a theory is "well-motivated" based upon papers written decades earlier before criticisms of the research program had surfaced). It is O.K. to write and publish a "Note" or "Letter" without a full literature review and analysis, but any publication of this kind should come with a warning label to that effect.

A warning label oriented approach would provide an easy checklist for science reporters, for legitimate scientific journals considering sending articles out for peer review, and for peer reviewers, that even when the authors and/or publishers fail to address the points in it, could encourage greater quality control in scientific paper writing and would reduce the incentive to publish dubious speculative work.

In a related matter, when considering a scientist's or academic's publications in hiring, or for tenure or promotion, the papers published ought to be flagged in cases where the conclusions made were later refuted by other research or significantly revised due to errors identified by others, the research program did not pan out, or the papers have been cited unfavorably or critically. Current norms in the field call only for disclosures of papers in which papers are retracted due to academic dishonesty such as faked data or citations to non-existent literature  (which are, of course, "red flags" rather than "yellow flags").

2 comments:

4gravitons said...

I like the general idea, though I think your specific proposals aren't really pithy enough to work for this kind of thing: the advantage of stuff like "this has not been published yet" is it's pretty easy for readers to understand, even if they don't really get the implications. A couple of these are almost at that level already with more responsible publications: stuff like "yet to be replicated" and "not statistically significant" get said in about the same way as "not yet published".

andrew said...

Fair point.