Showing posts with label bad scientists. Show all posts
Showing posts with label bad scientists. Show all posts

Monday, August 18, 2025

A Potentially Good New World Population Genetics Study Bumbled

A paper was published last year about the population genetics and historical genetics of the Blackfoot people. It compared a modest sample of Blackfoot affiliated genomes with other New World and Old World genomes. A small sample size isn't a big problem for a paleo-genetic study, however, because each individual's DNA has so many data points and generations of intermarriage in a fairly closed gene pool makes each individual highly representative of the population as a whole. But while the study does lots of things right, but makes a critical error in its analysis, which seriously detracts from the reliability of the analysis. This error arises from a weak review of the literature and deficient peer-review, which leads to an erroneous analysis.

The big problem with the paper is that it makes flawed assumptions about the peopling of the Americas. It relies on a model in which all Native Americans fit into two groups: Native North Americans (ANC-B) and Central and Southern Americans (ANC-A), and tries to determine where the Blackfoot people fit into that model.

The trouble is that the established paradigm is more complicated. While ANC-A is a valid and pretty much unified group that descend from basically Pacific coast route peoples in a primary founding population wave perhaps 14,000 years ago, Native Americans in North America have a more complex ancestry.

North American Native Americans have the lineages found in ANC-A (which results from a serial founder effect) and probably a least two other clades close in time to the initial founding era that spread into different parts of North America. 

Then, around 3500-2400 BCE, the ancestors of the Na-Dene people migrated to Alaska from Northeast Asia and admixed with pre-existing populations (their languages have remote but traceable connections to the Paleo-Siberian Ket people, whose language family is named after the Yenesian River in central Siberia) and are associated with the Saqqaq Paleoeskimo culture who also were the source of the Dorest Paleo-Eskimo populations (see also here and here) About 10% of Na-Dene ancestry is distinct from the initial founding population of the Americas.[2] The Na-Dene, like Inuits, have Y-DNA haplogroups that are specific to them and of more recent origin that the founding Y-DNA haplogroups of the Americas.[3].

And then, a final significant pre-Columbian wave with lasting demographic impact arrived from Northeast Asia, perhaps around 500s and 600s CE, and they are the ancestors of the Inuits (a.k.a. modern Eskimo-Aleut peoples) who have their roots in an Arctic and sub-Arctic population also known as the Thule. The 6th to 7th century CE Berginian Birnirk culture (in turn derived from Siberian populations) is the source of the proto-Inuit Thule people, who were the last substantial and sustained pre-Columbian peoples to migrate to the Americas.

A paper in 2020 refined and confirmed this analysis, and the 2024 paper even adopts its NNA v. SNA classification while failing to recognize the distinct temporal waves involved in the pre-Columbian peopling of the Americas.

See generally:

[1] Maanasa Raghavan, et al., "The genetic prehistory of the New World Arctic", Science 29 August 2014: Vol. 345 no. 6200 DOI: 10.1126/science.1255832.
[2] David Reich, et al., "Reconstructing Native American population history", Nature 488, 370-374 (16 August 2012) doi: 10.1038/nature11258
[4] Erika Tamm, et al., "Beringian Standstill and Spread of Native American Founders", PLOS One DO: 10.1381/journal.pone.0000829 (September 5, 2007).
[5] Alessandro Achilli, "Reconciling migration models to the Americas with the variation of North American native mitogenomes", 110 PNAS 35 (August 27, 2013) doi: 10.1073/pnas.1306290110
[7] Judith R. Kidd, et al., "SNPs and Haplotypes in Native American Populations", Am J. Phys Anthropol. 146(4) 495-502 (Dec. 2011) doi: 10.1002/aipa/21560

The critical problem with the paper is that Athabascans are a poor representative of Northern Native American lineages from the founding era ca. 14,000 years ago, because they have significant Na-Dene wave admixture, also shared, for example, with the Navajo, who migrated in turn migrated from what is now central to western Canada to the American Southeast around 1,000 CE (possibly, in part, due to the push factor of the incoming wave of proto-Inuits). 

In contrast, the vast majority of North American Native Americans have no Na-Dene or Inuit ancestry and are in population genetic continuity with one or more of the several founding populations of North America. Almost any other choice of a North American Native American comparison population would have been much, much better.

In contrast, the Karitiana are indeed representative (and the standard choice to represent) the ANC-A population.

It is entirely plausible that the Blackfoot are indeed from a wave of North American founding population that is under sampled and that their lineage is not represented in prior published works. 

Latin American indigenous peoples (and to a lesser extent and more recently, Canadian First Peoples) have, in general, been more receptive to population genetic work by anthropologists and Native American populations in the United States who have given these researchers the cold shoulder until very recently, due to a historical legacy that has understandably fostered distrust of people associated with the establishment in the U.S. including anthropologists. So, Native Americans in the U.S. are greatly under sampled.

But, because the thrust of the paper heavily relies on comparisons between Blackfoot DNA and Athabascan DNA with misguided assumptions about the Athabascan population histories entering into the calculations and analysis, it is hard to confidently extract reliable conclusions from that analysis. The Athabascan may be mostly ANC-B, but are probably the most divergent sample one could use to represent that population, particularly since no attempt is made to distinguish the ancestry components in that population. This seriously confounds the efforts to pin down the prehistoric time line.

A good quality peer-review should have caught this problem, but peer-review in practice is less effective than it is given credit for being.

Realistically, the only way to really do it right would be to withdraw the 2024 paper and replace it with a new paper that reanalyzes the Blackfoot genetic data by comparing it to a more suitable representative of North American Native American ancestry.




Studies of human genomes have aided historical research in the Americas by providing rich information about demographic events and population histories of Indigenous peoples, including the initial peopling of the continents. The ability to study genomes of Ancestors in the Americas through paleo-genomics has greatly increased the power and resolution at which we can infer past events and processes. However, few genomic studies have been completed with populations in North America, which could be the most informative about the initial peopling process. Those that have been completed in North America have identified Indigenous Ancestors with previously undescribed genomic lineages that evolved in the Late Pleistocene, before the split of two lineages [called the “Northern Native American (NNA)” or “ANC-B” and “Central and Southern American (SNA)” or “ANC-A” lineages] from which all present-day Indigenous populations in the double continent that have been sampled derive much, if not all, their ancestry before European contact. Specifically, the lineage termed “Ancient Beringian” was ascribed to a genome in an Ancestor who lived 11,500 years ago at Xaasaa Na’ (Upward Sun River) and named Xach’itee’aanenh t’eede gaay (USR1) by the local Healy Lake Village Council in Alaska. An Ancestor who lived 9500 years ago at what is now called Trail Creek Caves on the Seward Peninsula, Alaska, also belongs to the Ancient Beringian lineage. In addition, another Ancestor, under the stewardship of Stswecem’c Xgat’tem First Nation, who lived in what is now called British Columbia, belongs to a distinct genomic lineage that predates the NNA-SNA split but postdates the split from Ancient Beringians on the Americas’ genomic timeline. This Ancestor was identified at Big Bar Lake near the Frasier River and lived 5600 years ago. Thus, these previous studies of North American Indigenous Ancestors have successfully helped to identify previously unknown genomic diversity. However, the ancient lineages identified in these studies have not been observed in samples of Indigenous peoples of the Americas living today. Research in Mesoamerica and South America suggests that certain sampled populations (e.g., Mixe) have at least partial ancestry in present-day Indigenous groups from unknown genomic lineages in the Americas, possibly dating as far back as 25,000 years ago. . . .

With multiple genomic analyses showing the ancient Blood/Blackfoot clustering together with present-day Blood/Blackfoot but on a separate lineage from other North and South American groups, we created a demographic model using momi2, which used the site frequency spectra of present-day Blood/Blackfoot, Athabascan (as a representative of Northern Native American lineage), Karitiana (as a representative of Southern Native American lineage), and Han, English, Finnish, and French representing lineages from Eurasia. The best-fitting model shows a split time of the present-day Blood/Blackfoot at 18,104 years ago, followed by a split of Athabascan and Karitiana at 13,031 years ago.

The paper and its abstract are as follows:

Mutually beneficial partnerships between genomics researchers and North American Indigenous Nations are rare yet becoming more common. Here, we present one such partnership that provides insight into the peopling of the Americas and furnishes another line of evidence that can be used to further treaty and aboriginal rights. We show that the genomics of sampled individuals from the Blackfoot Confederacy belong to a previously undescribed ancient lineage that diverged from other genomic lineages in the Americas in Late Pleistocene times. Using multiple complementary forms of knowledge, we provide a scenario for Blackfoot population history that fits with oral tradition and provides a plausible model for the evolutionary process of the peopling of the Americas.
Dorothy First Rider, et al., "Genomic analyses correspond with deep persistence of peoples of Blackfoot Confederacy from glacial times" 10(14) Science Advances (April 3, 2024).

Monday, January 20, 2025

Garbage Experiment Of The Day

This experiment is searching for dark matter that must have properties in a part of the dark matter parameter space that direct dark matter detection experiments and particle collider tests have already ruled out by a dozen orders of magnitude or more. It is also basically ruled out by the LAT collaboration.

It is arguably one of the biggest wastes of time in the astronomy community right now. I would never have voted to fund it, if I were sitting on a committee considering the proposal.

Numerous observations confirm the existence of dark matter (DM) at astrophysical and cosmological scales. Theory and simulations of galaxy formation predict that DM should cluster on small scales in bound structures called sub-halos or DM clumps. While the most massive DM sub-halos host baryonic matter, less massive, unpopulated sub-halos could be abundant in the Milky Way (MW), as well and yield high-energy gamma rays as final products of DM annihilation. Recently, it has been highlighted that the brightest halos should also have a sizeable extension in the sky. In this study, we examine the prospects offered by the Cherenkov Telescope Array Observatory (CTAO), a next-generation gamma-ray instrument, for detecting and characterizing such objects. Previous studies have primarily focused on high-latitude observations; here, we assess the potential impact of the CTAO's Galactic Plane Survey, which will provide unprecedentedly deep survey data for the inner five degrees of the Galactic plane. Our modeling accounts for tidal effects on the sub-halo population, examining the conditions under which DM sub-halos can be detected and distinguished from conventional astrophysical sources. We find that regions a few degrees above or below the Galactic plane offer the highest likelihood for DM sub-halo detection. For an individual sub-halo -- the brightest from among various realizations of the MW subhalo population -- we find that detection at the 5σ level is achievable for an annihilation cross section of ⟨σv⟩∼3×10^−25 cm^3/s for TeV-scale DM annihilating into bb¯. For a full population study, depending on the distribution and luminosity model of Galactic sub-halos, yet unconstrained cross sections in the range ⟨σv⟩∼10^−23−10^−22 cm^3/s for TeV DM candidates are necessary for the brightest sub-halos to be detected.
Christopher Eckner, et al., "Detecting dark matter sub-halos in the Galactic plane with the Cherenkov Telescope Array Observatory" arXiv:2501.09789 (January 16, 2025).

Thursday, January 2, 2025

Quote Of The Day

They say that science progresses one funeral at a time. But it’s no longer true. Because the first generation of string theorists has raised their students who are now continuing the same stuff. And why would they not, these are cozy jobs, and there is nothing and no one that could stop them. So yeah, Siegfried is right. String theory is not dead. It’s undead, and now walks around like a zombie eating people’s brains.
From Sabine Hossenfelder's Youtube video String Theory Isn’t Dead about the article that Peter Woit discussed here.

Wednesday, October 16, 2024

Dark Matter Is Still Probably The Wrong Answer

Stacy McGaugh has a reaction blog post to the Scientific American article "What if We Never Find Dark Matter?" by Slatyer & Tait.

It nicely sums up the sociological conundrum in astrophysics that has led the discipline to throw a lot of weight and support behind a deeply flawed dark matter particle hypothesis with a particle that hasn't been detected and no hypothetical particle that can fit the astronomy observations and no theory that has made many significant ex ante predictions, rather than MOND and modified gravity that is a much better fit to the astronomy observations and has made many significant ex ante predictions.

He is spot on. Some good quotes:
In the 1980s, cold dark matter was motivated by both astronomical observations and physical theory. Absent the radical thought of modifying gravity, we had a clear need for unseen mass. Some of that unseen mass could simply have been undetected normal matter, but most of it needed to be some form of non-baryonic dark matter that exceeded the baryon density allowed by Big Bang Nucleosynthesis and did not interact directly with photons. That meant entirely new physics from beyond the Standard Model of particle physics: no particle in the known stable of particles suffices. This new physics was seen as a good thing, because particle physicists already had the feeling that there should be something more than the Standard Model. There was a desire for Grand Unified Theories (GUTs) and supersymmetry (SUSY). SUSY naturally provides a home for particles that could be the dark matter, in particular the Weakly Interacting Massive Particles (WIMPs) that are the prime target for the vast majority of experiments that are working to achieve the exceptionally difficult task of detecting them. So there was a confluence of reasons from very different perspectives to make the search for WIMPs very well motivated.

That was then. Fast forward a few decades, and the search for WIMPs has failed. Repeatedly. Continuing to pursue it is an example of the sunk cost fallacy. We keep doing it because we’ve already done so much of it that surely we should keep going. So I feel the need to comment on this seemingly innocuous remark:

although many versions of supersymmetry predict WIMP dark matter, the converse isn’t true; WIMPs are viable dark matter candidates even in a universe without supersymmetry.

Strictly speaking, this is correct. It is also weak sauce. The neutrino is an example of a weakly interacting particle that has some mass. We know neutrinos exist, and they reside in the Standard Model – no need for supersymmetry. We also know that they cannot be the dark matter, so it would be disingenuous to conflate the two. Beyond that, it is possible to imagine a practically infinite variety of particles that are weakly interacting by not part of supersymmetry. That’s just throwing mud at the wall. SUSY WIMPs were extraordinarily well motivated, with the WIMP miracle being the beautiful argument that launched a thousand experiments. But lacking SUSY – which seems practically dead at this juncture – WIMPS as originally motivated are dead along with it. The motivation for more generic WIMPs is lacking, so the above statement is nothing more than an assertion that runs interference for the fact that we no longer have good reason to expect WIMPs at all. . . . 
I can save everyone a lot of time, effort, and expense. It ain’t WIMPs and it ain’t axions. Nor is the dark matter any of the plethora of other ideas illustrated in the eye-watering depiction of the landscape of particle possibilities in the article. These simply add mass while providing no explanation of the observed MOND phenomenology. This phenomenology is fundamental to the problem, so any approach that ignores it is doomed to failure. I’m happy to consider explanations based on dark matter, but these need to have a direct connection to baryons baked-in to be viable. None of the ideas they discuss meet this minimum criterion.

Of course it could be that MOND – either as modified gravity or modified inertia, an important possibility that usually gets overlooked – is essentially correct and that’s why it keeps having predictions come true. That’s what motivates considering it now: repeated and sustained predictive success, particularly for phenomena that dark matter does not provide a satisfactory explanation for. . . . 
The equation coupling dark to luminous matter I wrote down in all generality in McGaugh (2004) and again in McGaugh et al. (2016). The latter paper is published in Physical Review Letters, arguably the most prominent physics journal, and is in the top percentile of citation rates, so it isn’t some minuscule detail buried in an obscure astronomical journal that might have eluded the attention of particle physicists.

Bonus quote from the comments:

It’s exactly the same crap as with string theory, and supersymmetry, and inflation, and dark sectors, and many other research bubbles in the foundations of physics. It is mathematical fiction; it’s nothing to do with reality any more.
- Sabine Hossenfelder (YouTube link).

Friday, May 3, 2024

A True Garbage Physics Paper

How do we create incentives in High Energy Physics to discourage such utterly garbage filled advocacy of discredited dead end speculation?
At present, the Standard Model (SM) agrees with almost all collider data. Yet, three finetuning issues -- the Higgs mass problem, the strong CP problem and the cosmological constant problem -- all call for new physics. The most plausible solutions at present are weak scale SUSY, the PQWW axion and the string landscape. A re-evaluation of EW finetuning in SUSY allows for a higgsino-like LSP and naturalness upper bounds well beyond LHC limits. Rather general arguments from string theory allow for statistical predictions that m_h~ 125 GeV with sparticles beyond present LHC limits. The most lucrative LHC search channel may be for light higgsino pair production. Dark matter turns out to be a SUSY DFSZ axion along with a diminished abundance of higgsino-like WIMPs.
Howard Baer, "Beyond the Standard Model: An overview" arXiv:2405.00872 (May 1, 2024) (contribution to the 2024 QCD session of the 58th Rencontres de Moriond).

Wednesday, January 17, 2024

New Historical Linguistics Paper Get IE Languages Badly Wrong

A new paper in Nature Communications on historical linguistics fails disastrously by claiming that the Indo-European languages had a Neolithic dispersal time from Anatolia. This utterly undermines the credibility of the methodology as a whole, and makes it not worth even bothering to read carefully in any other respect. Mountains of work in myriad papers considering ancient DNA, linguistics, and archaeology, done by far more competent researchers, contradict this paper. This paper never should have cleared peer review.

The idiots who wrote this fatally flawed paper are Sizhe Yang, Xiaoru Sun, Li Jin, and Menghan Zhang.

Thursday, September 7, 2023

Evidence Of Warm Dark Matter Annihilation Undermined

A new study fails to replicate the findings of five out of six papers that claim to have seen a 3.5 keV radiation line which arguably is the footprint of dark matter annihilation, using the same underlying data. 

The new study, with multiple authors, argues that the backgrounds were not correctly modeled and also identifies other methodological flaws in those papers. This greatly weakens on line of evidence in support of particle dark matter that can annihilate into ordinary matter or photons through collisions with other dark matter particles, which if the data were more solid would support a popular version of a warm dark matter particle model.

At least at face value, this is a rather stunning refutation of the work of the authors of the previous papers.

The 3.5 keV line is a purported emission line observed in galaxies, galaxy clusters, and the Milky Way whose origin is inconsistent with known atomic transitions and has previously been suggested to arise from dark matter decay. We systematically re-examine the bulk of the evidence for the 3.5 keV line, attempting to reproduce six previous analyses that found evidence for the line. Surprisingly, we only reproduce one of the analyses; in the other five we find no significant evidence for a 3.5 keV line when following the described analysis procedures on the original data sets. For example, previous results claimed 4σ

evidence for a 3.5 keV line from the Perseus cluster; we dispute this claim, finding no evidence for a 3.5 keV line. We find evidence for background mismodeling in multiple analyses. We show that analyzing these data in narrower energy windows diminishes the effects of mismodeling but returns no evidence for a 3.5 keV line. We conclude that there is little robust evidence for the existence of the 3.5 keV line. Some of the discrepancy of our results from those of the original works may be due to the earlier reliance on local optimizers, which we demonstrate can lead to incorrect results. For ease of reproducibility, all code and data are publicly available.  
Christopher Dessert, Joshua W. Foster, Yujin Park, Benjamin R. Safdi, "Was There a 3.5 keV Line?" arXiv:2309.03254 (September 6, 2023).

Monday, July 31, 2023

The Antiquity Of Indo-European Languages Is Still Overstated


The language family began to diverge from around 8,100 years ago, out of a homeland immediately south of the Caucasus. One migration reached the Pontic-Caspian and Forest Steppe around 7,000 years ago, and from there subsequent migrations spread into parts of Europe around 5,000 years ago according to P. Heggarty et al., Science (2023)
A pair of new papers are out on the origins of the Indo-European languages, like previous papers by Russel Gray, the estimate of an 8,100 year origin for the language family is too old. The main problem is a misinterpretation of the genetic and anthropological data, especially related to the Anatolian languages. The notion that the Indo-European languages have a source in the Caucuses or in Anatolia or points south of it don't hold water. 

As noted by a comment at Language Log:
[T]his article seems similar to the piece in Science with Quentin Atkinson from 2012, corrected in response to some of the criticism by linguists (eg. Romani has apparently been removed from the list of languages), but not citing the book by Pereltsvaig and Lewis which was the best single source for that criticism[.]

Atkinson and Gray are academic colleagues in New Zealand. 

The Anatolian languages look diverged from the other Indo-European languages, and hence look old, because of language contact effects with substrate languages very different from those present in the formative periods of the other Indo-European languages. 

Gray's models too heavily emphasize a hypothetical random mutation rate and greatly underestimate how much language change is due to language contact. If you are Iceland with very little language contact, you can still read 10th century Iceland texts today. If you are a language experiencing lots of language contract, like England, anything you read before the 15th century is incomprehensible.

There is no convincing historical, archaeological, or genetic evidence of any Indo-European presence in Anatolia much before 2000 BCE, despite the existence of literate cultures in that region at that time. But, there is strong archaeological and historical evidence of a migration of Indo-European people into Anatolia around 2000 BCE. There is no evidence of a steppe genetic signature in Anatolia on a large scale before then. There is also no relationship between the Indo-European languages and the languages that were present in the alleged region of their origin in the right time frame.

Likewise, there is no convincing archaeological or genetic evidence of an origin for the Indo-European languages in the Caucuses.

The new papers are (hat tip to Language Log): 

"New Insights into the Origin of the Indo-European Languages. Linguistics and genetics combine to suggest a new hybrid hypothesis for the origin of the Indo-European languages." Max-Planck-Gesellschaft (7/27/23).

PhysOrg, July 27, 2023. Discussing "Language Trees with Sampled Ancestors Support a Hybrid Model for the Origin of Indo-European Languages." Heggarty, Paul et al. Science 381, no. 6656 (July 28, 2023): eabg0818.

The front material of one of them is as follows:
Précis

An international team of linguists and geneticists led by researchers from the Max Planck Institute for Evolutionary Anthropology in Leipzig has achieved a significant breakthrough in our understanding of the origins of Indo-European, a family of languages spoken by nearly half of the world’s population.
Introduction

For over two hundred years, the origin of the Indo-European languages has been disputed. Two main theories have recently dominated this debate: the ‘Steppe’ hypothesis, which proposes an origin in the Pontic-Caspian Steppe around 6000 years ago, and the ‘Anatolian’ or ‘farming’ hypothesis, suggesting an older origin tied to early agriculture around 9000 years ago. Previous phylogenetic analyses of Indo-European languages have come to conflicting conclusions about the age of the family, due to the combined effects of inaccuracies and inconsistencies in the datasets they used and limitations in the way that phylogenetic methods analyzed ancient languages.

To solve these problems, researchers from the Department of Linguistic and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology assembled an international team of over 80 language specialists to construct a new dataset of core vocabulary from 161 Indo-European languages, including 52 ancient or historical languages. This more comprehensive and balanced sampling, combined with rigorous protocols for coding lexical data, rectified the problems in the datasets used by previous studies. 
Indo-European estimated to be around 8100 years old

The team used recently developed ancestry-enabled Bayesian phylogenetic analysis to test whether ancient written languages, such as Classical Latin and Vedic Sanskrit, were the direct ancestors of modern Romance and Indic languages, respectively. Russell Gray, Head of the Department of Linguistic and Cultural Evolution and senior author of the study, emphasized the care they had taken to ensure that their inferences were robust. “Our chronology is robust across a wide range of alternative phylogenetic models and sensitivity analyses”, he stated. These analyses estimate the Indo-European family to be approximately 8100 years old, with five main branches already split off by around 7000 years ago.

These results are not entirely consistent with either the Steppe or the farming hypotheses. The first author of the study, Paul Heggarty, observed that “Recent ancient DNA data suggest that the Anatolian branch of Indo-European did not emerge from the Steppe, but from further south, in or near the northern arc of the Fertile Crescent — as the earliest source of the Indo-European family. Our language family tree topology, and our lineage split dates, point to other early branches that may also have spread directly from there, not through the Steppe.” 
New insights from genetics and linguistics

The authors of the study therefore proposed a new hybrid hypothesis for the origin of the Indo-European languages, with an ultimate homeland south of the Caucasus and a subsequent branch northwards onto the Steppe, as a secondary homeland for some branches of Indo-European entering Europe with the later Yamnaya and Corded Ware-associated expansions. “Ancient DNA and language phylogenetics thus combine to suggest that the resolution to the 200-year-old Indo-European enigma lies in a hybrid of the farming and Steppe hypotheses”, remarked Gray.

Wolfgang Haak, a Group Leader in the Department of Archaeogenetics at the Max Planck Institute for Evolutionary Anthropology, summarizes the implications of the new study by stating, “Aside from a refined time estimate for the overall language tree, the tree topology and branching order are most critical for the alignment with key archaeological events and shifting ancestry patterns seen in the ancient human genome data. This is a huge step forward from the mutually exclusive, previous scenarios, towards a more plausible model that integrates archaeological, anthropological and genetic findings.”

The Supplemental Materials are here. The figure below is from the Supplemental Materials:

Fig. S6.1 Maximum Clade Credibility (MCC) tree for the tree distribution from Model M3.  This MCC tree corresponds to the DensiTree in Fig. 2 in the main text, summarizing the posterior distribution of trees.  Here, uncertainty in tree topology is represented by the posterior probability shown for each node.  Any branch leading to a node with a posterior probability of less than 0.5 is shown as a dashed line.  Branches that end before the right-hand margin represent non-modern languages, used here as date calibrations.  Colors indicate the established clades of Indo-European, following the same color scheme as in the language map and DensiTree figure in the main text. 

Section 7.7 of the Supplemental Materials notes that:

In our main analysis we apply no ancestry constraints, given the known objections to doing so, in particular to forcing ancient, written register languages to be direct ancestors of modern spoken languages.  Nonetheless, a major previously published paper does apply such ancestry constraints (12), and reports a powerful effect in leading to much shallower estimates of the root depth of the Indo-European family.  

Reference 12, which is a much more credible effort to address Indo-European origins is: W. Chang, C. Cathcart, D. Hall, A. Garrett, "Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis." Language 91, 194–244 (2015) (link is to the pdf full text). It's abstract states:

The counterfactual conclusions of their model when failing to apply these constraints, noted in Section 7.7 for Latin to Romance languages, classical Armenian as an ancestor of modern Armenian, Mycenaean Greek as an ancestor of Ancient and New Testament Greek, and Old English as an ancestor of modern English, all seriously cast doubt on the soundness of the model. 

Razib Khan has a lengthy commentary at his Gene Expression blog. He too casts doubt on the "Southern Arc" hypothesis that Indo-Europeans were present very early on in Anatolia and the Southern Caucuses:

We have some histories of the Middle Eastern Bronze Age. We know that the area of southwest Iran was dominated by the non-Indo-European Elamites as early as 3000 BC, and these people persisted down into the Common Era. Modern Armenia was dominated by non-Indo-European speaking Urartians after 1000 BC. This language is related to Hurrian, documented 4,000 years ago. Before the Indo-European Hittites ruled Hatti, the Hattians ruled Hatti. And the Hattians were not Indo-European. Judging by the obscure Eteocretan language that persisted into antiquity the Minoans were almost certainly not Indo-European speaking. 
Around 1500 BC it is true the ruling elite of the Hurrians, the Mittani, seem to have had an Indo-Aryan connection, but they were also likely intrusive, and their emergence as mobile warriors suspiciously post-dates the development of the light war chariot and the domestic horse thousands of miles to the north several centuries earlier by the Sintashta. The Assyrian royal annals date the arrival of Persians to the 9th century BC, but the results in the paper imply that the Iranians were already present in the Zagros for thousands of years before this (the south Caucasus being the Indo-European ur-heimat ultimately, the Indo-Iranians moving south and east very early on from that region).
I’m focusing on the Middle East because there is a rich history of textual evidence starting in the third millennium BC. These results imply that Indo-European languages are in fact native to the northern Middle East, in the southern Caucasus. And yet assorted obscure languages like Gutian, Kassite and Kaska, are found where you might expect a stray Indo-European here and there. To me this is curious and weird. Further to the west, these results seem to imply that Greek was brought with Caucasus ancestry, but Minoan was likely not Indo-European. There are all these non-Indo-European languages attested in the textual record…and only a few Indo-European ones (Hittite being the first). . . .
This doesn’t mean the other models don’t have holes, the “Southern Arc” theory is pretty complicated too, and everything would have been “easier” if the Hittites had steppe ancestry, and they do not seem to.

The earliest historically attested appearance of the Hittites is around 2000 BCE in written communications from Akkadian merchants, and this is corroborated by the archaeological evidence. Steppe ancestry appears in Anatolia only after this date. 

With respect to Tocharian he notes:

Tocharian languages were found in the northern and northeast regions of the Tarim basin. Historically, the southern rim of the basin was dominated by Iranian languages. It seems the most likely candidate for the people that gave rise to the Tocharian languages is the Afanasievo culture. The Afanasievo we now know were basically an eastern branch of the Yamnaya that show up in the Altai 3300 BC. This is 5,300 years BP. In the paper, the Tocharian split from other Indo-Europeans 5,400 to 8,600 years BP over a 95% confidence interval. The only way this makes sense to me is if there was deep linguistic structure within the Yamnaya despite overall genetic homogeneity maintained through mate exchange. In the text the authors seem to imply that the Tocharians are an early eastward migration, perhaps from the south Caucasus region. This does not align very well with the ancient DNA. The Afanasievo early on are replica copies of Yamnaya. 

So, again, the linguistic hypothesis of the paper is out of synch with the archaeological and ancient DNA evidence. And there are no well documented examples in historical linguistics of societies that are otherwise as homogeneous and contiguous as the Yamnaya harboring this kind of deep linguistic substructure for thousands of years.

He also comments skeptically on other aspects of the paper:

One of the major points of this paper that contradicts some theories in historical linguistics is a rejection of the tentative connection between Balto-Slavic and Indo-Iranian. Genetically, the curious aspect of the two language families is that Y chromosomal haplogroup R1a is very frequent in both, but differentiated into two lineages that seem to have diverged 5,500-6,000 years ago. But there is more than just Y chromosomes here; over the past decade autosomal genome analyses show that many South Asians, in particular those in the northwest and upper caste populations are enriched for a minority ancestral component that resembles Eastern Europeans. We now know what happened due to ancient DNA: Genetic ancestry changes in Stone to Bronze Age transition in the East European plain. A branch of the Corded Ware Culture (CWC) migrated eastward, becoming the Fatyanovo Culture, then the Balanovo Culture, then the Abeshevo Culture, and finally the Sintashta Culture. The Sintashta seem to have given rise a group of societies known as Andronovo that are hypothesized to evolved into Iranians and Indo-Aryans.

The result here does away with all this. Rather than Indo-Aryan speech being brought by steppe pastoralists between 3,500 and 4,000 years ago, as genetics would imply, the Indo-Aryan speech was likely present during the Indus Valley Civilization. These results imply that Indo-Aryan arrived in India thousands of years before the intrusion of steppe pastoralists, and it was carried eastward by farmers from the Caucasus. The Vedas and Sanskrit then come down from the IVC. And yet strangely the Vedas do not depict a very complex society like the IVC, but a more simple agro-pastoralist one. And, the sacred language of the IVC people presumably, Sanskrit, was maintained in particular by a Brahmin priestly caste that is notable for having a very high fraction of steppe ancestry, that much arrived later.

Again, this linguistic model is simply wrong. The timing of the archaeological evidence and the ancient DNA is indisputable. The inferences that must be made from this evidence to reach historical linguistic conclusions are easy and small ones. But there is simply no narrative consistent with the paper linguistic first based model, which relies on extremely uncertain and ill documented assumptions about how languages evolve over time, that makes any sense.

Razib notes that there are also serious reasons to doubt the claimed timing of the divergence of European subfamilies of the Indo-European languages which archaeological and ancient DNA evidence point to taking place after Corded Ware Culture expansion into Europe: 

A massive issue of this paper is that it makes a hash of a major phenomenon that we know between 3500 and 2500 BC, and that’s the spread of steppe-people in all directions, especially out of the Corded Ware complex
The CWC are notable for having a major admixture of Globular Amphora Culture (GAC) Neolithic ancestry, about 25-35% of their genetics, and then spreading into all directions. As noted by the authors and other observers, ancient DNA suggests that Anatolian, Armenian, and perhaps Greek and Illyrian (Albanian), are exceptions to this, deriving directly from Yamnaya or pre-Yamnaya (in the case of Hittites) Indo-European people (remember, CWC is a mix of Yamnaya and GAC). 
The genetics is very clear that a major wave of post-CWC people went into Asia, and south into the Indian subcontinent and Iran. The Y chromosomes imply this was male mediated, and post-CWC Y chromosomes are found in appreciable quantities as far south as Sri Lanka. But these data place this demographic migration far too late to have been the origin of Sanskrit, which is associated with Arya culture.

As Lazaridis points out on social media, the divergence of European language groups like Germanic, Celtic, Italic and Balto-Slavic also predates the CWC expansion westward. For example, Italic language split off in 3500 BC, 500 years earlier than the expansion of CWC into Eastern Europe, with a 95% lower-bound of 2200 BC, about when steppe ancestry shows up in the Italian peninsula according to ancient DNA. If the dates are true then it seems that the various Indo-European language groups were differentiated already very early on in the Yamnaya, and not later on through their expansion across Europe. In other words, this is a model of “ancient linguistic substructure.”

Rather than letting an unproven and highly uncertain linguistic evolution model set the parameters and try to shoehorn hard and precise archaeological, historical, and ancient DNA evidence into this model's timeline, we using the archaeological, historical, and ancient DNA evidence to calibrate and frame the parameter space of any linguistic evolution model.

We know that Tocharian split from other Indo-European languages around 3300 BCE. We know  that Balto-Slavic and Indo-Iranian split around 3000 BCE to 2500 BCE. We know that the Indo-Aryan languages arrived in South Asia from Central Asia sometime after the collapse of the Harappan culture ca. 2500 BCE to 1500 BCE. We know that the European branches of the Indo-European languages began to differentiate sometime during and after the spread of the Corded Ware Culture in Europe between 3500 and 2500 BCE. We know that the Indo-European languages reach the Italian Peninsula around 2200 BCE. We know that the Anatolian languages arrived in Anatolia around 2200 BCE to 2000 BCE, and that Mycenean Greek arrived in the Aegean at around the same time.

If the linguistic model doesn't match these hard dates then the linguistic model is broken. And, the single biggest factor throwing off the linguistic model is the assumption that because Tocharian (which has little language contact as it expands and is probably the most conservative of the documented Indo-European languages relative to Proto-Indo-European because it had very little language contact) and the Anatolian languages (which had stronger and more diverged substrate influences). As I explain in a comment at Razib's blog post:
Razib, presumably out of politeness, lays the foundation but doesn’t reach the final punchline, which is that the linguistic model is seriously flawed, and that the narrative the flows from trying to shoehorn the linguistic model’s conclusions into the hard evidence from archaeological evidence, historical accounts, and ancient DNA is likewise just plain wrong. We should be using the hard, precisely dated and placed evidence to calibrate the linguistic model instead of the other way around, because the parameters and assumptions of the linguistic model are profoundly less certain in date and in place.

The single biggest driver of the problem with the linguistic model is the assumption that the Anatolian languages, because they are more diverged from the other Indo-European languages, are also the oldest. All other things being equal, that isn’t an unreasonable assumption, but all other things are not equal.

The Neolithic societies of Europe and South Asia in which Indo-European languages replaced pre-existing Neolithic languages were all in a state of abject collapse when the Indo-European language speaking steppe people swept in, so the pre-existing substrate languages had much less of an impact on the Indo-European languages in those places, than in Anatolia. Further, in Europe, all of the substrate Neolithic first farmer languages were part of a single macro-linguistic family derived from the languages of the Western Anatolian source for the first farmers, possible with one major division between the LBK wave along the Danube and other inland river systems, and a Cardial Pottery wave skirting to Northern Mediterranean coast. Some of what we attribute to Proto-Indo-European or to a very basil split on the European side of the Indo-European languages may actually be shared substrate influences from similar languages in this European Neolithic language family (systemically understating the impact of language contact with these languages).

Likewise, in the East, the Indo-Iranian languages probably shared a common Harappan language family substrate.

In contrast, the Anatolian Indo-European languages saw their speakers, especially the Hittites, conquering a much more sophisticated Eneolithic/Early Bronze Age Hattic society whose linguistic predispositions were not so easily swept aside. Even after Hittite became the dominant secular language of the Hittite empire, the non-Indo-European Hattic language survived as a liturgical language akin to church Latin, post-Sumer Sumerian, and ancient Hebrew, for another thousand years, something that happened nowhere else in the Indo-European linguistic region. And, the languages of Anatolia and the Caucasian and Iranian highlands by the metal ages, were very different from the languages of the Western Anatolian Neolithic ancestors of Europe’s first farmers.

The Anatolian languages are more diverged from other Indo-European languages not because they are older, but because there was a stronger substrate influence and the substrate that was the source of the influence was much different. You can read and hear the extent of the influence by comparing Hittite names and words and sentences to their Hattic counterparts (preserved well in large volumes of royal record keeping) and their Minoan counterparts (preserved in phonetic transcriptions in Egyptian texts and what can be guessed at from Linear B writing).

Tocharian seems older than it really is for the opposite reason. Unlike every single other known Indo-European language, it had little or no substrate influence as it expanded into thinly populated regions en route to and in the Tarim Basin. It is probably the most conservative linguistically, kept pure from reduced contract on the frontier in much the same way that Icelandic on the frontier is the closest Germanic language to Old Norse, the same way that the Appalachian dialect of English is the closest the English dialect of Shakespeare as it was isolated on the frontier, and the same way that the Spanish dialect spoken by multigenerational natives of Southern Colorado and New Mexico are the only dialects of Spanish that retain some of its Spanish colonial era archaic words and grammatical constructions in living languages. The shared substrate influences of Western Anatolian Neolithic and Harappan language families on languages that were not Anatolian or Tocharian are absent in Tocharian and that is why it seems more diverged.

New Zealand academic Russell Gray in this paper is repeating the sins of his fellow New Zealander Quentin Atkinson in his 2012 paper in Science. Later work by Atkinson recognized that increasing the amount of language evolution attributed to language contract and decreasing the amount of language evolution attributed to random mutation produced more reasonable estimates of the time depths of the various Indo-European languages. But these lessons were lost on the authors of the current paper. 
The authors of the current paper, instead, forge an utterly unconvincing “Southern Arc” narrative that has to be riddled with exceptions to principles and inferences about ancient DNA markers of Indo-European languages, about the lack of a reason for archaeologically and genetically homogeneous and geographically compact societies to have deep linguistic divides. They remove the climate and other motives of expansion too.
Another comment seen elsewhere on the paper:
They have a systematic error of branch scaling which elongates branches with excessive borrowing (which is especially typical for Indic languages) or have limited knowledge of synonym pairs representing meanings in their dataset (which is common for many ancient languages). Both problems stem from the same computational simplification. Namely, they treat each cognate responsible for the given meaning as an independent binary value (present or absent) while in reality, presence or absence of synonyms for a given meaning are negatively correlated.

Basically in the languages where coevolving synonyms are well attested, a gain or a loss of a synonym will generally have a change value of 1 (1,1 -> 1,0 or vice versa). But in languages with external borrowing or with unknown synonym pairs, any such change would count as 2 (loss of the original cognate plus gain of a new one).

This scaling problem would have inferred even older split dates have it not been artificially limited by setting the upper bound for the age at 10,000 years. In one of the sensitivity analyses they removed this upper bound and ended up with estimates as old as 11 kya.

There is also an important linguistic consideration for the Northern route and against the South Caucasus urheimat, and it is borrowings from IE to neighboring languages. The oldest layer of IE-derived words in the Finno-Ugric languages is thought to be related to proto-Iranian and dated to ~Sintashta epoch in the Ural Mountains. Conversely, Gamkrelidze and Ivanov assembled a great collection of potentially IE-derived words in Kartvelian and Semitic languages but nothing there is convincingly older than Mitanni age.

This issue is discussed in Section 7.10 of the Supplemental Materials which notes that:

Our multistate model produces root age estimates distinctly younger (by 2057 years, or 25.1%) than any produced by the covarion model: 6153 BP (4926–7884 BP). In tree topology, the multistate results show a similar lack of resolution at basal nodes of the phylogeny, although there remains strong support for a European clade of Germanic, Celtic and Italic, and for a nesting of Nuristani within Indic. 

The chart with the difference in methodology (Figure 7.10.2) is:

 

A criticism of the paper at a Substack page can be found here. It opens with this framing:

Advances in genetics, linguistics, and archaeology have eliminated all but two theories of the Indo-European urheimat from consideration. The first theory is Steppe Theory – that the speakers of proto-Indo-European – the last stage of the language prior to its fragmentation into multiple branches – lived on the Pontic Steppe in what is now eastern Ukraine and southern Russia some 5,000 to 6,000 years ago. The second theory is Southern Arc Theory – that proto-Indo-European was spoken within the same time frame, but at an unspecified point between the southeastern Balkans and Azerbaijan.

I haven’t made up my mind on which theory I believe, so I will do my best to lay out their cases and problems. It is notable that they differ little in their understanding of the Bronze Age (roughly 3300 BC and after). Their understandings differ instead in the even more temporally distant Copper Age (roughly 4500 to 3300 BC). . . . 
Most of the Indo-European languages are believed to have fragmented into branches such as Indo-Iranian, Balto-Slavic, Italo-Celtic, Graeco-Armenian, or Germanic during the third millennium BC. However, there were two exceptions. The first is the Tocharian branch, attested in the Tarim Basin of what is now Xinjiang, China in the 1st millennium AD. 
The second is the Anatolian branch - Hittite, Luwian, and Palaic. The Anatolian languages are attested in the second millennium BC in Anatolia – what is now Asiatic Turkey. Both the Anatolian and Tocharian branches appear to have split from the other Indo-European languages prior to 3000 BC – Anatolian possibly even before 4000 BC.

The Anatolian languages have a number of odd features that set them apart from the other early Indo-European languages. They only have a present and a past tense, while other Indo-European languages have as many as six. They lacked a dual, and they had only an animate and neuter case unlike the other Indo-European languages. Additionally, the Hittite (an Anatolian language) word for wheel is not an Indo-European cognate. Wheels were invented and spread in the second half of the fourth millennium BC. As such, the linguistic and archaeological evidence implies that the Anatolian languages diverged from the main Indo-European languages prior to 3500 BC - centuries before the others.

[I note, editorially, that grammatical simplification is a phenomena commonly seen in bilingual communities with lots of language learners as the new language is formed, so the lost of 1-4 temporal tenses and a dual case is consistent with a strong substrate influence, as would a tendency of language learners who speak a substrate language to retain some core vocabulary word's from the substrate language, like "wheel". We know that the "wheel" technology originated on the steppe rather than in Anatolia.] 

Theories of Indo-European origins must account for the existence of the Anatolian languages and their early split from the rest of the language family. The Steppe theory and the Southern Arc theory address it in different ways.

The Steppe theory argues that the Anatolian languages are the product of a very early migration out of the steppe. Riding horses from the Indo-European urheimat in what is now eastern Ukraine, the earliest Anatolian speakers were rich in steppe ancestry, split from their cousins, and invaded the eastern Balkans and Hungary in the late 5th millennium BC. There, they created the Suvorovo and related cultures, spreading the ancestors of the Anatolian languages. The Anatolian-speaking Suvorovo people migrated south centuries later, eventually becoming part of the Ezero Culture in late 4th millennium BC Bulgaria and Thrace. Then, during the chaotic period of the 34th century BC (which, characterized by the invention of the sail, also saw the unification of Egypt and the massive Minoan invasion of Greece), they migrated into northwestern Anatolia. There, the Anatolian languages fragmented. The Luwians remained in western Anatolia, while another group of speakers conquered central Anatolia in the early 2nd millennium and formed the Hittite realm. At each step of the path, the original Steppe ancestry was diluted to the point where it was barely detectable in Anatolia.

[Of course, I am a steppe origin advocate who rejects this narrative.] 

The Southern Arc theory looks at the Anatolian languages differently. The lack of steppe ancestry in over a hundred ancient DNA samples from the Neolithic to Classical Ages shows that steppe penetrations into Anatolia were too minor and too late to have introduced the Anatolian languages to the region. For instance, Classical Age DNA finds in the city of Gordion are only about 4% Steppe in ancestry - even though the city had been ruled by four separate Indo-European groups. Additionally, the steppe ancestry in the Balkans present in the 3rd millennium BC is entirely from the migrations that occurred earlier that millennium. There is no evidence for any pre-3300 BC steppe-ancestry-rich Indo-European groups in the Balkans surviving to a point where they could have potentially migrated to Anatolia.

[This is an important reason why I argue that the Anatolian languages arrived in Anatolia only in about 2000 BCE.]

The Southern Arc offers another explanation for the Anatolian languages. Instead of originating on the steppe, it argues that the Anatolian languages are the remnants of the Indo-European languages that remained in their urheimat in the Southern Arc - a region from the southern Balkans to Azerbaijan - prior to the language ancestral to all of the other branches of Indo-European spreading north across the Caucasus or Black Sea to the steppe. Increases in Caucasian Hunter-Gatherer and Anatolian and Levantine Farmer ancestry in the steppe population at various points between 4500 and 3300 BC could have been the vector that spread the Indo-European languages from the Caucasus to the steppe peoples. After spreading to the steppe, the non-Anatolian Indo-European languages would have been spread across Europe, Central Asia, Iran, and India.

The Southern Arc theory is a great deal less specific than the Steppe theory, and will need to be fleshed out more. It is possible that the Chaff Faced Ware peoples of the southern Caucasus diffused across the Caucasus in the mid-5th millennium, bringing the Indo-European languages with them. It is also possible that the mighty Maykop people, known to have had cultural contacts with the steppe peoples, could have spread the Indo-European languages to their trading partners as a trade language.

In my opinion, the most likely candidates for introducing the Indo-European languages to the steppe peoples (assuming that the Southern Arc Theory is true) are the mysterious pre-Maykop peoples of the North Caucasus. The pre-Maykop peoples of the North Caucasus interacted with the peoples of the Danube Valley across the Black Sea as well as with the peoples of the steppe in the late 5th millennium BC. Copper from the Carpathians made it to the North Caucasus while boar’s tusk pendants and mace heads from the Caucasus made it to the Danube. However, little is known about them, and it is unlikely that much ever will be known about them. They were apparently destroyed by the Maykop people, and likely have no descendants.

There is a third theory, almost invariably promoted by linguists, which argues for a specifically Anatolian origin of the Indo-European languages. While on the surface it resembles the Southern Arc theory, it’s timing is very different. Rather than a proto-Indo-European language that splits between Anatolian and standard Indo-European in the late 5th millennium BC, it instead places the divergence of the Indo-European languages in the mid-7th millennium BC. It associates the spread of the Indo-European languages with the spread of farming from Anatolia, with the Anatolian Farmers and their European cousins, the Early European Farmers.

The Substack article goes on to debunk this theory. 

Another error in the new paper is that it claims that no southward migration of Steppe_MLBA pastoralists is attested by aDNA from BMAC sites in 2300-1700 BCE by citing Narasimhan et al. 2019 stating:

But, the paper cited actually concludes that Steppe_MLBA pastoralists migrated southward from 2100-1700 BCE. The cited paper actually says in the pertinent part:

This kind of basic misstatement of the conclusion of researchers who are relied upon for a thesis in a leading peer reviewed scientific journal is highly unimpressive and just plain sloppy.

Eurogenes adds criticism here.

Friday, July 21, 2023

The Case For Warning Labels For Physics Articles

Current academic and scientific publishing norms call for certain factors to be called out in something akin to "warning labels" for personal interests of the authors that could bias the result (e.g. drug company sponsorship in research related to a drug), or the lack of peer review for materials released in advance of publication in a peer reviewed scientific journal.

But, there are other known factors which are at least as serious that make conclusions in scientific journal articles less credible. Failure to recognize these risk factors are an important reason that main stream media science reporting makes misleading reports about new scientific research. One solution to this problem is for these risk factors to be routinely called out in the same manner. 

In the spirit of free inquiry and not suppressing minority views, these risk factors should not be used to deny publication to such papers, however. Instead, the should only be used to alert readers with "yellow flags" to the need to be particularly skeptical of the conclusions in the papers rather than accepting them uncritically.

What are risk factors that should trigger warning labels in scientific papers in physics?

    Data Quality

(1) An observation relied upon for new physics has not been replicated by an independent research group. More generally, the paper relies upon a single observation (e.g. a single astronomy observation) to support broad theoretical conclusions. This certainly shouldn't be used to deny publication, somebody has to be the first to observe anything, but still calls for a warning label (see, e.g., the Opera experiment's superluminal neutrino measurement that turned out to be due to a flaw in the experimental measurement).

(2) The paper does not consider global statistical significance in addition to local statistical significance when consideration of global statistical significance is necessary for a sound evaluation of the notability of the observation. Failure to consider global statistical significance is an utterly predictable and inevitable way to make noise look like a signal.

(3) The paper relies for the statistical significance of its result upon the combined significance of multiple observations that are individually either not globally statistically significant or represent mild (less than 3 sigma) tensions with a null hypothesis, especially if the separate observations are assumed to have uncorrelated uncertainties (which is rarely true, even though it is hard to quantify the correlations). In part, this is because all manner of theories can be devised after the fact to connect what are actually unrelated statistical flukes due to measurement uncertainty.

(4) An observation relied upon has been contradicted by, or is in strong tension with, other observations of the same phenomena (i.e. the paper relies upon "outlier observations"). This warning should be heightened if the paper relies upon outlier observations to the exclusion of non-outlier observations of the same thing (see, e.g., papers based upon a recent outlier measurement of the W boson mass). Likewise, often there are strong discrepancies between inclusive and exclusive measurements of the same quantity, and considering only one of these without a well reasoned explanation for doing so should trigger a warning label. Several kinds of measurements have consistent disparities between different kinds of measurements of the same quantity of this kind.

(5) The observation relied upon is only marginally statistically significant (i.e. less than 3 sigma of global statistical significance), or the statistical significance of the result relied upon has decreased as more measurements have been made of the same thing. Again, there is nothing wrong with publishing these results, but they should be taken with a grain of salt, especially when used to support otherwise ill-motivated new physics (i.e. outcomes that haven't been predicted but not observed for a long time in some theory that hasn't been strongly disfavored by other evidence).

(6) The group making an observation relied upon has reported results later found to be incorrect or contradicted by multiple other observers of the same phenomena that they were incorrect upon in the past. 

    Literature Review And Authorship 

(7) Papers should clearly state that they are proposing or relying upon new physics rather than widely accepted existing physics. This warning should be heightened where the paper purports to explain an experimental anomaly with new physics within three weeks of a newly reported anomalous experimental result by someone not from the group that did the experiment (i.e. before adequate time has elapsed for papers writing non-new physics explanations to be released). The heightened warning is necessary to counter the fact that there is a rush to release ill-vetted new physics explanations immediately after a new result is announced, making any necessary corrections later, in order to get credit for being the first to discover a new physics phenomena in the unlikely case that the proposal is the correct one.  

(8) The paper relies upon a result from a "fake journal".

(9) None of the authors of the paper is a professional academic or researcher in the field in which it is written with a PhD. (Papers with a professional academic or researcher and also a co-author who is a student in the field, or an industry professional, or someone making contributions from outside the field such as a graphic designer who did the especially difficult figures for the paper or a native English language speaker who helped the authors write idiomatically, or an autodidact, however, do not require a warning label).

(10) A proposed new physics explanation of a phenomena is presented when the phenomena has purportedly been explained already without new physics in a published paper or preprint.

(11) A paper that explains an observation with new physics without first discussing and ruling out possible explanations that do not involve new physics, including the possibility that there are understated or overlooked sources of uncertainty or inaccuracy in the observations relied upon.

(12) The paper does not review the literature, or has identified a paper in its review of the literature that reaches a contrary result and does not contain an analysis engaging with the analysis of prior work that reaches a contrary result. The literature review is inadequate and shouldn't count as a literature review sufficient to avoid a warning label unless it identifies literature critical of an overall new physics research program that the paper advances (e.g. supersymmetry, string theory, or QCD axions) or states that the authors have looked for and not found any such criticisms. Often a simple Wikipedia search would reveal such criticisms, but only a minority of papers advancing new physics include these criticisms in their literature reviews (instead often stating that a theory is "well-motivated" based upon papers written decades earlier before criticisms of the research program had surfaced). It is O.K. to write and publish a "Note" or "Letter" without a full literature review and analysis, but any publication of this kind should come with a warning label to that effect.

A warning label oriented approach would provide an easy checklist for science reporters, for legitimate scientific journals considering sending articles out for peer review, and for peer reviewers, that even when the authors and/or publishers fail to address the points in it, could encourage greater quality control in scientific paper writing and would reduce the incentive to publish dubious speculative work.

In a related matter, when considering a scientist's or academic's publications in hiring, or for tenure or promotion, the papers published ought to be flagged in cases where the conclusions made were later refuted by other research or significantly revised due to errors identified by others, the research program did not pan out, or the papers have been cited unfavorably or critically. Current norms in the field call only for disclosures of papers in which papers are retracted due to academic dishonesty such as faked data or citations to non-existent literature  (which are, of course, "red flags" rather than "yellow flags").