Thursday, June 30, 2022

What Drives The Distribution Of Tonal Languages And Correlated Phonetic Features?

This post was originally started and mostly written a few years ago. It is refined, expanded, and published now. (I've now cleared my backlog of draft posts.)

Languages with Complex Tone Systems

Languages With Simple Tone Systems

Languages Without Tone Systems

Languages with labial-velar consonants in yellow; 
languages with clicks in red and black.

Languages with glottal consonants other than ejectives
Purple and yellow have implosives only; 
red and white have glottalized resonants only; 
green and aqua have implosives and glottalized resonants.

Charts via WALS Online.

What is a tonal language?

In a tonal language, tone is the term used to describe the use of pitch patterns to distinguish individual words or the grammatical forms of words, such as the singular and plural forms of nouns or different tenses of verbs.

Tonality appears to be a part of a total phoneme set for a language which also includes a language's inventory of consonants, glottal stops, vowels, and click sounds. 

* The average language with a complex tone system has 26.0 consonants and 7.05 vowels, for a total of 33.05 phonemes.
* The average language with a simple tone system has 23.3 consonants and 6.28 vowels, for a total of 29.58 phonemes.
* The average language with no tone system has 22.1 consonants and 5.58 vowels, for a total of 27.68 phonemes.

Tonal languages also tend to be more likely to have implosive consonants (a type of glottal consonant), glottal resonant consonants, labial-velar consonants, and linguistic click consonants.

Where Are Tonal Languages Spoken?

As the charts at the top of this post demonstrate, tone languages tend to be more vastly common in places that with tropical (or at least subtropical) climates and very rare elsewhere. 

Languages with simple tone systems show a similar, but less pronounced tendency. All of the tonal languages outside tropical and subtropical areas have only simple tone systems. 

Some of the more controversial cases are arguable cases of simple tone systems in places where tonal languages are rare.  A dozen of the languages classified as having simple tone systems are among the most geographically atypical and are only marginally tonal to the extent that they arguably would be more properly classified as non-tonal. These include Norwegian, Japanese, Ainu and Oneida (Iroquoian; New York State).

Much of South Asia, however, despite having many languages, has a tropical or subtropical climate, but appears to have no languages with a complex tone system among its many Indo-European, Dravidian, or Austroasiatic languages, although it does have a handful of Sino-Tibetan languages with simple tone systems in the highlands found in the Himalayas and in the far northeast of the subcontinent. 

There are definitional issues about what constitutes a language with a complex tone system, a simple tone system or no tone system. WALS explains its definitions (emphasis added):
The first distinction made in this chapter is between languages with and languages without tones. For most languages it is easy to determine if the language does or does not make use of tone, but there are surprisingly sharp disagreements in certain cases. 
For example, Dar Fur Daju (Nilo-Saharan; Sudan) is reported as non-tonal in one source but transcribed with three tone levels in another. Ket (Yeniseian; northern Siberia) is described as having none, two, four or eight tones by different authors (there are some differences in the dialects being described, but this does not account for the differences of opinion on the tonal status of the language). Both these languages have been counted as non-tonal in the present chapter since the opinion that they lack tones seems to be the most well-supported (see Thelwall 1981 and Feev 1998 respectively).  
Other languages have clear word-level pitch phenomena but with limited function, or with roles that look more like stress in that they highlight a particular syllable of a word. Norwegian, Japanese, Ainu and Oneida (Iroquoian; New York State) are among languages of this kind. These languages are classified here as tonal, but are perhaps only marginally so.  
Of the 526 languages included in the data used for this chapter, 306 (58.2%) are classified as non-tonal. This probably underrepresents the proportion of the world’s languages which are tonal since the sample is not proportional to the density of languages in different areas. 
For example, from the large Niger-Congo family of Africa there are 68 languages in the sample, 5 of which are nontonal (Swahili, Diola-Fogny, Koromfe, Wolof and Bisa) and the remainder tonal. The Ethnologue (Grimes 2000) lists 1489 Niger-Congo languages, so less than 5% of the Niger-Congo languages are included. 
Of the Indo-European languages of western and central Europe, 16 are included (5 Romance, 3 Germanic, 3 Slavic, 2 Celtic, 1 Baltic, Greek, and Albanian). In these Indo-European groups the Ethnologue lists a total of 145 languages (7 Celtic, 58 Germanic, 48 Italic, 18 Slavic, 7 Greek, 4 Albanian, and 3 Baltic languages), so that over 10% of the Western European languages listed are included, only two of which are tonal or marginally so and the rest non-tonal. 
If, correspondingly, 10% of the Niger-Congo family had been included, 80 additional tone languages would have been included. 
Languages without tones predominate in the western part of the Eurasian landmass, including South Asia, in the more southerly regions of South America, and in the coastal area of northwestern North America. In this last area great genealogical diversity exists among the indigenous languages, but tone is almost entirely absent. In addition, no Australian language has been reported to be tonal.  
The languages with tones are divided into those with a simple tone system — essentially those with only a two-way basic contrast, usually between high and low levels — and those with a more complex set of contrasts. 
About a quarter of the languages (132, or 25.1%) have simple tone systems. This includes 12 languages which appear to meet the definition of being tonal only marginally. With better information a few of these might end up being classed as non-tonal. 
Less than a fifth (88, or 16.7%) have complex tone systems. Tone languages have marked regional distributions. Virtually all the languages in Africa are tonal, with the greater number having only simple tone systems, although more complex systems are not unusual, especially in West Africa. Languages with complex tone systems dominate in an area of East and Southeast Asia. Several clusters of languages with tones occur in South, Central and North America. A number of the languages of New Guinea are also tonal, or at least marginally so.

Tonality Appears To Be Primarily An Areal Rather Than A Language Family Based Property Of Languages

There are language families in which some languages are tonal, while other are not.

As noted above, two Indo-European languages arguably have simple tone systems, although at least one of these is a marginal case with a dubious classification.

Only five of Africa's Niger-Congo languages do not have tone systems.

Within the Afroasiatic language family, tonal languages appear in the Omotic, Chadic, and Cushitic branches of Afroasiatic (the Southern tier of Afroasiatic languages, mostly in Ethiopia and the African Sahel), according to Ehret (1996), but the Semitic, Berber, and Egyptian branches do not use tones phonemically.

Most Austroasiatic languages are tonal, but not the Munda languages of South Asia and not five of the lesser known Austroasiatic languages of Vietnam and Laos.

The Austronesian languages aren't uniform with regard to tonality either: "Unlike in the languages of Mainland Southeast Asia, tonal contrasts are extremely rare in Austronesian languages. Exceptional cases of tonal languages are Moklen and a few languages of the Chamic, South Halmahera–West New Guinea and New Caledonian subgroups."

The locations have temperatures and humidities that influence sound transmission through the air, and have terrain influences (e.g. tree density) that impact how far away you would need words you speak to carry best. So, one theory is that tone languages arise in places where the sound transmission qualities of the air and terrain favor them.

There have been suggestions in the literature that the local climate and ecology can make certain phoneme sets better in some places than in others, that the nature of one part of a phoneme set influences the nature of other parts of the phoneme set, and that there are specific non-random factors that favor particular subtypes of phonemes in particular conditions.
An environmental explanation is supported by the observation that tonality in language seems to be more of an areal effect than one that tracks language families.  There is a fair amount of circumstantial evidence, when you look at patterns of semantic tone use globally in all sorts of languages, to suggest that tonality is more of an areal feature than it is an indicator of the ancestral source of a language. Neighboring languages that come from different families often share the feature of semantic tonality, while languages within the same language family often differ in their use of semantic tonality.
Incidentally, the geographic distribution of languages with tone systems is similar, although not identical, to the geographic distribution of languages with glottal consonants. Both are most common in sub-Saharan Africa, Southeast Asia, and the subtropical and tropical regions of the Americas (although the Americas are far from uniform despite all except the Na-Dene and Inuit language families probably having a common ancestor ca. 14kya). But, the Chinese dialect family uses tone, while it does not utilize glottal consonants. 
It could be that the ancestral hominin type ASPM gene correlated with tonal languages “tunes” ones hearing system to better distinguish sounds in a certain pitch range, in general, in places that that the temperatures, humidities and terrain most conducive to tonal languages, while the derived type ASPM gene loosens to focus of the hearing system so that it isn’t so primed to maximizing hearing of sounds in particular set of conditions, which would be adaptive elsewhere.
This is a more complex hypothesis than the one proposed in the paper showing a relationship between tonal languages and this gene. This 2007 paper and abstract are as follows:
The correlations between interpopulation genetic and linguistic diversities are mostly noncausal (spurious), being due to historical processes and geographical factors that shape them in similar ways. Studies of such correlations usually consider allele frequencies and linguistic groupings (dialects, languages, linguistic families or phyla), sometimes controlling for geographic, topographic, or ecological factors. 
Here, we consider the relation between allele frequencies and linguistic typological features. Specifically, we focus on the derived haplogroups of the brain growth and development-related genes ASPM and Microcephalin, which show signs of natural selection and a marked geographic structure, and on linguistic tone, the use of voice pitch to convey lexical or grammatical distinctions. 
We hypothesize that there is a relationship between the population frequency of these two alleles and the presence of linguistic tone and test this hypothesis relative to a large database (983 alleles and 26 linguistic features in 49 populations), showing that it is not due to the usual explanatory factors represented by geography and history. The relationship between genetic and linguistic diversity in this case may be causal: certain alleles can bias language acquisition or processing and thereby influence the trajectory of language change through iterated cultural transmission.

Earlier versions still of this gene are associated with brain size:
The size of human brain tripled over a period of approximately 2 million years (MY) that ended 0.2-0.4 MY ago. This evolutionary expansion is believed to be important to the emergence of human language and other high-order cognitive functions, yet its genetic basis remains unknown. An evolutionary analysis of genes controlling brain development may shed light on it. ASPM (abnormal spindle-like microcephaly associated) is one of such genes, as nonsense mutations lead to primary microcephaly, a human disease characterized by a 70% reduction in brain size. Here I provide evidence suggesting that human ASPM went through an episode of accelerated sequence evolution by positive Darwinian selection after the split of humans and chimpanzees but before the separation of modern non-Africans from Africans. Because positive selection acts on a gene only when the gene function is altered and the organismal fitness is increased, my results suggest that adaptive functional modifications occurred in human ASPM and that it may be a major genetic component underlying the evolution of the human brain.
The case that your ASPM variant enhances fitness primarily by making your hearing system better adapted to your primary environment makes more sense to me in an evolutionary selective fitness sense. If people with the region appropriate variant hear subtle slight sound differences better than people who lack it, that could increase the ability of a hunter-gatherer to locate prey, to detect predators, to locate lost children who have wandered far away, to hear your enemies coming to get you, to detect a fire that has gotten out of control or something that you are standing on that is about to break, and cumulatively, that could produce a gradual, put persistent selective fitness advantage in the evolutionary sense.
I find it harder to believe that the tone language specific application of this trait would have much of a selective fitness effect. An inability to distinguish by sound alone two words that would both make contextual sense in a tonal language that you and the speaker share might tweak one’s social status in the community a little, but it seems less likely to have a big impact on mortality or lifetime reproductive success. It’s not impossible, but it would seem like a weaker explanation.
Against this backdrop, the natural question to ask is one that wouldn’t otherwise be obvious, which is “why aren’t there more tonal language in South Asia?” which has substantial linguistic diversity and climate features in part of the region that are very similar to places in Africa, Southeast and East Asia, and the Americas where tonal languages are predominant.
One partial answer to this is that the Indo-Aryan languages developed in places that did not have this climate and didn’t spontaneously pick up this feature upon arriving in the subcontinent. 
Migration can also explain the affirmative presence of these features in arid southern Africa where people speaking these languages probably migrated from more tropical parts of Eastern sub-Saharan Africa.
But, this doesn’t explain why we don’t see tonal Munda and Dravidian languages in South Asia. The urheimat of the Austroasiatic languages of which the Munda languages are a family member, is Southeast Asia (or perhaps southern China), where the vast majority of languages are tonal. And, the Dravidian languages, as far as anyone knows, are autochthonous in South Asia.
In both of these exception cases, I think that the likely explanation is a language learner effect.
The Munda languages, at least initially, seem to have had a fairly northerly distribution within South Asia where hearing well suited to tonality wouldn’t have been advantageous to the local people who probably accounted for all or most of the women in the community at the time of first contact when the Munda languages would have been adopted by people integrated into the early Munda communities. If half the people had trouble hearing the tones, that feature which was almost surely present in an ancestral pre-Munda language probably didn’t survive.
In the case of the Dravidian languages, which probably have an ancestral version that was tonal under the environmental hypothesis, the pertinent fact is that there are no meaningful communities in Dravidian India that do not have substantial ANI admixture dating to the last 2000-3500 years. The language learner affect at the time of ANI-ASI admixture could have stripped the Dravidian languages in existence at the time of their tonal features for the same reasons. The ubiquity of the Hindu religion in Dravidian India which has clear Indo-Aryan and Harappan synthesis origins, likewise suggests that the language learners were not just anybody, they were culturally influential elites whose language choices tend to influence whole communities by cultural imitation.

A New Post WIth A Weird Date

I recently resurrected a lengthy draft post that blogger has, for some reason that I have never seen before, given the date of February 2, 2021, which is when the draft was initiated, rather than today, which is when it was finally revised and published. 

It is entitled: "Harappan, Dravidian and Indo-Aryan Legacies" and disputes some conjectures made by Razib Khan about South Asian and Brahui prehistory. 

I encourage you to click through to read it.

Describing The Black Hole At The Center Of The Milky Way

A new paper describes how the black hole at the center of the Milky Way galaxy, Sagittarius A* (SgrA*), could have both its mass and spin measures with great precision with planned Earth orbit based gravitational wave detectors.

The a precise new mass measurement (at roughly part per million precision), while interesting in its own right, it wouldn't represent much of a scientific advance, and has already been made much more crudely (approximately 4,000,000 times the mass of the Sun). 

But, even measurement of the spin of SgrA* with one significant digit precision, would be a major scientific advancement. This proposed observation method would provide an exquisitely precise measurement of both the mass and the spin of SgrA* as soon as the opportunity presented itself when a brown dwarf falls into this black hole, which probably isn't all that rare of an event.

While the properties of a spinning black hole have been worked out analytically from first principles in General Relativity, with and without black hole electromagnetic charge, there is almost no solid observational evidence regarding whether black holes actually do have spin, and if so, what the magnitude of that spin of supermassive black holes at the center of galaxies like this one is likely to be.

The paper and its abstract are as follows:

Estimating the spin of SgrA∗ is one of the current challenges we face in understanding the center of our Galaxy. In the present work, we show that detecting the gravitational waves (GWs) emitted by a brown dwarf inspiraling around SgrA∗ will allow us to measure the mass and the spin of SgrA∗ with unprecedented accuracy. 
Such systems are known as extremely large mass-ratio inspirals (XMRIs) and are expected to be abundant and loud sources in our galactic center. We consider XMRIs with a fixed orbital inclination and two scenarios for SgrA∗'s spin (s): A highly spinning scenario where s=0.9 and a low spinning scenario where s=0.1. 
For both cases, we obtain the number of circular and eccentric XMRIs expected to be detected by space-borne GW detectors like LISA and TianQin. We later perform a Fisher matrix analysis to show that by detecting a single XMRI the mass of SgrA∗ can be determined with an accuracy ranging from 0.06 to 3 solar masses while the spin can be measured with an accuracy between 1.5×10^−7 and 4×10^−4.
Veronica Vazquez-Aceves, Yiren Lin, Alejandro Torres-Orjuela, "SgrA∗ spin and mass estimates through the detection of an extremely large mass-ratio inspiral" arXiv:2206.14399 (June 29, 2022).

The introduction of the paper explains that:
SgrA∗ , the super-massive black hole (SMBH) at our galactic center, was recently observed by the Event Horizon Telescope obtaining an estimated mass of 4 × 10^6 M , which is in good agreement with previous estimates; in contrast, measuring its spin remains a major challenge. 
In this work we show that detecting a single extremely large mass ratio inspiral (XMRI), i.e., a brown dwarf (BD) inspiraling towards SgrA∗ due to energy loss by gravitational waves (GWs), is enough to determine the spin and mass of SgrA∗ with very good accuracy. When an XMRI forms in our galactic center, its GW emission can be detected by space-borne detectors such as the Laser Interferometer Space Antenna (LISA) and TianQin. Furthermore, its large mass-ratio 𝑞 = 𝑚BD/𝑀SgrA∗ ≈ 10^−8 , allows the BD to spend a large amount of time inspiraling around the SMBH, and the slow evolution of the orbit simplifies the analysis of its gravitational radiation. Understanding its formation channels and evolution is a key part of the description that will lead to accurate templates to identify and extract the information encoded within its gravitational radiation. In a dense stellar system, such as our galactic center, two-body relaxation processes slowly change the orbits of the orbiting objects by diffusion in energy and angular momentum (𝐽). However, as diffusion in 𝐽 is more efficient than diffusion in energy, the eccentricity (𝑒) of the orbits changes faster than the semimajor axis (𝑎). As a consequence, the pericentre (𝑅p) of the orbit is perturbed, allowing objects to reach very close distances to the central black hole. An inspiraling system is formed when a compact object is diffused into an orbit with a small Rp such that after just one pericentre passage the orbit evolves only due to the energy lost by gravitational radiation. 
Recent estimates show that at the time the LISA and TianQin missions will be in space, there could be about 15 eccentric and five circular XMRIs in our galactic center emitting GWs in a detectable range. 

Monday, June 27, 2022

Events In 1627 CE Worthy Of A Movie

In the long scheme of history, these events, sometimes called somewhat misleadingly, the Turkish abductions, made little difference and were an outlier to the overall trends. But, they would be a great foundation for a movie bringing this historic period in multiple places to life and connecting them:

The Turkish Abductions (Icelandic: Tyrkjaránið) were a series of slave raids by pirates from Northwest Africa that took place in Iceland in the summer of 1627.

The pirates came from the cities of Algiers which was part of the Ottoman Empire (in modern-day Algeria) and Salé which was its own city-state known as Republic of Salé a tributary state of Morocco. They raided Grindavík, the East Fjords, and Vestmannaeyjar. About 50 people were killed and close to 400 people were captured and sold in the African slave market. A ransom was eventually paid, 9 to 18 years later, for the return of 50 individuals.

The label "Turkish" does not refer to Turkey; at the time it was a general term for all Muslims in the Mediterranean region since the majority were a part of the Ottoman Empire. During the 17th century, the majority of those called "Turks" in Algeria, were disowned Christians that had converted to Islam. They were mostly Spanish, Italians, and provençaux (French).

Ancient v. Modern Y-DNA R1b-V88


From here.

The ancient DNA "X" in the image in very close geographically to the site of the Bug-Dneister culture.

This is relevant previous discussions of Y-DNA R1b-V88 at this blog on May 6, 2022 and on September 27, 2017, in which I argue that Y-DNA R1b-V88 bearing Chadic people are derived from migrants who originated in the Bug-Dneister culture of Ukraine departing between 5400 BCE and 5200 BCE. 

See also earlier analysis at this blog not reaching the full conclusion, including a detailed paleo-climate analysis on March 2, 2014, a wet Sahara post on November 11, 2012, and a post on March 14, 2012 citing:

Friday, June 24, 2022

How Long Were Neanderthals and Modern Humans Neighbors In Europe?

My working estimate had been about 1,000 years of co-existence at any one place (an estimate also in line with estimated periods of co-existence for the first farmers and the prior hunter-gatherers of Europe). This paper's conclusion is a bit longer than that.
Recent fossil discoveries suggest that Neandertals and Homo sapiens may have co-existed in Europe for as long as five to six thousand years. Yet, evidence for their contemporaneity at any regional scale remains elusive. In France and northern Spain, a region which features some of the latest directly-dated Neandertals in Europe, Protoaurignacian assemblages attributed to Homo sapiens appear to replace Neandertal associated Chatelperronian assemblages. 
Using the earliest and latest known occurrences as starting points, Bayesian modelling has provided some indication that these occupations may in fact have been partly contemporaneous. The reality, however, is that we are unlikely to ever identify the first or last appearance of a species or cultural tradition in the archaeological and fossil record. 
Here, we use optimal linear estimation modelling to estimate the first appearance date of Homo sapiens and the extinction date of Neandertals in France and northern Spain by statistically inferring these missing portions of the Protoaurignacian and Chatelperronian archaeological records. Additionally, we estimate the extinction date of Neandertals in this region using a set of directly-dated Neandertal fossil remains. 
The results suggest that the onset of the Homo sapiens occupation of this region likely preceded the extinction of Neandertals and the Chatelperronian by up to 1400-2900 years, raising the possibility of an extended co-existence of these groups during the initial Upper Palaeolithic of this region. Whether or not this co-existence featured some form of direct interaction, however, remains to be resolved.

Wednesday, June 22, 2022

Was There A Paleo-American Ghost Population In Pre-Columbian Mexico?

A new Mexican ancient DNA paper and its abstract are as follows: 
Aridoamerica and Mesoamerica are two distinct cultural areas that hosted numerous pre-Hispanic civilizations between 2,500 BCE and 1,521 CE. The division between these regions shifted southward due to severe droughts ca. 1,100 years ago, allegedly driving demographic changes and population replacement in some sites in central Mexico. 
Here, we present shotgun genome-wide data from 12 individuals and 26 mitochondrial genomes from eight pre-Hispanic archaeological sites across Mexico, including two at the shifting border of Aridoamerica and Mesoamerica. 
We find population continuity spanning the climate change episode and a broad preservation of the genetic structure across present-day Mexico for the last 2,300 years. 
Lastly, we identify a contribution to pre-Hispanic populations of northern and central Mexico from an ancient unsampled ′ghost′ population.
Viridiana Villa-Islas, "Demographic history and genetic structure in pre-Hispanic Central Mexico" bioRxiv (June 20, 2022). (Supplementary materials here).

The Ghost Population Claim

This also isn't the first paper I've seen that mentions as possible "ghost" population in northern and central Mexico (see, e.g., previous discussion here), something which is always intriguing, even though, from the perspective of the typical fairly well educated layman, they may as well all be "ghost populations" since the average person really has no sense of which known non-ghost pre-Columbian populations of Mexico are out there anyway. 

Still, the existence of a "ghost" population, implies the extinction of that population, and the death of a whole people is always notable. 

But, even if prior papers on this ghost population made the point that this ghost population was deeply diverged from other founding population derived indigenous Mexicans, I missed it then, and this deep divergence makes the claim much more notable.

The body text discussion of the ghost populations below about the ghost population makes the bombshell claim, in a buried lede, that the a divergence date of the ghost population from other indigenous Americans from the founding population of the Americas is ten thousand years before the main founding population of the Americas arrived!

Ghost population contribution to pre-Hispanic Mexico 

Earlier reports have demonstrated the detection of ancestry from an unsampled group, designated UpopA, among the present-day Mixe from Mexico and Lagoa Santa from Brazil populations. 

We tested the presence of UpopA in the pre-Hispanic individuals, using an admixture graph model with the same topological framework but some modifications. We used Mbuti as an outgroup, Ami as the East Asian, MA1 as the ancient north Eurasian, USR1 as the ancient Beringian, Anzick-1 and Spirit Cave as the NNA, and Athabascan as the SNA. 

Interestingly, the Mummy F9_ST_a from the north and the individuals from CdV from central Mexico have ancestry from a ‘ghost’ population, at 28% and 17%, respectively (|Z-score|: 2.142 434 and 1.655, respectively), consistent with the “ghost” UpopA ancestry reported in Mixe. 

We repeated this test with the Mixe population to verify the contribution of the ghost population in our tested admixture graph structures, and found Mixe to have an ancestry coming from an unsampled UpopA population at 13% (|Z-score|: 2.34), which is similar to previous estimates of 11%. Models including individuals from other archaeological sites had |Z-score| values over 3 and branches with inner zeros, thus we rejected them. . . . 
Furthermore, the qpGraph admixture models explored for the pre-Hispanic populations showed F9_ST_a and the ancient individuals from CdV have ancestry from an unsampled population that we hypothesize is the ghost population previously found named UpopA. 
The contribution from UpopA was identified in present-day Mixe from Mexico and Lagoa Santa, Brazil, through qpGraph models and estimated to have diverged ~24,700 years ago from Native Americans. 
Also, a more recent study through D statistics analysis has suggested that UpopA contributed to present day populations from north Mexico as Akimel O’odham, Guarijo, Rarámuri, Cora, Mexicanero, and Tepehuano, as well as in present-day populations from central and south Mexico as Mixe, Totonac, Nahuas from Puebla, Otomi from Hidalgo, Chocholteco and, Mocho. 
Our study is the first to identify this UpopA ancestry in admixture models involving pre-Hispanic individuals from Mexico. Further aDNA studies from the Americas could help better characterize the source of this UpopA ghost population contributing to many present-day Indigenous populations from Mexico.

If true, the existence of such a ghost population would call for a dramatic paradigm change regarding the founding of the Americas and might also shed insight into the Paleo-Asian ancestry seen in South America although the UpopA ghost population is not described in this paper as having a Paleo-Asian character and this may be entirely separate population for the source of the Paleo-Asian DNA.

Other Analysis  

The drought 1,100 years ago mentioned in the paper was roughly of the same magnitude as the one that Colorado and the rest of the arid Southwest United States is facing right now.

I'm skeptical about the extent to which one can make strong statements made in the paper's abstract about population continuity over a thousand across the temporal boundary of this climate event in a region as large as present day Mexico from 26 ancient DNA samples, only 12 of which (none from earlier than 600 CE) have autosomal as well as mtDNA, and only 2 of which are from the hypothesized transition area.

The geographical mix of mtDNA and Y-DNA haplogroups found show continuity with the modern population genetics of Mexico:
Even though we identified previously unreported variants in the ancient mitogenomes, the spatial distribution of the haplogroups found, namely A, B, C and D, closely resembles that of present-day Mexico. 
The information on the paternal lineage was limited, since, despite confirming the presence of the Native American chromosomal haplogroup Q in the pre-Hispanic males, there was no clear differential spatial distribution of the two identified sub-haplogroups, Q1a2a1-L54 and Q1a2a1a1a1-M3, in both Aridoamerica and Mesoamerica. Few studies have recovered information of ancient Y chromosome haplogroups across the Americas and, given the sex-biased admixture events after European colonization, the frequency of haplogroup Q in the present-day population is low. 

The Supplemental Materials note that:

In the Sierra Tarahumara, we found two sub-linages of the mitochondrial haplogroup C (C 100%), as previously reported. In contrast, Central Mexico shows a higher diversity of mitochondrial haplogroups. Cañada de la Virgen was the site with the highest number of sub-haplogroups, 5 distributed in B 50%, A 25%, C 12%, and D 12%. Michoacán presented four different subhaplogroups (A 40%, D 40%, and C 20%). Sierra Gorda also presented four different subhaplogroups but corresponding to major lineages A 50%, B 43%, and D 7%, respectively.  

The body of the paper also analyzes effective population size, which it finds was basically constant until a population expansion 5,000 years ago, followed by a population decline starting slowly at 2,000 years ago and accelerating 500 years ago when contact with Europeans occurred (although the contractions still left effective population sizes far larger than those prior to the population expansion which was probably attributable to a shift to sedentary farming of maize, beans and squash).

The date where the contraction begins 2,000 years ago coincides which a much more severe drought than the current one or one occurring around 800 CE that took place in what is now the American Southwest around the 100s CE.

They propose a model in which the north to south client population structure seen in modern and ancient DNA in Mexico arose from "a northern/southern population split between 4,000 and 10,000 years ago followed by multiple waves of admixture events between them. . . . [with] subsequent splits within each of these regions 6,523 and 5,715 years ago for north and south, respectively." 

The secondary splits, again, roughly correspond to the adoption of sedentary agriculture, and the estimated timing of the primary northern/southern split doesn't resolve whether it happened contemporaneous with that split, or took place earlier.

Monday, June 20, 2022

Grain Cultivation and Chinese History

One can do a pretty decent job of explaining the larger course of history with an economically deterministic model (which in turn, although not necessarily in the case of the paper below, has a strong climate driven component).
We propose and test empirically a theory describing the endogenous formation and persistence of mega-states, using China as an example. We suggest that the relative timing of the emergence of agricultural societies, and their distance from each other, set off a race between their autochthonous state-building projects, which determines their extent and persistence. 
Using a novel dataset describing the historical presence of Chinese states, prehistoric development, the diffusion of agriculture, and migratory distance across 1° × 1° grid cells in eastern Asia, we find that cells that adopted agriculture earlier and were close to Erlitou the earliest political center in eastern Asia remained under Chinese control for longer and continue to be a part of China today. By contrast, cells that adopted agriculture early and were located further from Erlitou developed into independent states, as agriculture provided the fertile ground for state-formation, while isolation provided time for them to develop and confront the expanding Chinese empire.
Our study sheds important light on why eastern Asia kept reproducing a mega-state in the area that became China and on the determinants of its borders with other states.
James Kai-sing Kung, et al., "Millet, Rice, and Isolation: Origins and Persistence of the World's Most Enduring Mega-State", SSRN IZA Institute of Labor Economics Discussion Paper Series IZA Discussion Paper No. 15348 (June 11, 2022).

Thursday, June 16, 2022

How Often Does The LHC Produce Electroweak Bosons?

A June 11, 2022 blog post by Tommaso Dorigo provides a somewhat prosaic bit of background data that often gets overlooked in the hype of looking at unusual events and collectively demonstrates the overall soundness of the electroweak part of the Standard Model in a way that I haven't previously blogged. 

Look how closely the blue lines that are Standard Model predictions line up with the experimental data except in the last column which uses a pre-Higgs boson discovery mass estimate for the Higgs boson (this line would match up pretty closely too if the experimentally correct Higgs boson mass were used instead).
The graph is old but I can't be bothered with finding the latest-greatest measurement, as it is only for illustrative purposes...

Some explanation of this graph is in order. On the vertical axis you have the production cross section of the different processes, which is basically a rescaled probability - as we said, W production (the leftmost column) takes place once in a million collisions, which is in fact 10^5 picobarns ( a picobarn is 10^-12 barns, and for LHC proton collisions that equates to one in 100 billion collisions). 
On the various columns you get to see how the probability becomes smaller as you require the presence of more and more complex final states - two bosons of different kinds, e.g., or even a W plus one or more jets (left sub-bins in the W boson bin). 
Another thing to register is that all measured values (red points with uncertainty bars) are in agreement with predictions. 
A further note is that this graph was produced in 2011, when the Higgs boson had not been discovered yet - and so the last column, which shows the Higgs decay to a pair of Z bosons, is only showing an upper limit on the cross section!


Troödons are currently my favorite dinosaur. But for the extraterrestrial impact that wiped out the dinosaurs in the Gulf of Mexico about 65 million years ago, they would have been top candidates to become the dominant intelligent species on Earth.

A Troödon "is a large-brained nocturnal Troodontid theropod, which lived in the latter part of the Cretaceous period. It . . . may have had camouflage markings on its hide, to help it catch the nocturnal mammals which formed a small part of its diet. . . . it was also feathered. Troodon had abnormally large eyes, and these helped Troodon to see after the setting of the sun . . . . Troodon . . . is generally accepted as having been . . . warm - blooded. . . . It lived in what is now North America during the late Cretaceous Period (100-65 mya) and grew to be about 8 feet to 10 feet long and about 110 lbs. in weight. This slender little dinosaur might be the most intelligent dinosaur ever to exist. Troodon ate mammals, small dinosaurs, eggs and juvenile Hadrosaurs. It also specializes in pack hunting to hunt bigger prey.

From the Dinosaur Wiki

Olive Trees Domesticated In Levant Ca. 5000 BCE

This particularly domestication of olive and fig trees in the Copper Age in the Jordan Valley isn't very surprising (in contrast, for example, to the belated and complicated domestication of the almond tree or the long distance migration involved in banana domestication). But, it is still notable and helps piece together the overall chronology and story of plant and animal domestication over time.
A new study has unraveled the earliest evidence for domestication of a fruit tree, researchers report. The researchers analyzed remnants of charcoal from the Chalcolithic site of Tel Zaf in the Jordan Valley and determined that they came from olive trees. Since the olive did not grow naturally in the Jordan Valley, this means that the inhabitants planted the tree intentionally about 7,000 years ago.

The paper and its abstract are as follows:
This study provides one of the earliest examples of fruit tree cultivation worldwide, demonstrating that olive (Olea europaea) and fig (Ficus carica) horticulture was practiced as early as 7000 years ago in the Central Jordan Valley, Israel. 
It is based on the anatomical identification of a charcoal assemblage recovered from the Chalcolithic (7200–6700 cal. BP) site of Tel Tsaf. Given the site’s location outside the wild olive’s natural habitat, the substantial presence of charred olive wood remains at the site constitutes a strong case for horticulture. 
Furthermore, the occurrence of young charred fig branches (most probably from pruning) may indicate that figs were cultivated too. One such branch was 14C dated, yielding an age of ca. 7000 cal. BP. 
We hypothesize that established horticulture contributed to more elaborate social contracts and institutions since olive oil, table olives, and dry figs were highly suitable for long-distance trade and taxation.
Dafna Langgut, Yosef Garfinkel. 7000-year-old evidence of fruit tree cultivation in the Jordan Valley, Israel. 12(1) Scientific Reports (2022) DOI: 10.1038/s41598-022-10743-6

Wednesday, June 15, 2022

What If Neutrinos Coupled Proportionate To Mass To The Higgs Field?

The Standard Model of Particle Physics (SM) is agnostic in the source and nature of the neutrino masses. Indeed, there is some pedantic debate over whether neutrino mass is really part of the SM although from a practical perspective, most people would say that it is when they are talking about SM physics predictions.

If neutrinos did couple to the Higg field (which is the source of the rest mass of all of the other massive fundamental particles in the SM), they would have such a weak coupling that we would probably never observe Higgs boson decays to neutrinos even if they did, like other SM fundamental particles, have a coupling to the Higg field proportional to their rest mass. 

I'll illustrate this conclusion with the following back of napkin class calculations.

The branching faction of the charged leptons is for tau-lepton pairs, 6.27% (observed), for muon pairs, 0.021 8% (observed), and for electron-positron pairs, 0.000 000 5% (not yet observed).

Roughly speaking branching fractions ratios are on the same order of magnitude of the square of mass ratios.

Electrons are 511,000 eV v. something on the order of 0.050 eV for the largest neutrino mass eigenstate, a ratio of 107.

This implies that if neutrinos got their mass via the same SM Higgs mechanism that applies to other SM fermions (whether or not that makes sense for other reasons), that a branching fraction from Higgs boson decays for neutrinos on the order of 1014 smaller than that of the roughly 5*10-7 for electrons.

This would imply a Higgs boson branching fraction on the order of not more than 10-21 for any kind of neutrino, when muon pairs with a branching fraction of about 2*10-2 are at the current experimental detection threshold.

In addition, a smaller proportion of neutrinos passing through a detector are actually seen than a proportion of charged leptons passing through a detector, since neutrinos interact more weakly with other matter, so a Higgs boson decay into neutrinos would not be possible at the same branching fraction that a charge lepton Higgs boson decay can be detected.

Realistically, detector precision would have to improve by a factor of about 1022 or more over current technology to directly observe neutrino decays from Higgs bosons in a statistically significant way.

I don't expect that to happen anytime during the lives of anyone who ever encounters me alive.

Evidence That The Right SM Muon g-2 Calculation Is The One That Matches The Experiments

There are two leading calculations of the Standard Model prediction for the anomalous magnetic moment of the muon, called muon g-2, which is an inherent electromagnetic property of muons (a "heavy" electron) that can, in theory, be calculated exactly in the Standard Model.

This calculation has three main components, the electromagnetic contribution, which is the primary contribution to the overall value and is easy to calculate and extremely precise, the weak force contribution, which is much smaller contribution which is harder to calculate and not quite as precise (to a great extent because some of the physically measured constants that go into that calculation haven't been measured as precisely), and the strong force (i.e. QCD) contribution which is like the weak force contribution a small part of the total in absolute terms, that is profoundly difficult to calculate and is the source of the lion's share of the uncertainty in the calculation of muon g-2.

The small contribution to the muon g-2 which involves considering virtual quarks and gluons that impact the electromagnetic properties of muons when you are being exceedingly rigorous is hard to determine because QCD calculations are exceedingly difficult (for reasons I've explained in prior posts at this blog) and the because the underlying physical constants that go into them (such as the strong force coupling constant and light quark masses) haven't been measured very precisely. The hadronic vacuum polarization (HVP) part of the muon g-2 calculation is particularly difficult to calculate and is a QCD component of the muon g-2 calculation.

Both theoretical calculations agree on the electromagnetic and weak force contributions, are reasonably close to each other (although the differences are statistically very significant), and match the experimentally measured results at roughly the part per hundred million level.

One of the calculations (by the BMW group) matches the experimental result and is purely theory driven, the other (by the Theory Initiative group) is "data driven" and replaces some of the HVP calculations needed to determine muon g-2 in the Standard Model with experimental data that they think should replicate the theoretical calculations that this approach doesn't calculate directly.

A new (admittedly quite technical) lattice QCD study suggests that the data driven component is of the Theory Initiative calculation is inaccurately determining the Standard Model contribution to the HVP contribution to muon g-2 for which it attempts to substitute experimentally data for some reason.

Why care?

Muon g-2 is an observable quantity that is sensitive globally to almost everything in the Standard Model, at least at "low energies" at the scale of what can be seen at particle colliders and below, although new physics at the extremely high energies of the time immediately after the Big Bang "decouple" from this observable.

If the correct Standard Model prediction for muon g-2 is consistent with the experimental measurement of muon g-2 (which has now been replicated at high precision), then there is very little room for "low energy" beyond the Standard Model physics (although they aren't entirely ruled out so long as their contributions to muon g-2 net out to zero), and the case for the completeness of the existing Standard Model is greatly heightened.

In contrast, if the correct Standard Model prediction for muon g-2 is significantly different from the experimental measurement of muon g-2 then there have to be "low energy" beyond the Standard Model physics out there to be discovered that give rise to the discrepancy of a well quantified magnitude, that should be accessible at the energy scales of current or next generation particle accelerators. The energy scales at which these new physics should arise is further constrained by the fact that, apart from potential lepton universality violations, we haven't seen them at the LHC so far. 

Thus, if this discrepancy is real, "new physics" are just around the corner and there is a strong scientific motivation for building a next generation collider and for choosing the design more likely to identify the source of a muon g-2 anomaly with a particular magnitude.

This new study suggests that the "muon g-2 anomaly" arises because of a flawed calculation of the Standard Model prediction for this observable, arising from the "data driven" part of that prediction of the HVP contribution to muon g-2, and not new physics that is waiting to be imminently observed.

The paper and its abstract are as follows:
Euclidean time windows in the integral representation of the hadronic vacuum polarization contribution to the muon g−2 serve to test the consistency of lattice calculations and may help in tracing the origins of a potential tension between lattice and data-driven evaluations. 
In this paper, we present results for the intermediate time window observable computed using O(a) improved Wilson fermions at six values of the lattice spacings below 0.1 fm and pion masses down to the physical value. Using two different sets of improvement coefficients in the definitions of the local and conserved vector currents, we perform a detailed scaling study which results in a fully controlled extrapolation to the continuum limit without any additional treatment of the data, except for the inclusion of finite-volume corrections. To determine the latter, we use a combination of the method of Hansen and Patella and the Meyer-Lellouch-Lüscher procedure employing the Gounaris-Sakurai parameterization for the pion form factor. We correct our results for isospin-breaking effects via the perturbative expansion of QCD+QED around the isosymmetric theory. 
Our result at the physical point is a^win(μ)=(237.30 ± 0.79 stat ± 1.22 syst) × 10^−10, where the systematic error includes an estimate of the uncertainty due to the quenched charm quark in our calculation. Our result displays a tension of 3.8σ with a recent evaluation of a^win(μ) based on the data-driven method.
Marco Cè, et al., "Window observable for the hadronic vacuum polarization contribution to the muon g−2 from lattice QCD" arXiv:2206.06582 (June 14, 2022) (report number MITP-22-038, CERN-TH-2022-098).

Monday, June 13, 2022

ΛCDM Fails Again

The distribution of radio galaxies and quasars that are observed is inconsistent the the Standard Model of Cosmology which asserts that at a sufficiently large scale the universal should look essentially the same in all directions aside from a correction based upon our vantage point of observation.

We present the first joint analysis of catalogs of radio galaxies and quasars to determine if their sky distribution is consistent with the standard ΛCDM model of cosmology. This model is based on the cosmological principle, which asserts that the universe is isotropic and homogeneous on large scales, so the observed dipole anisotropy in the cosmic microwave background (CMB) must be attributed to our local peculiar motion. 
We test the null hypothesis that there is a dipole anisotropy in the sky distribution of radio galaxies and quasars consistent with the motion inferred from the CMB, as is expected for cosmologically distant sources. Our two samples, constructed respectively from the NRAO VLA Sky Survey and the Wide-field Infrared Survey Explorer, are systematically independent and have no shared objects. 
Using a completely general, two dimensional definition of the p-value that accounts for correlation between the found dipole amplitude and its directional offset from the CMB dipole, the null hypothesis is independently rejected with p=7.9×10^−3 and p=9.9×10^−6 for the radio galaxy and quasar samples, respectively, corresponding to 2.7σ and 4.4σ significance. The joint significance, using sample size-weighted Z-scores, is 5.2σ. We show that the radio galaxy and quasar dipoles are consistent with each other and find no evidence for any frequency dependence of the amplitude.
Nathan Secrest, et al., "A Challenge To The Standard Cosmological Model", arXiv:2206.05624 (June 11, 2022).