Thursday, April 18, 2024

The Genetic Origins Of The Indo-Europeans

A major new preprint of a paper on Indo-European genetic origins is out. The genetic information is interesting. The linguistic and historical analysis still isn't great, but this paper, at least, seems to abandon previous flawed Anatolian origin and Neolithic origin theories for the Indo-European languages. 

Hat tip to @Ryan for the link to the new paper.

The Yamnaya archaeological complex appeared around 3300 BCE across the steppes north of the Black and Caspian Seas, and by 3000 BCE reached its maximal extent from Hungary in the west to Kazakhstan in the east. To localize the ancestral and geographical origins of the Yamnaya among the diverse Eneolithic people that preceded them, we studied ancient DNA data from 428 individuals of which 299 are reported for the first time, demonstrating three previously unknown Eneolithic genetic clines. 
First, a "Caucasus-Lower Volga" (CLV) Cline suffused with Caucasus hunter-gatherer (CHG) ancestry extended between a Caucasus Neolithic southern end in Neolithic Armenia, and a steppe northern end in Berezhnovka in the Lower Volga. Bidirectional gene flow across the CLV cline created admixed intermediate populations in both the north Caucasus, such as the Maikop people, and on the steppe, such as those at the site of Remontnoye north of the Manych depression. CLV people also helped form two major riverine clines by admixing with distinct groups of European hunter-gatherers. 
A "Volga Cline" was formed as Lower Volga people mixed with upriver populations that had more Eastern hunter-gatherer (EHG) ancestry, creating genetically hyper-variable populations as at Khvalynsk in the Middle Volga. 
A "Dnipro Cline" was formed as CLV people bearing both Caucasus Neolithic and Lower Volga ancestry moved west and acquired Ukraine Neolithic hunter-gatherer (UNHG) ancestry to establish the population of the Serednii Stih culture from which the direct ancestors of the Yamnaya themselves were formed around 4000 BCE. This population grew rapidly after 3750-3350 BCE, precipitating the expansion of people of the Yamnaya culture who totally displaced previous groups on the Volga and further east, while admixing with more sedentary groups in the west. 
CLV cline people with Lower Volga ancestry contributed four fifths of the ancestry of the Yamnaya, but also, entering Anatolia from the east, contributed at least a tenth of the ancestry of Bronze Age Central Anatolians, where the Hittite language, related to the Indo-European languages spread by the Yamnaya, was spoken. 
We thus propose that the final unity of the speakers of the "Proto-Indo-Anatolian" ancestral language of both Anatolian and Indo-European languages can be traced to CLV cline people sometime between 4400-4000 BCE.
Iosif Lazaridis, et al., "The Genetic Origins of the Indo-Europeans" bioRxiv (April 17, 2024).

The paper's conclusion lays out a narrative, which has some merit, although I'm skeptical of some of their analysis, including the Anatolian part. 
The origin and spread of the first speakers of Indo-Anatolian languages 
Different terminologies exist to designate the linguistic relationship of Anatolian and IndoEuropean languages. The traditional view includes both within an “Indo-European” (IE) group in which Anatolian languages usually represent the first split. An alternative terminology, which we use here, names the entire linguistic group “Indo-Anatolian” (IA) and uses IE to refer to the set of related non-Anatolian languages such as Tocharian, Greek, Celtic, and Sanskrit. Dates between 4300-3500 BCE have been proposed for the time of IA split predating both the first attestation of the Hittite language in Central Anatolia (post-2000 BCE) and the expansion of the Yamnaya archaeological culture (post-3300 BCE). 
We identify the Yamnaya population as Proto-IE for several reasons. First, the Yamnaya were formed by admixture ~4000 BCE and began their expansion during the middle of the 4th millennium BCE, corresponding to this linguistic split date between IE and Anatolian. Second, the Yamnaya were the source of the Afanasievo migration to the east a leading candidate for the split of the ancestral form of Tocharian, widely recognized as the second split after that of Anatolian. Third, the Yamnaya can be linked to the languages of Armenia via both autosomal and Y-chromosome ancestry after ~2500 BCE, and to the languages of the Balkans such as Greek. Fourth, the Yamnaya can be linked indirectly to other IE speakers via the demographically and culturally transformative Corded Ware and Beaker archaeological cultures of the 3rd millennium BCE that postdate it by centuries. Most people of the Corded Ware culture of central-northern Europe had about three quarters of Yamnaya ancestry, a close connection within a few generations that can be traced to the late 4th millennium BCE. The Beaker archaeological culture of central-western Europe also shared a substantial amount of autosomal ancestry with the Yamnaya and were also linked to them by their possession of R-M269 Y-chromosomes. The impact of these derivative cultures in Europe leaves no doubt that they were linguistically Indo-European as most later Europeans were; the Corded Ware culture itself can also be tentatively linked via both autosomal ancestry and R-M417 Y-chromosomes with Indo-Iranian speakers via a long migratory route that included Fatyanovo and Sintashta intermediaries. 
A recent study proposed a much deeper origin of IA/IE languages to ~6000  BCE or about two millennia older than our reconstruction and the consensus of other linguistic studies. The technical reasons for these older dates will doubtlessly be debated by linguists. From the point of view of archaeogenetics, we point out that the post-3000 BCE genetic transformation of Europe by Corded Ware and Beaker cultures on the heels of the Yamnaya expansion is hard to reconcile with linguistic split times of European languages consistently >4000 BCE as no major pan-European archaeological or migratory phenomena that are tied to the postulated South Caucasus IA homeland ~6000 BCE can be discerned.  
The Yamnaya culture stands as the unifying factor of all attested Indo-European languages. Yet, the homogeneity of the Yamnaya patrilineal community was formed out of the admixture of diverse ancestors, via proximal ancestors from the Dnipro and CLV clines. Yamnaya and Anatolians share ancestry from the CLV Cline, and thus, if the earliest IA language speakers shared any genetic ancestry at all—the possibility of an early transfer of language without admixture must not be discounted—then the CLV Cline is where this ancestry must have come from. On the Anatolian side, we see that ancestry from the southern Caucasus Neolithic end of the CLV Cline was impactful during the Chalcolithic and Bronze Ages and Bronze Age Central Anatolians over the time span of Hittite presence there also had traces of Lower Volga-related ancestry which implies an origin north of the Caucasus. On the steppe side, we see that mixed Lower Volga/Caucasus Neolithic ancestry was present in the Dnipro Cline and maximized in the Yamnaya population along that cline. IBD analysis identifies long (≥30cM) segments shared by Eneolithic individuals from Berezhnovka-2 in the Lower Volga with Khvalynsk, Igren-8 Serednii Stih, and Areni-1 Armenian Chalcolithic populations, providing strong direct evidence for the impact of Lower Volga ancestry on the Middle Volga, Dnipro, and South Caucasus regions, and active gene flow among these regions around the time the sampled individuals lived. The individual from Vonyucka-1 in the North Caucasus, in fact, has an IBD link (15.2cM) with an early Bronze Age Anatolian from Ovaƶren. Indo-Anatolian languages must have been spread widely by people carrying CLV cline ancestry >4000 BCE. However, only two descendant groups transmitted their languages to later groups: the Yamnaya in the Dnipro-Don area, aided by the mobility of their horse-wagon technology, and the Proto-Anatolians in the south, surviving in the diverse linguistic landscape of ancient Western Asia long enough for their languages to be recorded in writing after 2000 BCE. Whatever their deeper origins in time out of the diverse constituents of CLV cline populations, the Indo-Anatolians must have been part of that cline. Genetics has little to say whether within this cline the IA languages were first spoken in the Caucasus end of the cline and spread into the steppe along with the spread of Caucasus ancestry, or vice versa, or even if a linguistic unity uncoupled with ancestry existed within the CLV continuum. DNA has traced back the ancestors of both Anatolian and IE speakers to the part of the CLV Cline that was north of the Caucasus mountains, bringing them into proximity with each other and uncovering their common CLV ancestry. However, it cannot adjudicate, on its own, who among the proximate and diverse distal ancestors of the CLV people were Pre-IA speaking. Future studies of the dynamics and temporality of intra-CLV contacts (to which genetics may add its information) and of the cultures of CLV people (as reconstructed by archaeology and linguistics) may decide who among them were most likely to have been the “original” Indo-Anatolians. 

Linguistic evidence has been advanced in favor of different solutions of the Proto-IE origins problem for more than two centuries and we review some recent proposals relevant to our reconstruction of early IA/IE history.

First, the presence of some cereal terminology in IA languages and even more in IE was suggested to reflect a subsistence strategy that relied in part on agriculture; this was interpreted as providing evidence against a geographic origin of the populations that spread Indo-European languages east of the Dnipro valley, the easternmost point in which agriculture was used (along with foraging and herding) during the Eneolithic. Our genetic findings are consistent with this constraint. If a Caucasus Neolithic population like that at Aknashen spread IA languages to the north (via the CLV cline to the Dnipro-Don area) it would almost certainly have had a cereal vocabulary, and then this vocabulary would have been retained during the Serednii Stih culture of the Eneolithic down to the time of the Yamnaya as agriculture continued to be used there. 

Second, the fact that Anatolian languages are attested largely in western Anatolia has been interpreted as evidence for entry into Anatolia from the west (via the Balkans), and thus we need compelling genetic evidence to provide a strong synthetic case for an eastern route. In fact, however, our genetic data does provide such a strong case, greatly increasing the plausibility of scenarios of an eastern entry of Proto-Anatolian speaking ancestors into Anatolia. This is because we find that Central Anatolian Early Bronze Age people who were plausibly speakers of Anatolian languages based on their archaeological contexts, were striking genetic outliers from their neighbors due to having a minority component of their ancestry from the CLV (plausibly from the people who brought the ancestral form of Anatolian languages to Anatolia), the majority of their ancestry from Mesopotamian Neolithic farmers, and little or no ancestry from the Neolithic and Chalcolithic Anatolians who were overwhelming the source populations of other Early Bronze Age Anatolians. Mesopotamian Neolithic ancestry almost certainly had an eastern geographic distribution, while the Central Anatolian Bronze Age people had no evidence of the European farmer or European hunter-gatherer ancestry that CLV have encountered if they had migrated to Anatolia from the west, so the genetic data favor an eastern route. 
How then could it be that there is no linguistic evidence of Anatolian speakers in eastern Anatolia? 

We propose that the archaeologically momentous expansion of the Kura-Araxes archaeological culture in the Caucasus and eastern Anatolia after around 3000 BCE may have driven a wedge between steppe and West Asian speakers of IA languages, isolating them from each other and perhaps explaining their survival in western Anatolia into recorded history. That the expansion of the Kura-Araxes archaeological culture could have had a profound enough demographic impact to have pushed out Anatolian-speakers, is attested by genetic evidence showing that in Armenia, the spread of the Kura-Araxes culture was accompanied by the complete disappearance of CLV ancestry that had appeared there in the Chalcolithic. 
The Kura-Araxes culture may not be the only reason for the IA split. The ancestors of the Yamnaya did not only become separated from their Anatolian linguistic relatives but from other steppe populations as well. The homogenization of the Yamnaya ancestral population during the 4th millennium BCE, both in terms of its autosomal ancestry, and in terms of its Y-chromosome lineage, attest to a period of relative isolation and the cessation of admixture. Such isolation would foster linguistic divergence of the languages spoken in the pre-Yamnaya community with those of their linguistic relatives on the steppe. This isolation must have persisted even after the sudden appearance of the Yamnaya archaeological horizon. Mobility and geographical dispersal provided ample opportunities for the resumption of admixture, yet the genetic homogeneity of the “Core Yamnaya” across much of the steppe leaves little room for the absorption of any pre-existing steppe communities: they all seem to disappear in the face of the Yamnaya juggernaut. Did mixing occur between the segment of the Yamnaya population not buried in kurgans and locals they encountered while the kurgan-buried elite largely avoided it with some exceptions? 

The rise of the Yamnaya in the Steppe at the expense of their predecessors was followed by their demise after a thousand years, displaced by descendants of people of the Corded Ware culture. Was this the demise of the kurgan elites of the Yamnaya or of the population as a whole? 

The steppe was dominated by many and diverse groups later still, such as the Scythians and Sarmatian nomads of the Iron Age. These groups are certainly very diverse genetically, but their kurgans scattered across the steppe attest to the persistence of at least some elements of culture that began in the Caucasus-Volga area seven thousand years ago before blooming, in the Dnipro-Don area, into the Yamnaya culture that first united the steppe and impacted most of Eurasia. To what symbolic purpose did the Yamnaya and their precursors erect these mounds we may not ever fully know. If they aimed to preserve the memory of those buried under them, they did achieve their goal, as the kurgans, dotting the landscape of the Eurasian steppe, drew generations of archaeologists and anthropologists to their study, and enabled the genetic reconstruction of their makers’ origins presented here.


Guy said...

Looking at the real meat (Part B of the Supplemental data) I think some strong arguments are made and this paper really summarizes and advances what is known and can reasonably inferred.

andrew said...

@Guy I'll take a look.

I'm particularly sensitive to misclassifying the Copper Age expansion of the non-Indo-European people from the Eastern Highlands of West Asia called "Hattic", which may have had trace CLV autosomal ancestry, who immediately preceded the Hittites in Anatolia, as an Indo-European linguistic or cultural expansion that makes it seem that the Anatolian language speaker's expansion (which I think was closer to 2000 BCE), happened sooner than it did.

andrew said...

@Guy My concern is related to my long standing concern about the reason that Anatolian languages are greatly diverged from other Indo-European languages. The majority view among linguists is that this is due to a great time depth of the division between the two. My view is that this is due to the outsized effect of language contact in this case with an only modestly less advanced society with a substrate influence that is very different from the substrate influences experienced by other IE languages.

Guy said...

Hi Andrew, With the major finding of a steppe component in BA Central Anatolia and a plausible explanation (KA expansion) for the dilution of CVL/SS ancestry in the South Caucasus I think the field of solutions has narrowed significantly. If one favors Hypothesis A-West or A-East over B, then it's time to sharpen your pencil and do some model building... After writing that I realize the paper changed my default to B from A-East. C is a bit of a cop-out, but... I suppose proto-IA as a koine is a theory. (That can never be disproven, so maybe not a theory in some systems.)