Tuesday, June 13, 2017

Two Waves Of Steppe Migration?

Davidski at his Eurogenes blog, buried in a lot of technical jargon and software discussion, makes a pretty notable observation

He observes that the steppe component of the autosomal genetics of some South Asian populations shows a greater affinity to ancient Yamnaya genomes, which he attributes to an earlier Indo-Aryan wave of migration (speaking an early dialect of Sanskrit), while the steppe component of some linguistically Indo-Iranian populations (the oldest attested ancestral versions of which are Avestan a.k.a. Zend, and Old Persian) shows greater affinity to ancient Andronovo genomes, which he attributes to a later Indo-Iranian wave of migration.

Some quotes from his post (quotation updated in this post on June 14, 2017 as indicated, all emphasis mine):
My main model is also a decent statistical fit for at least a number Indian groups, like, for instance one of the Gujarati subpopulations labeled GujaratiD in the Human Origins dataset. But it fails marginally for Pathans, so it's not a robust solution for all of South Asia. Incredibly, using Andronovo instead of Yamnaya in the Pathan model makes it work. Tajiks can also be modeled in this way using Andronovo. I say incredibly, because Pathans and Tajiks are obviously Iranic speakers, and their Iranic ancestors in all likelihood arrived in South Asia from the Eurasian steppe much later than the Indo-Aryan ancestors of the Kalash and most Indians. . . .
So what we might be seeing here is substructure within the steppe-related admixture amongst South Asians, with Indo-Aryan speakers apparently showing Yamnaya-related (Catacomb?) ancestry, and Iranic speakers, as well as possibly groups with significant Iranic ancestry, showing a preference for later Andronovo-related ancestry.
Update 14/06/2017: I've now had the chance to test many more Indo-Aryan and Iranic groups with my model. Most of these groups show a slight, non-significant, preference for Yamnaya_Samara as the steppe reference population. However, those that show a slight, and again non-significant, preference for Andronovo are usually Iranic, such as the Balochi in the graphs below. I'm not claiming that this proves anything, but I do think that it hints at something, and I'll try testing a few different hypothesis in the near future[.] 
In fairness to Davidski, should this not pan out, he has made clear that this analysis is preliminary and involves some of the first times that he has used a new software tool that he is not yet fully familiar with using. Also, as an editorial note, if I recall correctly, the GujaratiD population consists of ethnically Gujarati immigrants from India in Dallas, Texas. (Similarly, one of the major reference sets of Chinese genomes, CHD, consists of Han Chinese immigrants from Denver and one of the major reference sets for Northern European CEU, consists of white people from Utah).

This is very plausible in historical context. It would also explain why it is hard to develop a good fitting model that involves just a single wave of steppe migration to South and West Asia, which has been pretty much the default assumption so far.

ANI v. ASI Puzzles That May Or May Not Be Related

It isn't clear if this has anything to do with an analysis of contemporary South Asian genomes using linkage disequilibrium dating (a methodology that is heavily biased towards the most recent date of any admixture) that appears to reveal that places in North India, which should have encountered Indo-Aryans first, shows the youngest data of most recent Ancestral North Indian (ANI) admixture, suggesting that these areas experienced two, rather than one, wave of ANI admixture. But, it is quite possible that this is unrelated and that the ANI in both waves of admixture was basically the same genetic population.

I should also note that while ASI (Ancestral South Indian) genetics are probably autochthonous and from a single source, the ANI is a component is probably composed of multiple layers: a Paleolithic North Indian layer with a clinal relationship to ASI (possible pre-dating the Last Glacial Maximum ca. 18,000 BCE given its affinity of ASI to Onge DNA), a Harappan layer from West Asian Neolithic migrants (ca. 7000 BCE given the date of the Indus River Valley Neolithic), but possibly including additional migration, for example, in connection with an Uruk expansion from Mesopotamia that occurs around the time that Harappan civilization with which it had trade ties, emerges, and at least one subsequent layer of Indo-Aryan steppe migrant contributions (ca. 2000 BCE to 1500 BCE), but quite possibly two layers.

Uruk Expansion

Razib Khan has been dropping hints in several posts that he is interested in the explanatory possibilities of the Uruk expansion from Mesopotamia ca. 3600 BCE, that could be an important genetic and cultural source for the people in the nearby highlands of Anatolia, the Caucasus and West Asia, and ultimately perhaps also cultures influenced secondarily by those cultures such as the Minoans and maybe even the Yamnaya people.

I agree and think that this could be at the root, which is perhaps too deep in time depth to be discerned definitively from linguistic data given the quality of the available linguistic data, of the ergative languages of this region and perhaps also Basque.

About Kashmiri

Another interesting aside, from a post at the Brown Pundists blog, is that the Kashmiri language is closer in many respects to Sanskrit than other Indo-Aryan languages.

