Some Southern Central Asians Have About 50% Steppe Ancestry

Davidski at Eurogenes discusses some Southern Central Asian populations in a recent post. This post expands on his rather raw genetic analysis to put his conclusions in historical and anthropological and linguistic context.

Pamiri children via Wipiedia. There are about 350,000 Pamiris in their region, 
who are mostly Shia Muslims

A map showing where Pamiri people live from the same Wikipedia page. 
This region called Badakhshan, is divided between Tajikistan in the north 
and Afghanistan and Pakistan in the south and China's Xinjiang to the west.

Pamiri Tajik Genetics

Davidski demonstrates with convincing autosomal and uniparental genetic evidence that 50% or more of their gene pool is derived from the Pontic Caspian Steppe in Europe (i.e. basically modern Ukraine) with the balance mostly from Iranian farmers and about 8% directly from East Asians. This is consistent with their phenotype, but contrary to the oral history of the origins of their people.
Several South Central Asian populations have a reputation for producing individuals who look surprisingly European, even the lighter shade sort of European from Eastern and Northern Europe. This is especially true of the Pamiri Tajiks, and it's unlikely to be a coincidence, because these people probably do harbor a lot of ancient Eastern European ancestry. 
My own estimates, using various ancestry modeling methods, suggest that Pamiri Tajiks derive ~50% of their genome-wide genetic ancestry from populations closely related to, and probably derived from, Eneolithic/Early Bronze Age pastoralists from the Pontic-Caspian steppe of Eastern Europe, such as the Sredny Stog and Yamnaya peoples.
In his autosomal genetic model he is "using the mostly Yamnaya-derived Iron Age Sarmatians from Pokrovka, Russia, in far Eastern Europe, to illustrate the point. Note that Sarmatians were East Iranic-speakers, which is what Tajiks are."
Many South Central Asian groups, and especially Indo-European-speakers, like the Tajiks, show moderate to high frequencies of two Y-chromosome haplogroups typical of Bronze Age Eastern Europeans: R1a-M417 and R1b-M269. This is old news to the regular visitors here and its implications are obvious, so if you still think that these haplogroups expanded from South Central Asia to Eastern Europe, rather than the other way around, then please update yourself (for some pointers, see here and here). 
And now, courtesy of Peng et al. 2017, we also have a much better understanding of ancient European maternal input into the gene pool of Pamiri groups (see here). The paper doesn't specifically cover the topic of European admixture in South Central Asia, but it nevertheless demonstrates it unequivocally.
In Peng (2017), many mtDNA lineages are: 
shared between Europeans and Central and South Asians; quite a few of these lineages are rooted in Eastern Europe, as shown by both modern-day and ancient DNA, so they strongly imply gene flow, and indeed considerable maternal gene flow, from Eastern Europe deep into Asia. Worthy of note are the lineages belonging to such relatively young (likely post-Neolithic) haplogroups as U5a1a1, U5a1d2b, U5a2a1, and U5b2a1, all of which have already been found in ancient remains from the Pontic-Caspian steppe.
The History of the Pamiri People

The Badakhshan region has had a Silk Road trading based economy attested to with archaeological evidence dating to before 4000 BCE. 

The legendary historical origin story of the Pamiri people, or at least the homegrown dynasty of rulers of Badakhshan, traces the dynasty's origins to the reign of Alexander the Great in the 4th century BCE, although no linguistic or genetic or cultural evidence supports this Iron Age legendary Hellenic history. But, it isn't impossible as the Eastern part of Alexander the Great's empire was a conquered Persian population which may not have involved many Greek and Anatolian migrants below the highest levels of the ruling class and soldiers who did not relocate permanently. It is one of several historical instances where conquerers adopted to a great extent the sophisticated culture of the land they conquered.

Ethnically Persian, local poet and religious scholar Nasir Khusraw led the conversion of the ancestors of the modern Pamiri people to Shia Islam in the 11th century CE.

The kingdom of Badakhshan is historically well attested and flourished from the 15th century CE to 1641 CE, and is mentioned in the Marco Polo story (whose historical accuracy is admittedly sometimes questionable). For the next two centuries, until 1839 CE, the region alternated between conquering outsider Emirs and Khanates, and home grown princes. After that British, Russian and Chinese colonial powers had their way with various parts of the region in succession and not always peacefully. Ultimately today, the region ended up split between Afghanistan and Pakistan (the successors to the British colonial rulers), Tajikistan (successor to Russian and then Soviet colonial rulers), and an autonomous region on the far western frontier of China.


At a minimum, Alexander the Great's Iron Age empire, while it may have been one important source of the Iranian Neolithic ancestry of the Pamiri people, which has significant Indo-Iranian steppe admixture itself from sometime in the time frame from ca. 2500 BCE to 1500 BCE or so, was almost surely not the primary source of the substantial and distinctively steppe ancestry found in the Pamiri people relative to Iron Age Iranians.  And the Pamiri apparently also also show little South Asian proper genetic admixture, either in their whole genomes or in their uniparental haplogroups.

This suggests that in the Bronze Age, East Iranian steppe people may have mostly replaced the previous population that had traded there for 1500 to 2000 years or so when these Indo-Europeans arrived.

Part of what makes the ancestry of the Pamiris interesting is that that they have not just autosomal genetic and mtDNA similarity to the Yamnaya people of Ukraine the way the Corded Ware people did, but also significant amounts of Y-DNA R1b-M269 associated with the Southern Steppe, in addition to Y-DNA R1a-M417, associated with the Northern Steppe (which was the predominant Y-DNA haplogroup among both Corded Ware people and people with Indo-Aryan ancestry in South Asia).

Indeed, the fact that the two Y-DNA haplogroups are mixed here, unlike nearby regions adjacent to the steppe, suggests a plausible two step migration, with Y-DNA R1b Yamnaya people forming the first wave that cleared away older populations of the region, and Indo-Iranians (who were predominantly Y-DNA R1a) forming a second wave, perhaps not that many centuries afterwards, who gave the Pamiris their East Iranian language, Y-DNA R1a and an additional dollop of steppe autosomal genetics and steppe mtDNA.

The Y-DNA R1b Yamnaya people, while similar in autosomal genetics and mtDNA to steppe people further north with mostly Y-DNA R1a, whom we can characterize much better linguistically, are almost surely ancestral to Western and Northern European populations where Y-DNA R1b-M269 is predominant. But, the story of their long westward journey and the near disappearance of Y-DNA R1b-M269 men from the steppe they originally inhabited before the Indo-Iranian and Corded Ware expansions as demonstrated by a stark transition in Y-DNA types in ancient DNA in their own homeland is much more of a mystery. 

The linguistic connections between their language(s) the languages spoken in the parts of Europe and Asia where their descendants are now most common is also more obscure. It would be plausible for them to have been linguistically Indo-European, but it isn't a foregone conclusion either, because anthropological and linguistic evidence suggests that the Celtic languages which are the oldest attested languages of the regions where their descendants settled have far to little time depth (1300 BCE or so at most) to have arrived at the same time that the descendants of the Yamnaya people arrived in Western and Northern Europe (ca. 2500 BCE) a dozen centuries earlier.

The Pamiri people are testimony to the fact that the collapsing Yamnaya people had spread to the East on the Silk Road as well as to the West, before or in connection with their collapse.

We don't know if the Y-DNA R1b in the Pamiri is a relict population of a formerly greater Yamnaya range or if it arrives as part of an exodus from Ukraine. But, the fact that they speak an Indo-Iranian language associated with Y-DNA R1a steppe people, denies us much insight in the linguistic character of the Y-DNA R1b steppe people who preceded them, unless substrate influences can be discerned in their East Iranian languages that are not shared by predominantly Y-DNA R1a Indo-Iranian populations.

It would be particularly notable as we try to reconstruct Yamnaya linguistics, if substrate influences found in the Pamiris languages were also shared by Indo-European languages in the Balkans were there is also a mix of Y-DNA R1a and R1b, and in Celtic languages, but was absent in other Balto-Slavic languages and other Indo-Iranian languages. But, those are very fragile ties to establish and doing so is more linguistic art than linguistic science. Alas, in the case of the Pamiris, their historic role as Silk Road traders makes it particularly hard to distinguish substrate influences in their languages from loan words picked up from travelers and fellow Silk Road merchants.


Could any of this just be modern Russian admixture? The Russians have been all over this region for 200 years or so.It was part of the Russian empire before being incorporated into the Soviet Union.

We already know that Scythians and Samartians had Yamnaya-related ancestry.

Yes. Davidski has in general been peddling this inaccurate and "ideologically motivated" view of what he terms using non-factual terminology like ANE ancestry which is totally repudiated. Underhill has already identified the probable source of R1a* between eastern Anatolia and northern Iran and R1b unfortunately suffers the same fate of Asiatic origin given much older haplogroups in West Asia. Even more so scholars have recently posited that Yamnaya genetics does in fact derive from southward Asiatic groups from Iran and Central Asia. Perhaps recent population history as proposed by Saiden comes steppe groups like Late Iron Age Scythians, but as a whole R1a haplogroups are more diverse - R1a1 Z282 being found in Arabia, Central Asia and Armenia of all places - in Asia and R1b clearly originated in Asia before moving into Europe. Davidski simply confuses the genetic influx that resulted from recent historical migrations from the steppe and the older basal layers that have an Asiatic origin in his non rigorous layman analyses.

Meanwhile outside of Hindunationaliststan the oldest R1A samples have been found nowhere near India or Iran:

The upcoming South Asian papers aren't going to confirm your Hindu Nationalist Revisionism. Are you going to riot over it?

I do not have nearly the tolerance that Davidski does for personal attacks back and forth in the comments.