Ancient DNA from the Carpathian Basin (5700 BCE to 3900 BCE)

A new study reviews ancient DNA results from six successive archaeological cultures from the terminal culture of pre-Neolithic hunter-gatherers to the middle Neolithic era in Western Hungary.  I was slightly influenced by it when writing my recent R1b post, but it deserves separate mention.

Bernard's blog provides the source data on mtDNA haplogroups, Y-DNA, the geographic context of the cultures, and PCA analysis based upon haplogroup frequencies.

While the results aren't paradigm busting, a few observations about the data are in order:

The Ancient DNA Sample

* There are a remarkable 336 ancient DNA samples in the study from which at least mtDNA results are obtained.  The data I've seen from the study do not discuss autosomal DNA, although it must be present for at least the samples from which Y-DNA haplogroups were obtained.

The Cultures

* There are six Neolithic cultures studied.  The earliest is Starcezo, which precedes the Vinca culture in essentially the same territory.  The Vinca culture, LBK culture, Sopot culture and Lengyel culture are neighbors geographically and essentially contemporaneous with each other.  The Vinca is the most Southeastern of the four contemporaneous cultures.  The Lasinja culture post-dates the Vinca, LBK and Lengyel culture and is contemporaneous with the late Sopot culture.  The Sopot culture is in a small area basically to the due north of the Vinca culture.

* Starcezo may be a first wave Neolithic culture, with the Vinca culture as the culture that follows after the first major farming bubble bust in the Neolithic era with a resurgence of hunter-gatherer DNA.  Alternately, the resurgence of hunter-gatherer DNA in Vinca samples may be due to the "economic stresses caused by decreasing soil fertility" after two millennia of intensive farming that this culture itself experienced in its declining years.  But, the Neolithic cultures further north and west don't experience the same resurgence of hunter-gatherer population genetics.

* Historically, there has been debate in anthropology between those who see Vinca as a local evolution of Starcezo, and those who see it as a result of migration from Anatolia.  The genetic data seems to favor the former scenario, but suggests that the LBK, Sopot and Lengyel cultures may involve more new migration from Anatolia or elsewhere.

* The Vinca culture is notable as one of the earliest in which proto-writing and commercial seals, similar to those found in the Harappan civilization, appear in Europe and pre-dates the use of the Indus Script of the Harappan civilization by more than 700 years (seals were also used in the Copper and Bronze Age civilization of Anatolia, Sumeria, Assyria and Egypt).  Per Wikipedia on Vinca culture:
The Vinča culture occupied a region of Southeastern Europe (i.e. the Balkans) corresponding mainly to modern-day Serbia, and Kosovo, but also parts of Romania, Bulgaria, Bosnia, Montenegro, Macedonia, and Greece. This region had already been settled by farming societies of the First Temperate Neolithic, but during the Vinča period sustained population growth led to an unprecedented level of settlement size and density along with the population of areas that were bypassed by earlier settlers. Vinča settlements were considerably larger than any other contemporary European culture, in some instances surpassing the cities of the Aegean and early Near Eastern Bronze Age;a millennium later. One of the largest sites was Vinča-Belo Brdo, it covered 29 hectare and had up to 2,500 people.

* All 19 of the hunter-gatherer samples have one of five different mtDNA U haplogroups (counting 2 mtDNA U not further specified due to degradation of the sample as a separate category).

* All of the Neolithic samples have substantial mtDNA diversity (at least six or more mtDNA top level haplogroups represented).  At least some Neolithic sample has examples of mtDNA not found in the hunter-gather sample including H, HV, V, J, K, N1a, T1, T2,  U3, U8, W and X.  This tends to suggest that at least the maternal origins of the first wave Neolithic peoples of Europe were part of the culture that fused populations that before the Neolithic revolution were distinct into a melting pot whole.  As I note below, there was far less Y-DNA diversity, a fact that casts some doubt on the anthropological conception of the archaeological cultures of "Old Europe" as matrilineal.

* A quite significant share of the Vinca samples are also mtDNA U (about 22.5%), but it is quite rare in the other Neolithic samples.  This is a resurgence in mtDNA U frequencies from the previous Starcezo culture.

* There is a significant mtDNA mix difference between Starcezo and Vinca on one hand, and the other Neolithic cultures, including a significantly higher frequency of mtDNA H in the other cultures, and a significantly lower frequency of mtDNA U in the other cultures.  The Starcezo culture has only 6.9% mtDNA H and the subsequent Vinca culture has even less, but the levels of mtDNA H are much higher in the other Neolithic cultures in the sample (LBK, Sopot, Lengyel and Lasinja), which have percentages of mtDNA H from 16.67% to 28.54%, still lower than modern levels, but much higher than Starcezo and Vinca.


* There is no hunter-gatherer ancient Y-DNA in the sample.  Elsewhere in Europe, ancient hunter-gatherer Y-DNA has almost universally been haplogroup I1.

* The Neolithic samples (pre-Bronze Age) include Y-DNA G2a, F*, I2, E-M78, J2 and surprisingly C.  There is also one sample of Y-DNA I1 from the LBK culture (possibly representing a hunter-gatherer male integrated into farmer society at some point).  There are Y-DNA samples from all six of the Neolithic cultures studied.

It bears noting that this is not a "Near Eastern" mix, a description often ascribed to the genetics of first farmers in Europe.  Y-DNA G2a, F* and J2 are what we would expect from a source in Anatolia or the Caucasus Mountains or the highlands of Iran, not the Levant or Arabia or even the lowlands of Mesopotamia (although J2 would surely be found in Mesopotamia in significant proportions).  Y-DNA I2 suggests Paleolithic variation in European hunter-gatherer populations that happened to crop up in the Balkans (and later re-emerged as a post-Nordic Neolithic revolution Scandinavian Y-DNA clade).  Y-DNA E-M78 is the only Y-DNA clade in the first farmer population of Europe that evokes Near Eastern origins and this makes up only about 6.67% of the Neolithic ancient Y-DNA population sample in this study and could have been picked up by the populations where it is found en route from their place of origin to the places that they settled in Europe in the Neolithic era.

The first European farmers probably emerged from the highlands that form the Southern boundaries of Europe and West Asia, rather than from what we would conventionally think of as the "Near East" proper.

* The Starcezo and Vinca culture combined have 4 examples of Y-DNA G2a, 3 of F*, and one of I2a1.  The absence of Y-DNA E-M78 is notable, because this is present at quite high levels in the part of Europe where the Vinca culture was present.  Instead, the two E-M78 samples in the study come from further north, where E-M78 is found only in moderate frequencies today.

* But, one shouldn't read too much into a sample size of just eight men, when other evidence suggests that there is more to the story.  If the N=8 sample were random and independent (which it isn't) a proportion of Y-DNA E-M78 in the Neolithic era as E-M78 of about 33% (the peak modern European percentage found in Albania) is just barely ruled out at the 95% confidence level, and moderately lower modern proportions of E-M78 in the Starcezo and Vinca regions are not excluded in a statistically significant way.  A Neolithic proportion of E-M78 of 17% for example, would be perfectly consistent with a failure to find any samples of it in a sample size of N=8 which is random and independent (which these samples are not).

The Starcezo and Vinca cultures are in just the right place at just the right time to be the source of the high levels of Y-DNA E-M78 in the region in modern Europe (and for that matter as the place where the E-M78 introgressed into other early Neolithic culture's gene pools).  The explosion in the regions population relative to the Starcezo culture upon the appearance of the Vinca culture's much more successful agricultural society, also creates a moment in time where founder effects might cause a moderate underlying level of E-M78 (and in particular, E-V13), to be dramatically amplified relative to the Starcezo background in a way that would leave a permanent imprint on the region's population genetics.

Everything we know about Bronze Age migration into the region disfavors that migration period as a major source of E-M78 in this region.  So, one is left between a choice of a Neolithic source and an Iron Age or later source for this part of the Y-DNA gene pool in the Balkans.  In the latter scenario, perhaps E-V13 arrived in the Balkans during the time period when it was part of the Ottoman empire, since some accounts assert that there was significant population replacement and mass resettlement of Ottoman populations into its borderland. But, given the antiquity of E-M78 in Europe (e.g. in a 7000 year old Spanish individual), the Neolithic possibility still seems more likely, despite the lack of Hungarian ancient DNA to support this hypothesis in this small sample.  The Neolithic theory also has support from the time depth of the E-V13 subclade of E-M78 which is the predominant Y-DNA E clade in the Balkans of about 8,100 years ago, right around the time of the Neolithic revolution and the Vinca population explosion there.

UPDATE (July 2, 2015): Wikipedia recalls a literary reference that could be relevant to the question of Y-DNA E in the Balkans:
In Aeschylus's play, The Suppliants, the Danaids fleeing from Egypt seek asylum from King Pelasgus of Argos, which he says is on the Strymon including Perrhaebia in the north, the Thessalian Dodona and the slopes of the Pindus mountains on the west and the shores of the sea on the east; that is, a territory including but somewhat larger than classical Pelasgiotis. The southern boundary is not mentioned; however, Apis is said to have come to Argos from Naupactus "across" (peras), implying that Argos includes all of east Greece from the north of Thessaly to the Peloponnesian Argos, where the Danaids are probably to be conceived as having landed. He claims to rule the Pelasgians and to be the "child of Palaichthon (or 'ancient earth') whom the earth brought forth".

The Danaids call the country the "Apian hills" and claim that it understands the karbana audan (accusative case, and in the Dorian dialect), which many translate as "barbarian speech" but Karba (where the Karbanoi live) is in fact a non-Greek word. They claim to descend from ancestors in ancient Argos even though they are of a "dark race" (melanthes ... genos). Pelasgus admits that the land was once called Apia but compares them to the women of Libya and Egypt and wants to know how they can be from Argos on which they cite descent from Io.

In a lost play by Aeschylus, Danaan Women, he defines the original homeland of the Pelasgians as the region around Mycenae.
The original legend is also set out more fully here, where it is notable that Pliny the Elder;s "Natural History" is another source for the legend, and it is conjectured that:
Even a cautious reading of the subtext as a vehicle for legendary history suggests that a Pelasgian kingship in archaic Argos was overcome, not without violence, by seafarers out of Egypt (compare the Sea Peoples), whose leaders then intermarried with the local dynasty. The descendants of Danaus' "blameless" daughter Hypermnestra, through Danaë, led to Perseus, founder of Mycenae, thus suggesting that Argos had a claim to be the "mother city" of Mycenae.
Naively, this story suggests that rather than being European hunter-gatherers who were assimilated into the first wave of Neolithic farmers in the Balkans, that the Pelasgians may have been the first wave Neolithic farmers in the Balkans (who probably arrived ca. 7000 BCE-6000 BCE), and that a Y-DNA E-M78 people called the Danaids arrived in a post-Neolithic folk migration from Egypt or Libya after the Pelasgians were established, but before the arrival of the Indo-European Greeks (who probably started to appear around 2000 BCE).  The fact that this migration was not lost to the tradition when Aeschylus was writing in classical Iron Age Greece, and that Egypt may have been a meaningful polity at the time, suggests that the Danaids may have arrived in the Eneolithic era (i.e. Copper Age) ca. 4000 BCE - 2000 BCE in this part of Europe.

The comparison to the Sea Peoples (who are associated with Bronze Age collapse ca. 1200 BCE and some of whom were ethnically Indo-European Greek), however, poses the intriguing possibility that there may have been a similar phenomena in the wake of the similar empire felling climate event around 2000 BCE, placing the Danaids in a very specific time and place indeed, and having them arrive on the scene very shortly indeed before the arrival of the Mycenians whose language become Ancient Greek, just as in the narrative.

Obviously, legendary history alone shouldn't be taken at face value, but Aeschylus gives us a general time frame and a very specific geographical pointer regarding where to look for archaeological evidence that the Danaid migration ever happened.  One could point to a pyramid like structure in Argos whose age and interpretation is disputed as a sign of an Egyptian connection to Argos though the Danaids. END UPDATE.

* Y-DNA R1b does not appear until the Bronze Age, where there are 2 Y-DNA R1b samples and there is one Y-DNA I1 sample.  The study does not appear to include Bronze Age ancient mtDNA samples.  Naively, I would have expected Y-DNA R1a rather than R1b in the Bronze Age this far to the east in Europe.  Could it be that R1b made its way to Western Europe via the Balkans and then across the Southern part of Europe to Iberia, perhaps originating in the Yamnaya people?

As Maju has noted at this blog and in comments on one of my recent posts, it isn't sufficient to look merely at crude top level Y-DNA clades to resolve these questions, because what may look the same at gross level, may imply an impossible phylogeny when more detailed Y-DNA clade descriptions are considered.  We are looking for a trail of parent to daughter clades from East to West until you get to the root of Western European Y-DNA R1b clades.

As noted above, small samples can be misleading, but the total absence of any Y-DNA R in any of the 30 pre-Bronze Age samples from the region, while it may not rule out the possibility that there was some Y-DNA R in the region during the Neolithic, does seriously disfavor the possibility that Y-DNA R was present at anything approaching its modern frequency in the region during the Neolithic era.  Modern populations in this region have from 20%-60% Y-DNA R1, with significant amounts of both R1a and R1b in almost all of these populations.

If the N=30 sample were random and independent (which it isn't) a proportion of R1 in the Neolithic era or more than about 10% is disfavored at the 95% confidence level, and the expected proportion of R1 would be not more than about 5% given the data in the sample that we do have and ignoring our Baysean priors from our wider knowledge of ancient DNA.  The nature of sample size statistics is that even the slight improvement from N=8 to N=30 dramatically improves statistical power, even though this improvement shows diminishing returns as N increase (to the point where, for example, the difference in statistical power between a sample size of 800 and 3000 is quite modest).

Also, knowing what I know now about the fact that Slavic peoples made their way into the Balkans from the North and had a significant demic component (unlike many other migration period "barbarians"), it isn't implausible that at least some of the Y-DNA R1a in the Balkan region today arrived as late as the Iron Age from Slavic migration (which would have predominantly Y-DNA R1a relative to Y-DNA R1b), and that the ratio of R1b to R1a in the Balkan region was higher in the Bronze Age than it is today.  And, the R1b blending proportions with R1a in the Balkans, unlike in many other parts of Europe, is not a good fit to historical boundaries between the Bell Beaker cultural area and the Corded Ware cultural area that succinctly describes the relative proportion of these Y-DNA types in most of Europe.  So there is a window in time in the Bronze Age during which R1b could plausibly have been dominant in a population that migrated through Western Hungary on to Western Europe.

