Thursday, October 27, 2016

mtDNA U3a1

A blog post by Maju calls attention to the interesting mtDNA clade, U3a1, an example of which turned up in ancient Basque mtDNA from ca. 1300 BCE, as part of a quite small sample of ancient mtDNA from Basque Country.

A page on mtDNA U3 from Family Tree DNA reveals this about the clade (emphasis added):
mtDNA haplogroup U3 is present in low percentages throughout Europe and Western Asia. It is an ancient haplogroup arising over 30,000 year ago from the very old haplogroup U. It rises to its greatest frequencies in the Near East and Southern Caucasus, that is the mountainous area of Western Iran, Georgia, Armenia, Azerbaijan, Turkey, Syria, Jordan and Iraq, where the percentages vary between 4 and 8 percent. Currently U3 can be divided into three subclades, U3a, U3b and U3c. The latter is a subclade of the original U3ac and split off from U3a 1000's of years ago. All three subclades occur in the above mentioned areas with the exception of Turkey and Armenia, where U3c appears to be absent and U3a is very rare, those countries being dominated by U3b. 
In Europe U3 is still common in Bulgaria and the eastern most islands of the Mediterranean Sea, Cyprus, Rhodes, Crete, where percentages rise to 3 or 4 percent, but becomes rarer and rarer as one moves west with one exception. Again Bulgaria, the Greek mainland and Etruscan Italy are dominated by U3b, whereas the Mediterranean Islands and the rest of Italy have all three subclades. 
The one exception is one sub-branch of subclade U3a called U3a1 which appears to have originated in Europe. At least at this point no instance of this clade has been observed in the Mid-East. This sub-branch dominates U3 in Western Europe especially along the Atlantic coast making up over 60% of U3 in these areas with frequencies rising to as high as 1% of the total population in Scotland and Wales and as high as 3 or 4% in Iceland. Also well over half of this sub-subclade is made up of one version of U3a1 called U3a1c, with a change at 16356, and which accounts for most of the distribution along the coastline from Norway to Northern Portugal. 
U3 also occurs along the North African coast which borders the Mediterranean. The subclade U3b dominates the eastern countries including Egypt and Ethiopia, whereas both U3a (in its older form found in the Near East) and U3b occur in the Berber occupied areas from Libya through Morocco, where again the percentage rises to as high as 1% of the total population. 
There are isolated pockets in the Near East where U3 occurs at a very high percentage of the population; U3 makes up 16% of the Adegei in the Northern Caucasus, about 18% of Iraqi Jews around Baghdad, 39% of Jordaneans of the Dead Sea Valley, 11% of the Qashqai in Southwest Iran (note these people speak a dialect closely related to Azerbaijani of the Caucasus) 17% in one study of Luri in the Western Zagros, 12% on the Greek island of Rhodes and also among the Romani (Gypsies) of Poland, Lithuania and Spain where percentages vary from nearly 40% to as high as 55%. 
Most of these groups show little variance with one or two sub-haplotypes dominating and appear to be due to founder effects and genetic drift in a small population. This is especially true of the Adegei and the Romani. Again the one exception is the Western Iranian mountains (Zagros) where both the Luri and the Qashqai show not only high percentages but also a high degree of variance including both U3a and U3b with U3c occurring nearby. . . .  
Although the distinction is largely made in the coding region, the subclades U3a, U3b and U3c can usually be distinguished in the HVR1 Region. U3a can almost always be recognised by the presence of 16390A (or simply "390A"). And U3c can almost always be recognised by the presence of 16193T and 16249C. Whereas U3b will have neither of these patterns. The common European subclade U3a1 can only be determined through a full test including the coding region, although U3a1c along the Atlantic coast can usually be recognised by a change at 16356C along with 16390A. Sometimes U3a1c will still be erroneously labeled as U4 by FTDNA and in some early studies, as 16356C is the defining marker of U4, but U3a1c can be distinguished from U4 by the presence of 16343G along with 16356C. 
It is interesting that wherever U3 can be observed both U3a and U3b is also generally observed in the same area although as pointed out above the ratios vary considerably. U3 can be observed as far east as Western Siberia and Southeast into Pakistan, Afghanistan and Northern India but in very low frequencies. U3c, an older version of U3ac, covers quite a surprising wide area given its extreme rarity. It can be observed from South of the Caspian Sea all the way over through the Mediterranean countries to the West Coast of Europe including Scotland. Currently it is believed that U3 entered Europe for the first time during the Neolithic movement of peoples 8 or 9 thousand years ago probably mainly through the Eastern Mediterranean but also up the Danube Valley from the Black sea. It appears to have been embedded within the larger population located somewhere in the mountainous areas surrounding the Levant from before 30,000 ybp and when these populations began to increase and spread about 18 or 20 thousand years ago there was left from the constant appearance of new clades and the going extinct of those already existing only 2 versions of U3: U3b and U3ac, which had been diverging from each other for 1000's of years. 
The distribution of mtDNA U3a1 shows a great deal of similarity with the distribution of mtDNA V, both of which are found in Berbers, Iberians and Scandinavians and generally track close to the Atlantic Coast. This is suggestive of a possible Mesolithic (or Epi-Paleolithic if you will), period of expansion after the Last Glacial Maximum and before the Neolithic Revolution as a minor component of a mix within a population that also included mtDNA V.

This migration would have been separate from the main, perhaps Neolithic, stream that crossed the Balkans and spread further into Italy with the Etruscans, which had both U3a and U3b.

The alternative hypothesis (and, of course, both could have happened, they aren't necessarily inconsistent) is that mtDNA U3a1 was spread during the Bell Beaker era which is a good fit to its high frequency in Scotland and Wales and Iceland.

There seems to be an implication, however, that the mtDNA U3a in Berbers is not U3a1, let alone U3a1c, which which imply that U3a1 could have been perhaps an Iberian mutation spread by Bell Beakers, with remaining U3 clades perhaps tracing to the earlier Mesolithic era.

Of course, my tendency to associate a shared Berber and European presence with the Mesolithic era driven in part by the high frequency of mtDNA V in the Saami of Finland, may not actually be justified.

After all, based on the phylogeny of the leading Berber Y-DNA clade and linguistic evidence, the ethnogenesis of the Berber language and people may date only to 3700 BCE, making it one of the youngest Afro-Asiatic languages, although mtDNA clade in the Berber people would likely have older origins than its Y-DNA clade as male dominated conquests like the Berber one seems to have been tend to have much more modest impacts on mtDNA clades in the conquered population. Berber ethnogenesis is not much older than Bell Beaker ethnogenesis, so it could be the Bell Beaker influences could still have penetrated the Berber founding population before it had completed its expansion. If Berbers originated on the eastern side of the Sahara around the time that it was becoming arid due to the climate change event that brought it to its current state, and then migrated west, the Berbers of far northwest Africa may have arrived quite a few centuries after Berber ethnogenesis, although obviously this would not be true if Berber migration was instead from west to east.

In the bigger picture, based upon haplotype diversity, the Western Zargos mountains look like a likely place of origin for mtDNA U3. The distribution of U3a also suggests that it made it was to Iberia and beyond to the Atlantic Coast via a maritime route that stopped on Mediterranean islands and Southern Italy.

The fact that the subclade associated with the Atlantic Coast is predominantly the very specific mtDNA U3a1c also suggests that this clade has a much shallower time depth than mtDNA 3a's split thousands of years ago.

10 comments:

Maju said...

They report it as U3a and not U3a1. Granted that it might be U3a1 or even U3a1a, as they are only using (quite probably, not specified) HVS-I for "identification" of haplogroups, but this is by no means certain.

Maju said...

Also re. chronologies, I would not dare to consider U3c as more recent than U3a. The tree is as follows (root at U3ac):

→ C6518T A10506G C13934T (G16390A) → U3a
→ T12843C A15613G (C16193T T16249C G16526A) → U3c

So U3a took THREE coding region mutations, while U3c took only TWO (in brackets are the HVS-I transitions, which are much less reliable for clock purposes, because they seem to "compensate" changes in the coding region, at least partly). So IMO U3c looks older than U3a, one "tick" older.

Maju said...

More complete, albeit also speculative, chronological reconstruction (coding region only):

→ A1811G → U2'3'4'7'8'9 (probably c. 40-45 Ka old)
·→ A14139G T15454C → U3
··→ A2294G T4703C G9266A → U3ac
···→ C6518T A10506G C13934T → U3a
····→ G3010A → U3a1
·····→ G7521A! → U3a1a
···→ T12843C A15613G → U3c
·→ T9698C → U8
··→ A3480G → U8b'c
···→ C7031T A10398G! → U8c (branch details not fully confirmed)

U in general is (unlike H) a fast-evolving haplogroup. But it's good to have calibration points anyhow, dates that act as terminus ante quem and that can only come from the archaeogenetic databases. We know, to begin with, that some U2 and other U lineages (U8c, U6, U5, U*) are found in the earliest sequenced European Upper Paleolithic (between 39 and 31 Ka BP), and we infer from its geo-structure that U in general must have expanded within the West Eurasian Upper Paleolithic c. 50-45 Ka BP. From this we can infer that U is about 50 Ka old and U2'3'4'7'8'9 about 40-45 Ka old.

To be more precise: the oldest known U2 is dated to 38 Ka BP (Kostenki), U8c c. 32 Ka BP (Paglicci), U6 c. 33 Ka BP (Pestera Muierilor) and U5 to 31 Ka BP (Krems). I added U8c to the tree above so we can compare: we know that U8c is four c.r. mutations downstream of U2'3'4'7'8'9 and that it is at least 32 Ka old. Let's say "35 Ka" for safety.

So, if U2'3'4'7'8'9 is c. 45 Ka old (older seems unlikely) and U8c c. 35 Ka old, each mutation took some 2.5 Ka to complete with the "molecular clock" method, maybe a bit more. Stretching the time lapse at the extreme, U is 50 Ka old and U8c 30 Ka old (most recent possible date is 30.6 Ka calBP), so 5 mutations divided by 20 Ka is 4 Ka per mutation. Let's take this last longest possible chronology, that would make:

→ U2'3'4'7'8'9: 46 Ka
·→ U3: 42 Ka
··→ U3ac: 30 Ka
···→ U3a: 18 Ka
····→ U3a1: 14 Ka
·····→ U3a1a: 10 Ka
···→ U3c: 22 Ka

So let me remain extremely skeptic about the claims of the "hyper-recent" origin of U3c, most probably caused by lack of sufficient knowledge of the relatively rare haplogroup. Being unsuccessful or non-European does not equal to be "recent".

andrew said...

To be clear, I understand that the find is U3a and not necessarily U3a1, but it is U3a1 and especially U3a1c that is the really interesting one in terms of distribution.

Also, I absolutely recognize that U3c is old. It is U3a1c that I would suspect has relatively recent time depth.

Maju said...

"It is U3a1c that I would suspect has relatively recent time depth."

It seems I got that wrong. Sorry.

Anyhow, have you even checked PhyloTree before making that claim? Because U3a1c is only separated of U3a1 by a HVS-I mutation, T16356C, so it should be about the same age as U3a1a and U3a1b (both accumulation one coding region mutation at the stem), maybe even older. Of course the molecular clock is not at all an exact method but there's nothing else we can use anyhow.

andrew said...

Absolute dating is hard. And, there seems to be a case that U3a1 arose about 9 kya, which would suggest that U3a1c arose later, although admittedly if that calibration of the mtDNA clock is right, it would be a stretch for it to be ca. 3700 BCE or later, rather than Mesolithic or early Neolithic.

The trouble is, it could very well have come into being in the early Neolithic somewhere in Western Europe (probably SW Europe) in the early Neolithic (and hence absent from other places where U3a is found) and then been present in a gene pool that wasn't expanding very rapidly until an expansion in connection with Bell Beaker.

Maju said...

"Absolute dating is hard."

Absolutely! All the dates I mentioned should be considered very approximative "educated hunches". However I do have a very radical criticism about the usual dates with the usual "molecular clock" methods, very particularly when applied to mtDNA, because they are almost invariably absurd. This is because they use methods devised some two decades ago, proven wrong once and again, but still reused because it is what you do when you are trapped in the University system, which demands scholasticism quite apparently, at least in population genetics: you just cite the old papers with no creativity nor criticism whatsoever and that seems to make you "right". Well, no: you could equally cite the Bible or Kant's "Critical Reason" (both equally junk but revered) as the basis for your conclusions, it's junk in: junk out. Revered junk is just useless junk all the same.

My reasoning about my guesstimates is explicit, clear and debatable, the reasoning behind "academic" versions of the "molecular clock" is hidden, obscure and beyond debate apparently. I'd really appreciate if people would stop believing what they are told and began thinking on their own, really, because science is not about "beliefs" nor about "authority" but about proven facts and sound logic.

Just to make sure, I'm not at any moment questioning your hypothesis about U3a1a, just questioning pseudoscientific albeit "academic" molecular-clock-o-logy mentioned a bit too happily, acritically, in your article.

andrew said...

Absolutely fair.

Chris Davies said...

mtDNA haplogroup U3 is found in Chad in Arabs and Buduma, although subclade not specified (also U4 in Shuwa). http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0018682

andrew said...

Thanks for the tip.