Thursday, December 8, 2016

Sardinian Population Genetics

Why Care About Sardinian Genetics?

The population genetics of Sardinia are notable mostly because it is the population that is the closest modern match to the population genetics of the Neolithic era first farmers of Europe. 

In large part because it was an island in the relatively warm Mediterranean sea, it was basically unpopulated prior to the Neolithic Revolution, and experienced far less Middle Neolithic to Early Bronze Age change in its populations than Continental Europe, due to migration from the Steppe. 

This was from both Indo-European initially in the form of the Corded Ware Culture, from the Bell Beaker culture (of uncertain linguistic affiliation) and from later linguistically Indo-European and Uralic migrations into Europe. 

Findings Of The New Large Sample Size Study

A new study with a larger sample size than its predecessors has found that the mountainous regions of the island have been particular stable population genetically since it was settled by first wave farmers of the Neolithic era. 

As in most cases in Europe, new migration to Sardinia from the mainland appears to have been male biased, even before the Bronze Age.

Why Is There A Basque Affinity In Sardinian Population Genetics?

It isn't entirely clear if the stronger genetic affinity of this Sardinian highlands population to the Basque population is due to a greater proportion of hunter-gatherer and Neolithic ancestry, or is due to demic Bell Beaker impacts on Sardinia. 

One way to model the population genetics of Western Europe is to see four primary layers: a Mesolithic layer of Western European hunter-gatherers, a first wave Neolithic farmer layers, a Bell Beaker layer (dating to the early Bronze Age), and an Italo-Celtic-Germanic Indo-European layer (dating to the early Iron Age). 

In this model, the Basque have contributions from the first three layers, while the rest of Western Europe has contributions from all four layers. A reduced "Bronze Age Steppe" component in Sardinians and Basque people could reflect the absence of the fourth layer but not the third, or could reflect a reduced third and fourth layer contribution.

This study, however, does not appear to really consider this possible model and instead assumes three layer model without distinct Bronze Age and Iron Age Steppe migrations, even though the Y-DNA data seems to support such a model.

The body of the paper states regarding the Basque affinity:
Due to its smaller long-term effective population size (Figure 5A), Sardinia is expected to have undergone accelerated rates of genetic drift. To correct for this when measuring similarity to other mainland populations, we used “shared drift” outgroup-f3 statistics (Raghavan et al. 2014), which are robust to population-specific drift. Using this metric, we find the Basque are the most similar to Sardinia, even more so than neighboring mainland Italian populations such as Tuscany and Bergamo (Figure S6A, S6B). 
This relationship is corroborated by identityby-descent (“IBD”) tract length sharing, where among mainland European populations, French Basque showed the highest median length of shared segments (1.525 cM) with Arzana (Figure S7). 
We also tested the affinity between Sardinians and Basque with the D-statistics of the form D(Outgroup, Sardinia; Bergamo or Tuscan, Basque). In this formulation, significant allele sharing between Sardinia and Basque, relative to sharing between Sardinia and Italian populations, will result in positive values for the D-statistic. We find that Sardinia consistently showed increased sharing with the Basque populations compared to mainland Italians (|Z|> 4; Figure S6C), and the result was stronger when using the Arzana than Cagliari sample (DARZ = 0.008 and 0.0096, DCAG = 0.0072 and 0.0087 for French Basque and Spanish Basque, respectively). In contrast, sharing with other Spanish samples in our dataset was generally weaker and not significant ( |Z| < 3.5; Figure S6C), suggesting the shared drift with the Basque is not mediated through Spanish ancestry. . . . 
We generally find that Sardinians have the highest observed levels of shared drift with early Neolithic farming cultures and low levels of shared drift with earlier hunter-gather samples (Figure 6A, Figure S9, Table S5), consistent with previous reports. Surprisingly though, when examining allele sharing within Sardinia, we found that both ancient Neolithic farmer ancestry and pre-Neolithic ancestry are enriched in the Gennargentu-region. 
First, we find that shared drift with Neolithic farmers and with pre-Neolithic hunter-gatherers is significantly correlated with the proportion of “Gennargentu-region” ancestral component estimated from ADMIXTURE analysis, while shared drift with Steppe pastoralists has a weak negative correlation with Gennargentu-region ancestry (Figure 6B). 
Second, using supervised estimation of ancestry proportion based on aDNA (Haak et al. 2015), we estimate higher levels of Neolithic and preNeolithic ancestries in the Gennargentu region and higher levels of Steppe Pastoralist ancestry outside the region (Figure S10). Finally, calculations with Patterson’s D-statistics of the form D(Outgroup, Ancient, Ogliastra, Non-Ogliastra) also support increased sharing with Neolithic and pre-Neolithic individuals, but not post-Neolithic individuals from the Steppe, in the Ogliastra samples (D = -0.0037 and -0.0042, |Z| = 7.4 and 7.9 when aDNA sample = Stuttgart and Loschbour, respectively; D = -0.0009, |Z| is not significant, when aDNA sample = Yamnaya) . 
Together, these results suggest that while at the regional-level Sardinia appears to harbor the highest amounts of Neolithic farmer ancestry and very little of the pre-Neolithic hunter-gatherer or Bronze Age pastoralists ancestries, there exists within-island variation. Specifically, we find that with increasing level of isolation (represented by increasing level of the Gennargentu ancestry), there are greater Neolithic farmer and pre-Neolithic hunter-gatherer ancestry, while the less isolated Sardinians have a stronger signal of ancestry from the Steppe pastoralist source. . . . 
We also found Sardinians show an impressive signal of shared ancestry with the Basque, in terms of identity-by-descent tracts and the outgroup f3 shared-drift metric. Such a connection is consistent with long-held arguments of a connection between the two populations, including claims of Basque-like non-Indo-European language words among Sardinian placenames (Blasco Ferrer 2010). More recently the Basque have been shown to be enriched for Neolithic farmer ancestry (Lazaridis et al. 2014, Gunther et al. 2015) and Indo-European languages have been associated with Steppe population expansions in the Bronze Age (Allentoft et al. 2015, Haak et al. 2015). These results support a model in which Sardinians and the Basque may both retain a legacy of pre-Indo-European, Neolithic ancestry (Gunther et al. 2015). . . . 
The high frequency of particular Y-chromosome haplogroups (particularly I2a1a2 and R1b1a2) that are not commonly affiliated with Neolithic ancestry is one challenge to a model in which Sardinian principally has Neolithic ancestry. Whether such haplogroup frequencies are due to simple genetic drift and/or a signal of sex-biased demographic processes has been an open question. By carrying out X versus autosome comparisons we uncovered evidence of sex-biased patterns of ancestry in Sardinia, and found an enrichment of Neolithic and preNeolithic ancestry on the X-chromosome across all groups on the island. 
The Abstract and Citation For The Paper
The population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of traits and diseases. The history of the Sardinian population has also been the focus of much research, and in recent ancient DNA (aDNA) studies, Sardinia has provided unique insight into the peopling of Europe and the spread of agriculture. 
In this study, we analyze whole-genome sequences of 3,514 Sardinians to address hypotheses regarding the founding of Sardinia and its relation to the peopling of Europe, including examining fine-scale substructure, population size history, and signals of admixture. 
We find the population of the mountainous Gennargentu region shows elevated genetic isolation with higher levels of ancestry associated with mainland Neolithic farmers and depleted ancestry associated with more recent Bronze Age Steppe migrations on the mainland. Notably, the Gennargentu region also has elevated levels of pre-Neolithic hunter-gatherer ancestry and increased affinity to Basque populations. 
Further, allele sharing with pre-Neolithic and Neolithic mainland populations is larger on the X chromosome compared to the autosome, providing evidence for a sex-biased demographic history in Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.
Charleston et al., Population history of the Sardinian people inferred from whole-genome sequencing, bioRxiv, Posted December 7, 2016, doi: via Eurogenes.


Chris Davies said...

"Why Is There A Basque Affinity In Sardinian Population Genetics?"

The HLA haplotype A30-Cw5-B18-DR3-DQ2 is instructive. This is the #1 HLA haplotype in Sardinia. It appears to have formed in northern Africa, migrated to Sardinia early on, reached high frequency in Sardinia due to founder effect / drift, and then entered mainland Europe with second highest European frequency in Basques. The full haplotype, or variations on it, can be found in Maghrebi Berbers, Senegalese Mandenka, Ghanaians, Chadic speakers in north Cameroon, Sudanese, and Kenyan Luo [although Africa is badly under-sampled].

Italy Sardinia Pop.3 - 12.50% [highest-frequency class I HLA haplotype in Sardinia]
Spain Gipuzkoa Basque - 6.10% [second-highest frequency class I HLA haplotype in Basques]
Portugal South - 4.10%
Spain Murcia - 3.50%
Spain Majorca & Minorca - 2.20%
France Corsica Island - 2.00%
Spain Catalonia Girona - 1.70%
Spanish expats/migrants in Germany - 1.57%
Portugal Beja - 1.50%
Italy Pop. 5 - 1.26%
Portugal Faro - 1.20%
Portugal North - 1.10%
Italian expats/migrants in Germany - 0.95%
Portuguese expats/migrants in Germany - 0.81%
Romanian expats/migrants in Germany - 0.41%
Albanian Pop.2 - 0.12%
Greek expats/migrants in Germany - 0.11%
Russia Moscow Pop.2 - 0.10%
Germany Pop.7 - 0.10%
Poland DKMS - 0.09%

andrew said...

I wonder if there is any way to date when those entered the genomes in question. I can think of plausible narratives for later introductions in both Basque and Sardinia in more recent times, although different for each.

Chris Davies said...

Older introduction to Sardinia as haplotype is considerably more equilibrated. Entry probably in the order of 7 or 8 kya+ minimum. Haplotype is much less equilibrated in Basques, so entry might be more like <5kya. On the Wikipedia entry the author states that the Basque and Sardinian haplotypes had a different Cw allele, but looking on one can see that they are definitely the same.