Wednesday, October 16, 2024

Papuan Demographic History From Modern Genomes

A new pre-print at bioRxiv disputes the status of Papuans (presumably together with aboriginal Australians) as an outgroup to both European and Asian populations. Instead, it positions them as a sister population of other Asian populations.
The demographic history of the Papua New Guinean population is a subject of significant interest due to its early settlement in New Guinea, at least 50 thousand years ago, and its relative isolation compared to other out of Africa populations. This isolation, combined with substantial Denisovan ancestry, contributes to the unique genetic makeup of the Papua New Guinean population. Previous research suggested the possibility of admixture with an early diverged modern human population, but the extent of this contribution remains debated. 
This study re-examines the demographic history of the Papua New Guinean population using newly published samples and advanced analytical methods. Our findings demonstrate that the observed shifts in relative cross coalescent rate curves are unlikely to result from technical artefacts or contributions from an earlier out of Africa population. Instead, they are likely due to a significant bottleneck and slower population growth rate within the Papua New Guinean population. Our analysis positions the Papua New Guinean population as a sister group to other Asian populations, challenging the notion of Papua New Guinean as an outgroup to both European and Asian populations
This study provides new insights into the complex demographic history of the Papua New Guinean population and underscores the importance of considering population-specific demographic events in interpreting relative cross coalescent rate curves.
Mayukh Mondal, et al., "Resolving out of Africa event for Papua New Guinean population using neural network" bioRxiv (September 23, 2024) https://doi.org/10.1101/2024.09.19.613861

The introduction to the paper explains that:
The Papua New Guinean (PNG) population is among the most fascinating in the world, owing to its unique demographic history. Following the Out Of Africa (OOA) event, modern humans populated New Guinea at a remarkably early date-at least 50 thousand years ago. Since then, the population has remained relatively isolated compared to other OOA populations (such as European and Asian populations) and has gone through a strong bottleneck. The substantial Denisovan ancestry within the PNG population and the strong correlation between Denisovan and Papuan ancestries, contribute to the genetic distinctiveness of the PNG population. 
Researchers have suggested that the genomes of PNG populations contain evidence of admixture with a modern human population that might have diverged from African populations- around 120 thousand years ago- much earlier than the proclaimed primary divergence between African and OOA populations. However, the extent to which this early diverged population contributed to the genome of PNG populations remains a subject of ongoing debate. Interestingly, this early migration hypothesis is more widely accepted by archeologists. 
Pagani et al supports this hypothesis, notably through Relative Cross-Coalescent Rate (RCCR) analysis. This RCCR analysis suggests that the PNG population diverged from African populations significantly earlier than other OOA populations. They argued that this earlier divergence indicated by the RCCR curve might reflect a contribution from an earlier OOA population specific to PNG. While this shift in the RCCR curve is well-documented, some researchers attribute it to technical artefacts such as low sample sizes and phasing errors rather than genuine demographic events. 
The origins of the primary lineage of the PNG population have also been contested. Some researchers propose that the PNG population is closely related to the Asia-Pacific populations and serves as a sister group to other Asian populations. Conversely, other researchers argue that the PNG population is an outgroup to both European and Asian populations. 
Recent advancements in analytical methods may provide new insights into these debates. For example, Approximate Bayesian Computation with Deep learning and sequential Monte Carlo (ABC-DLS) allows for the use of any summary statistics derived from simulations to train neural networks, which can then predict the most likely demographic models and parameters based on empirical data. Additionally, the Relate software enhances RCCR analysis by employing a modified version of the hidden Markov model, initially used in the Multiple Sequentially Markovian Coalescent (MSMC) method, allowing for the analysis of thousands of individuals with greater robustness. 
In this paper, we re-examine the demographic history of the PNG population using newly published samples combined with data from the 1000 Genome Project and cutting edge methods. This approach has enabled us to address these longstanding questions with greater precision. We first generate new empirical RCCR curves and demonstrate that the previously observed shift is unlikely to be the result of low sample size or phasing errors. Through simulations, we further show that the PNG population is indeed a sister group to other Asian populations and this shift is probably not due to contributions from an earlier OOA population. Instead, it is likely a consequence of a significant bottleneck and slower population growth in the PNG population.

The paper then defines the demographic models that the paper analyzed at a broad brush level:

To explore the demographic processes causing the observed RCCR shift, we tested five plausible demographic scenarios labelled A, O, M, AX and OX. In Model A, the PNG and East Asian populations are sister groups. Model O positions the PNG population as an outgroup to both European and East Asian populations. Model M combines elements of both A and O, suggesting that the PNG population arose from admixture between a sister group of the Asian population and an outgroup of European and Asian populations. Model AX postulates that the PNG population is a sister group to the Asian population but received input from an earlier OOA population. Finally, in Model OX, the PNG population receives a contribution from an earlier OOA population, while remaining ancestry came from an out group to the European and East Asian populations. . . .  
The best-fitting parameters for Model A largely correspond with the previously established OOA model, with some deviations specific to the inclusion of the PNG population. 

Our model suggests that all OOA populations, including PNG, diverged from African populations (represented by Yoruba) around 62.4 (62- 62.8) thousand years ago, experiencing a significant bottleneck. Approximately 52 (51.6- 52.8) thousand years ago, Neanderthals contributed around 3.7% (3.59- 3.85%) of the genome to these OOA populations. Shortly thereafter, Europeans and East Asians diverged from the PNG populations around 51.2 (50.8- 51.6) and 46.2 (45.9- 46.5) thousand years ago, respectively. The PNG population then mixed with Denisovans around 31.2 (31.1- 31.5) thousand years ago, contributing approximately 3.16% (3.05- 3.21%) to the genome of PNG. 

Our analysis also shows that the PNG population experienced a more severe bottleneck (674 [663- 689] of effective population size) than other OOA populations (i.e. Europeans 3512 [3423- 3589] and East Asians 1771 [1730- 1799] of effective population size), with growth rates significantly lower than those of other OOA populations, consistent with previously published data. 

While our parameter inference is generally robust within the individual model, substantial changes occur when the underlying model is altered. Given that determining the precise demographic model for human populations is an ongoing effort, parameter estimates should be considered supplementary to the model rather than independent results. 

The concluding discussion of the results notes that:

We successfully replicated the shift observed by Pagani et al., confirming its presence in both physically mapped and statistically phased sequences, which involved over 100 PNG samples. This consistency suggests that the shift is reproducible, though its underlying cause may differ from the original interpretation of Pagani et al.. 

Our analysis using ABC-DLS supports a simpler demographic model for PNG populations, proposing them as a sister group to Asians with no substantial detectable contribution from an earlier OOA population. Interestingly, our simulated models reveal that a stronger bottleneck with a lower growth rate could produce a similar shift in RCCR analysis and potentially be misinterpreted as a signal of an earlier population separation. While RCCR is a valuable proxy for estimating the separation time between populations, it is not without biases. The shift could result from various factors, including earlier divergence times, admixture with earlier diverged populations, or even a bottleneck in one of the populations, as demonstrated in our study. Interestingly this demographic history of stronger bottleneck with slower growth rate was also experienced by the Andamanese population, which explains the shift found in the Andamanese population as well. Thus, using RCCR analysis to rebuild the tree of divergence might need to be revised.

The observed shift in the RCCR curve suggests that a recent bottleneck can impact estimates of effective population size in the distant past. Notably, in our simulations, the Papua New Guinean bottleneck occurred much later (around 46.2 thousand years ago) than the observed shift (peaking around 100 thousand years ago) with a population (Yoruba) that separated a long time ago. This finding implies that the estimation of effective population size and cross-coalescent rates may not be entirely independent, potentially affecting RCCR analysis in its current form. Further analysis suggests that the estimation of coalescent rate was affected earlier than true changes of effective population size, which shifts the RCCR curve as RCCR is a ratio of coalescent rates. Additionally, this shift was absent in simulations involving populations that separated 300 thousand years ago, akin to the San population, indicating that the bottleneck effect diminishes over longer separation times.

Our results also reveal that when the contribution from an earlier OOA population is between 1-5%, our neural analysis misclassifies the Model AX to be Model A at a higher rate. We found that when the contribution from an earlier OOA population is set between 1-5%, our ABC-DLS analysis tends to misclassify the Model AX as Model A at a higher rate. A similar issue arises with Model M, where a low contribution (less than 5%) from an outgroup Eurasian population can still be misclassified as Model A. Thus our analysis does not work for less than 5% contribution from these unknown ghost populations, though Model OX does not show a similar phenomenon with Model A misclassification. While we cannot completely rule out the possibility of a small contribution from these populations, our analysis suggests that such models are not necessary to explain the RCCR shift as previously proposed.

Interestingly, our results position PNG as a sister group to Asian populations rather than an outgroup of European and Asian. The primary difference between those models and ours lies in the migration rates between populations. Previous models that incorporated significant migration rates between populations were found to have confounded results, leading us to avoid including migration rates in our models. Without migration, our Model O closely resembles the previous models of PNG. Given that those models used substantial migration rates, they are not directly comparable to our models without migration rate. Indeed with high migration rates, our approach failed to distinguish between Model A and O with high certainty. Still our work suggests that the main lineage of PNG is coming from a sister group of Asia, which was not confounded by a convoluted migration rate patterns between populations.

Our parameter estimation suggests that the PNG population separated from other populations around 46.2 (45.9 - 46.5) thousand years ago, a timeline that aligns with archaeological estimates of when the ancestors of PNG reached the ancient continent of Sahul, the landmass that once connected New Guinea and Australia. 

Additionally, our Relate analysis indicates that the separation time between PNG and European populations was the longest observed among OOA populations. However, as our model suggests, this is likely a bias caused by the bottleneck of PNG. This bottleneck may lead to an overestimation of the separation time, particularly in RCCR analysis. In reality, it is more likely that PNG and East Asian populations separated later than the divergence between PNG and European populations. 

In conclusion, our study provides compelling evidence that the unique demographic events—specifically, a significant bottleneck and slower population growth—within the PNG population are key factors influencing the observed shifts in RCCR curves. These findings not only refine our understanding of PNG's demographic history but also emphasise the necessity of accounting for population-specific demographic events when interpreting RCCR curves. 

No comments: