Tuesday, August 14, 2018

Sparrows Co-Evolved With Human Agriculture To Digest Grains More Efficiently

In an ideal world, more papers on subjects like the emergence of agriculture would incorporate corroborating data points like this one that make the narrative robust and more rich.
House sparrows (Passer domesticus) are a hugely successful anthrodependent species; occurring on nearly every continent. Yet, despite their ubiquity and familiarity to humans, surprisingly little is known about their origins. We sought to investigate the evolutionary history of the house sparrow and identify the processes involved in its transition to a human-commensal niche. 
We used a whole genome resequencing dataset of 120 individuals from three Eurasian species, including three populations of Bactrianus sparrows, a non-commensal, divergent house sparrow lineage occurring in the Near East. 
Coalescent modelling supports a split between house and Bactrianus sparrow 11 Kya and an expansion in the house sparrow at 6 Kya, consistent with the spread of agriculture following the Neolithic revolution. Commensal house sparrows therefore likely moved into Europe with the spread of agriculture following this period. Using the Bactrianus sparrow as a proxy for a pre-commensal, ancestral house population, we performed a comparative genome scan to identify genes potentially involved with adaptation to an anthropogenic niche. We identified potential signatures of recent, positive selection in the genome of the commensal house sparrow that are absent in Bactrianus populations. The strongest selected region encompasses two major candidate genes; COL11A—which regulates craniofacial and skull development and AMY2A, part of the amylase gene family which has previously been linked to adaptation to high-starch diets in humans and dogs. Our work examines human-commensalism in an evolutionary framework, identifies genomic regions likely involved in rapid adaptation to this new niche and ties the evolution of this species to the development of modern human civilization.
Mark Ravinet,, et al., "Signatures of human-commensalism in the house sparrow genome" Proceedings of the Royal Society B (August 8, 2018) via this tweet (hat tip Razib Khan).

Unsurprising But Concerning

We find that at least 31.2% of the citations to retracted articles happen a year after the article has been retracted. And that 91.4% of these post-retraction citations are approving.
From here (hat tip Marginal Revolution).

Contemporary Genetics in Singapore Confirm Paradigms

A new paper predictably sees the three main ethnicities of Singapore (Chinese, Malay and Indian), with some admixture between these populations (with the Chinese and Indian populations arriving in the historic era as immigrants). The Malay component of ancestry is absent in the 1000 Genomes Project data. There are also genetic signatures of a separate mainland route Neolithic Austroasiatic wave of migration from South China (ca. 4000 years ago), and an island base Austronesian wave of migration from South China (ca. 2000 years ago).


Asian populations are currently underrepresented in human genetics research. Here we present whole-genome sequencing data of 4,810 Singaporeans from three diverse ethnic groups: 2,780 Chinese, 903 Malays, and 1,127 Indians. 
Despite a medium depth of 13.7X, we achieved essentially perfect (>99.8%) sensitivity and accuracy for detecting common variants and good sensitivity (>89%) for detecting extremely rare variants with <0.1% allele frequency. We found 89.2 million single-nucleotide polymorphisms (SNPs) and 9.1 million small insertions and deletions (INDELs), more than half of which have not been cataloged in dbSNP. In particular, we found 126 common deleterious mutations (MAF>0.01) that were absent in the existing public databases, highlighting the importance of local population reference for genetic diagnosis. 
We describe fine-scale genetic structure of Singapore populations and their relationship to worldwide populations from the 1000 Genomes Project. In addition to revealing noticeable amounts of admixture among three Singapore populations and a Malay-related novel ancestry component that has not been captured by the 1000 Genomes Project, our analysis also identified some fine-scale features of genetic structure consistent with two waves of prehistoric migration from south China to Southeast Asia. Finally, we demonstrate that our data can substantially improve genotype imputation not only for Singapore populations, but also for populations across Asia and Oceania. These results highlight the genetic diversity in Singapore and the potential impacts of our data as a resource to empower human genetics discovery in a broad geographic region.

The juicy bit of the discussion section reads as follows:
Malay represents indigenous people in Southeast Asia and contributes a novel ancestry component that was not captured by the 1000 Genomes Project. We observed a clear north-south clinal pattern of genetic variation in both South Asia and East/Southeast Asia, except for two recent migrant populations--the SG Chinese and SG Indian, which is consistent with previous studies that suggest a strong role of geography in producing human population structure. 
Moreover, we found noticeable amounts of admixture among the three major populations in Singapore. 
In addition, we identified two closely related ancestral components (components 4 and 5 in Figure 2E) that are prevalent in East and Southeast Asian populations, suggestive of their ancient origins. Based on the geographic distributions of these two components, we speculate that they might reflect two waves of prehistoric migration from south China to Southeast Asia through a mainland route (component 5) and an island route (component 4). This hypothesis is consistent with a complex peopling history of Southeast Asia depicted by a recent ancient DNA study. The study suggested that an expansion from East Asia into mainland Southeast Asia occurred about 4,000 years ago during the Neolithic transition to farming, and that an island route migration corresponding to the Austronesian expansion into Philippines and Indonesia took place about 2,000 years ago. 
So, the new study adds a few, basically unsurprising, but important, data points to the mix, but provides no big surprises or insights. 

Wednesday, August 8, 2018

Contemporary Corsican Y-DNA Sheds Light On Southern European Pre-History

Historical Background

Bernard's blog provides a capsule history of Corsica and then explores a new paper on its genetics:
The most widely accepted hypothesis is the colonization of the Corsican-Sardinian bloc from Tuscany during different ice ages. Thus the first colonization of Corsica can go back to the Mesolithic between 18,000 and 15,000 years. The oldest archeological evidence is the Mesolithic collective burial of Campo Stefano located in the south of Corsica. It is 8940 years old. Other Mesolithic sites are identified in the south-west Filitosa and southern Corsica, as well as in Sardinia. 
A major demographic change occurred in the Neolithic from the sixth millennium BC. The archaeological remains are carved stones and pottery of printed ceramics, cardial or Campaniform. 
The Corsican prehistory ends when the Greeks settle on the island building the city of Alalia in 565 BC. JC. The Greeks were followed by Romans, Vandals and Byzantines.
There were also early Iron Age Greek colonies in Italy and Southern France before the full ascendancy of the Western Roman Empire.

The genetic data recounted below suggests that the Romans, and even less so, the Vandals and Byzantines didn't appear to have had much of a demic impact on Corsica. Corsica was eventually claimed by France, the country which controls it today, and the demic impact of subsequent Northern Italian and French rulers also appears to have been modest, although broad similarities between Northern Italy, Southern France and Corsica could obscure these sources of admixture.
After being ruled by the Republic of Genoa since 1284, Corsica was briefly an independent Corsican Republic from 1755 until it was officially ceded by the Republic of Genoa to Louis XV as part of a pledge for debts in 1768. Due to Corsica's historical ties with the Italian peninsula, the island retains to this day many Italian cultural elements: the native tongue is recognised as a regional language by the French government. Corsica was ruled by various powers over the course of its history, but had several brief periods of self-government. 
Napoleon was born in 1769 in the Corsican capital of Ajaccio. His ancestral home, Maison Bonaparte, is today a significant visitor attraction and museum.
The Genetics of Corsica and its Vicinity

Bernard summarizes the findings of a new genetics paper on Corsica as follow:
Y chromosome DNA in Corsica shows several waves of populations. The oldest is characterized by the arrival of haplogroup I2 in the Mesolithic. Deep demographic changes in the Neolithic are identified by the presence of haplogroup G. The Copper Age sees the arrival of haplogroup R on the island. The difference in distribution of the two subclades R1b-U152 and R1b-U106 may correspond to the two groups of statue-menhirs erected in the Bronze Age and distributed to the north and south. The settlement of the Greek city of Alaria seems to correspond to the maximum frequency of haplogroup E1b-V13. 
Regarding relations between Corsica and Sardinia, the results of this study suggest two different genetic histories, Nuragic and Torréenne . The distribution of haplogroup G also suggests a continuity between southern Corsica and Sardinia, while that of haplogroup I suggests a distinction. Indeed the Corso-Sardinian block is characterized by a climatic contrast. The glacial sediments in northern Corsica suggest three glacial episodes, whereas these sediments are absent in Sardinia. It is reasonable to think that the first Mesolithic arrived in Sardinia, where the climate is more favorable, before joining Corsica when the temperature has softened.
From Bernard's blog via Google translate from French (emphasis and link re Alaria added) discussing Julie Di Cristofaro, et al., Prehistoric migrations through the Mediterranean basin shaped Corsican Y-chromosome diversity PLOS (2018). 

I largely concur with Bernard's analysis above, although the estimated date of Y-DNA R's arrival may be a bit early for what could have been an early Bronze Age arrival instead.

Some of the interesting points in the raw data pertain to Y-DNA in Provence and Tuscany which are used for comparison purposes (I've interlineated editorial commentary in brackets and added emphasis to some of Bernard's translated blog text. I've also made some minor translation corrections.)
Haplogroup R is the most common in Corsica with a value of 51.8%. This haplogroup reaches 90% in Provence and 45.3% in Tuscany. The subclade R1b-U152 is predominant, especially in North Corsica. Nevertheless the subclade R1b-U106 is present in South Corsica. In Europe, R1b-U152 is the most common in Switzerland, Italy, France and Western Poland. Early DNA studies have shown that haplogroup R has spread in Western Europe in the Copper Age and the Bronze Age. 
Haplogroup G has a frequency of 21.7% in Corsica and 13.3% in Tuscany. It is absent in Provence. The subclade G2a-L91 reaches 11.3% in Corsica and is absent in Tuscany. G2a-L91 and G2a-PF3147 reach their highest frequency in Sardinia and Southern Corsica. Early DNA studies have shown that haplogroup G spread in Europe with Neolithic farmers. 
Haplogroup J shows an intermediate frequency in Corsica (11.8%) between those of Provence (6.6%) and Tuscany (17.6%). The subclade J2a-M67 is homogeneous on the island with a TMRCA of 2380 years. [Ed. i.e. 430 B.C.E., which is in the early Iron Age.] Subclade J2a-Page55 is present in northwestern Corsica. 
Haplogroup E is mainly represented by its subclade E1b-V13. Its frequency in Corsica (5.5%) is intermediate between those of Provence  (3%) and Tuscany (10.4%). The diffusion of E1b-V13 is supposed to be related to the Neolithic expansion.  [Ed. E1b-V13, which is my Y-DNA clade, is the predominant Y-DNA E clade in Europe and probably arrived via Greece and the Balkans.]
Haplogroup I is present in Corsica under its two clades I1 (0.3%) and I2 (2.4%). It is absent in Provence and present in Tuscany: I1 (0.3%) and I2 (6.3%). Clade I1 is mainly present in Northern Europe [Ed. with a Neolithic era expansion.], while clade I2 is mainly divided into two subclades: I2-P37 and I2-M436. The latter is present mainly in the Balkans. Ancient DNA studies have shown us that I2 is associated with the Mesolithic in Europe. It is also present in the Neolithic especially in the south of France. The subclade I2-M26 found in 30% of the samples in Sardinia is very little present in Corsica. 
Haplogroup Q is present in Corsica with a frequency of 2.4%. He is absent in Provence and has a frequency of 0.6% in Tuscany. [Ed. A frequency of Y-DNA Q that high in Corsica is probably a founder effect. Y-DNA Q is quite rare in Europe.]
Bernard doesn't discuss the distribution of Y-DNA T-M70 which was also present, but this would appear to be a good candidate for a Cardial Pottery Neolithic source, based upon its distribution within Corsica and where Y-DNA T is found elsewhere.

The Y-DNA R1b-U152 distribution probably implies a Bell Beaker related source in these areas, which is notable as I had been unclear on the extent of Y-DNA R1b distributions in Switzerland, Southern France and Northern Italy (I admit that I had never really even wondered about its presence in Corsica).

As far back as attested history goes, Northern Italy was Italic (probably due to early Iron Age migrations) with an intrusive Etruscan migration as well in that time period. The earliest attested linguistic data for Southern France was that it was ruled by Celtic tribes, although there may have been Vasconic speakers in the far Southwest.

Bernard and I both think that Greek colonization in the early Iron Age is a more likely source of Y-DNA E1b-V13 in these populations than Neolithic era expansions something that also fits the relative frequencies in Provence and Tuscany. Y-DNA J and E1b-V13 likely arrived at around the same time., even though Y-DNA J is more widely distributed in Corsica than Y-DNA E1b-V13. This Greek colonization event could also be the source of some of the Y-DNA I2 in the sample.

Inferred Dark Matter Distributions In Galaxy Clusters Also Track Baryonic Matter

We study the total and dark matter (DM) density profiles as well as their correlations for a sample of 15 high-mass galaxy clusters by extending our previous work on several clusters from Newman et al. Our analysis focuses on 15 CLASH X-ray-selected clusters that have high-quality weak- and strong-lensing measurements from combined Subaru and Hubble Space Telescope observations. The total density profiles derived from lensing are interpreted based on the two-phase scenario of cluster formation. 
In this context, the brightest cluster galaxy (BCG) forms in the first dissipative phase, followed by a dissipationless phase where baryonic physics flattens the inner DM distribution. This results in the formation of clusters with modified DM distribution and several correlations between characteristic quantities of the clusters. 
We find that the central DM density profiles of the clusters are strongly influenced by baryonic physics as found in our earlier work. The inner slope of the DM density for the CLASH clusters is found to be flatter than the Navarro--Frenk--White profile, ranging from α=0.30 to 0.79. We examine correlations of the DM density slope α with the effective radius Re and stellar mass Me of the BCG, finding that these quantities are anti-correlated with a Spearman correlation coefficient of 0.6. We also study the correlation between Re and the cluster halo mass M500, and the correlation between the total masses inside 5 kpc and 100 kpc. We find that these quantities are correlated with Spearman coefficients of 0.68 and 0.64, respectively. These observed correlations are in support of the physical picture proposed by Newman et al.

This is not a smoking gun for any theory, but it reaffirmed the lesson of inferred dark matter distributions in galaxies that inferred dark matter distributions closely track the distribution of ordinary baryonic matter there. This means that the an appropriate modified gravity theory can probably also explain the dark matter phenomena of clusters, even though the simply toy model formula of MOND is not adequate to do so.

The failure to the NFW distribution of dark matter in clusters as well as has been previously shown, in galaxies, also strongly supports the conclusion that if there are indeed dark matter particles, that they interact with ordinary matter more strongly than by gravity alone, and hence, dark matter cannot be truly collisionless.

As an aside, some of these authors were also authors of a recent paper claiming that the data do not disclose a fundamental acceleration scale, which is simply shoddy work (see critiques spelling out the flaws in their analysis here and here). So a grain of salt may be required when looking at their data analysis and conclusions.

On the other hand, a 2017 paper by many of the same authors confirms as other have found that the NFW inferred dark matter halo shape is a poor fit to more than 75% of galaxies and an indifferent fit to the remainder, even in large galaxies where the usual justifications for deviations from the NFW shaped halo expected for collisionless dark matter are weaker.  Its abstract is as follows, emphasis added:
We develop and apply new techniques in order to uncover galaxy rotation curves (RC) systematics. Considering that an ideal dark matter (DM) profile should yield RCs that have no bias towards any particular radius, we find that the Burkert DM profile satisfies the test, while the Navarro-Frenk-While (NFW) profile has a tendency of better fitting the region between one and two disc scale lengths than the inner disc scale length region. Our sample indicates that this behaviour happens to more than 75% of the galaxies fitted with an NFW halo. Also, this tendency does not weaken by considering "large" galaxies, for instance those with M1010M. Besides the tests on the homogeneity of the fits, we also use a sample of 62 galaxies of diverse types to perform tests on the quality of the overall fit of each galaxy, and to search for correlations with stellar mass, gas mass and the disc scale length. In particular, we find that only 13 galaxies are better fitted by the NFW halo; and that even for the galaxies with M1010M the Burkert profile either fits as good as, or better than, the NFW profile. This result is relevant since different baryonic effects important for the smaller galaxies, like supernova feedback and dynamical friction from baryonic clumps, indicate that at such large stellar masses the NFW profile should be preferred over the Burkert profile. Hence, our results either suggest a new baryonic effect or a change of the dark matter physics.
A 2016 paper by one of the authors and a co-author summarizes the small scale issue of LambadCDM:
The ΛCDM model, or concordance cosmology, as it is often called, is a paradigm at its maturity. It is clearly able to describe the universe at large scale, even if some issues remain open, such as the cosmological constant problem , the small-scale problems in galaxy formation, or the unexplained anomalies in the CMB. ΛCDM clearly shows difficulty at small scales, which could be related to our scant understanding, from the nature of dark matter to that of gravity; or to the role of baryon physics, which is not well understood and implemented in simulation codes or in semi-analytic models. At this stage, it is of fundamental importance to understand whether the problems encountered by the ΛDCM model are a sign of its limits or a sign of our failures in getting the finer details right. In the present paper, we will review the small-scale problems of the ΛCDM model, and we will discuss the proposed solutions and to what extent they are able to give us a theory accurately describing the phenomena in the complete range of scale of the observed universe.

Tuesday, August 7, 2018

Musings On General Relativity Degrees of Freedom

The Einstein Field Equations in four dimensions consist of ten equations, each corresponding to elements of a symmetric tensor. One can show that given certain identities that this constitutes two degrees of freedom.

One can show that this is the same as the number of degrees of freedom for a spin-2 massless graviton is also two.

But, is this correspondence sensible?

The Einstein Field Equations, if you treat the stress-energy tensor as the source of the the tensor on the left hand side from which gravitational effects are derived, has the following sixteen components:

1. Mass (strictly speaking, energy density)
2. Electromagnetic flux in three spatial dimensions.
3. Linear momentum in three spatial dimensions.
4. Angular momentum in three spatial dimensions.
5. Pressure in three spatial dimensions.
6. Shear stress in three spatial dimensions.

This made all sorts of sense in the early 1900s when those were the only sources of mass and energy known.

But, a century later, this is weird and we know better. 

The Einstein Field Equations completely disregard both the strong force and the weak force that we know believe to be fundamental. But, we no longer think of pressure or shear stress as fundamental quantities, so what is it doing in a fundamental law of Nature?

For that matter, we usually think of electromagnetic flux in terms of photons at a fundamental level, not as two interrelated classical fields (a distinction with a difference without which, for example, transistors wouldn't work).

Also does it continue to make sense to formulate this with a hydrodynamic conception of the flow of matter and energy, when all of the other comparable laws of Nature are formulated in terms of point masses, which makes a certain amount of sense when you have a Standard Model made up of point masses.

Viewed in this light, why should a basically fundamental quantity, the degrees of freedom of a spin-2 massless graviton, have any correspondence to the Einstein Field Equations which omit entirely reference to some things that we think of as fundamental that gravitate, while elevating other quantities that we otherwise don't think of as fundamental to fundamental status?

And, if the components of the Einstein Field Equations seem rather arbitrary by modern standards, why should we assume that they really correspond to a spin-2 massless graviton?

Monday, August 6, 2018

The Genetics Of The Modern Pygmy Population Of Flores

The island of Flores, which is the source of archaic hominin remains for a species colloquially known as "Hobbits" also has a modern pygmy population that is about half a meter taller than the hobbits who live in Rampasasa, Flores, Indonesia, which is right next to the Liang Bua cave where Homo floresiensis remains were found.

Analysis of their genomes reveals that they cluster genetically with the Lebbo people of Borneo, previously discussed at this blog, who are a mix of Southeast Asian ancestry and Papuan ancestry, and the Mamanwa of the Philippines. The pygmies of Flores have Denisovan ancestry proportionate only to their Papuan ancestry, with no signs of enhanced Denisovan ancestry from additional Flores based gene flow.

More On Ancient Australian Astronomy

As the abstract makes clear, this paper corroborates previously results. This were reported on in a December 8, 2017 blog post at this blog.
Recently, a widely publicized claim has been made that the Aboriginal Australians discovered the variability of the red star Betelgeuse in the modern Orion, plus the variability of two other prominent red stars: Aldebaran and Antares. This result has excited the usual healthy skepticism, with questions about whether any untrained peoples can discover the variability and whether such a discovery is likely to be placed into lore and transmitted for long periods of time. 
Here, I am offering an independent evaluation, based on broad experience with naked-eye sky viewing and astro-history. I find that it is easy for inexperienced observers to detect the variability of Betelgeuse over its range in brightness from V = 0.0 to V = 1.3, for example in noticing from season-to-season that the star varies from significantly brighter than Procyon to being greatly fainter than Procyon. Further, indigenous peoples in the Southern Hemisphere inevitably kept watch on the prominent red star, so it is inevitable that the variability of Betelgeuse was discovered many times over during the last 65 millennia. 
The processes of placing this discovery into a cultural context (in this case, put into morality stories) and the faithful transmission for many millennia is confidently known for the Aboriginal Australians in particular. So this shows that the whole claim for a changing Betelgeuse in the Aboriginal Australian lore is both plausible and likely. Given that the discovery and transmission is easily possible, the real proof is that the Aboriginal lore gives an unambiguous statement that these stars do indeed vary in brightness, as collected by many ethnographers over a century ago from many Aboriginal groups. So I strongly conclude that the Aboriginal Australians could and did discover the variability of Betelgeuse, Aldebaran, and Antares.
Bradley E. Schaefer, "Yes, Aboriginal Australians Can and Did Discover the Variability of Betelgeuse" (June 26, 2018).  Journal reference: 21(1) Journal of Astronomical History and Heritage 7-12 (2018).

Saturday, August 4, 2018

Harappan (a.k.a. Meluhha) Was Neither Dravidian Nor Munda

I agree entirely with Razib Khan's case that the Harappan language of the Indus River Valley people was neither a Dravidian nor a Munda language. (There is also overwhelming evidence that it was not Indo-European.)

UPDATED August 9, 2018

Michael Witzel has described the linguistic substrate in Rig Vedic Sanskrit evidenced by about 300 words, with the unfortunate appellation "Para-Munda" which is misleading at best, although he does reference the historically correct name for the language of the Indus Valley Civilization's people, Meluhha, which we know from Sumerian accounts of its IVC ex-patriot neighborhood that captured the name phonetically. 

He is right, I am very comfortable in concluding, however, that Meluhha was not a member of the Dravidian language family or the Indo-European language family.

To the extent that there are any commonalities between the Meluhha language and the Munda languages of India, they are probably either mere coincidences or derive from a Meluhha adstrate/substrate or Meluhha borrowings into the Munda languages.

I am also inclined to think that Witzel's timeline for Indo-Aryan migration into South Asia is probably a bit late. I tend to see the evidence as favoring a date closer to 1950 BCE (based upon evidence from Cemetery H of cremation, upon traces of new metallurgy technologies and upon historical climate factors), while he tends to favor a date as late as 1300 BCE.

On the other hand, there is something to Witzel's inference based upon the kind of vocabulary that ended up having substrate influences, that Meluhha's substrate influences on early Vedic Sanskrit arose at a post-collapse phase of Harappan civilization and reflects rural agricultural village life rather than the grand urban civilization involved in long distance international trade found in the Indus River Valley that preceded it.

His conclusion that Harappan seals were probably not a full fledged alphabet or written language and were instead a system of logos or symbols (albeit with some very limited additional information in a small character set) is probably accurate, however.

I also tend to agree that there is some merit in Witzel's claim that the Vedic religion was predominantly Indo-European rather than having a great deal of continuity with the Harappan religion, although I would be inclined to think that he underestimates the extent to which indigenous Harappan and Dravidian religious elements may have subsequently resurfaced and influenced the Hindu religious tradition that evolved and emerged from the early Vedic religion as recounted in the Rig Veda.