Thursday, July 20, 2017

Understanding The Lightest Scalar Mesons

The introduction to a new paper on the photo-production of the two lightest scalar mesons (the f0(500) and the f0(980) explains the still unsolved mystery of these composite hadrons:
Understanding the structure of low-lying scalar mesons has been one of the most challenging issues in hadronic physics. Their internal structure is still under debate. That the f0(500) scalar meson, which is also known as σ, is not an ordinary meson consisting of a quark and an anti-quark is more or less in consensus. Recent studies suggest that these scalar mesons may belong to the flavor SU(3) non-qq¯ nonet (see reviews [1, 2], a “note on scalar mesons below 2 GeV ” in Ref. [3], and references therein. A recent review provides also various information on the structure of the scalar meson [4], including a historical background of the σ meson). The f0(500) is also interepreted as one of the glueballs or gluonia, mixed with the ¯qq state [5–7], though this idea is criticized because the same analysis is rather difficult to be applied to explaining the strange scalar meson K∗ 0 (800) or κ, which is also considered as a member of the nonet. The f0(500) is often regarded as a tetraquark state in a broad sense [8]. The f0(500) as a tetraquark state has a multiple meaning: It can be described as a diquark-antidiquark correlated state [9, 10], ¯qqqq¯ state [11], or correlated 2π state [12, 13] arising from ππ scattering. This non ¯qq feature was employed in various theoretical approaches such as QCD sum rules [14], effective Lagrangians [15], and lattice QCD [16–18].  
The scalar mesons were also extensively studied phenomenologically. There are two scalar-isoscalar mesons (I G(J P C ) = 0+(0++)) below 1 GeV, that is, the lowest-lying f0(500) (or σ) and the first excited f0(980). Both the f0(500) and the f0(980) exist in ππ scattering and their pole positions were investigated based on many different processes, for example, such as πN → ππN reactions [19–21], Kl4 decay [22, 23], D → 3π [24, 25], J/ψ → ωππ [26], ψ(2S) → π +π −J/ψ [27], γγ → ππ [28], pp scattering [29], and so on (for details, we refer to Refs. [3, 4]). While the mass and the width of the f0(980) are more or less known to be mf0 = 990±20 MeV and Γ = 40−100 MeV, those of f0(500) are still far from consensus (see Ref. [3]). The upper bound of the f0(500) mass is given in the large Nc limit in terms of the Gasser-Leutwyler low-energy constant [30], which suggests that the f0 mass is quite possibly smaller than 700 MeV.

Wednesday, July 19, 2017

Humans Were Present In Australia 65,000 Years Ago.

A new article at Nature definitively dates evidence of human occupation of Australia to 65,000 years ago.

This has many implications. These include the following:

* It puts a latest possible date on an Out of Africa migration. This tends to disprove the hypothesis that there was an Out of Africa migration that failed ca. 75,000 years BP followed by a second Out of Africa migration that struck, ca. 50,000 years BP, that has been advanced by many. The recent trend has been to attribute all early evidence (such as Altai Neanderthal DNA showing admixture with modern humans ca. 100,000 years BP and archaeological traces in the Levant, the interior of Arabia and South Asia) of modern humans outside of Africa to a stillborn Out of Africa migration.

* It puts a latest possible date on Denisovan admixture with the ancestors of Australian Aboriginal people who, together with Papuans and Negrito populations from the Philippines have significant levels of Denisovan admixture today.

* It tends to make the megafauna extinction that followed the arrival of modern humans in Australia less immediate with as much as a 20,000 year gap between first contact and the extinction of some Australian megafauna species.

* It puts a latest possible date on behavioral modernity in anatomically modern humans. Almost all of the evidence used to put the start of the Upper Paleolithic era at around 50,000 years ago has subsequently been found more like 70,000 years ago.

Tuesday, July 18, 2017

How Extreme Were Caste Founder Events In South Asia

Endogamy, estimated to have lasted about 2,000 years, has produced genetically distinguishable differences between varna in the Indian caste system which is the main several layers of the caste system (with broad regional and linguistic divides as well) and distinctive levels in recessive gene disease risk associated with endogamy in a population with a small founding population (not true inbreeding) in a significant number of jati which are far more specific sub-castes with more specific occupational and ethnic associations within a varna.

Razib emphasizes the fact that this is inconsistent with the notion that caste only becomes significant around the time of the British colonial period as one leading account suggests. My own priors, based on what I was taught in the relevant classes in college, meanwhile, are for endogamy to have arisen not long after the Bronze Age, so 2,000 years seems young to me.
The more than 1.5 billion people who live in South Asia are correctly viewed not as a single large population, but as many small endogamous groups. We assembled genome-wide data from over 2,800 individuals from over 260 distinct South Asian groups. We identify 81 unique groups, of which 14 have estimated census sizes of more than a million, that descend from founder events more extreme than those in Ashkenazi Jews and Finns, both of which have high rates of recessive disease due to founder events. We identify multiple examples of recessive diseases in South Asia that are the result of such founder events. This study highlights an under-appreciated opportunity for reducing disease burden among South Asians through the discovery of and testing for recessive disease genes.
Nathan Joel Nakasuka, et al., "The promise of disease gene discovery in South Asia" ((June 6, 2017) doi:

There paper estimates that there are 4,600 populations in South Asia which are defined by endogamy and comparable to the 260 groups examined in this study.

I have also rarely seen a paper with so little body text. In relegates almost all of its detailed conclusions to supplementary materials. There is almost no discussion the populations studied from an interdisciplinary perspective, and there is almost no effort to see a larger context and the root these findings in a meaningful narrative that can be fit to historical reality as a cross-check to confirm that the findings are robust.

Another recent paper takes on South Asian population structure as its primary focus:
India represents an intricate tapestry of population substructure shaped by geography, language, culture and social stratification operating in concert. To date, no study has attempted to model and evaluate how these evolutionary forces have interacted to shape the patterns of genetic diversity within India. Geography has been shown to closely correlate with genetic structure in other parts of the world. However, the strict endogamy imposed by the Indian caste system, and the large number of spoken languages add further levels of complexity. 
We merged all publicly available data from the Indian subcontinent into a data set of 835 individuals across 48,373 SNPs from 84 well-defined groups. Bringing together geography, sociolinguistics and genetics, we developed COGG (Correlation Optimization of Genetics and Geodemographics) in order to build a model that optimally explains the observed population genetic sub-structure. 
We find that shared language rather than geography or social structure has been the most powerful force in creating paths of gene flow within India. Further investigating the origins of Indian substructure, we create population genetic networks across Eurasia. We observe two major corridors towards mainland India; one through the Northwestern and another through the Northeastern frontier with the Uygur population acting as a bridge across the two routes. Importantly, network, ADMIXTURE analysis and f3 statistics support a far northern path connecting Europe to Siberia and gene flow from Siberia and Mongolia towards Central Asia and India.
Aritra Bose, et al., "Dissecting Population Substructure in India via Correlation Optimization of Genetics and Geodemographics" bioRxiv (July 17, 2017) doi:

Monday, July 17, 2017

Facebook Traffic

I appear to be getting a lot of hits at this blog from links on Facebook posts recently, probably mostly to a couple of moderately old posts at this blog, probably including one linking New World and Altaic genetics, but my limited analytics make it hard to tell what Facebook posts are linking to posts on this blog. 

If you are aware of Facebook links to this blog, I'd appreciate a note in a comment with a link to the source post, just out of simple curiosity (I certainly wouldn't ask anyone to take down a post).

A Review Of The Altaic Language Hypothesis

Linguist George Starostin published an Oxford Research Encyclopedia entry on the Altaic Language hypothesis in April of 2016, that is a good quick introduction to the subject from one of its leading proponents. In the barest nutshell (from the beginning of this entry):
“Altaic” is a common term applied by linguists to a number of language families, spread across Central Asia and the Far East and sharing a large, most likely non-coincidental, number of structural and morphemic similarities. 
At the onset of Altaic studies, these similarities were ascribed to the one-time existence of an ancestral language—“Proto-Altaic,” from which all these families are descended; circumstantial evidence and glottochronological calculations tentatively date this language to some time around the 6th–7th millennium bc, and suggest Southern Siberia or adjacent territories (hence the name “Altaic”) as the original homeland of its speakers. 
However, since the mid-20th century the dominant view in historical linguistics has shifted to that of an “Altaic Sprachbund” (diffusion area), implying that the families in question have not sprung from a common source, but rather have acquired their similarities over a long period of mutual linguistic contact. 
The bulk of “Altaic” has traditionally included such uncontroversial families as Turkic, Mongolic, and Manchu-Tungusic; additionally, Japanese (Japonic) and Korean are also frequently seen as potential members of the larger Altaic family (the entire five branches are sometimes referred to as “Macro-Altaic”).
Basically, the winds of scholarship seem to be drifting from a "hard Altaic" position, in which the members languages emerged tree-like from a common proto-language, to a "soft Altaic" position that sees the similarities between members of the language family as possibly due to borrowings between geographically adjacent language families.

A New Paper On Asian Negrito Genetics

Studies that don't involve ancient DNA, like this one, have become passé, but they still provide some useful insights.
Human presence in Southeast Asia dates back to at least 40,000 years ago, when the current islands formed a continental shelf called Sundaland. In the Philippine Islands, Peninsular Malaysia and Andaman Islands, there exist indigenous groups collectively called Negritos whose ancestry can be traced to the ‘First Sundaland People’. 
To understand the relationship between these Negrito groups and their demographic histories, we generated genome wide Single Nucleotide Polymorphism (SNP) data in the Philippine Negritos and compared them with existing data from other populations. 
Phylogenetic tree analyses show that Negritos are basal to other East and Southeast Asians, and that they diverged from West Eurasians at least 38,000 years ago. We also found relatively high traces of Denisovan admixture in the Philippine Negritos, but not in the Malaysian and Andamanese groups, suggesting independent introgression and/or parallel losses involving Denisovan introgressed regions. Shared genetic loci between all three Negrito groups could be related to skin pigmentation, height, facial morphology and malarial resistance. These results show the unique status of Negrito groups as descended from the First Sundaland People.
Timothy A. Jinam, et al., "Discerning the origins of the Negritos, First Sundaland Peoples: deep divergence and archaic admixture" Genome Biol Evol evx118. (July 11, 2017)

PNAS also has a new paper out on the population genetics of Madagascar but it is currently closed access.

Seven New Mesolithic Scandinavian Hunter-Gatherer Genomes

A new bioRXviv preprint analyzes ancient autosomal DNA from seven Mesolithic Scandinavian hunter-gatherers. 

Eurogenes does an excellent job of hitting the high points of the paper which model Mesolithic Scandinavian hunter-gatherers (SHG) as an admixture of Western hunter-gatherers and Eastern hunter-gatherers with some additional variation no longer present in Europe and strong selective effects associated with Northern latitudes. In three of his particularly notable bullet points (not all of which are highlighted in the abstract to the paper), he notes that:
- EHG probably dispersed across Scandinavia in a counter-clockwise direction via an ice-free route along the Atlantic coast in what is now Norway, because SHG samples from northern and western Scandinavia show more EHG ancestry than those from eastern and southern Scandinavia
- at least 17% of the SNPs that are common in SHG are not found in present-day Europeans, suggesting that a large part of European variation has been lost since the Mesolithic 
- although it's unlikely that SHG made a significant contribution to the present-day Northern European gene pool, some gene-variants common in SHG that appear to be associated with metabolic, cardiovascular, developmental and psychological traits are carried at high frequencies by present-day Northern Europeans, especially compared to present-day Southern Europeans, probably due to strong selective pressures specific to northern latitudes in Europe
The comments at that post also make some notable observations. For example:

AWood notes that while the EHG mix is unexpected that: "The WHG moved north-east from modern Germany/Poland to the Baltic/Gotland/Sweden as we've seen from other results."

MaxT notes that: "3 out 6 SHG carried alleles for EDAR gene - gene associated with "shovel-shaped teeth and hair thickness phenotype" in Asians. 

"For rs3827760, within the EDAR gene, the derived G allele is associated with shovel-shaped teeth and hair thickness phenotype in East Asians. In the novel SHGs in this study, only the ancestral A allele is present (SF12 is homozygote AA). The derived variant was reported in three of the six Motala SHGs which are younger than most other SHGs in this study. It is clear that the variant was present among SHGs, and it is possible that it has a continuous (but varying) distribution from Scandinavia to East Asia during the Mesolithic, and that the very low sample size of EHGs has failed to pick up the variant. It is also possible that the derived rs3827760 variant was brought to Scandinavia by migration in the Late Mesolithic, perhaps related to the specific Motala group."

Only population who could have brought this is EHG-like population coming from Russia, more sampling for EHG will solve this over time."

A tweeted quotes quantifies the atypical EHG ancestry distribution, 55% in the Northwest v. 35% in Eastern and South-Central Scandinavia.

There are indications in the data based upon an effective population size measure that these groups had a common source population ca. 50,000 to 70,000 years ago, in the early Upper Paleolithic era that coincides with a possible Out of Africa (or perhaps Out of Arabia) event (as the existing paradigm would predict).

The paper and its abstract are as follows (the paragraph breaking and emphasis is mine):
Scandinavia was one of the last geographic areas in Europe to become habitable for humans after the last glaciation. However, the origin(s) of the first colonizers and their migration routes remain unclear. 
We sequenced the genomes, up to 57x coverage, of seven hunter-gatherers excavated across Scandinavia and dated to 9,500-6,000 years before present. Surprisingly, among the Scandinavian Mesolithic individuals, the genetic data display an east-west genetic gradient that opposes the pattern seen in other parts of Mesolithic Europe. This result suggests that Scandinavia was initially colonized following two different routes: one from the south, the other from the northeast. The latter followed the ice-free Norwegian north Atlantic coast, along which novel and advanced pressure-blade stone-tool techniques may have spread. These two groups met and mixed in Scandinavia, creating a genetically diverse population, which shows patterns of genetic adaptation to high latitude environments. These adaptations include high frequencies of low pigmentation variants and a gene-region associated with physical performance, which shows strong continuity into modern-day northern Europeans. 
Finally, we were able to compute a 3D facial reconstruction of a Mesolithic woman from her high-coverage genome, giving a glimpse into an individual's physical appearance in the Mesolithic.
Günther et al., Genomics of Mesolithic Scandinavia reveal colonization routes and high-latitude adaptation, bioRxiv (July 17, 2017), doi:

Life On Earth-Like Planets Is Extremely Resilient

Astrophysical events that kill off many species of life on Earth or another Earth-like planet do happen. We have had several on Earth in the last 4 billion years. 

But, astronomy events so severe that they would sterilize Earth or an Earth-like planet killing off all life of all kinds on it, are extremely rare according to a new study of the topic.
Much attention has been given in the literature to the effects of astrophysical events on human and land-based life. However, little has been discussed on the resilience of life itself. Here we instead explore the statistics of events that completely sterilise an Earth-like planet with planet radii in the range 0.5−1.5R Earth and temperatures of ∼300K, eradicating all forms of life. 
We consider the relative likelihood of complete global sterilisation events from three astrophysical sources -- supernovae, gamma-ray bursts, large asteroid impacts, and passing-by stars. To assess such probabilities we consider what cataclysmic event could lead to the annihilation of not just human life, but also extremophiles, through the boiling of all water in Earth's oceans. 
Surprisingly we find that although human life is somewhat fragile to nearby events, the resilience of Ecdysozoa such as Milnesium tardigradum renders global sterilisation an unlikely event.
David Sloan, Rafael Alves Batista and Abraham Leob, "The Resilience of Life to Astrophysical Events" (July 13, 2017).

The Irish-South Slavic Connection

There are a great many words in Irish Gaelic that seem to have a stronger etymological connection to Slavic (and in particular South Slavic) word than they do with other Indo-European languages (many other such links are noted at the same blog). There are archaeological and oral history suggestions of these links as well.

Conventional linguistic family trees don't suggest such a connection. But, there is almost surely an important lesson about Western European pre-history hidden in there somewhere, but a really articulate story, providing either a direction of transmission, or better yet, a historical narrative of this connection is lacking.

My intuition is that the pre-Slavic language of the region may have been pre-proto-Celtic, that a significant migration from this region may have reached its terminus in Ireland, and that this substrate may have influenced the adoption of the Slavic family of languages of the region. Notably, frontier regions often end up having more pure transmission of the source language and culture of their founding population that intermediate ones, which would help explain why Irish Gaelic might have more pure connections to South Slavic than, for example, Welsh.

Thursday, July 13, 2017

Working With Multiple Margins of Error

There are a couple of tasks that come up with you are looking at experimental measurements with margins of error that can be solved with simple equations that are nonetheless not widely known by people who have only taken introductory statistics. This post is mostly based upon a question and answer about the subject at physics stack exchange.

I intentionally omit a derivation of these formulas and a discussion of the limitations on when these formulas apply. In the real world, some of these limitations do not hold and the formulas below are merely good approximations.

I used these formulas to produce, for example, the results reported in my recent post about new measurements of the Higgs boson mass with different margins of error in studies by the independent ATLAS and CMS experiments which were not combined in the source. A truly correct analysis would probe more deeply into the sources of systemic error in the underlying measurements conducted by each experiment since some of those systemic errors may be perfectly correlated with each other since both ATLAS and CMS are using the same apparatus, the Large Hadron Collider, to conduct their experiment, but the impact of that more rigorous analysis is slight.

A more serious systemic issue which is ignored by the practicing scientists in the area except qualitatively, because it would make the calculations much harder and wouldn't necessarily provide useful information because we don't know the truly correct calculation, is that there is pretty good accumulated empirical evidence that the error in the kinds of measurements that are assumed to be Gaussian are, in fact, non-Gaussian and have fatter tails in their distributions than a Gaussian distribution would suggest.

Thus, while the standard method of computing standard errors in experimental measurements in high energy physics that assumes a Gaussian error distribution is a valid way to determine the degree of error in one experiment relative to another in a physical unit free manner, the absolute value of the likelihood that the true result is more than X sigma away from the measured value based upon a Gaussian distribution is systemically understated in an amount that is particular large for high values of sigma. This is why physicists consider only a 5 sigma result to be a true discovery, even though something close to a 3 sigma result ought to be sufficiently certain if error were really distributed in a Gaussian manner.

1. Combining different kinds of error in the same measurement to obtain an total margin of error.

You will often see an experimental measurement in a research paper reported in the form:

Measurement (k) +/- Statistical Sampling Error (Δk1) +/- Systemic Error (Δk2) +/- Theoretical Error (Δk3), where Δk1Δk2 and Δk3 are in the same units as k and set forth the magnitude of one standard deviation of error of the type described assuming that the error is distributed on a Gaussian basis (i.e. that it follows a "normal distribution").

But, often what you need to know is Measurement (k) +/- Combined Error (Δk).

How do you do this?

You square each of the independent sources of error, add these squares together, and take the square root, which gives you the total combined error from all sources. For different types of error Δk1 and Δk2 the formula is:


Usually, this formula will result in a total margin of error which is a bit more than the largest source of error, but is much less than the sum of the different margins of error. Intuitively, this makes sense, because having multiple possible independent sources of error should increase the total margin of error, as this formula does, but not linearly (i.e. by just adding up the potential sources of error) because some of the time error from one source of potential error will often be cancelled out by an error in the opposite direct from another independent source of potential error. 

2. Combining two independent measurements with different margins of error.

Suppose that more than one experiment independently measures the same quantity and each experiment gets a result. The first paper measures a physical constant k, to be k1 +/- Δk1. The second paper measures the physical constant k, in the same physical units, to be k2 +/- Δk2.

The best estimate of the true value of k given this information is calculated using an error weighted mean, weighting by the reciprocals of the squares of the respective individual uncertainty values. An accurate measurement must contribute more to the best value than an inaccurate measurement.

Thus, the formula is that for X=((1/(Δk*Δk1))*k1)+ ((1/(Δk2*Δk2))*k2)

and Y=(1/(Δk*Δk1))(1/(Δk2*Δk2)), the error weighted mean value of k = X/Y.

3. Combining margins of error from two combined independent measurements.

Suppose that more than one experiment independently measures the same quantity and each experiment gets a result. The first paper measures a physical constant k, to be k1 +/- Δk1. The second paper measures the physical constant k, in the same physical units, to be k2 +/- Δk2.

What is the combined margin of error Δk for the combined error weighted mean measurement k?
Intuitively, a very uncertain value must make little contribution. The uncertainty in k must always be less than or equal to the smallest of the individual uncertainties. Also, multiple, equally accurate measurements must decrease uncertainty.