Monday, November 28, 2022

Hmong-Mien Historical Population Genetics

It will take me a while to really process this paper, but it is interesting. It suggests a surprisingly recent population history for today's Hmong-Mien people. 

The big picture linguistic context of the region and how the Hmong-Mien have been understood to fit into it so far, is discussed at a previous post at this blog.

The linguistic, historical, and subsistent uniqueness of Hmong-Mien (HM) speakers offers a wonderful opportunity to investigate how these factors impact the genetic structure. Nevertheless, the genetic differentiation among HM-speaking populations and the formation process behind are far from well characterized in previous studies. Here, we generated genome-wide data from 67 Yao ethnicity samples and analyzed them together with published data, particularly by leveraging haplotype-based methods. 
We identify that the fine-scale genetic substructure of HM-speaking populations corresponds better to linguistic classification than to geography, while the parallel of serial founder events and language differentiations can be found in West Hmongic speakers. Multiple lines of evidence indicate that ~500-year-old GaoHuaHua individuals are most closely related to West Hmongic-speaking Bunu. The excessive level of the genetic bottleneck of HM speakers, especially Bunu, is in agreement with their long-term practice of slash-and-burn agriculture. The inferred admixture dates in most of the HM-speaking populations overlap the reign of the Ming dynasty (1368-1644 CE). Besides the common genetic origin of HM speakers, their external ancestry majorly comes from neighboring Han Chinese and Kra-Dai speakers in South China
Conclusively, our analysis reveals the recent isolation and admixture events that contribute to the fine-scale genetic formation of present-day HM-speaking populations underrepresented in previous studies.
Zi-Yang Xia, Xingcai Chen, Chuan-Chao Wang, Qiongying Deng, "Reconstructing the formation of Hmong-Mien genetic fine-structure" bioRxiv (November 24, 2022).

Friday, November 25, 2022


Dark matter models are no better than MOND at describing the Milky Way in the face of detailed observations, despite having more free parameters. Moffat's modified gravity doesn't work as well, despite its broader range of applicability.
Aiming at discriminating different gravitational potential models of the Milky Way, we perform tests based on the kinematic data powered by the Gaia DR2 astrometry, over a large range of (R,z) locations. Invoking the complete form of Jeans equations that admit three integrals of motion, we use the independent R- and z-directional equations as two discriminators (TR and Tz). We apply the formula for spatial distributions of radial and vertical velocity dispersions proposed by Binney et al., and successfully extend it to azimuthal components, σθ(R,z) and Vθ(R,z); the analytic form avoids the numerical artifacts caused by numerical differentiation in Jeans-equations calculation given the limited spatial resolutions of observations, and more importantly reduces the impact of kinematic substructures in the Galactic disk. 
It turns out that whereas the current kinematic data are able to reject Moffat's Modified Gravity (let alone the Newtonian baryon-only model), Milgrom's MOND is still not rejected. In fact, both the carefully calibrated fiducial model invoking a spherical dark matter (DM) halo and MOND are equally consistent with the data at almost all spatial locations (except that probably both have respective problems at low-|z| locations), no matter which a tracer population or which a meaningful density profile is used. Because there is no free parameter at all in the quasi-linear MOND model we use, and the baryonic parameters are actually fine-tuned in the DM context, such an effective equivalence is surprising, and might be calling forth a transcending synthesis of the two paradigms.
Yongda Zhu, et al., "How Close Dark Matter Halos and MOND Are to Each Other: Three-Dimensional Tests Based on Gaia DR2" arXiv:2211.13153 (November 23, 2022) (accepted for publication in MNRAS).

Tuesday, November 22, 2022

Resolving The Tevatron W Boson Mass Anomaly While Saving Face

A group reworking old Tevatron collider measurements of the W boson mass announced a result in April of 2022 that was anomalously high. It was so high that it is basically inconsistent with all other measurements of it (including previous Tevatron measurements) and with a global Standard Model fit.

Now, a collaboration between Tevatron scientists and Large Hadron Collider (LHC) scientists has published a paper that as obliquely as possible, so as to save face for the Tevatron scientists, explains why their previous announcement was too high (which everyone except desperate theorists have basically known since the outset, even if many of them didn't say so out loud), so that the embarrassing debacle of the initial results announcement can be salvaged to provide something of value to science.

The body text of the conference paper, however, suggests that the first step will only reduce the Tevatron W boson mass measurement by about 10 MeV which is still too little, although it helps easy the tension and there is apparently a second installment of adjustments still to come.
We present the current status of the W-boson mass averaging project, an ongoing effort aimed at combining Tevatron and LHC measurements. Methods are presented to accurately evaluate the effect of PDFs and other modelling variations on existing measurements. Based on this approach, the measurements can be corrected to a common modelling reference and to the same PDFs, and subsequently combined accounting for PDF correlations in a quantitative way. We discuss the combination procedure, and the impact of improvements in the theoretical description of W-boson production and decay.
Simone Amoroso (on behalf of the Tevatron/LHC W-mass combination Working Group), "Status of the W-boson mass averaging project" arXiv:2211.12365 (November 22, 2022) (Contribution to the proceedings of the 41st International Conference on High Energy Physics (ICHEP2022), 6-13 July 2022, Bologna, Italy).

A Lepton Flavor Universality Violation Fix?

A new paper proposed a solution to the apparent violation of lepton flavor universality violation that has been observed in semi-leptonic B meson decays, by refining the Standard Model prediction calculation with a new methodology and improving the precision of one of the physical constants that is important to the calculation, rather than resorting to beyond the Standard Model physics.

While this is a preliminary result, it is a promising suggestion that one of the most intractable apparent experimental anomalies in high energy physics may be possible to resolve. 

Monday, November 21, 2022

The Toda and Kov People Of India

In anthropology, outlier groups of people are often critical to understanding the big picture.

The Toda people, (previous coverage here) an ethnicity with a current census population counts at about two thousand in number, in the Tamil Nadu state of India in a hill region, which is on the far southern tip of the subcontinent, whom the following excerpts from Wikipedia summarize as follows, are one such people:
Toda people are a Dravidian ethnic group who live in the Indian states of Tamil Nadu. Before the 18th century and British colonisation, the Toda coexisted locally with other ethnic communities, including the Kota, Badaga and Kurumba, in a loose caste-like society, in which the Toda were the top ranking. During the 20th century, the Toda population has hovered in the range 700 to 900. Although an insignificant fraction of the large population of India, since the early 19th century the Toda have attracted "a most disproportionate amount of attention because of their ethnological aberrancy" and "their unlikeness to their neighbours in appearance, manners, and customs". The study of their culture by anthropologists and linguists proved significant in developing the fields of social anthropology and ethnomusicology.

The Toda traditionally live in settlements called mund, consisting of three to seven small thatched houses, constructed in the shape of half-barrels and located across the slopes of the pasture, on which they keep domestic buffalo. Their economy was pastoral, based on the buffalo, whose dairy products they traded with neighbouring peoples of the Nilgiri Hills. Toda religion features the sacred buffalo; consequently, rituals are performed for all dairy activities as well as for the ordination of dairymen-priests. The religious and funerary rites provide the social context in which complex poetic songs about the cult of the buffalo are composed and chanted.

Fraternal polyandry in traditional Toda society was fairly common; however, this practice has now been totally abandoned, as has female infanticide. 
During the last quarter of the 20th century, some Toda pasture land was lost due to outsiders using it for agriculture or afforestation by the State Government of Tamil Nadu. This has threatened to undermine Toda culture by greatly diminishing the buffalo herds. Since the early 21st century, Toda society and culture have been the focus of an international effort at culturally sensitive environmental restoration. The Toda lands are now a part of The Nilgiri Biosphere Reserve, a UNESCO-designated International Biosphere Reserve; their territory is declared UNESCO World Heritage Site. . . .

Physical anthropologist Egon Freiherr von Eickstedt, in 1934, described the Todas as being of the North Indid division of the Indid type, therefore connecting them to ancient Proto-Dravidian population. DNA studies in the early 21st century showed that the Toda and Kota share genes which distinguish them from the other Nilgiri Hill Tribes. . . .

Their sole occupation is cattle-herding and dairy-work. Holy dairies are built to store the buffalo milk.

They once practiced fraternal polyandry, a practice in which a woman marries all the brothers of a family, but no longer do so. All the children of such marriages were deemed to descend from the eldest brother. The ratio of females to males is about three to five. The culture historically practiced female infanticide. In the Toda tribe, families arrange contracted child marriage for couples. . . .
The forced interaction with other peoples with technology has caused a lot of changes in the lifestyle of the Todas. They used to be primarily a pastoral people but now, they are increasingly venturing into agriculture and other occupations. They used to be strict vegetarians but now, some people eat meat. . . .

Registrar of Geographical Indication gave GI status for this unique embroidery, a practice which has been passed on to generations. 
Genetically, the Toda people are the closest modern match to the pre-steppe invasion Harappan people in the Indus River Valley, as we can confirm by matching their ancient DNA to modern Toda genomes. Razib Khan notes at the link that the Toda people: "resembles the IVC population in having less AASI but not too much (if any) steppe."

Their lack of steppe ancestry could just be coincidental. But, their low level of AASI (ancient ancestral South Indian) ancestry in the pinnacle of South India, together with "their unlikeness to their neighbours in appearance, manners, and customs" suggests strongly that they a long distance migrants who might very well be basically unadmixed descendants of the Harappans.

Hinduism is generous and inclusive of considerable internal diversity, but nonetheless the traditional religion of the Toda people appears to be atypical of conventional Hindu practice (with about 88% of ethnic Toda people following the Toda religion, about 11% have converted to Christianity (about 220 people), about 1% are Muslim (about 20 people), and one Toda individual identifies as Buddhist)"
According to the Toda Religion, the goddess Teikirshy and her brother first created the sacred buffalo and then the first Toda man and woman. Many rites feature the buffalo, its milk and other products form the basis of their diet.

The Toda religion exalted high-class men as holy milkmen, giving them sacred status as priests of the holy dairy. According to Sir James Frazer in 1922 (see quote below from Golden Bough), the holy milkman was prohibited from walking across bridges while in office. He had to ford rivers by foot, or by swimming. The people are prohibited from wearing shoes or any type of foot covering.

Toda temples are distinct from Hindu temples and are constructed in a circular pit lined with stones. They are similar in appearance and construction to Toda huts. Women are not allowed to enter or go close to these huts that are designated as temples.

Linguistically, their Dravidian dialect is atypical, which could indicate a substrate Harappan language influence:

The Toda language is a member of the Dravidian family. The language is typologically aberrant and phonologically difficult. Linguists have classified Toda (along with its neighbour Kota) as a member of the southern subgroup of the historical family proto-South-Dravidian. It split off from South Dravidian, after Kannada, but before Malayalam. In modern linguistic terms, the aberration of Toda results from a disproportionately high number of syntactic and morphological rules, of both early and recent derivation, which are not found in the other South Dravidian languages (save Kota, to a small extent.)

Their language is noted for its many frictives and trills.

Other relic populations of the IVC may have blended into the regions of India to which they migrated before jati endogamy took hold, while this singular outlier population may have managed to endure with only minimal accommodation to the local language and the synthesized Hindu religious practices necessary to persist. Places that are highlands and economically marginal for other purposes, like the Nilgiris mountain range in Tamil Nadu, India, are common places to find relict peoples.

It is notable that while they were historically pastoralists, that they herded buffalo (true buffalo, unlike the American bison inaccurately called buffalo), rather than the horses, cows, sheep and goats that steppe herders may have herded. Again, this shows a lack of connection to the Indo-Aryans who were known for their horses and cows.

Fraternal polyandry and infanticide are classic cultural reactions to persistent poverty in a society where inheriting assets is critical. This could flow from their self-imposed isolation from modernity and the larger culture. 

Their kindred Kota people are also quite distinct, although they have embraced the outside world more fully than the Toda people:

Kotas, also Kothar or Kov by self-designation, are an ethnic group who are indigenous to the Nilgiris mountain range in Tamil Nadu, India. They are one of the many tribal people indigenous to the region. (Others are the Todas, Irulas and Kurumbas). Todas and Kotas have been subject to intense anthropological, linguistic and genetic analysis since the early 19th century. Study of Todas and Kotas has also been influential in the development of the field of anthropology. Numerically Kotas have always been a small group not exceeding 1,500 individuals spread over seven villages for the last 160 years. They have maintained a lifestyle as a jack of all trades such as potters, agriculturalist, leather workers, carpenters, and black smiths and as musicians for other groups. 
Since the British colonial period they have availed themselves of educational facilities and have improved their socio-economic status and no longer depend on the traditional services provided to make a living. Some anthropologists have considered them to be a specialized caste as opposed to a tribe or an ethnic group.

Kotas have their own unique language that belongs to the Dravidian language family but diverged from South Dravidian sub family at some time in BCE. Their language was studied in detail by Murray Barnson Emeneau, a pioneer in the field of Dravidian linguistics. Their social institutions were distinct from mainstream Indian cultural norms and had some similarities to Todas and other tribal people in neighbouring Kerala and the prominent Nair caste. It was informed by a fraternal polygyny where possible. Kota religion was unlike Hinduism and believed in non-anthropomorphic male deities and a female deity. 
Since the 1940s, many mainstream Hindu deities also have been adopted into the Kota pantheon and temples of Tamil style have been built to accommodate their worship. They’ve had specialized groups of priests to propitiate their deities on behalf of the group.
. . .

The Indian government considers them to be a scheduled tribe (ST), a designation reserved for indigenous tribal communities throughout India that are usually at a lower socio-economic status than mainstream society. They also have a special status as a Primitive Tribal Group (PTG) based on certain socio economic and demographic indicators. But the Kotas are a relatively successful group that makes its living as agriculturalists, doctors, post masters and availing themselves of any government and private sector employment.
. . .

Although many theories have been put forth as to the origins of Kotas and Todas, none have been confirmed as factual. What linguists and anthropologists agree is that ancestors of both Kotas and Todas may have entered the Nilgiris massif from what is today Kerala or Karnataka in centuries BCE and developed in isolation from the rest of the society. According to F. Metz, a missionary, Kotas had a tradition that alluded to them coming over from a place called Kollimale in Karnatakas. They seem to have displaced the previous Kurumba inhabitants from the higher altitudes to lower altitude infested with malarial mosquitoes.

The Kota tribe shows the maternal haplogroup M frequency of 97%, which is one of the highest in India. Within M haplogroup, M2 lineages are common amongst Dravidian speaking populations of South India. They also demonstrate very low admixture rate from other neighbouring groups. The studies on the hematological parameters of Kota showed that they have a low MCV (mean corpuscular volume) even though there in no trace of anemia. However it is also suggested that there is a probability for G6PD deficiency among this group.

At some point in their history they developed a symbiotic economic relationship with their buffalo rearing Toda neighbours as service providers in return for Todas' buffalo milk, hides, ghee, and meat. They also had a trading and ritual relationship with Kurumba and Irula neighbours who were cultivators and hunter-gatherers. They specifically used the Kurumbas as their sorcerers and as village guards. Origin myth of Kotas postulates that Kotas, Todas, and Kurumbas were all placed in the Nilagiris area at once as brothers by the Kota god. This symbiotic relationship survived until disturbed by the British colonial officers starting in the early 19th century.

Since the early 19th century, missionaries, British bureaucrats, anthropologists and linguists of both Western and Indian kind have spent an enormous amount of time studying the different ethnic and tribal groups; of all, the Todas were the most studied, followed by Kotas. Other groups such as Irulas and various groups of Kurumbars were least studied. The study of the ethnic groups of Nilgiris was instrumental in the early development of the field of Anthropology. Although most groups lived in peace with each other and had developed a symbiotic relationship, taboos and cultural practices were developed to maintain social distance. According to F. Metz, as the original settlers of the highland, Kurumbars were subject to continuous violence including occasional massacres by the Todas and Badagas. According to Kota informants, they had supplied battle drums during periods of war.

Kotas were observed to be domiciled in seven relatively large nuclear villages intercepted between Toda and Badaga settlements. Six villages were within the Nilgiris district in Tamil Nadu and the seventh one in the Wayanad district in Kerala. Kota villages are known as Kokkal in their language and as Kotagiri by outsiders. Each village had three exogamous clans of similar name. Each clan settled in a street called Kerr or Kerri. Clan members were prohibited from marrying within each within the same village but could from the same clan or any other clan from non-native village. The relationship between similarly named clans was unknown and no social hierarchy was evident amongst the inter and intra village clans.

Women had a greater say in choosing their marriage partners than in any mainstream Indian villages and also helped out in many economic activities. They had the right to divorce. They were also exclusively engaged in making pottery. According to early western observers, unlike Toda women, who were friendly towards visitors to their villages, Kota women maintained their distance from outsiders. Wives of Kota priests played equally important ritual role in religious functions. Women who became possessed to flute music are called Pembacol and were consulted during important village decisions. Women also had specialized roles associated with cultivation, domestic chores and social functions.

Unlike Todas, Kotas ate meat and were in good physical condition when met by early anthropologists. Their traditional food is a type of Italian millet known as vatamk. Vatamk is served in almost all ceremonial occasions but rice is the preferred daily food. It is served with udk, a sambar like item made of locally available pulses, vegetables and tamarind juice. Beef is seldom eaten but eggs, chicken and mutton are consumed, when available, along with locally grown vegetables and beans.

Prior to colonial era penetration of the Kota area there was very little if not no formal relationships between neighbouring political entities and Kota villages. It is assumed that political entities from Karnataka made forays into the highland but their control was not permanent. Kotas are the head of the nilgiri. There was no formal differentiation that existed within and outside the village level. Each village had an expectation to meet. The village of Thichgad is famous for its women's song and dance, the funerals are well known in Menad, and the Kamatraya festival and instrumental music are famous in Kolmel. Kota village is led by a village headman called Goethgarn. The Goethgarn from Menad was the head of all the seven villages. Whenever a dispute arose, the Goethgarn will call a meeting known as a kuttim with the village elders and decide the solutions. Within a village, the Goethga-rn and elders decide when festivals are to be held and how to solve problems in the community. Although regular justice is handled through the Indian judicial system, local decisions of Kota cultural requirements are handled by the village kut. 
Kota religion and culture revolved around the smithy. Their major deities are A-yno-r also divided into big or Doda-ynor or small or kuna-yno-r, a father god and Amno-r or mother goddess. Father god is also called Kamati-cvara or Kamatra-ya in some villages. Although there were two male gods, there was only one version of the female goddess. These gods were worshiped in the form of Silver disks at specific temples. Historian Joyti Sahi and L. Dumont notes these deities may have roots in proto-Shaivism and proto-Shaktism.

Kotas had a number of religious festivals during the colonial precontact and immediately after the colonial contact period. It ranged from praying to their rain god Kannatra-ya or titular deity Kamatra-ya. During the seed sowing ceremony, they used to build a forge and a furnace within the main temple and smiths would make avocatory item like axes or gold ornaments to the deity. The head priest mundika-no-n and headmen gotga-rn usually belong to the particular family (kuyt) and it was passed from father to son. Mundika-no-n is assisted by the te-rka-ran, through whom the god (so-ym) communicates with the people while being possessed. Te-rka-ran could come from any family but mundika-no-n comes from a specific family in a village.

Kota funeral rites consist of two ceremonies. The first one is called Green funeral and concerns cremation of the body. The second ceremony is called a dry funeral and involves exhumation of buried bones from the first ceremony, followed by sacrifice of semi wild buffaloes. The second funeral is no longer practiced widely. Kota temples are unique in being run by a variety of people not restricted to original priestly families.

Kotas speak the Kota language or Ko-v Ma-nt and it is closely related to Toda language. It was identified as an independent Dravidian language in as early as 1870s by Robert Caldwell. It diverged from the common South Dravidian stock in years BCE. Kota language speakers became isolated and the language developed certain unique characteristics that were studied in detail and by the prominent Dravidian linguists Murray Emeneau. It is informed by maintenance of words that shows strong affinity to Archaic Tamil. According to Emeneau, Kotas have been living isolated since their separation from the mainstream Tamil speakers in years BCE. Emeneau dates the split to the 2nd century BCE as terminus ante quem ("limit before which") and was unable to date the period earlier than this in which the split may have happened, but it happened after Kannada split from the common Tamil–Kannada stock.

All Nilgiri languages show areal influence and show affinity to each other. Since the reorganization of linguistic states in India, most Kota children study in Tamil at schools and are bilingual in Kota and Tamil. Previously Kotas were multilingual in Kota, Toda and Badaga languages.

Thursday, November 17, 2022

There Are Hundreds Of Chinese Languages

Chinese is a language family often expressed with a common set of non-phonetic logogram characters rather than a single language. 

How Many Languages Are There in China?

At least three hundred. . . .

Two suggestions from me:

1. The speakers are still calling all these languages / varieties "dialects", which means they must be mutually intelligible and / or at least closely related, which is far, far from the truth. . . .

A lect (often a regional or minority language) as part of a group or family of languages, especially if they are viewed as a single language, or if contrasted with a standardized idiom that is considered the 'true' form of the language (for example, Cantonese as contrasted with Mandarin Chinese or Bavarian as contrasted with Standard German). . . .

Why must these local lects be stigmatized?

Let's use the neutral, linguistically exact term "topolect", calqued on Sinitic "fāngyán 方言" (NOT "dialect").

2. Instead of referring to all of the many languages of China as "Chinese", I propose that they be divided into two main groups, "Sinitic" and "non-Sinitic". Sinitic includes all the languages of China that fall under the designation Hànyǔ 漢語, not just Mandarin, and especially not just Modern Standard Mandarin (MSM) (Pǔtōnghuà 普通話). Non-Sinitic would include Mongolic, Tungusic, Turkic, and the scores of other languages that are unrelated to Hànyǔ 漢語 ("Sinitic").

China is not a nation of linguistic uniformity, as is often falsely alleged.

From Language Log (a 13:54 YouTube video discussing the issue is linked at the link and is what the commentary above is discussing). The author is a linguist with at least one specialty in Chinese languages.

The Wikipedia redirect from Hànyǔ above contains the following centering introduction (emphasis mine):

Chinese (simplified Chinese: 汉语; traditional Chinese: 漢語; pinyin: Hànyǔ or also 中文; Zhōngwén, especially for the written language) is a group of languages that form the Sinitic branch of the Sino-Tibetan languages family, spoken by the ethnic Han Chinese majority and many minority ethnic groups in Greater China. About 1.3 billion people (or approximately 16% of the world's population) speak a variety of Chinese as their first language.

The spoken varieties of Chinese are usually considered by native speakers to be variants of a single language. However, their lack of mutual intelligibility means they are sometimes considered separate languages in a family. Investigation of the historical relationships among the varieties of Chinese is ongoing. Currently, most classifications posit 7 to 13 main regional groups based on phonetic developments from Middle Chinese, of which the most spoken by far is Mandarin (with about 800 million speakers, or 66%), followed by Min (75 million, e.g. Southern Min), Wu (74 million, e.g. Shanghainese), and Yue (68 million, e.g. Cantonese). These branches are unintelligible to each other, and many of their subgroups are unintelligible with the other varieties within the same branch (e.g. Southern Min). There are, however, transitional areas where varieties from different branches share enough features for some limited intelligibility, including New Xiang with Southwest Mandarin, Xuanzhou Wu with Lower Yangtze Mandarin, Jin with Central Plains Mandarin and certain divergent dialects of Hakka with Gan (though these are unintelligible with mainstream Hakka). All varieties of Chinese are tonal to at least some degree, and are largely analytic.
What does it mean to say that Chinese is largely analytic? This is something that Chinese has in common to a great extent with English.
In linguistic typology, an analytic language is a language that conveys relationships between words in sentences primarily by way of helper words (particles, prepositions, etc.) and word order, as opposed to using inflections (changing the form of a word to convey its role in the sentence). 
For example, the English-language phrase "The cat chases the ball" conveys the fact that the cat is acting on the ball analytically via word order. 
This can be contrasted to synthetic languages, which rely heavily on inflections to convey word relationships (e.g., the phrases "The cat chases the ball" and "The cat chased the ball" convey different time frames via changing the form of the word chase). 
Most languages are not purely analytic, but many rely primarily on analytic syntax.

When a language is "tonal" then the inflection and tone with which the phonemes in a word are pronounced determines which word is being used, as opposed, for example, to merely expressing feelings about the content of what is being said, or merely serving as punctuation (e.g. the raised tone at the end of a sentence in English to indicate that it is a question rather than a statement).

The geographic distribution of tonal languages tends to be an areal feature common to all languages in a geographic area regardless of the ancestral languages from which they are derived and tends to be more common in more tropical areas. 

Tonal languages also tend to substitute for phonemes. Tonal languages tend to have fewer distinct consonants and vowels than non-tonal languages do, with tones making up for that shortage of sounds from which to make words.

As Wikipedia further explains:

Jerry Norman estimated that there are hundreds of mutually unintelligible varieties of Chinese. These varieties form a dialect continuum, in which differences in speech generally become more pronounced as distances increase, though the rate of change varies immensely. Generally, mountainous South China exhibits more linguistic diversity than the North China Plain. In parts of South China, a major city's dialect may only be marginally intelligible to close neighbors. For instance, Wuzhou is about 190 kilometres (120 mi) upstream from Guangzhou, but the Yue variety spoken there is more like that of Guangzhou than is that of Taishan, 95 kilometres (60 mi) southwest of Guangzhou and separated from it by several rivers. In parts of Fujian the speech of neighboring counties or even villages may be mutually unintelligible.

Until the late 20th century, Chinese emigrants to Southeast Asia and North America came from southeast coastal areas, where Min, Hakka, and Yue dialects are spoken. The vast majority of Chinese immigrants to North America up to the mid-20th century spoke the Taishan dialect, from a small coastal area southwest of Guangzhou.


Local varieties of Chinese are conventionally classified into seven dialect groups, largely on the basis of the different evolution of Middle Chinese voiced initials:
The classification of Li Rong, which is used in the Language Atlas of China (1987), distinguishes three further groups:
  • Jin, previously included in Mandarin.
  • Huizhou, previously included in Wu.
  • Pinghua, previously included in Yue.
Some varieties remain unclassified, including Danzhou dialect (spoken in Danzhou, on Hainan Island), Waxianghua (spoken in western Hunan) and Shaozhou Tuhua (spoken in northern Guangdong).

A fair amount of the differentiation of the Chinese languages is attested in the historic era, and much of it that does not dates to late Chinese prehistory:

At the end of the 2nd millennium BC, a form of Chinese was spoken in a compact area around the lower Wei River and middle Yellow River. From there it expanded eastwards across the North China Plain to Shandong and then south into the valley of the Yangtze River and beyond to the hills of south China. As the language spread, it replaced formerly dominant languages in those areas, and regional differences grew. Simultaneously, especially in periods of political unity, there was a tendency to promote a central standard to facilitate communication between people from different regions.

The first evidence of dialectal variation is found in texts from the Spring and Autumn period (771–476 BC). At that time, the Zhou royal domain, though no longer politically powerful, still defined standard speech. The Fangyan (early 1st century AD) is devoted to differences in vocabulary between regions. Commentaries from the Eastern Han period (first two centuries AD) contain much evidence of local differences in pronunciation. The Qieyun rhyme book (601 AD) noted wide variation in pronunciation between regions, and set out to define a standard pronunciation for reading the classics. This standard, known as Middle Chinese, is believed to be a diasystem based on the reading traditions of northern and southern capitals.

The North China Plain provided few barriers to migration, leading to relative linguistic homogeneity over a wide area in northern China. In contrast, the mountains and rivers of southern China have spawned the other six major groups of Chinese languages, with great internal diversity, particularly in Fujian. 
Standard Chinese

Until the mid-20th century, most Chinese people spoke only their local language. As a practical measure, officials of the Ming and Qing dynasties carried out the administration of the empire using a common language based on Mandarin varieties, known as Guānhuà (官話/官话, literally 'speech of officials'). Knowledge of this language was thus essential for an official career, but it was never formally defined.

In the early years of the Republic of China, Literary Chinese was replaced as the written standard by written vernacular Chinese, which was based on northern dialects. In the 1930s a standard national language was adopted, with its pronunciation based on the Beijing dialect, but with vocabulary also drawn from other Mandarin varieties. It is the official spoken language of the People's Republic of China and of the Republic of China (Taiwan), and one of the official languages of Singapore.

Standard Mandarin Chinese now dominates public life in mainland China, and is much more widely studied than any other variety of Chinese. Outside China and Taiwan, the only varieties of Chinese commonly taught in university courses are Standard Mandarin and Cantonese. 
Comparison with Europe

Chinese has been likened to the Romance languages of Europe, the modern descendants of Latin. In both cases, the ancestral language was spread by imperial expansion over substrate languages 2000 years ago, by the QinHan empire in China and the Roman Empire in Europe. In Western Europe, Medieval Latin remained the standard for scholarly and administrative writing for centuries, and influenced local varieties, as did Literary Chinese in China. In both Europe and China, local forms of speech diverged from the written standard and from each other, producing extensive dialect continua, with widely separated varieties being mutually unintelligible.

On the other hand, there are major differences. In China, political unity was restored in the late 6th century (by the Sui dynasty) and has persisted (with relatively brief interludes of division) until the present day. Meanwhile, Europe remained fragmented and developed numerous independent states. Vernacular writing, facilitated by the alphabet, supplanted Latin, and these states developed their own standard languages. In China, however, Literary Chinese maintained its monopoly on formal writing until the start of the 20th century.

The morphosyllabic writing, read with varying local pronunciations, continued to serve as a source of vocabulary and idioms for the local varieties. The new national standard, Vernacular Chinese, the written counterpart of spoken Standard Chinese, is also used as a literary form by literate speakers of all varieties.
Dialectologist Jerry Norman estimated that there are hundreds of mutually unintelligible varieties of Chinese. These varieties form a dialect continuum, in which differences in speech generally become more pronounced as distances increase, although there are also some sharp boundaries.

However, the rate of change in mutual intelligibility varies immensely depending on region. For example, the varieties of Mandarin spoken in all three northeastern Chinese provinces are mutually intelligible, but in the province of Fujian, where Min varieties predominate, the speech of neighbouring counties or even villages may be mutually unintelligible. 
Dialect groups

Classifications of Chinese varieties in the late 19th century and early 20th century were based on impressionistic criteria. They often followed river systems, which were historically the main routes of migration and communication in southern China. The first scientific classifications, based primarily on the evolution of Middle Chinese voiced initials, were produced by Wang Li in 1936 and Li Fang-Kuei in 1937, with minor modifications by other linguists since. The conventionally accepted set of seven dialect groups first appeared in the second edition of Yuan Jiahua's dialectology handbook (1961):  
Mandarin This is the group spoken in northern and southwestern China and has by far the most speakers. This group includes the Beijing dialect, which forms the basis for Standard Chinese, called Putonghua or Guoyu in Chinese, and often also translated as "Mandarin" or simply "Chinese". In addition, the Dungan language of Kyrgyzstan and Kazakhstan is a Mandarin variety written in the Cyrillic script. 
Wu These varieties are spoken in Shanghai, most of Zhejiang and the southern parts of Jiangsu and Anhui. The group comprises hundreds of distinct spoken forms, many of which are not mutually intelligible. The Suzhou dialect is usually taken as representative, because Shanghainese features several atypical innovations. Wu varieties are distinguished by their retention of voiced or murmured obstruent initials (stops, affricates and fricatives). 
Gan These varieties are spoken in Jiangxi and neighbouring areas. The Nanchang dialect is taken as representative. In the past, Gan was viewed as closely related to Hakka because of the way Middle Chinese voiced initials became voiceless aspirated initials as in Hakka, and were hence called by the umbrella term "Hakka–Gan dialects". 
Xiang The Xiang varieties are spoken in Hunan and southern Hubei. The New Xiang varieties, represented by the Changsha dialect, have been significantly influenced by Southwest Mandarin, whereas Old Xiang varieties, represented by the Shuangfeng dialect, retain features such as voiced initials. 
Min These varieties originated in the mountainous terrain of Fujian and eastern Guangdong, and form the only branch of Chinese that cannot be directly derived from Middle Chinese. It is also the most diverse, with many of the varieties used in neighbouring counties—and, in the mountains of western Fujian, even in adjacent villages—being mutually unintelligible. Early classifications divided Min into Northern and Southern subgroups, but a survey in the early 1960s found that the primary split was between inland and coastal groups. Varieties from the coastal region around Xiamen have spread to Southeast Asia, where they are known as Hokkien (named from a dialectical pronunciation of "Fujian"), and Taiwan, where they are known as Taiwanese. Other offshoots of Min are found in Hainan and the Leizhou Peninsula, with smaller communities throughout southern China. 
Hakka The Hakka (literally "guest families") are a group of Han Chinese living in the hills of northeastern Guangdong, southwestern Fujian and many other parts of southern China, as well as Taiwan and parts of Southeast Asia such as Singapore, Malaysia and Indonesia. The Meixian dialect is the prestige form. Most Hakka varieties retain the full complement of nasal endings, -m -n -ŋ and stop endings -p -t -k, though there is a tendency for Middle Chinese velar codas -ŋ and -k to yield dental codas -n and -t after front vowels. 
Yue These varieties are spoken in Guangdong, Guangxi, Hong Kong and Macau, and have been carried by immigrants to Southeast Asia and many other parts of the world. The prestige variety and by far most commonly spoken variety is Cantonese, from the city of Guangzhou (historically called "Canton"), which is also the native language of the majority in Hong Kong and Macau. Taishanese, from the coastal area of Jiangmen southwest of Guangzhou, was historically the most common Yue variety among overseas communities in the West until the late 20th century. Not all Yue varieties are mutually intelligible. Most Yue varieties retain the full complement of Middle Chinese word-final consonants (/p/, /t/, /k/, /m/, /n/ and /ŋ/) and have rich inventories of tones.
The Language Atlas of China (1987) follows a classification of Li Rong, distinguishing three further groups: 
Jin These varieties, spoken in Shanxi and adjacent areas, were formerly included in Mandarin. They are distinguished by their retention of the Middle Chinese entering tone category. 
Huizhou The Hui dialects, spoken in southern Anhui, share different features with Wu, Gan and Mandarin, making them difficult to classify. Earlier scholars had assigned to them one or other of these groups, or to a group of their own. 
Pinghua These varieties are descended from the speech of the earliest Chinese migrants to Guangxi, predating the later influx of Yue and Southwest Mandarin speakers. Some linguists treat them as a mixture of Yue and Xiang.
Some varieties remain unclassified, including the Danzhou dialect of northwestern Hainan, Waxiang, spoken in a small strip of land in western Hunan, and Shaozhou Tuhua, spoken in the border regions of Guangdong, Hunan, and Guangxi. This region is an area of great linguistic diversity but has not yet been conclusively described.

Most of the vocabulary of the Bai language of Yunnan appears to be related to Chinese words, though many are clearly loans from the last few centuries. Some scholars have suggested that it represents a very early branching from Chinese, while others argue that it is a more distantly related Sino-Tibetan language overlaid with two millennia of loans. 
Relationships between groups

Jerry Norman classified the traditional seven dialect groups into three larger groups: Northern (Mandarin), Central (Wu, Gan, and Xiang) and Southern (Hakka, Yue, and Min). He argued that the Southern Group is derived from a standard used in the Yangtze valley during the Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central group was transitional between the Northern and Southern groups. Some dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined.

Scholars account for the transitional nature of the central varieties in terms of wave models. Iwata argues that innovations have been transmitted from the north across the Huai River to the Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched. 
Quantitative similarity

A 2007 study compared fifteen major urban dialects on the objective criteria of lexical similarity and regularity of sound correspondences, and subjective criteria of intelligibility and similarity. Most of these criteria show a top-level split with Northern, New Xiang, and Gan in one group and Min (samples at Fuzhou, Xiamen, Chaozhou), Hakka, and Yue in the other group. The exception was phonological regularity, where the one Gan dialect (Nanchang Gan) was in the Southern group and very close to Meixian Hakka, and the deepest phonological difference was between Wenzhounese (the southernmost Wu dialect) and all other dialects.

The study did not find clear splits within the Northern and Central areas: 
Changsha (New Xiang) was always within the Mandarin group. No Old Xiang dialect was in the sample. 
Taiyuan (Jin or Shanxi) and Hankou (Wuhan, Hubei) were subjectively perceived as relatively different from other Northern dialects but were very close in mutual intelligibility. Objectively, Taiyuan had substantial phonological divergence but little lexical divergence. 
Chengdu (Sichuan) was somewhat divergent lexically but very little on the other measures.
The two Wu dialects (Wenzhou and Suzhou) occupied an intermediate position, closer to the Northern/New Xiang/Gan group in lexical similarity and strongly closer in subjective intelligibility but closer to Min/Hakka/Yue in phonological regularity and subjective similarity, except that Wenzhou was farthest from all other dialects in phonological regularity. The two Wu dialects were close to each other in lexical similarity and subjective similarity but not in mutual intelligibility, where Suzhou was actually closer to Northern/Xiang/Gan than to Wenzhou.

In the Southern subgroup, Hakka and Yue grouped closely together on the three lexical and subjective measures but not in phonological regularity. The Min dialects showed high divergence, with Min Fuzhou (Eastern Min) grouped only weakly with the Southern Min dialects of Xiamen and Chaozhou on the two objective criteria and was actually slightly closer to Hakka and Yue on the subjective criteria. 

Local varieties from different areas of China are often mutually unintelligible, differing at least as much as different Romance languages and perhaps even as much as Indo-European languages as a whole.  
These varieties form the Sinitic branch of the Sino-Tibetan language family (with Bai sometimes being included in this grouping). Because speakers share a standard written form, and have a common cultural heritage with long periods of political unity, the varieties are popularly perceived among native speakers as variants of a single Chinese language, and this is also the official position.  
Conventional English-language usage in Chinese linguistics is to use dialect for the speech of a particular place (regardless of status) while regional groupings like Mandarin and Wu are called dialect groups.  
ISO 639-3 follows the Ethnologue in assigning language codes to eight of the top-level groups listed above (all but Min and Pinghua) and five subgroups of Min. Other linguists choose to refer to the major groupings as languages. Sinologist David Moser stated that the Chinese authorities refer to them as "dialects" as a way to reinforce China as being a single nation.

In Chinese, the term fāngyán is used for any regional subdivision of Chinese, from the speech of a village to major branches such as Mandarin and Wu. Linguists writing in Chinese often qualify the term to distinguish different levels of classification. All these terms have customarily been translated into English as dialect, a practice that has been criticized as confusing. The neologisms regionalect and topolect have been proposed as alternative renderings of fāngyán.

The only varieties usually recognized as languages in their own right are Dungan and Taz. This is mostly due to political reasons as they are spoken in the former Soviet Union and are usually not written in Han characters but in Cyrillic. Dungan is in fact a variety of Mandarin, with high although asymmetric mutual intelligibility with Standard Mandarin. 
Various mixed languages, particularly those spoken by ethnic minorities, are also referred to as languages such as Tangwang, Wutun, and E. The Taiwanese Ministry of Education uses the terms "Minnan language" and "Taiwan Minnan language".

Quote Of The Day

One red flag that a theory might be in trouble is when one has to invoke tooth fairies to preserve it. These are what the philosophers of science more properly call auxiliary hypotheses: unexpected elements that are not part of the original theory that we have been obliged to add in order to preserve it.
- Stacy McGaugh at his Triton Station blog.

Tuesday, November 15, 2022

The State Of Viable Supersymmetry Parameter Spaces

There are still plenty of high energy physics theorists clinging to supersymmetry theories (a.k.a. SUSY), in part because many theorists believe that SUSY theories are necessarily the low energy approximations of all string theory vacua. In other words, if SUSY is falsified, so is string theory in the eyes of many theorists. An abstract of a report from a group of physicists trying to make the case for the continued viability of SUSY is below.

A few observations:

1. The new anomalous CDF W boson mass measurement is disfavored even in a SUSY context. Also, like most physicists advancing BSM theories, the analysis ignores the fact that the new CDF W boson mass measurement is far out of step with all of the prior W boson mass measurements around the world which also have to be explained by any theory. The only plausible explanation is that the anomalous CDF W boson mass recalculation is wrong.

A new global electroweak fit, likewise, confirms that the anomalous CDF W boson mass recalculation is completely inconsistent with all other data.

2. SUSY needs the muon g-2 anomaly to fit the data in any low energy parameter space. But, if the BMW calculations of the Standard Model value of muon g-2 and alternative calculations of the hadronic light by light contribution to muon g-2 are correct, as is increasingly appears to be the case, while the Theory Initiative "data driven" hybrid calculations of the Standard Model value of muon g-2 are wrong, then low energy SUSY simply can't be fit to the experimental data. This strongly suggests that SUSY is not "just around the corner" waiting to be discovered at a next generation higher energy particle collider.

3. Needless to say, there is no direct experimental evidence of any SUSY particles to date at the LHC or anywhere else which means that any SUSY particles have to have masses above the electroweak mass scale it was designed to address. Direct exclusions of SUSY push its particles close to and beyond the TeV scale. No version of SUSY with parameters consistent with experimental constraints solves the hierarchy "problem" that it was original devised to address.

4. A thermal freeze out scenario for SUSY dark matter candidates can produce the right amount of dark matter inferred in the universe in a dark matter particle paradigm. 

But the authors conveniently ignore that direct dark matter detection tests that basically rule out a lightest neutralino dark matter candidate, because the cross-sections of interaction with ordinary matter are weaker than they should be in a SUSY theory. The cross-section of interaction of the lightest neutralino can be calculated precisely from theory in SUSY theories and should be comparable to the cross-section of interaction of ordinary neutrinos. The direct dark matter detection experiments place constraints on the cross-section of interaction which are many orders of magnitude smaller over a very large mass range. 

Other constraints disfavor heavy thermal freeze out SUSY WIMP dark matter particle candidates, which would, among other things, lead to more galaxy scale structure in the universe than we observe.
This is a brief overview on the low energy supersymmetry in light of current experiments including the LHC searches, the dark matter (DM) detections, the muon g-2 and the CDF II measurement of the W-boson mass. We focus on the minimal framework of supersymmetry, namely the minimal supersymmetric model (MSSM), and obtain the following conclusions: (i) The MSSM can survive all current experiments, albeit suffering from the little hierarchy problem due to the heavy stops pushed up by the LHC searches; (ii) The DM relic density can be readily achieved by the thermal freeze-out of the lightest neutralino and the null results of DM direct detections are typically driving the parameter space to the bino-like lightest neutralino region; (iii) The muon g-2 anomaly reported by FNAL and BNL can be explained at 2-sigma level, which indicates light sleptons and electroweakinos possibly accessible at the HL-LHC; (iv) The CDF II measurement of the W-boson mass can be marginally explained, but requires light stops near TeV which may soon be covered by the LHC searches.
Jin Min Yang, Pengxuan Zhu, Rui Zhu, "A brief survey of low energy supersymmetry under current experiments" arXiv:2211.06686 (November 12, 2022) (Proceedings of the LHCP 2022 conference).