Tuesday, January 29, 2013

Munda As Intrusive To India

The Origins of the Munda People

Razib Khan confirms at the Gene Expressino blog, with independent DIY autosomal genetic analysis, what previous linguistic, archaelogical and Y-DNA evidence had already strongly suggested: the Austro-Asiatic language subfamily of South Asia, called the Munda languages, now spoken by about 9 million people in the Northeastern part of South Asia were intrusive to South Asia from Southeast Asia, rather than the other way around.  The Austro-Asiatic languages are best known for the Vietnamese and Khmer (Cambodian) languages.  These languages have their roots probably in the hills of southern Yunnan in China," between 4000 BCE and 2000 BCE. (As an aside, Recent Y-DNA phylogeny evidence also supports the proposition that people who speak the Hmong-Mien languages are descended from the population that now speaks Austroasiatic Mon-Khmer languages.

This is particuarly notable because the speakers of the Munda languages are predominantly "tribal" peoples, and there has been a tendency to associate tribal peoples of South Asia with the most ancient layer of South Asia's population history.  But, while there may be an ancient South Asian substrate in linguistically Munda populations of South Asia that is particularly strong in the matrilineally inherited mtDNA of the Munda peoples and to a still strong extent in the autosomal DNA of these peoples, there is a clear Y-DNA and autosomal contribution from Southeast Aisa.

The Austro-Asiatic languages are believed to have expanded with agriculture and in particular with early rice farming agriculture, so its likely time of arrival can be associated with the earliest evidence for rice farming in South Asia. 

According to the Wikipedia account:
The earliest remains of the grain in the Indian subcontinent have been found in the Indo-Gangetic Plain and date from 7000–6000 BC though the earliest widely accepted date for cultivated rice is placed at around 3000–2500 BC with findings in regions belonging to the Indus Valley Civilization. Perennial wild rices still grow in Assam and Nepal. It seems to have appeared around 1400 BC in southern India after its domestication in the northern plains. It then spread to all the fertile alluvial plains watered by rivers.

The "tribal" character of the current populations probably reflects language shift among populations that continued to be sedentary rice farmers (mostly to either the Indo-Aryan languages, or to the Tibeto-Burman languages, although the genetic evidence suggests that most of the latter languages were transmitted mostly demically rather than via language shift, particularly on the Y-DNA side); and language conservation among populations that "reverted" to a hunting-gathering or pastoralist means of subsistance in areas that became unsuitable for rice farming.  The less sedentary populations were less easily subjegated by the Indo-Aryans.

As discussed below, this puts the arrival of the Munda languages in India earlier than the Indo-European languages (by 1000 to 1500 years), even earlier than the Tibet-Burman languages (by considerably more centuries), and roughly contemporaneously with the proto-language era of Dravidian immediately prior to its division into subfamilies.

The Indo-European Languages of South Asia

It is already widely agreed that the Indo-European languages of South Asia (mostly the Indo-Aryan languages, which are derived from Sanskrit, but also the Iranian languages and Nuristani languages found in Western Pakistan) are intrusiive to South Asia and arrived around 2000 BCE to 1500 BCE. 

Of course, the widely spoken Indo-European lingua franca of South Asia, which is English, is a legacy of colonial rule by the British, starting in the 1700s via the British East India Company and ending in 1947, when South Asia gained its independence from the United Kingdom.

The Tibeto-Burman Language Of South Asia

Genetic evidence, cultural evidence and historical evidence likewise point to the Tibeto-Burman languages spoken in South Asia's Northern and Eastern borders as intrusive more recently than either the Austro-Asiatic languages or the Indo-European languages.  And, thes languages are spoken only on the very fringe of South Asia, rather than penetrating deeply into it.

The Dravidian Languages

The case of the Dravidian languages is less settled.  With the exception of a pocket of Dravidan language speakers who speak the Brahui language of Pakistan, these languages are concentrated to the South and East of India.  Moreover, while the evidence of place names suggests that Dravidian languages were once spoken over a much wider geographic range in Southern India and the far Southeast of Pakistan than they are today (presumably displaced by Indo-Aryan languages ca. 1500 BCE in these places), per Wikipedia:

The Brahui, Kurukh and Malto have myths about external origins. The Kurukh have traditionally claimed to be from the Deccan Peninsula, more specifically Karnataka. The same tradition has existed of the Brahui. They call themselves immigrants. Many scholars hold this same view of the Brahui such as L. H. Horace Perera and M. Ratnasabapathy.   
Linguistic and genetic evidence, taken together, support an origin for Brahui sometime not long after 1000 CE, mostly via language shift rather than a demic migration of large numbers of Dravidian speakers from South Asia.

The 73 Dravidian languages are fairly closely related and there are attested versions of source languages for two of its main subfamilies by the 4th and 5th centuries respectively (Old Tamil ca. 300 BCE and Old Telegu ca. 400 BCE).  There was probably a single proto-Dravidian language that an ancestral to all modern Dravidian languages around 1500 BCE-2500 BCE.  This linguistically estimated dated coincides with the advent of farming in Southern India, an event known as the South Indian Neolithic

Thus, whether Dravidian was an indigeous language of Paleolithic South Indians, or was instead intrusive like India's other major languages, the particular Proto-Dravidian language that is ancestral to all of the modern Dravidian languages probably did not start to expand more than about a thousand years before the Indo-Aryans began to make their way into Southern India.

Autosomal genetics and most uniparental genetic markers point to ancestral South Indian genetic origins for the core of the linguistically Dravidian people (in some cases with an Indo-European demic infusion into the highest caste populations that underwent language shift) that is far, far older than the proto-Dravidian language. 

Any superstrate population associated with the Dravidian language must have been thin, and the only suggestive trace that I have identifed as a possible proto-Dravidian marker would be Y-DNA haplogroup T, which is most common in India geographically right where proto-Dravidian should have originated, although haplogroup T is now found in populations with various linguistic affiliations.  Y-DNA haplogroup T is found in appreciable frequencies among Somolians, Egyptians and Mesopotamians, and at low frequencies throughout the early Neolithic area, but isn't particular common in the Indus River Valley area where Y-DNA haplogroup T's sister clade, Y-DNA haplogroup L, is more common.

The linguistic origins of Dravidian are unclear.  Some of the more plausible linkages to Dravidian linguistically have been to the Elamite language, to the Uralic languages, and to the fringe members of the Niger-Congo linguistic family that show the simplifying impact of linguistic neighbors that speak languages from other language families.

The hypothesis that Proto-Dravidian was once the language of the Indus River Valley Civilization (mostly in modern Pakistan and the adjacent deserts of India which were once fertile river beds), has very little support in the modern distillation of the linguistic, genetic, and archaeological evidence.  The Indus River Valley civilization area and the Dravidian area are genetically distinct (the deep Ancestral North Indian v. Ancestral South Indian divide).  The archaeological evidence supports only a few border trading posts between the two civilizations.  There is no real evidence of a Dravidian substrate in the earliest Rig Vedic Sanskrit texts (Dravidian influence comes only later through word borrowing and areal influences).  And, the characteristic crops of the Dravidian linguistic area seem to be derived from Sahel African crops rather than from Fertile Crescent crops, as they are better adapted to the local seasons.  This is one of the reason that Fertile Crescent agriculture didn't immediately spread to South India once it arrived in the Indus River Valley area about four thousand years before the South Indian Neolithic.  There are cultural links in addition to crops to the culture of Sahel farmers.

This also tends to disfavor the Elamo-Dravidian theory of this language's origins, since that theory generally assumes that Dravidan reached India as an extension of the Harappan culture of the Indus River Valley which was, unlike South India, adjacent to Southern Iran where the Elamite language was spoken into the early historically attested era in the region.

Bottom Line

There are few tiny, near moribund possible exception of some language isolates in the Andaman Islands (which show genetic linkage to South India) and the highlands of the Himalayas, sometimes grouped into a conjectural and residual Indo-Pacific language family. 

But, otherwise, all of the languages of South Asia arrived there from outside South Asia within the last five thousand years or so, with the possible exception of one geographically tiny pre-Neolithic Dravidian language family dialect if that language is indigeneous rather than intrusive within the last ten thousand years or so.

The language of Paleolithic India as of about 8000 BCE are probably entirely lost, as is the Harappan language, unless one takes the minority view that proto-Indo-European was Harappan rather than the majority view that proto-Indo-European was a language of the Pontic-Caspian steppe.  The minority view is far less crazy than it is often decried as being, but still lacks the multidisciplinary evidnece to support it that the Kurgan hypothesis of Indo-European language origns does.

The main argument for Harappan as proto-Indo-European is the absence of any easily discernable substrate component to the early Sanskrit writings, against a presumed proto-Indo-European background validated with early West Eurasian languages and Tocharian.
But, this should not be misconstrued to say that the major languages of South Asia involved demographic replacement.  Genetically, South Asia's indigeneous Paleolithic peoples appear to be more strongly represented than almost anywhere else in the world, as evidenced by private Y-DNA lineages, private mtDNA lineages and the nature and distributions of South Asia's autosomal genetics in relation to those elsewhere in Asia and Europe.  The case the a majority of non-African modern humans lived in South Asia for much of the period from ca. 75,000 years ago until perhaps 30,000 years ago, is quite strong.  There are clear genetic superstate populations and there is no longer any pure ancestral South Indian population (the Andamanese probably come closest).  But, the ASI percentage of ancestry in much of India is quite high.

9 comments:

Maju said...

What most called my attention is that (always in the context of an algorithm of dubious usefulness) the Munda's main stem hangs from the main South Asian trunk, what implies that most of their genetics should be native and not imported.

Otherwise I agree that AA languages and culture, and some genetics but not all, especially not most of the mtDNA, is immigrant from the Neolithic period.

andrew said...

Point well taken. While the charts in Razib's chart don't really allow for a direct comparison, it visually conveys the impression that the Munda superstrate was a lot thinner than the Indo-Aryan superstrate.

terryt said...

"the Austro-Asiatic language subfamily of South Asia, called the Munda languages, now spoken by about 9 million people in the Northeastern part of South Asia were intrusive to South Asia from Southeast Asia, rather than the other way around".

Great to have that 'proved'. The Y-DNA involved is obvioulsy O2 but interestingly the mt-DNA R-derived haplogroups R7 and R8 are both said to be especially associated with Munda-speaking populations.

"most of their genetics should be native and not imported".

But you know as well as the rest of us that languages are more mobile than are haplogroups.

"The Austro-Asiatic languages are believed to have expanded with agriculture and in particular with early rice farming agriculture, so its likely time of arrival can be associated with the earliest evidence for rice farming in South Asia".

As can the whole Y-DNA O expansion.

"Genetic evidence, cultural evidence and historical evidence likewise point to the Tibeto-Burman languages spoken in South Asia's Northern and Eastern borders as intrusive more recently than either the Austro-Asiatic languages or the Indo-European languages. And, thes languages are spoken only on the very fringe of South Asia, rather than penetrating deeply into it"

Tibeto-Burman in India looks most likely associated with Y-DNAs O3 and D.

andrew said...

"But you know as well as the rest of us that languages are more mobile than are haplogroups."

I've been brewing in my head the outlines of a post on just that issue, i.e. the relative mallability of different cultural and genetic indicators in relation to time depth.

"Tibeto-Burman in India looks most likely associated with Y-DNAs O3 and D."

If I had more time, I would have tracked down the reference to the relevant published papers. D would have been long predated the existence of Tibeto-Burman but would indeed have possibly been in the mix of peoples expanding with that language. Indeed, the near total absence of D in the Han Chinese argues strongly from a lowland rather than highland urheimat for the Broader Sino-Tibeto-Burman language family, with the highlands as a receiving refugia and point of dispersal rather than an origin for the family (with the high diversity there explained by the tendency of mountain populations to balkanize and form insular microenvironments).

terryt said...

"D would have been long predated the existence of Tibeto-Burman but would indeed have possibly been in the mix of peoples expanding with that language".

Exactly. It is certainly not associated with Tibeto-Burman in the Andamans for example. But I think the evidence is pretty convincing that D's spread into Northeast India was associated with the Tibeto-Burman expansion.

"the near total absence of D in the Han Chinese argues strongly from a lowland rather than highland urheimat for the Broader Sino-Tibeto-Burman language family, with the highlands as a receiving refugia and point of dispersal rather than an origin for the family"

I'm very sure you are correct, and it has been a source of constant disagreement between Maju and myself. I'm sure that Tibeto-Burman's spread is associated with the expansion of the Chinese Neolithic. In fact the whole Y-DNA O expansion is associated with the Chinese Neolithic.

andrew said...

"But I think the evidence is pretty convincing that D's spread into Northeast India was associated with the Tibeto-Burman expansion."

In my view the evidence is ambivalent on the point. The Y-DNA subhaplogroups of D in Tibet and those in India and the Andamans are all of a piece phylogenetically and relatively distinct from those in Japan, in contrast. But, the directionality of the gene migration is not at all obvious.

terryt said...

"relatively distinct from those in Japan"

Yes. D in Japan is quite distinct (D2). D1 is Tibetan. D3 is Central Asian but is present in Tibet, Tajikistan and the Pamirs. The Andamans may be yet a third haplogroup (D*) as may be the D* in Central Asia. The D in India is quite probably the same haplogroup as in The Andaman Islands, D*.

"But, the directionality of the gene migration is not at all obvious".

From the distribution of the variuos D haplogroups I'd say somewhere near Tibet was the obvious source. From Wikipedia to remind you:

http://en.wikipedia.org/wiki/Haplogroup_D-M174_(Y-DNA)

"This paragroup is found with high frequency among Andaman Islanders and 0%-65% in Northeast India in adivasi tribes"

'Adavasi' is a general term but most in Northeast India, where Indian D is concentrated, speak Tibeto-Burman and Austro-Asiatic languages. As far as I'm aware D has not been found in Austro-Asiatic-speaking people.

Maju said...

There's a key study on the haplotype structure of D that finds quite good support for SE Asian origin of D. I'm now at the tablet so I don't have my bookmarks at hand but Terry should know by now which is it, yet he insists on speculating for his pet hypothesis without basis...

terryt said...

"Terry should know by now which is it"

Yes. It placed D's origin somewhere around the border region between Tibet and China. Maju chose to see it as being SE Asia but as he has recently shown his knowledge of the geography of the region is sketchy at best, so we can safely ignore his belief in a SE Asian origin.