Wednesday, January 17, 2018

Ten Observations About The Harappan Language

The best evidence for the Harappan language comes from the substratum influences in Vedic (all quotes not otherwise attributed are from this link) supplemented by genetic data and archaeology in close cases. 

What can we conclude?

1. Retroflex phonemes where part of this language.

Why? 
Retroflex phonemes are now found throughout the BurushaskiNuristaniDravidian and Munda families. . . . They are not reconstructible for either Proto-Indo-European or Proto-Indo-Iranian, and they are also not found in Mitanni–Indo-Aryan loanwords. . . .There is a clear predominance of retroflexion in the Northwest (Nuristani, Dardic, Khotanese Saka, Burushaski), involving affricates, sibilants and even vowels (in Kalasha), compared to other parts of the subcontinent. It has been suggested that this points to the regional, northwestern origin of the phenomenon in RigVedic.
The Harappan civilization was initially predominantly in the Indus River Valley and now extinct Sarasvati River basin. When the Sarasvati River declined, the people who lived there migrated to the Ganges River basin. This is consistent with the Northwestern subcontinent scope of this feature. 

Its appearance in Dravidian and Munda languages could be an adstratum effect (i.e. a borrowing from a prestige language), since Harappan civilization was "more advanced" and the core of both of these language families is not in the Northwest and this phoneme is less common in non-Northwestern representatives of these languages.

The Northwestern distribution also suggests that Harappan linguistic influence may have coincided with a BMAC and eastern Iranian trade influence that is documented archaeologically. This is also supported by the existence of 55 loan words shared by Sanskrit and Old Iranian, and by the relevance of some of the words to the BMAC culture.

2. Substrate vocabulary topics are mostly what you would expect.

The case that there was a separate Harappan language that was a substrate to Indo-Aryan languages is supported by the number and nature of the likely substrate words in early Sanskrit.

The lion's share of Vedic substrate words are not Dravidian and instead cannot be attributed to any well attested linguistic source, hence they are probably Harappan.
Kuiper identified 383 specifically Rigvedic words as non-Indo-Aryan — roughly 4% of its vocabulary. Oberlies prefers to consider 344–358 "secure" non-Indo-European words in the Rigveda. Even if all local non-Indo-Aryan names of persons and places are subtracted from Kuiper's list, that still leaves some 211–250 "foreign" words, around 2% of the total vocabulary of the Rigveda. 
These loanwords cover local flora and fauna, agriculture and artisanship, terms of toilette, clothing and household. Dancing and music are particularly prominent, and there are some items of religion and beliefs. They only reflect village life, and not the intricate civilization of the Indus cities, befitting a post-Harappan time frame. In particular, Indo-Aryan words for plants stem in large part from other language families, especially from the now-lost substrate languages.
3. Harappan had a distinctive animate prefix k- 
Mayrhofer identified a "prefixing" language as the source of many non-Indo-European words in the Rigveda, based on recurring prefixes like ka- or ki-, that have been compared by Michael Witzel to the Munda prefix k- for designation of persons, and the plural prefix ki seen in Khasi, though he notes that in Vedic, k- also applies to items merely connected with humans and animals.
This is important for a couple of reasons. 

First, prefixing languages appear in only geographically limited areas (mostly sub-Saharan Africa and island Southeast Asia with isolated examples in the Americas and in the Ket language).

Second, languages which categorize words by categories such as "animate" are relatively uncommon. In some places these systems are called "genders" and in other places these systems are called "noun cases" but they seem to represent the same phenomena.

Most of the most naively plausible macro-linguistic family relations of Harappan are not prefixing and do not have a gender/noun case system that includes candidates like "animate" which are found in some African languages, some South Asian languages, and some Australian languages. 

I suspect that the presence of this prefix in Munda is more likely to be an adstratum effect of Harappan on Munda, or a substratum effect of a hunter-gatherer language of India, than visa versa, as this feature seems to be absent in other Austro-Asiatic languages from which Munda is primarily derived.

4. Harappan is not a Dravidian language.
There are an estimated thirty to forty Dravidian loanwords in Vedic. However, Witzel finds Dravidian loans only from the middle Rigvedic period, suggesting that linguistic contact between Indo-Aryan and Dravidian speakers only occurred as the Indo-Aryans expanded well into and beyond the Punjab. . . . 
Dravidian languages are primarily confined to the South of India today, there is a striking exception: Brahui (which is spoken in parts of Baluchistan). . . . some Dravidian place-names are found in now Indo-Aryan regions of central India, and possibly even as far northwest as Sindh. However, it is now argued by Elfenbein that the Brahui could only have migrated to Balochistan from central India after 1000, because of the lack of any older Iranian (Avestan) loanwords in Brahui. The main Iranian contributor to Brahui vocabulary, Balochi, is a western Iranian language like Kurdish, and moved to the area from the west only around 1000. 
As noted above, retroflex phonemes in early Indo-Aryan cannot identify the donor language as specifically Dravidian. Krishnamurti argues the Dravidian case for other features: "Besides, the Ṛg Veda has used the gerund, not found in Avestan, with the same grammatical function as in Dravidian, as a nonfinite verb for 'incomplete' action. Ṛg Vedic language also attests the use of iti as a quotative clause complementizer." However, such features are also found in the indigenous Burushaski language of the Pamirs and cannot be attributed only to Dravidian influence on the early Rigveda.
Archaeology and to a lesser extent genetics, also argue against Harappan and Dravidian sharing a same language family, so the cultures had only thin trade relations with each other, and there is an ANI v. ASI distinction in genetics that was probably more distinct earlier on.

5. Harappan is not a Munda language.
Kuiper identified one of the donor languages to Indo-Aryan as Proto-Munda. Munda linguist Gregory D. Anderson states: "It is surprising that nothing in the way of quotations from a Munda language turned up in (the hundreds and hundreds of) Sanskrit and middle-Indic texts. There is also a surprising lack of borrowings of names of plants/animal/bird, etc. into Sanskrit (Zide and Zide 1976). Much of what has been proposed for Munda words in older Indic (e.g. Kuiper 1948) has been rejected by careful analysis. Some possible Munda names have been proposed, for example, Savara (Sora) or Khara, but ethnonymy is notoriously messy for the identification of language groups, and a single ethnonym may be adopted and used for linguistically rather different or entirely unrelated groups".
This is also an archaeological and geographic fit.

Domesticated Indian rice was consumed in the late Harappan period and probably coincides with the arrival of the Munda people, but would not have been present when the Harappans were adopting a language and would not have had such a great impact on this civilization as to result in language shift, as even then domesticated rice was not a dominant staple of the Harappan diet.

6. Harappan was not an Indo-European language.

The evidence from linguistics, genetics and archaeology that Indo-Aryan was invasive to South Asia sometime ca. 2500 BCE to 1500 BCE is pretty overwhelming. Ancient DNA results likely to be released any day now are likely to further confirm this reality.

7. Harappan seals probably weren't a full written language.

More on this in some other post. For a minority view that these seals were a language and were the source of the Brahmi script, see here.

8. The evidence for an expected Mesopotamian language family connection is weak, but it may have been ergative.


Harappan agriculture was derived from the Fertile Crescent Neolithic package via early Caucasian/Iranian farmers. And, there was maritime trade between the Harappans and the Sumerians in the Enolithic and Bronze Ages via the Persian Gulf.


Realistically, even if Harappan was the language of early Iranian farmers and displaced the local hunter-gatherer languages of the Indus River Valley when they arrived, ca. 6000 BCE, between then an ca. 3500 BCE when they engaged in maritime trade with Sumeria, there were 2500 years during which a great deal of linguistic divergence could have emerged making any language family resemblance largely unrecognizable.

Historically records of Sumerians discussing Harappan traders suggest that the Sumerians found the Harappan language to be impenetrable.

I once suspected strongly that the Harappan language was ergative, although now I have some doubts about this proposition.

One clue: "A lot of Indo-Iranian languages are partially ergative as well, with the Kurdish languages, Pashto, and Baloch". This is suggestive of an ergative substrate in the Harappan trade zone which makes sense given that Elamite of West Asia is ergative and that Harappans have trade ties to Sumeria and probably arrives as West Asian farmers from the Fertile Crescent where their agricultural package was developed over Persia to the Indus River Valley. (Notably, there appears to be genetic continuity between early West Asian farmers and modern Iranian Zoroastrians). It is possible, even likely, that this was also a feature of the language of the early Caucasian Farmers even pre-Uruk.

But, it may be impossible to ever know for sure, and the lack of strong linguistic connections of other kinds between Harappan and Mesopotamian and Iranian languages muddies that inference. So does the fact that ergativity seems to be absent from the core Northwest subcontinent region, as there are examples of ergativity persisting as a substrate influence, for example, in some Berber languages and in Kurdish. If Harappan were ergative, this substate influence really should have shown up in some Indo-Aryan language of the Northwest subcontinent and it does not.

9. Pre-Indo-European Harappan culture contributed significantly to the resulting Indo-Aryan culture.

Some general discussion of these issues is here. For example, the culinary dish "curry" was a pre-Indo-Aryan Harappan contribution to South Asian culture.

10. There was probably a single Harappan language, at least for Harappa proper.

The Indus River Valley civilization unlike the Mesopotamian civilization appears to have been a unified polity unified by trade for much of its history, rather than a sequence of warring independent city-states. This suggests that they probably had a single language as well.

If there was a language family that included Harappan, mostly likely the Indus River Valley/Harappan civilization proper spoke one language and its trading partners in eastern Iran and the BMAC may have spoken related languages in the family derived from it.

2 comments:

Ryan said...

My guess is that Harrapan is not closely related to the languages of Mesopotamia or Iran. If the DNA rumours are true, they were pretty genetically distinct from their Iranian neighbours and dominanted by an indigenous Y-chromosome haplogroup (L), so my guess would be their language would be similarly distinct and local.

andrew said...

@Ryan I'm starting to come around to your point of view.