Monday, May 20, 2024

Four Waves Of New World Languages?

The "four wave" idea is that there were four waves of people in the founding population from two eras. These were in  turn followed two much later waves that aren't included in the four, a Na-Dene wave, followed much later by an Inuit wave. And, there is indeed genetic evidence of a two component founding population, although no previous study have felt they had sufficient evidence to make a corresponding linguistic division with linguistic features that persisted to the present.

The abstract of a new historical linguistics paper states:
The known languages of the Americas comprise nearly half of the world's language families and a wide range of structural types, a level of diversity that required considerable time to develop. This paper proposes a model of settlement and expansion designed to integrate current linguistic analysis with other prehistoric research on the earliest episodes in the peopling of the Americas. 
Diagnostic structural features from phonology and morphology are compared across 60 North American languages chosen for coverage of geography and language families and adequacy of description. 
Frequency comparison and graphic cluster analysis are applied to assess the fit of linguistic types and families with late Pleistocene time windows when entry from Siberia to North America was possible. The linguistic evidence is consistent with two population strata defined by early coastal entries ~24,000 and ~15,000 years ago, then an inland entry stream beginning ~14,000 ff. and mixed coastal/inland ~12,000 ff. 
The dominant structural properties among the founder languages are still reflected in the modern linguistic populations. The modern linguistic geography is still shaped by the extent of glaciation during the entry windows. Structural profiles imply that two linguistically distinct and internally diverse ancient Siberian linguistic populations provided the founding American populations.
Final two paragraphs of the paper conclude:
The results are compatible with the two-origin analysis (Two Main Biological Components, or 2MBC) of Walter Neves and various colleagues (Neves & Pucciarelli, 1991; Neves et al., 2007; Hubbe et al., 2010; Hubbe et al., 2020, and several others; overviews in Rothhammer & Dillehay, 2009; Hubbe, 2015; von Cramon-Taubadel et al., 2017), which infer a distinction of early versus later populations in South and Central America from craniometric data; the early population has Australasian/Melanesian affinities while the later one has Siberian affinities. Skoglund et al., 2016 find South American genomic evidence for two populations and the same affinities. These findings are a good match for the distinction here of early versus late strata. 
Whether the physical diversity matches the linguistic diversity is not yet clear. The start times implied are too late, but this is probably not essential. The geographically patterned structural distribution is not addressed as Neves and colleagues deal only with Central and South America, and Skoglund et al., 2016 find too little data for North America. Pitblado, 2011 reviews other evidence for two origins.

Thus the modern facts speak for a structurally differentiated linguistic population in ancient Siberia. Its descendants entered through a chronologically and geographically sequenced series of openings beginning in the peak LGM, first coastal and then inland, rapidly moving well to the south and well inland. Whether the early and late South American populations continue the North American early and late strata, and whether the North American geographical patterning continues to South America, warrant further work.

The introduction explains that:
Archeological evidence shows that there were human entries to the Americas well before the end of the last glaciation (Dillehay, 1997; Dillehay et al., 20082015) and quite possibly by the peak of glaciation c. 24,000 years ago. The linguistic evidence requires entries during or before the last glacial maximum (LGM) (Nichols, 2015a2015b), enough to give rise to the ~200 independent language families of the Americas and their structural diversity. (Before recent extinctions the Americas hosted close to half of the known linguistic lineages.) Glaciation blocked entry to North America, but recent paleoceanographic data and models show two periods during the terminal Pleistocene when coastal entries were possible: ~24.5–22 ka and ~16.4–14.8 ka (Praetorius et al., 2023). Overland entries to intermontane North America via the ice-free corridor between the shrinking Cordilleran and Laurentide glaciers became possible in the same general time frame as the second of these openings (Perego et al., 2009, Anderson & Bissett, 2015:79–80), and movement into the interior via the Columbia and Snake to the upper Missouri drainage and via the Colorado and Gila was always possible (Anderson et al., 2013; Anderson & Bissett, 2015).

Drawing on these sources, this paper proposes a two-stratum, four-opening model for the origin and expansion of the American linguistic population. The first and second openings (~24,000 and ~15,000 years ago, following Praetorius et al., 2023resulted in coastal entries south of the ice sheets (i.e. south of the Columbia River). The resultant linguistic populations have mingled in Oregon and California; their descendants cannot so far be clearly distinguished on linguistic evidence, and will be jointly called the early stratum. The third opening was the formation of the ice-free corridor ~14,000 years ago, resulting in colonization of the North American interior by people from Beringia and unglaciated northwestern Alaska; this produced a later stratum which gave rise inter alia to the Clovis phenomenon and whose linguistic descendants are structurally quite distinct from the early stratum found in California and Oregon. The fourth opening was the retreat of the coastal ice sheet, making possible coastal and near-coastal settlement of the Pacific Northwest ~12,000 years ago. Descendants of fourth-opening languages share many of the distinctive properties of third-opening languages but can be distinguished on other salient properties. Still later came the entries of two known language families: Athabaskan-Eyak-Tlingit (AET, a.k.a. Na-Dene), mostly interior-oriented, and Eskimo-Aleut, primarily coastal.

Inland Northeastern Asia during the terminal Pleistocene must have been a linguistic spread zone like attested recent arctic and subarctic regions. Relevant modern examples are non-food-producing arctic regions (North America; northern Eurasia prior to the introduction of large-scale reindeer herding) and also non-food-producing desert regions (southern Africa, the Australian arid interior, and part of the Great Basin). A spread zone usually has a sparse and often mobile human population and a linguistic population shaped by convergence and diffusion with occasional large spreads altering the genealogical and/or typological profile of the region. In the roughly 8000–10,000 years between the two coastal openings there must have been the normal gradual grammatical and lexical changes in the northeastern Asian languages, and occasional immigration, emigration, or extinction of languages, over time substantially changing the linguistic typological profile of the area. (Note that the linguistic comparative method, used to establish relatedness and reconstruct ancestral states, can rarely reach back more than ~6000 years, so daughters from an 8000-year-old split can be as unrecognizably different as two unrelated languages.) A few representatives of the Siberian linguistic population as it changed over time can be assumed to have made viable entries to subglacial North America.

With deglaciation after the third and fourth openings, occasional further entries must have occurred. A total of about 12 entries at a mean of 2000-year intervals, beginning about 25,000 years ago, would generate the genealogical and structural diversity attested in the Americas and allow time for expansion as far south as Monte Verde, Chile, a site with clear evidence of human habitation at over 14,000 years ago (for the expansion rate, Nichols, 2008; for Monte Verde, Dillehay et al., 2015; Erlandson et al., 2008; Dillehay, 1997). The most recent two postglacial entries are identifiable with specific language families: the AET family, which dispersed in Alaska at least 5000 years ago and has well-argued likely connections to the Yeniseian family of north central Siberia (Vajda, 20102018, Fortescue & Vajda, 2022), age of entry unknown but likely early postglacial (Kari, 2019 gives some hydronymic evidence)6and the ~4000-year-old Eskimo-Aleut family of coastal Siberia, Alaska, Canada, and Greenland, whose Eskimoan and Aleut branches split in Siberia, spread eastward separately to Alaska, and later recontacted there (Berge, 20172018). The AET and Eskimo-Aleut families are different in structural type from each other and from most of the other families in North America. Geographically, I put them in the Pacific Northwest group.

The first entrants to North America expanded the human frontier, and as the continent was settled the frontier expanded east and south. Behind the gradually advancing frontier was probably a dynamic and variable ethnolinguistic situation. Though the sociolinguistics of interaction between small, mobile groups in very sparsely inhabited land with an open frontier is unknown, the model sketched here attempts to capture one element of the picture: language viability and spreads from the entry points. In this model an entering group settles in an attractive resource-rich area (the term of Beaton, 1991 is megapatch) not far from the entry point, and there they prosper, adapt to the environment and resources, and expand demographically. By the time another entrant appears the first settlers are well entrenched, more numerous than the newcomers, and in a position to fend off or absorb the small group of newcomers. The newcomers move on, skirting earlier entrants. Offshoots from the first settlement also move away and found new colonies. Thus the early settlement functions as a staging area (using the term of Anderson & Gillam, 2000, Anderson et al., 2013; those staging areas are archeologically supported while those proposed here are supported by linguistic intangibles). Gradually the human frontier expands far to the south and east, but it continues to be the case that descendants of the earlier settlers remain entrenched in or near the staging area more often than they move on.

This paper is an exploratory study showing that there is enough linguistic evidence to support the model and raise research questions and hypotheses. It is intended to help geneticists, archeologists, and linguists formulate research questions and hypotheses relevant to each other's work. Meanwhile, pending larger-scale and more definitive work, the results are used here as an analytic framework.
The paper is: 

Johanna Nichols, "Founder effects identify languages of the earliest AmericansAmerican Journal of Biological Anthropology (March 30, 2024) (open access) via Language Log.

4 comments:

Gregory B said...

Andrew,

I have enjoyed reading your comments since we were on Dienekes’s blog. This is my first post here, though I have been reading your blog for years.

Regarding American settlement patterns, I think that there were 2 main groups settling the Americas: a Western group and an Eastern one.
The Eastern one was less Mongoloid and more long-headed; the Western group was more Mongoloid and more short-headed.
I think that the Eastern one came first, and that the Western one came later and pushed Eastern one out of the West (hence my terminology), although there was much penetration into the East by the Westerners, with some conquest and some spreading of language (and the Q male haplogroup).

I think that the tail end of the Eastern migration was cut off by the Western migration, but, later, this tail end, the Nadene (or, perhaps better, AET) speakers entered and cut off the tail end of the Western migration; that tail end, the Eskimo-Aleut speakers, came in later, behind the Nadene.

As to the languages they spoke, Sapir’s classification of the languages north of Mexico seems reasonable, while Greenberg’s classification of the Latin American languages is hard to evaluate, as he apparently did not attempt a general characterization of his language groups. I have read some discussion of these by the owner of the website “Emekur: Sprachen, Karten and Sprachkaren” at emekurnet.wordpress.com. He seems to characterize what Greenberg called “Andean” and presumably “Macro-Chibchan” as relying more on suffixes than on prefixes, and characterize the ones in eastern South America as relying more on prefixes than on suffixes (these are what Greenberg called “Equatorial” and “Ge-Pano-Carib”, though Emekur and others think that Panoan is more closely related to Arawakan, an Equatorial language group, and Tupian, placed in Equatorial close to Arawakan by Greenberg, is closer to Ge and Carib than Arawakan) .

Now this pattern of having languages relying more on suffixes in the west and languages relying more on prefixes in the east in South America is mostly repeated in North America. None of the other features discussed by Emekur are mentioned by Sapir. Nonetheless, though the role of affixes by itself is thin evidence, I am inclined to think that that suffixing being stronger in the west on both continents and prefixing being stronger in the east is not just a coincidence. And Swadesh thought that at least Macro-Chibchan belonged in his Macro-Penutian group, along with Penutian, Aztec-Tanoan, and Macro-Oto-Manguean.
I will have to continue this in my next post.

Gregory Browne

Gregory B said...

Focusing on the North American languages and on Sapir’s classification of them, it seems that we can sort them into two groups:
1. Eskimo-Aleut and Algonkian-Wakashan,
and Penutian and Aztec-Tanoan
2. Nadene and Hokan-Siouan
(Sapir seemed to support group 1, except that he thought that Aztec-Tanoan appeared to be like a Penutian substrate overlaid by Hokan.)

Group 1 languages are characterized by:
--using suffixes more than prefixes
--having cases
--having reduplication (except for Eskimo-Aleut)

Group 1 languages minus Aztec-Tanoan are characterized by those features plus:
--having, in addition to reduplication, other inner stem changes (except for Eskimo-Aleut)
--not compounding stems
--not practicing nominal incorporation (except for a moderate amount in Algonkian-Wakashan)
--having inflection

Group 2 languages are characterized by:
--not using suffixes more than prefixes (either using prefixes more, as in Hokan-Siouan, or using something like both, as in Nadene)
--not having cases
--not having using reduplication or other inner stem changes
--compounding stems
--practicing nominal incorporation
--not having inflection or inflecting weekly at best

Regarding their haplogroups, while male Q seems to prevail in all areas (except among the Nadene, where male C is strongest), they diverge in regard to the female haplogroups.
In areas where Hokan-Siouan and some the languages of eastern South America are found, female haplogroup C is strong, whereas female haplogroup D is also common in eastern South America. Both of these female haplogroups are descended from female haplogroup M. So in parts of the Americas where Group 2, the suffixers, are found, descendents of haplogroup M prevail among the females.
In contrast, in parts of the Americas where Group 1, the prefixers, are found, the female haplogroups are A and B, which are both descendents of female haplogroup N (A being directly descended from N, and B being descended from R, which is descended from N). I am inclined to think that A was originally associated with Eskimo-Aleut and Algonkian-Wakashan speakers and that B was originally associated with Macro-Penutian and Andean speakers.
Now female haplogroup A is also strong in eastern North America, but the Eskimo-Aleut and Algonkian-Wakashan also extend far into the east, all the way to the Atlantic.
Among the Nadene, I think it was just C males (perhaps a lost war party) conquering some A females (A prevailing among females on both sides of the Nadene: the Algonkian-Wakashan and the Eskimo-Aleut), as the speakers are round-headed and more Mongoloid than their linguistic relatives in the eastern parts of the Americas, which features probably came from the conquered females of group A.

andrew said...

@GregoryB

I appreciate your thoughtful comments.

I'm not sure why one would conflate the late arriving Eskimo-Aleut and NaDene language families with the far earlier founding population rooted language families that have two to four subgroups, into "Group 1" and "Group 2".

Are you suggesting that the older language families were influenced by contact with the later arriving languages?

Gregory B said...

I think that the language groups are similar because they have a common ancestor, and I think that most speakers of each group are closely related to most of the speakers of the language groups that are next of kin to their language group.
I think that few of their similarities are due to borrowing, because I think that few of the similarities of most pairs of neighboring language groups are similar because of borrowing. I think many words are borrowed, and a few phonemes are borrowed, but I think that it is much rarer that complex grammatical features (syntactic and morpological) are borrowed, because it is much more difficult; I won’t say it never happens, but I think it is rare.