Showing posts with label Europe. Show all posts
Showing posts with label Europe. Show all posts

Thursday, September 11, 2025

Where Does Language Complexity Evolve?

I'm not entirely sold on this article's conclusion about the circumstances in which language complexity evolves, at least not based upon the data presented, even though there is good evidence that large numbers of language learners tend to reduce grammatical complexity.

Polysynthetic languages are very common in the New World and just across the Beringian land bridge from it (which could also be connected to the Paleo-Siberian cases), suggesting that most instances of it could be derived from a common source creating a Founder effect. 

Also, while these languages are small and isolated now, this hasn't always been the case. Athabaskan-Eyak-Tlingit and Nahuatl, for example, historically were quite expansive and had lots of contact with other language families.

The second cluster is from Aboriginal Australians and Papuans who derive from the first wave of modern human migration Out of Africa and into Asia around the time of the Toba eruption. 

There is a third cluster in the Caucuses, involving just one of the language families there, which are associated with early migrants from the first wave of Fertile Crescent agriculture in the highlands of West Asia. 

There are apparently other instances in South Asia and East Asia.

A handful of major language expansions, none of which happened to be polysynthetic make up a huge share of all languages spoken today. These include the Indo-European language family, the Afro-Asiatic languages, the Bantu language family, the Dravidian languages, and the arguable Altaic language family. European and Chinese colonial empires further reduced the size of many languages (long after their features were well intact), for reasons unrelated to language complexity. 

There are polysynthetic or borderline polysynthetic languages listed in the following major language families: the Sino-Tibetan languages (seemingly all in the Tibetan-Burmo branch), the Austroasiatic languages (3 Munda languages in Northeast India), the Austronesian languages (5 borderline seemingly Formosan languages) 

If only a minority of pre-expansion languages were polysynthetic, it wouldn't be surprising if the main expanding languages didn't end up including them.

Significance

A global test reveals statistically robust support for the hypothesis that complex word forms are more likely to develop in isolated languages. Polysynthesis, where words are built from many units to convey complex meanings, is more likely to occur in smaller populations and less likely to occur with many languages in contact. By building a global database of polysynthetic languages and analyzing in a phylospatial framework, this study highlights the potential for macroevolutionary methods to test hypotheses about language evolution and contribute to long-standing debates in linguistics.

Abstract

Evolution of complexity in human languages has been vigorously debated, including the proposal that complexity can build in small, isolated populations but is often lost in situations of language contact. If it is generally true that small, isolated languages can build morphological complexity over time, but complexity tends to be lost in situations of language contact, then we should find that forms of language complexity that have evolved multiple times will tend to be associated with population size, isolation, and language age. 
We test this hypothesis by focusing on one particular form of morphological complexity, polysynthesis, where words built from many parts embody complex phrases. By assembling a global database of polysynthetic languages and conducting phylospatial analyses, we show that languages with highly complex word morphology are more likely to have small population sizes, less likely to occur with many other languages in direct contact, and have a greater tendency to be on long phylogenetically isolated lineages. 
These findings are consistent with the hypothesis that languages that evolve in isolation for long periods may be more likely to accrue morphological complexity. Polysynthetic languages also tend to have higher levels of endangerment. Our results provide phylogenetically informed evidence that one particular form of complex language morphology is more likely to occur in small, isolated languages and is prone to loss in contact.
Lindell Bromham, Keaghan Yaxley, Oscar Wilson, and Xia Hua, "Macroevolutionary analysis of polysynthesis shows that language complexity is more likely to evolve in small, isolated populations" 122(24) PNAS e2504483122 (June 12, 2025) (pay per view; but open access supplemental materials).

The languages counted as polysynthetic and borderline, according to the Supplemental Materials, are below the fold.

Tuesday, September 9, 2025

The Huns Were Paleo-Siberian, Not Linguistically Turkic (Also Slavic Origins)

A new paper makes a strong case that the Huns, a group of "barbarians" (in the eyes of Roman historians) who made multiple attempts to invade the Roman empire, spoke a Paleo-Siberian language (to which the Na-Dene languages of North America, such as Navajo, are distantly related), rather than a Turkic language, as conventional wisdom in historical linguistics prior to this paper had wrongly believed.



The Xiōng-nú were a tribal confederation who dominated Inner Asia from the third century BC to the second century AD. Xiōng-nú descendants later constituted the ethnic core of the European Huns. It has been argued that the Xiōng-nú spoke an Iranian, Turkic, Mongolic or Yeniseian language, but the linguistic affiliation of the Xiōng-nú and the Huns is still debated. 
Here, we show that linguistic evidence from four independent domains does indeed suggest that the Xiōng-nú and the Huns spoke the same Paleo-Siberian language and that this was an early form of Arin, a member of the Yeniseian language family. This identification augments and confirms genetic and archaeological studies and inspires new interdisciplinary research on Eurasian population history.
Svenja Bonmann et al, "Linguistic Evidence Suggests that Xiōng‐nú and Huns Spoke the Same Paleo‐Siberian Language," Transactions of the Philological Society (June 16, 2025). DOI: 10.1111/1467-968X.12321

A news report about the paper spells it out this hypothesis at greater length:
New linguistic findings show that the European Huns had Paleo-Siberian ancestors and do not, as previously assumed, originate from Turkic-speaking groups. The joint study was conducted by Dr. Svenja Bonmann at the University of Cologne's Department of Linguistics and Dr. Simon Fries at the Faculty of Classics and the Faculty of Linguistics, Philology and Phonetics at the University of Oxford.

The results of the research, "Linguistic evidence suggests that Xiōng-nú and Huns spoke the same Paleo-Siberian language," have been published in the journal Transactions of the Philological Society.

On the basis of various linguistic sources, the researchers reconstructed that the ethnic core of the Huns—including Attila and his European ruling dynasty—and their Asian ancestors, the so-called Xiongnu, shared a common language. This language belongs to the Yeniseian language family, a subgroup of the so-called Paleo-Siberian languages. These languages were spoken in Siberia before the invasion of Uralic, Turkic and Tungusic ethnic groups. Even today, small groups who speak a Yeniseian language still reside along the banks of the Yenisei River in Russia.

From the 3rd century BCE to the 2nd century CE, the Xiongnu formed a loose tribal confederation in Inner Asia. A few years ago, during archaeological excavations in Mongolia, a city was discovered that is believed to be Long Cheng, the capital of the Xiongnu empire. The Huns, in turn, established a relatively short-lived but influential multi-ethnic empire in southeastern Europe from the 4th to 5th centuries CE.

Research has shown that they came from Inner Asia, but their ethnic and linguistic origins have been disputed until now, as no written documents in their own language have survived. A great deal of what we know about the Huns and the Xiongnu is therefore based on written documents about them in other languages; for example, the term "Xiōng-nú' derives from Chinese. 

 

[Based on the "World Topographic Map" by Esri. Sources: Esri, HERE, Garmin, Intermap, INCREMENT P, GEBCO, USGS, FAO, NPS, NRCAN, GeoBase, IGN, Kadaster NL, Ordnance Survey, Esri Japan, METI, Esri China (Hong Kong), OpenStreetMap contributors, GIS User Community, Simon Fries. Created with QGIS 3.36.]. Credit: Transactions of the Philological Society (2025). DOI: 10.1111/1467-968X.12321

From the 7th century CE, Turkic peoples expanded westwards. It was therefore assumed that the Xiongnu and the ethnic core of the Huns, whose own westward expansion dates back to the 4th century CE, also spoke a Turkic language. However, Bonmann and Fries have found various linguistic indications that these groups spoke an early form of Arin, a Yeniseian language, in Inner Asia around the turn of the millennium.

"This was long before the Turkic peoples migrated to Inner Asia and even before the splitting of Old Turkic into several daughter languages. This ancient Arin language even influenced the early Turkic languages and enjoyed a certain prestige in Inner Asia. This implies that Old Arin was probably the native language of the Xiongnu ruling dynasty," says Bonmann.

Bonmann and Fries analyzed linguistic data based on loan words, glosses in Chinese texts, proper names of the Hun dynasty as well as place and water names. Taken by itself, the data on each of these aspects would have comparatively little significance, but taken together it is hard to argue with the conclusion that both the ruling dynasty of the Xiongnu and the ethnic core of the Huns spoke Old Arin.

The findings of the study also made it possible for the first time to reconstruct how the Huns came to settle in Europe: For the two researchers, place and water names still prove today that an Arin-speaking population once left its mark on Inner Asia and migrated westwards from the Altai-Sayan region. Attila the Hun probably also bears an ancient Arin name: Until now, "Attila" was thought to be a Germanic nickname ("little father"), but according to the new study, "Attila" could also be interpreted as a Yeniseian epithet, which roughly translates as "swift-ish, quick-ish."

The new linguistic findings support earlier genetic and archaeological findings that the European Huns are descendants of the Xiongnu. "Our study shows that alongside archaeology and genetics, comparative philology plays an essential role in the exploration of human history. We hope that our findings will inspire further research into the history of lesser-known languages and thereby contribute further to our understanding of the linguistic evolution of mankind," concludes Fries.

In the body text, a section of the paper explores the previous conventional wisdom and its difficulties:

Although direct evidence is lacking, Iranian, Turkic and Mongolic languages have all been proposed as the language of the ruling dynasty of the Xiōng-nú (cf. e.g. Shiratori 1900; Benzing 1959; Pritsak 1982; Bailey 1985; Dybo 2007; Janhunen 2010; Beckwith 2018; Beckwith 2022) and of the Huns (cf. e.g. Doerfer 1973; Pritsak 1982; Savelyev 2020; Savelyev & Jeong 2020), because in the 1st millennium AD languages from these three families were spoken in Inner Asia. Inscriptions dating between the 4th and 9th century AD demonstrate that Iranian languages (Sogdian, early 4th to 6th century AD, Sims-Williams 2011; Vovin 2018) and Mongolic ones (Khüis Tolgoi and Bugut inscriptions of the 5th–6th centuries AD, Vovin 2018) as well as, much later, Turkic languages (isolated Turkish phrases in Bactrian manuscripts of the 7th century AD, Orkhon and Yenisei Kirgiz inscriptions between the early 8th and 9th century AD, Erdal 2004: 4–8) were spoken in the territory between the Yenisei River in the West, the Tian Shan range in the South and Mongolia in the East. Other Indo-European languages were spoken in oasis cities along the northern and southern ridges of the Takla Makan desert in the 1st millennium AD including Indo-Iranian (Iranian Khotanese and Tumshuqese Saka, Bactrian, Indo-Aryan Prakrit, Sanskrit) and ‘Tocharian’ languages (Agnean and Kuchean).

However, this linguistic situation of a coexistence of Iranian, Turkic and Mongolic in Inner Asia can only be reliably established as such for the late 1st millennium AD. Hypotheses on an Iranian, Mongolic or Turkic identity of the Xiōng-nú primarily rest on written sources post-dating the Xiōng-nú era
While the theoretical possibility of a Mongolic or Turkic presence in Inner Asia already at the beginning of the common era cannot be ruled out a priori, it is important to note that there is, on the other hand, also no robust evidence – especially from textual sources – that could directly imply or prove a Turko-Mongolic presence in this area at such an early date. 
The earliest sources from the Tarim Basin and the territories alongside the Oxus River/Amu Darya (Chorasmia, Sogdia, Bactria) only document Indo-European languages from the Indo-Iranian and ‘Tocharian’ branches (to which might be added, as a cultural import, also Ancient Greek in Macedonian colonies). Judging by more indirect evidence – especially loanwords in other languages, toponyms, etc. – other Iranian languages, namely different Sakan varieties (Tremblay 2005) and ‘Old Steppe Iranian’ (Bernard 2023), must have been spoken in the steppe corridor from the Kazakh steppe to Dzungaria, and perhaps even to Gansu (see Beckwith 2022). It is only centuries later, namely in the Migration Period of the 5th–6th centuries AD, that a (Para-)Mongolic language might be attested in Inner Asia (Vovin 2018), and fragments of this (Para-)Mongolic language, in turn, are still much earlier documented than the earliest secure Turkic words dating from the 7th century AD.

There is thus neither direct nor indirect evidence supporting the claim of a Mongolic or Turkic presence in Inner Asia between the 3rd century BC and the 2nd century AD, and the hypothesis of a Mongolic or Turkic identity of the ethnic core of the Xiōng-nú (as proposed by Benzing 1959, Pritsak 1982; Tenišev 1997; Dybo 2007; Janhunen 2010; Savelyev 2020) is thus rather unlikely from the outset, as is the hypothesis of a completely unknown or unclassifiable language without any living descendants (as proposed by Doerfer 1973). The same applies to the Huns: there is a complete lack of evidence supporting claims of a Turkic presence among the Huns.1 On the other hand, an Iranian component in the Xiōng-nú Empire is possible, and indeed quite likely, although, as we intend to point out with the present study, such Indo-European ethnicity must not necessarily have been shared by the ruling dynasty or ethnic core of the Xiōng-nú (pace Bailey 1985; Beckwith 2022) or the Huns.

Concerning such an Iranian component, (Beckwith 2018, 2022) has argued recently that Xiōng-nú words preserved in Chinese texts are indicative of an Iranian language, which he calls ‘East Scythian’. However, his interpretation depends on a reconstruction of the Old and Middle Chinese pronunciation of Chinese signs which significantly differs from established reconstructions such as the classic one of Pulleyblank, and which has also been criticised by Vovin et al. (2016: 129–30). In addition to this, his Iranian etymologies must be met with serious doubts. For instance, the ethnonym ‘Aryan’, which is amply attested in many Indo-Iranian languages, is given by Beckwith with a word-initial laryngeal sound (discussion in Beckwith 2022: 183–86, cf. particularly p. 186): ‘East Scythian *ḥarya [ɣa.rya] “noble, royal; Scythian” → Old Chinese *ḥaryá 夏/*ḥâryá 華 “royal; Chinese, China”’. This would indeed be a remarkable Iranian word form, because no Indo-Iranian language points to an initial laryngeal (†Hā̆ri̯a- vel sim.): A word-initial laryngeal should have left direct traces in Persianide languages (see Kümmel 2018), but Old Persian <ariy-> /ariya-/ or inscriptional Middle Persian ēr ‘Iranian’ do not preserve such a sound. The hypothetical (East) Scythian would be the only Iranian language to preserve it, and independent evidence for this is entirely lacking. Other etymologies equally rest upon highly questionable ad hoc assumptions on Iranian historical phonology and must accordingly be dismissed (e. g. the etymology of Old Turkic täŋri ‘heaven’ that Beckwith 2022: 195, 203 wants to derive from an East Scythian *tagri through the application of an alleged Scythian syllable contact law of nasalization completely unheard of in the specialist literature and remaining without any reliable parallel; on this word rather cf. Georg 2001).

It must therefore be conceded that, while it is a priori likely that Iranian tribes were one factor among others in the ethnolinguistic melting pot of the eastern Eurasian steppe some 2000 years ago (the Sakan languages would be a good starting point for further research in this direction), the evidence adduced by scholars in favour of a dominant role of Iranian groups and their languages in the Xiōng-nú empire so far does not follow the rigorous methodological standards of Historical-Comparative Linguistics and is therefore insufficient to allow for any reliable inferences.

Etymological analyses of Xiōng-nú glosses in Chinese sources (collected by Pulleyblank 1962, criticised and reanalysed by Dybo 2007), complemented by the interpretation of the so-called Jié couplet, the only short text preserved in the Xiōng-nú language,2 have led to a more promising alternative hypothesis. This hypothesis acknowledges both the multi-ethnic composition of the Xiōng-nú empire as such and the presence of Indo-European and specifically Iranian languages in Inner Asia at the beginning of the common era, yet adds to the complexity the idea that the native language of the ruling dynasty of the Xiōng-nú empire might have been a Yeniseian one (Ligeti 1950; Pulleyblank 1962; Dul'zon 1966; Dul'zon 1968; Vovin 2000; Vovin 2003; Vovin 2007; Werner 2014; Vovin 2020). Yeniseian languages are usually considered remnants or survivors of the original linguistic diversity of Siberia, historically spoken in retreat areas as the result of several waves of superimposition or displacement by expanding Uralic/Samoyedic, Turkic and Tungusic languages. Therefore, Yeniseian languages are also known as Paleo-Siberian languages.3 Several different Yeniseian languages were spoken in the 18th century AD alongside the middle reaches of the Yenisei River and some of its tributaries, yet this probably reflects a northward migration from a point of departure further south, around the headwaters of the Yenisey, the Ob and the Irtyš rivers (see Dul'zon 1959a; Dul'zon 1959b; Dul'zon 1964; Maloletko 1992; Maloletko 2000; Vajda 2019: 194–95; cf. also Janhunen 2020: 167). From the six historically attested Yeniseian languages Ket, Yugh, Kott, Assan, Arin and Pumpokol, it has so far been suggested that Ket/Yugh (Ligeti 1950; Pulleyblank 1962) or Pumpokol (Vovin 2000, 2003, 2007, 2020; Vovin et al. 2016) may have been the native language of the Xiōng-nú ruling dynasty.

Adding value to this hypothesis is the fact that the northward migration of Yeniseian-speaking groups, as reflected in toponyms, from the Altai-Sayan area would well agree with detailed historical studies considering Indic, Iranian and Chinese written sources (de la Vaissière 2005; de la Vaissière 2014). These studies indicate that, following the eventual demise of their steppe empire, remnants of the Xiōng-nú migrated to the north of the Altai-Sayan Mountain ranges in the mid-2nd century AD and that this retreat area was the starting point of a secondary expansion of Xiōng-nú descendants roughly two hundred years later, between ca. 350–370 AD. This expansion occurred in three directions: One migratory trajectory led northward and left traces in the form of toponyms. This population movement downstream of the major rivers Yenisey, Ob and Irtyš perfectly explains the linguistic situation as documented for the first time in the 18th century and provides a direct link between Yeniseian languages and the Xiōng-nú. Another migratory route led to southern Asia and involved groups known from Iranian and Indic sources as Chionites, Kidarites, Hephthalites, Alchons as well as the so-called Huṇa (cf. Pfisterer 2013). A third migratory trajectory led westward, into Europe and involved the Huns who appeared in Eastern Europe in 370 and posed a threat to Roman hegemony until Attila's death in 453, the Battle of Nedao shortly afterwards and the ensuing disintegration of their confederation (cf. e.g. Heather 1996; Bóna 2002; Halsall 2007; Schmauder 2009; Maas 2014; Pohl 2022).

Several nomadic groups of late Antiquity that originated in Inner Asia and migrated to the southern and western peripheries of the Eurasian landmass apparently used the same ethnonymic constituent (Chion-ites – Al-chon – Huṇa – Huns; cf. de la Vaissière 2005; de la Vaissière 2014, but see Atwood 2012), and the traditional hypothesis of a link between the ethnic core of the European Huns of the 4th–5th centuries AD and the Inner Asian Xiōng-nú of the 3rd century BC–2nd century AD, first proposed by the French scholar Joseph de Guignes in the 18th century, has, strictly speaking, never been falsified (de la Vaissière 2005: 15). 
A genetic connection between the Xiōng-nú and the Huns is usually considered unlikely in modern archaeological and historical scholarship (e.g. Beckwith 2009: 72; Savelyev & Jeong 2020; Pohl 2022; Maenchen-Helfen 1944–1945; Maenchen-Helfen 1955; Maenchen-Helfen 1973; Schmauder 2009), partly because of the large chronological gap between the dissolution of the Xiōng-nú empire in the 2nd century AD and the appearance of the Huns in the 4th century AD, and partly because only two archaeological features render a connection likely: large bronze cauldrons of a certain type and artificially deformed or elongated skulls (Pohl 2022: 147).

Despite the prevailing scepticism of historians and archaeologists, the hypothesis of a connection between the Xiōng-nú and the Huns has been corroborated recently by previously unknown and unavailable genetic data analysed by Gnecchi-Ruscone et al. (2025): ‘(…) long-shared genomic tracts provide compelling evidence of genetic lineages directly connecting some individuals of the highest Xiongnu-period elite with 5th to 6th century AD Carpathian Basin individuals, showing that some European Huns descended from them’
On the provision that there was indeed some continuation between the ethnic core of the European Huns and the former Xiōng-nú, the ruling classes of both multi-ethnic confederations may have spoken the same language in two different diachronic stages (an older form and a younger one), implying that the identification of the linguistic affiliation of one of these groups probably also means identifying the native language of the other group
In the following, we will discuss previously unknown linguistic evidence from four domains independently supporting such a connection and thus corroborating the recent archaeological and genetic findings: (1) loanwords, (2) glosses, (3) anthroponyms and (4) toponyms/hydronyms.

This analysis, which moves the Turkic and Tungistic migrations several centuries later in history than previously believed, is also relevant to the Altaic linguistic hypothesis and our understanding of these ethnic mass migrations more generally.

Close in time and space: Slavic ethnogenesis

The Slavic people emerged around the same time as the fall of the Roman Empire and the demise of the short lived Hunnic Kingdom in the Balkans, but before the Magyar conquest of what is now called Hungary and before the appearance of Gypsies in Europe. This period was traditionally called the "Dark Ages" in Europe. There are some historical roots, however, which suggest Slavic origins several centuries earlier (from the Wikipedia link at the start of this paragraph):

Ancient Roman sources refer to the Early Slavic peoples as "Veneti", who dwelt in a region of central Europe east of the Germanic tribe of Suebi and west of the Iranian Sarmatians in the 1st and 2nd centuries AD, between the upper Vistula and Dnieper rivers. Slavs – called Antes and Sclaveni – first appear in Byzantine records in the early 6th century AD. Byzantine historiographers of the era of the emperor Justinian I (r. 527–565), such as Procopius of Caesarea, Jordanes and Theophylact Simocatta, describe tribes of these names emerging from the area of the Carpathian Mountains, the lower Danube and the Black Sea to invade the Danubian provinces of the Eastern Empire.

Jordanes, in his work Getica (written in 551 AD), describes the Veneti as a "populous nation" whose dwellings begin at the sources of the Vistula and occupy "a great expanse of land". He also describes the Veneti as the ancestors of Antes and Slaveni, two early Slavic tribes, who appeared on the Byzantine frontier in the early-6th century.

Procopius wrote in 545 that "the Sclaveni and the Antae actually had a single name in the remote past; for they were both called Sporoi in olden times". The name Sporoi derives from Greek σπείρω ("to sow"). He described them as barbarians, who lived under democracy and believed in one god, "the maker of lightning" (Perun), to whom they made sacrifice. They lived in scattered housing and constantly changed settlement. In war, they were mainly foot soldiers with shields, spears, bows, and little armour, which was reserved mainly for chiefs and their inner circle of warriors. Their language is "barbarous" (that is, not Greek), and the two tribes are alike in appearance, being tall and robust, "while their bodies and hair are neither very fair or blond, nor indeed do they incline entirely to the dark type, but they are all slightly ruddy in color. And they live a hard life, giving no heed to bodily comforts..."

Jordanes describes the Sclaveni as having swamps and forests for their cities. Another 6th-century source refers to them living among nearly-impenetrable forests, rivers, lakes, and marshes.

Menander Protector mentions Daurentius (r. c. 577 – 579) who slew an Avar envoy of Khagan Bayan I for asking the Slavs to accept the suzerainty of the Avars; Daurentius declined and is reported as saying: "Others do not conquer our land, we conquer theirs – so it shall always be for us as long as there are wars and weapons".

The Slavic languages are a relatively recent offshoot of the Indo-European Baltic languages, which in turn may be the most direct descendants of the language(s) of the Corded Ware culture (ca. 3000 BCE to 2350 BCE).

Eurogenes reports on new ancient DNA driven discoveries drawn from the earliest ancient Slavic DNA at his blog.

A paper dealing with the origin of Slavic speakers, titled Ancient DNA connects large-scale migration with the spread of Slavs, was just published at Nature by Gretzinger et al. (see here).

The dataset from the paper includes ten fascinating ancient samples from Gródek upon the Bug River in Southeastern Poland. These individuals are dated to the so called Tribal Period (8th – 9th centuries), and, as far as I know, they represent the earliest Slavic speakers in the ancient DNA record.

The really interesting thing about these early Slavs is that they already show some Germanic and other Western European-related ancestries. Nine of the samples made it into my G25 analysis (see here). In the Principal Component Analysis (PCA) plots . . . five of them cluster near present-day Ukrainians, while the rest are shifted towards present-day Northwestern and Western Europeans. . . .  GRK015, a female belonging to Western European-specific mtDNA haplogroup H1c, shows Scandinavian ancestry. On the other hand, GRK014, a female belonging to the West Asian-specific mtDNA haplogroup U3b, probably has Southern European ancestry.
These results aren't exactly shocking, because the people who preceded the early Slavs in the Gródek region were Scandinavian-like and associated with the Wielbark archeological culture. In other words, they were probably Goths who also had significant contacts with the Roman Empire.

However, it's not a given that the ancestors of the Tribal Period Slavs mixed with local Goths. It's also possible that they brought the western admixture, or at least some of it, from the Slavic homeland, wherever that may have been.

That's because the early Slavs who migrated deep into what is now Russia also showed Western European-related admixture. This is what Gretzinger et al. say on page 74 of their supplementary info (emphasis is mine):
The only deviation from this pattern is observed for ancient samples from the Russian Volga-Oka region, where we measure higher genetic affinity between present-day Southern/Western Europeans and the SP population compared to the pre-SP population (Fig. S17). This agrees with the pattern observed in PCA and ADMIXTURE that, in contrast to the Northwestern Balkan, Eastern Germany, and Poland-Northwestern Ukraine, the arrival of Slavic-associated culture in Northwestern Russia was associated with a shift in PCA space to the West, a decrease of BAL [Baltic] ancestry, and the introduction of Western European ancestries such as CNE [Continental North European] and CWE [Continental Western European].
Thus, it's highly plausible that the Tribal Period Slavs from Gródek were very similar, perhaps even practically identical, to the proto-Slavs who lived in the original Slavic homeland. Hopefully we won't have to wait too long to discover whether that's true or not. More Migration period and Slavic period samples from the border regions of Belarus, Poland and Ukraine are needed to sort that out.

Eurogenes goes on to criticize a suggestion in the supplemental materials to the Slavic ancient DNA paper that suggests that 

Sycthian groups from Ukraine show varying fractions of South Asian ancestry (between 5% and 12%), a component present in many ancient individuals from Moldova, Ukraine, Western Russia, and the Caucasus, but (nearly) absent in the SP genomes from Central and East-Central Europe (<5%). [Ed. references to specific samples showing this omitted.]

Eurogenes, rightly, explains that the data are really showing European introgression into South Asia arising from the Indo-Aryan invasion of the region in the Bronze Age, and before that from Iran. 

Thursday, August 28, 2025

Some Linguistic Hypotheses

* I think that it is very likely that the Korean language family and the Japanese language family are related, even if it is challenging to find "smoking gun" evidence of it today. Japanese may have also have some Manchurian linguistic influence. The broader Altaic hypothesis has less strong support, but there may be something to it.

* I think that it is very likely that the Dravidian language family was influenced by an African language family, with the vectors of that transmission probably being people from the Horn of Africa who also brought some key African Sahel domesticates to Southern India around the time of the South Indian Neolithic ca. 2500 BCE. 

* The Harappan language is almost surely not Indo-European, not Dravidian, and not Munda as a language family. It could conceivably have some connection to language isolates in the general region known as Indo-Pacific languages, or it might not. It is probably the main substrate influence on Sanskrit and through Sanskrit on the other Indo-European languages of India. The script associate with it was probably a proto-script, like a set of emojis or trademarks, and not a full written language. The same is true of the early Vinca script used in the Neolithic Balkans.

* I think that it is very likely that Indo-Aryans (Sanskrit speaking derived people) conquered almost all of India sometime in pre-history and imposing their language and the Hindu religion (although not as faithfully to some of its tenants like vegetarianism), except a small last stronghold, more or less in the vicinity of the modern city of Visakhapatnam, which then reconquered territory from the Indo-Aryans, restoring their dialect of the Dravidian language, but not effectively displacing the Hindu religion that the Indo-Aryan conquerors brought with them. This is why the Dravidian language family seems younger than it really is; it's historic linguistic diversity was wiped out at this point with most of its variants extinguished at this time. As I noted in a post at Wash Park Prophet:

[A]reas that are linguistically Indo-Aryan are more likely to be vegetarian than areas that are linguistically Dravidian, Munda or Tibeto-Burmese. Meat eating may reflect a thinner Indo-Aryan influence even in places that experienced a language shift to Indo-Aryan languages. Vegetarianism may alternatively reflect a stronger influence from the pre-Indo-Aryan Harappan society.

* Brahui, a Dravidian language pocket found far from the geographic range of the other Dravidian language, probably was not within the historic range of the Dravidian languages. Instead, it is probably a result of language shift through elite dominance around 1000 CE or so, by some foreign Dravidian warlord or king.

* Sometime around the Copper Age (a.k.a. the Eneolithic) in Anatolia, people from the eastern highlands brought the Hattic language (which preceded the Hittite language) to Anatolia. It is related to Kassite, other Iranian highland languages, and more remotely to most of the Caucasian languages (which are related to each other even if the connections are hard to establish), to Sumerian, and probably to Elamite. It is also probably related to Minoan. One of the litmus tests of all of these languages is that they were ergative. 

Hattic probably replaced the Neolithic language(s) of Anatolia, including the Western Neolithic language which spread across Europe in two main branches, the Linear Pottery culture (LBK) to through the rivers of the north, and the Cardial Pottery culture to more or less along the Mediterranean coast, which was very different from Hattic. The Western Anatolian Neolithic languages were the substrate languages for the Indo-European language in most of Europe, but not in Anatolia where the Hattic language was the substrate. Hattic substrate influence is the reason that Anatolian Indo-European languages like Hittite seem so diverged from other Indo-European languages, because the Hattic society was much healthier when the Indo-Europeans arrived than in other places where the Indo-Europeans conquered Neolithic societies in a state of collapse. The most basil branch of Indo-European was probably that spoken in the Tarim Basin, which was on a frontier with almost no substrate influence.

* It is very likely that the languages of the European hunter-gatherers are completely lost. The Uralic languages arrived much later. In the Americas and Japan and Australia, we know that indigenous hunter-gather language substrates had very little impact on the food producing conquerer languages, even when indigenous peoples made a large genetic contribution to the people speaking the food producer languages.

* Basque, therefore, is very unlikely to be an indigenous European hunter-gatherer language. It could be the last survivor of the language family of the first farmers of Europe rooted in Western Anatolia frmo around 6000 BCE to 4000 BCE, or it could reflect a very distant outpost of a Copper Age language probably in the same language family as Hattic and Minoan. I probably lean towards the Neolithic hypothesis, as the corpus of Hattic (which remained a written liturgical language for a thousand years after the Hittites took over) and of Basque are both large enough that a connection would have been established by linguists by now if it was present, even though both are ergative languages, but the rarity of ergative languages outside the West Asian highlands, ancient Mesopotamia, and places to the east of that, favor a copper age origin for it. The Paleo-Hispanic languages may have all been a coherent group and Tartessos in Southwest Iberia was metal rich and a strong candidate for the source for Plato's Atlantis story. The "Tartessian culture was born around the 9th century B.C. as a result of hybridization between the Phoenician settlers and the local inhabitants. Scholars refer to the Tartessian culture as "a hybrid archaeological culture".

* We know the Etruscan, Raetic, and Lemnian (together called the Tyrsenian languages, an areal designation, since while the connection of Etruscan and Raetic is pretty solid, the linguistic family connection to Lemnian is not, and possibly Camunic as well, although it could also be related to Celtic) are also not Indo-European languages and pre-date Indo-European:

  • Etruscan: 13,000 inscriptions, the overwhelming majority of which have been found in Italy; the oldest Etruscan inscription dates back to the 8th century BC, and the most recent one is dated to the 1st century AD.
  • Raetic: 300 inscriptions, the overwhelming majority of which have been found in the Central Alps; the oldest Raetic inscription dates back to the 6th century BC.
  • Lemnian: 2 inscriptions plus a small number of extremely fragmentary inscriptions; the oldest Lemnian inscription dates back to the late 6th century BC.
  • Camunic: may be related to Raetic; about 170 inscriptions found in the Central Alps; the oldest Camunic inscriptions dates back to the 5th century BC.

The ergative substrate influence probably explains its presence in Indo-European Pashto, Kurdish languages and Indo-Aryan languages, which was shared with Basque and is absent from most Indo-European languages. It suggest that Harappan was probably ergative. The Tyresnian languages apparently non-ergative character suggests that they aren't part of the same language family as Basque, and tends to favor a Copper Age origin for Basque rather than a Neolithic origin for it.

But we haven't deciphered them very well since the corpus of those writings has mostly been lost, and what we have left is mostly monolingual and short. We can't even say with completge confidence that they were all in the same language family, although ancient Rhaetic spoken to the north of Etruscan (not linguistically related to the similarly named modern Indo-European minority language of Switzerland) was probably in the same language family with Etruscan. somewhat conflicting historical evidence suggests that Lemnians were migrants from the Alps and/or northern Italy, probably during the Greek dark ages after Bronze Age collapse had run its course.

We also don't know much about the substrate language that influenced Mycenaean Greek.

Tuesday, July 8, 2025

Steppe Ancestry In Italy


Blonde hair percentages, at a population statistics level, is a good proxy for Indo-European steppe ancestry levels (it's not as good as autosomal DNA, but the sample size and amount of fine grained geographic detail is much better). You'd need an estimate for the amount of steppe ancestry in Italians to calibrate this litmus test, however.

The first farmers of Europe had essentially 0% blonde hair, much like modern Sardinians, who are their closest genetic match. Blonde hair in Europe arrived more or less exclusively via steppe migration in late Neolithic to early Bronze Age from an ultimate homeland in the vicinity of modern Ukraine, although plenty of migration happened within Europe after this migration and not all steppe migrants had blonde hair. It is also possible to have very little steppe ancestry while still having the blonde hair gene. 

The chart shows the percentage of blond haired people in the regions shown on the map in (or near) Italy. Overall, about 8% of Italians are naturally blonde (another estimate suggests 15%). It suggests that Indo-European migration to Italy was largely north to south (with exceptions for urban centers) and reached southern Italy in far smaller proportions than northern Italy, although it is hard to know how much of the migration was modern, how much was medieval, how much was from the Roman era, and how much dates to pre-history.

Until the late 1870s, Italy was not a unified country, with Southern Europe belonging to the poorer Kingdom of the Two Sicilies with a more agricultural economy, and Northern Europe belonging to a number of smaller and more prosperous states with more mercantile economies, which could have impacted migration patterns by increasing migration from areas with more blonde people. 


From Reddit.

In the medieval era Northern Europeans, including the Normans and Vikings and Germanic tribes, had greater interactions with Northern Italy than with Southern Italy, as well. 

In the Roman era, migration to the Roman capital and its major cities from North Africa, Egypt, and the Levant might have diluted the percentage of people with steppe ancestry.

Shortly before the classical Roman era, there were a number of Greek colonies in Italy, which could be reflected in the purple regions on the map (about 4% of Greeks are naturally blonde), with some blurring out due to admixture with regions near former Greek colonies.


From Wikipedia.

Monday, June 16, 2025

Population Replacement In The Columbian Highlands

In Europe, the first farmers of Europe, derived from Western Anatolian farmers, largely replaced Europe's original hunter-gatherers (who actually show continuity between the periods before and immediately after the Last Glacial Maximum), and in turn, received very substantial genetic admixture from late Copper Age/early Bronze Age Indo-Europeans from more or less where Ukraine is today. This diluted both first farmer ancestry, and the already highly diluted European hunter-gather ancestry that was admixed into those first farmer populations. In some places, like Britain, the population replacement of first farmers by Indo-Europeans was nearly complete.

Something similar apparently happened in East and Southeast Asia.

A new study established that the Americas did not break from this pattern, with some of its early agriculturists replacing pre-existing hunter-gatherer populations in a similarly genocidal pattern. If anything, this replacement was even more complete.

Sometime between 4000 BCE and 0 CE, in the Columbian highlands, probably coinciding with a new archaeological culture whose artifacts appear around 1000 BCE to 800 BCE, a millennium after maize cultivation began around 1800 BCE (but possibly before the full blown ceramic culture emerged), a clade of indigenous South American hunter-gatherers (with ancestry dating back to the initial wave of human settlement of South America) were replaced by a different group of indigenous South Americans.

The 1800 BCE date is from A. Gómez, et al., "A Holocene pollen record of vegetation change and human impact from Pantano de Vargas, an intra-Andean basin of Duitama, Colombia." 145 Rev. Palaeobot. Palynol. 143–157 (2007) (full paper available here), and really only definitively points to deforestation and Amaranth cultivation at that point in the highlands of Columbia.

From Wikipedia.

The population that replaced them, which is genetically linked to the speakers of Chibchan languages and probably originated in Central America, has remained the dominant population of the region in genetic continuity with their ancestors since this population replacement occurred, although later populations admixed with them and brought new languages in some parts of the region. 

There is no evidence that anyone from the pre-agricultural, pre-ceramic culture that was replaced in the Columbian highlands survived, or even significantly admixed with surviving populations.

The new agriculturalist culture did not really come into its own archaeologically until 1000 BCE to 800 BCE, so we can't know for sure if the replacement took place suddenly (although the lack of admixture between the new and old populations suggests that it did) or more gradually, or how long after maize cultivation, a thousand years earlier than this culture's pots appeared, the population replacement happened. 

Conservatively, it happened in some short time period between 1800 BCE and 800 BCE (about 3,000 to 4,000 years after it happened in Europe). Realistically, it probably happened on the later side of that time range when other components of the emerging farmer culture, like pottery and possibly other key domesticated plants and/or animals, joined with improved maize cultivation to give rise to a technologically dominant new culture.

The introduction and discussion sections of a new study released May 28, 2025 in the journal Scientific Advances by Kim-Louise Krettek, et al., explain that:

Genetic studies on ancient and present-day Indigenous populations have substantially contributed to the understanding of the settlement of the Americas. Those studies revealed that the population ancestral to non-Arctic Native Americans derives from a genetic admixture between ancient East Asian and Siberian groups somewhere in North-East Asia before 20,000 years before the present (yr B.P.). Around 16,000 yr B.P., after its arrival in North America, this genetic ancestry split into two lineages known as northern Native American and southern Native American. While northern Native American ancestry is largely confined to ancient and current populations of North America, the southern Native American lineage expanded further south and constitutes the main ancestry component of ancient and present-day Indigenous South Americans. 
Southern Native American ancestry diversified within North America into at least three sublineages, i.e., one related to the Clovis-associated Anzick-1 individual from western Montana (USA), one found in ancient California Channel Islands individuals and the last one representing the main ancestry source of modern-day Central and South Americans.  
Each of these sublineages provided a wave of ancestry into the gene pool of ancient South Americans. Individuals from Chile and Brazil dating back to around 12,000 and 10,000 yr B.P., respectively, were more genetically related to the Anzick-1 genome than individuals from the eastern Southern American coast, Southern Cone and the Andes from 10,000 yr B.P. onward. In addition, the California Channel Islands ancestry was found in the Central Andes by 4200 yr B.P. and became widespread in the region thereafter. However, the exact timing of these population movements into the southern subcontinent remains largely unsolved to date. 
The Isthmo-Colombian area, stretching from the coast of Honduras to the northern Colombian Andes, is critical to understanding the peopling of the Americas. Besides being the land bridge between North and South America, it is at the center of the three major cultural regions of Mesoamerica, Amazonia, and the Andes. At the time of European contact, the region was inhabited by a complex mosaic of human populations, mainly speakers of Chibchan, Chocoan, Carib, and Arawakan languages. 
Among those populations, those who were speakers of Chibchan languages were the most widespread in the region in terms of demography, cultural diversity, and territorial distribution. Chibchan is a language family with multiple, highly distinct branches, many of which are still spoken today in different regions of the Isthmo-Colombian area. The homeland and antiquity of the Proto-Chibchan language and the ancestor of all Chibchan languages remain subjects of debate. High intrafamily variation in terms of lexicon and grammar suggests that the language family is ancient and began diversifying several thousand years ago. The locus of that incipient diversification, however, is still uncertain. Most scholars believe that this protolanguage began to diversify in Lower Central America, where the largest number of these languages is spoken today. However, some evidence suggests that Proto-Chibchan might have originated in South America and then diversified in Central America at a much later date. 
Genetic studies of ancient and present-day Isthmo-Colombian Indigenous populations revealed a distinctive ancestry component primarily associated with speakers of Chibchan languages. However, whereas mitochondrial DNA (mtDNA) studies suggested a migration of Chibchan-related ancestry from Central America into Colombia and Venezuela, genome-wide studies favored an opposite, south-to-north population movement. According to the latter model, speakers of Chibchan languages from Central America are not direct descendants of the first settlers in the region but, instead, derive from a more recent back migration from South to Central America. 
The southernmost region of the Isthmo-Colombian area is the Altiplano Cundiboyacense (hereafter Altiplano). This plateau with an average altitude of 2600 m in the Eastern Cordillera of the Colombian Andes was inhabited by ancient hunter-gatherer groups from the Late Pleistocene. During the Early and Middle Holocene phases of the Preceramic period (~11,500 to 4000 yr B.P.), populations on the Altiplano underwent multiple cultural transformations, most notably increased sedentism and a transition from a hunter-gatherer subsistence to the introduction of horticultural practices and forest management. However, it was not until the early Late Holocene, ~3800 yr B.P., that the first clear evidence of maize cultivation appeared. 
During the subsequent Formative period (~3000 to 1000 yr B.P.), a distinct type of pottery emerged on the Altiplano that is referred to as the Herrera ceramic complex, also known in the literature as the Herrera period (2800 to 1200 yr B.P.). It is still highly debated whether Herrera-associated groups on the Altiplano derived from an in situ development of local hunter-gatherers or were a consequence of population dispersals into the region. 
Around 1200 yr B.P., a cultural phase, known as the Muisca period, began on the Altiplano and lasted until the imposition of the Hispanic Colonial regime in the mid-16th century. Most available evidence is suggestive of population continuity with the preceding Herrera period. The Muisca period is characterized by a relatively continuous process of demographic growth, development of agriculture and trade, and social and political complexification. These factors played a considerable role in shaping the Muisca culture and gave rise to the Chibchan-speaking population that dominated the Altiplano until European colonization. 
While several studies have reported mtDNA data from ancient Colombian individuals, genome-wide data from this region are still entirely lacking to date. In this study, we generated mtDNA and genome-wide data of 21 ancient individuals from two areas of the Altiplano (Bogotá plateau and Los Curos). Our data, spanning a time transect between around 6000 and 500 yr B.P., provide an opportunity to explore several key questions: 
(i) Which southern Native American genetic ancestry do Preceramic individuals from the Altiplano derive from? 
(ii) Were the cultural transformations associated with the Herrera and Muisca periods accompanied by migrations and demographic changes? 
(iii) How is the genetic ancestry observed in speakers of Chibchan languages related to that of ancient individuals from the Altiplano? 
(iv) What are the genetic relationships between the generated ancient genomes and the existing genomic data of present-day Indigenous communities from Colombia and neighboring regions?

In this study, we generated genome-wide data from 21 individuals spanning a time transect of almost 6000 years from the Altiplano, which represents the southern edge of the Isthmo-Colombian area. Our findings contribute to a better understanding of the population history of this area, a key region in the peopling process of South America. We show that the hunter-gatherer population from the Altiplano dated to around 6000 yr B.P. lack the genetic ancestry related to the Clovis-associated Anzick-1 genome and to ancient California Channel Island individuals, suggesting their affiliation to the southern Native American lineage that became the primary source of ancestry of South Americans by 9000 yr B.P. 
However, unlike ancient genomes from the Andes and the Southern Cone that are associated with the same wave of ancestry, the analyzed Preceramic individuals from Colombia do not share distinct affinity with any ancient or modern-day population from Central and South America studied to date. Colombia_Checua_6000BP can thus be modeled as a previously undescribed distinct lineage deriving from the radiation event that gave rise to multiple populations across South America during its initial settlement. 
The cultural transition between the Preceramic and Herrera periods is associated with a seemingly complete replacement of the local genetic profile. This challenges the model where local hunter-gatherers developed in situ as suggested by morphometric studies and an ancient mtDNA time transect. Instead, our study provides evidence for a major genetic turnover on the Altiplano occurring after 6000 yr B.P. but before 2000 yr B.P. Since the mechanisms and precise temporal scale of this replacement event remain uncertain, we cannot directly associate it with the emergence of maize cultivation ~3800 yr B.P. However, our data do support the archaeological hypothesis that the introduction of pottery associated with the Herrera ceramic complex was mediated through population dispersals. 
Our results show that the incoming genetic ancestry on the Altiplano is related to ancient and present-day populations speaking Chibchan languages from Central America. This can be explained most parsimoniously by Chibchan-related migrations from Lower Central America to South America, rather than back-migration to the isthmus. 
A separate study found evidence for a previously unknown south-to-north expansion of Chibchan-related ancestry from Lower Central America into the Mayan territories of Belize by 5600 yr B.P. Therefore, rather than modeling Central American populations associated with Chibchan languages as deriving from a mixture between North and South American ancestries, these results are consistent with an origin of Chibchan-related ancestries in Lower Central America, followed by bidirectional gene flow toward both Meso- and South America. This model of an original “Chibchan homeland” in Central America is supported not only by mtDNA studies on present-day populations who speak Chibchan languages but also from linguistic observations, indicating that the isthmus region exhibits the highest diversity within this language family. 
From an archaeological perspective, the Chibchan-related ancestry is first identified in 2000-year-old individuals associated with Herrera ceramics. In addition, previously sequenced Ceramic-associated individuals from Venezuela dated to 2400 yr B.P. also showed a high affinity to Central American populations speaking Chibchan languages. Despite the similar ancestry pattern and temporal frame, the two populations do not appear to form a simple sister group. This could be in line with linguistic evidence that suggests multiple, distinct Chibchan language expansions into South America, but additional studies will be necessary to further clarify this issue. 
After the arrival of the Chibchan-related ancestry, which completely reshaped the genetic landscape of the region, we find evidence of a long period of genetic continuity in the genetic profile of the local populations for over 1500 years (from at least 2000 to 500 yr B.P.). The stability in genetic ancestry encompasses the end of the Herrera period and the beginning of the Muisca period. This points to a scenario in which populations speaking languages from the Chibchan lineage would have settled the Altiplano before the emergence of traits normally associated with the Muisca culture, and it shows that this cultural transition took place without a substantial migration from regions with a distinct genetic ancestry composition. In addition, such a genetic continuity extends through different cultural phases within the Muisca period and persists until the Spanish colonization. Colonial linguistic documentation established that Muisca people spoke a now extinct Chibchan language. Our findings not only confirm their genetic link with speakers of Chibchan languages from Central America but also suggest that ancestral Chibchan languages, possibly basal to the Magdalenic branch that gave rise to the documented Muisca language, might have already been spoken on the Altiplano during the pre-Muisca Herrera period. 
While the representation of Indigenous populations in our dataset is certainly not exhaustive, the observed spatial pattern in the genetic affinity of post-2000 yr B.P. ancient Colombians with present-day Indigenous populations raises questions regarding the uneven distribution of populations speaking Chibchan languages across the Isthmo-Colombian area at the time of the Hispanic colonization, also referred to as a Chibchan “archipelago”. 
One possible explanation is that this distribution resulted from separate dispersals from Central America to different locations of northern South America rather than a single expansion wave, as suggested by the internal branching pattern of the Chibchan language family. However, it is also possible that the initial spread was more widespread and got later fragmented by post-Chibchan migration and admixture events. The observation that Chibchan-affiliated populations from northern Colombia have a significantly reduced genetic affinity to post-2000–yr B.P. ancient Colombians than to Lower Central Americans supports the role of population admixture in shaping the genetic diversity of northern South America.

Also, while the earlier South American hunter-gatherer clade that went extinct probably dated to the founding wave of the modern humans in South America, they did not have notable Australasian or Melanesian ancestry, disfavoring the existence of a dramatically genetically distinct founding population of the Americas that preceded the main founding wave of modern humans and has Australasian or Melanesian genetic affinities that ancient.

Wednesday, April 30, 2025

Prehistoric Germans And Finns

Razib Kahn has a short little recap of what ancient DNA and other sources tell us about the ancestors and linguistic origins of the Germanic and Finnish people. It is summed up in the following image:


He opens with this summary:

A 2023 preprint out of David Reich’s lab seems to have come close to pinpointing the origin of the Baltic’s Finnic peoples, while a 2025 preprint from his rival Eske Willerslev’s group may have uncovered the proto-Germanic tribes’ ancestral homeland in the most unexpected locale. Because whereas the Finnic tribes’ destination was the eastern Baltic, that same zone now appears to have been the proto-Germanics’ and their ancestors’ long mysterious origination point. In a general sense, the Finns’ and Estonians’, and their proto-Uralic ancestors’ more than 1,000-year journey from one end of Eurasia to the other is little surprise, just a refinement whose precise details linguists, archaeologists and now geneticists had long quested to pin down. But the very suggestion that what became Finland and Estonia were meanwhile the mysterious homeland of the earliest proto-Germanic-speaking people comes straight out of left field. Disciplines like archaeology have barely had time to come to grips with the ramifications of this possibility, with early scholarly response thus far amounting to little more than stunned silence.

He also provides a map of the range of the Uralic language families:


Both narratives make sense to me. 

The oldest historically attested homeland of the Germanic peoples, where Old Norse a.k.a. proto-Germanic was spoken, is in the vicinity of Denmark, Southern Sweden, and Southern Norway. But, there was good reason, based upon historic archaeological cultures, the proto-Indo-European homeland, and genetics, to assume that their ancestors were among the Corded Ware people further to the east.

Likewise, there has never been any real doubt that the Uralic homeland was not in Finland, or that it was someplace further east, either in the Ural Mountains themselves, or in the Northeastern Asian region loosely described as Siberia.

To some extent, the arrival of the linguistically Uralic proto-Finns may have provided a motive for the exodus of the proto-Germans.

The genetics also tell us that the proto-Finns arrival followed a pattern familiar in demographic history. A male dominated group arrived, conquered, and took local wives. As Razib explains:
Finnic mtDNA did not differ from that of their Scandinavian neighbors to the west, Finnic Y chromosomes were markedly distinct. About 60% of Finnish men carried haplogroup N, as compared to 7% of Swedish males, 3% of Norwegian men and 1% of male Danes (while N is basically wholly absent from Western and Southern Europe). 
Interestingly, haplogroup N, and Finland’s particular sublineage, is also found in populations to the east, from European Russia all the way out to Siberia’s Pacific coast. In Russia’s frigid far north, half of men carry this lineage, while among the Finnic-speaking ethnicities of the Russian Urals, the Udmurts and Mari, its frequency hovers around 30-50%. Among the Samoyed tribes, over 50% of men carry N. Finally, among the northeasternmost Turkic-speaking people in the world: the Yakuts of eastern Siberia, 80-95% of men are N. 
Though haplogroup N’s ambit is more extensive than the map of Uralic languages today, save for Hungarians, all Uralic-speaking populations harbor N in high numbers. If you are a man who carries N, you may not be Uralic, but if you are a (non-Hungarian) Uralic male, odds are good that you carry N.

We know why Hungary is different. Their Uralic language arrived in central Europe around 1000 CE, when it is historically attested that Magyar conquerers arrived, and didn't admix much with the locals, but through elite dominance, converted the local central European peasants to their language. We even know that these Magyar conquerers ventured west because Turkic speaking tribes of Huns pushed them out.

We also know about the first farmers of Europe, derived from Western Anatolian derived Linear Pottery Neolithic people, and the European hunter-gatherers who preceded them, that came before the people who were the ancestors of the Germans, and then, of the Finns. 

The European hunter-gathers who preceded the first farmers of Europe started from a clean slate in the Mesolithic era, because most of Northern Europe was either under glaciers or too frigid to be habitable around the time of the Last Glacial Maximum (LGM) around 20,000 years ago, and these glaciers had to retreat for several thousand years before the region could be repopulated. 

Neanderthals don't appear to have ever managed to populate regions that far north and were extinct many thousands of years prior to the LGM. There were some Cro-Magnon (i.e. modern human) European hunter-gathers, who started to arrive in Europe about 40,000 years ago, before the ice age that gave rise to the LGM quite far north in Europe. They were genetically rather similar to the European hunter-gatherers who repopulated Europe after the LGM. But the pre-LGM European hunter-gatherers either retreated to one of three main refugia in Southern Europe during that ice age, or died.

Slavic people replaced Finns in much of what is now Russia, between their ethnogenesis in the historic era, around the time of the fall of the Roman empire, as they expanded until sometime around the early modern period in Europe, which started around the time of the Renaissance and the Protestant Reformation (and later into Northeast Asia).