A new article explores the genetic and linguistic origins of the Albanian people in the context of know historical events and ancient DNA.
The origins of the Albanian people have vexed linguists and historians for centuries, as Albanians first appear in the historical record in the 11th century CE, while their language is one of the most enigmatic branches of the Indo-European family.
To identify the populations that contributed to the ancestry of Albanians, we undertake a genomic transect of the Balkans over the last 8000 years, where we analyse more than 6000 previously published ancient genomes using state-of-the-art bioinformatics tools and algorithms that quantify spatiotemporal human mobility.
We find that modern Albanians descend from Roman era western Balkan populations, with additional admixture from Slavic-related groups. Remarkably, Albanian paternal ancestry shows continuity from Bronze Age Balkan populations, including those known as Illyrians. Our results provide an unprecedented understanding of the historical and demographic processes that led to the formation of modern Albanians and help locate the area where the Albanian language developed.
doi: https://doi.org/10.1101/2023.06.05.543790
The body text of the paper explains that:
During the Iron Age (1100 BCE–150 CE), the Balkans were characterised by remarkable cultural, linguistic, and genetic heterogeneity. In the western Balkans, “Celtic” cultures such as Hallstatt and La Tène, interacted for centuries with local groups referred to as the “Illyrians” and “Dalmatians”. Deep in the Balkan heartland, heterogenous populations named by classical authors as “Dacians”, “Dardanians”, “Moesians”, and “Paeonians” bordered nomadic cultures from the Pontic-Caspian steppe known as the “Scythians”, while the southeastern part of the peninsula was inhabited by “Thracians” and “Greeks”. Balkan peoples also expanded beyond the confines of the peninsula, with “Messapians” migrating to southeast Italy at least since 600 BCE.
The linguistic and cultural diversity of the Balkans was considerably homogenised during the Hellenistic, Roman, and especially the Migration Period, when Germanic and Slavic-speaking groups massively settled in the region. These events ultimately led to the extinction of all palaeo-Balkan languages except Greek and Albanian. The latter is one of the most enigmatic branches of the Indo-European language family, having vexed linguists for more than two centuries.
Tracing the origins of Albanians and their language is challenging for several reasons. Only a handful of historical sources comment on the ethnic and linguistic composition of the southwest Balkans during the transition from classical antiquity to Medieval times (500-1000 CE), and none of them mention an Albanian-speaking population from the territory of modern Albania. Speakers of Slavic languages are reported to have inhabited what is now southern Albania in the 8th century CE, where the frequency of Slavic toponyms also peaks, while the same region is characterised by the presence of Greek-speakers at least since the Medieval period. The urbanised Medieval populations of the northwest, referred to by contemporary historians as the Romani/Ῥωμᾶνοι, are thought to have spoken a variant of vulgar Latin known as West Balkan Romance that persisted at least until the 13th century CE. The demographic and linguistic situation in the mountainous interior is unknown, and it is only in the 11th century CE that Albanians appear in the historical record, while the earliest surviving written document of their language dates to 1462 CE.
A number of linguistic hypotheses have attempted to identify the affinities of the Albanian language and to locate the region where it developed, yet no definitive conclusions have been drawn. The most prominent, mutually exclusive hypotheses can be divided into those arguing for a local west Balkan origin from an Illyrian or Messapic background [which may or may not have been distinct languages], and those proposing a non-local origin from a Daco-Moesian-Thracian background or an unattested Balkan language, whose speakers entered Albania from the central-east Balkans sometime after 400 CE.
The validity of these hypotheses, although hotly debated, is hard to test, as these ancient languages are poorly recorded, being known only from fragmentary inscriptions, toponyms, and a handful of historical sources.
Furthermore, all of the ethnonyms of ancient Balkan peoples, such as “Illyrian” and “Thracian”, are likely artificial labels that were coined by ancient and modern authors, and may include several related languages with largely obscure geographical limits, intelligibility, and emic identities of their speakers. The most recent linguistic hypotheses propose a sister-group relationship of Albanian to Greek or to the Greek-Armenian clade, which firmly places the origin of the language in the Balkans but does not pinpoint the location of the proto-Albanian homeland within the peninsula and its potential affiliation to historically attested populations.
Archaeological data on Albania’s Medieval cultures are also inconclusive, especially for the Komani-Kruja complex (ca. 600-800 CE), which has been interpreted as the cultural expression of either a Romanised population (local or intrusive) or an indigenous Albanian-speaking group.
Due to the challenges associated with linking archaeological, literary, and linguistic evidence, an archaeogenetic approach may offer novel insights into the origin of the Albanians, their biological relationships to ancient people, and the affinities of their language. Although gene flow is not always accompanied by language shifts [as in the case of Basque and Etruscan], migration is one of the primary vectors of cultural change, of which language dissemination is a frequent outcome. Recent years have witnessed a surge in the palaeogenomic sampling of the Balkan peninsula, yet the resulting datasets have not been mined to help us understand how migration led to the emergence and spread of new material cultures, communities, and languages in the territory of modern Albania. . . .
Our genomic transect of the population of Albania from the Neolithic to the modern era reveals fluctuations in genetic ancestry over a period of 8000 years. In contrast to the southeastern Balkans, where the arrival of Pontic-Caspian steppe ancestry and the associated Indo-European cultural package during the EBA did not lead to a lasting genetic turnover, we show that contemporary populations in Albania were genetically transformed both in autosomal and paternal ancestry. We find that more than a millennium later, BA-IA Balkan populations with high levels of steppe ancestry (30-40%) formed a distinct genetic cluster that extended from northwestern Greece, North Macedonia and the Adriatic coast (including Albania) and transcended archaeological and linguistic boundaries. This genetic continuum was broken down across the Balkans during the Roman and Migration period, due to mass settlement of Germanic and Slavic-speaking groups in the region.
However, in agreement with linguistic studies, we find that Albanians likely descend from a surviving West palaeo-Balkan population that experienced significant demographic increase approximately between 500-800 CE, perhaps after a population bottleneck. We show that in contrast to the rest of the Balkans, the Medieval samples from both North and South Albania experienced little to no contribution from surrounding Slavic populations and maintained high levels of BA-IA West Balkan ancestry.
Remarkably, the same genetic profile persisted 500-800 years later in most of the post-Medieval samples from Bardhoc, as shown both by the PCA, qpAdm analyses, and IBD data, which indicate significant genetic continuity from the Medieval populations of Albania. However, qpAdm models cannot exclude the possibility of additional admixture with currently unsampled neighbouring late Roman-early Medieval palaeo-Balkan groups with a similar ancestry profile. Based on linguistic data, the area of modern Kosovo and southeastern Serbia may have been such a source.
Despite being largely unaffected by the demographic changes that took place during the Migration period, the historical Albanians did not emerge in isolation. At the peak of the Migration Period, the Medieval population of Albania displayed genetic links as far as Pannonia, while in post-Medieval times we detected the presence of individuals likely related to modern Roma people. Furthermore, two of the post-Medieval samples exhibit significant admixture with South Slavic populations, and modern Albanians display highly variable levels of Slavic ancestry. This indicates complex historical interactions with South Slavic populations, as suggested by toponymy and linguistics.
We reveal that a significant proportion of the paternal ancestry of modern Albanians derives from groups ultimately descending from the BA-IA West Balkans, including those traditionally known as “Illyrians”, which reflects our findings on autosomal ancestry. However, inferring the language spoken by the Medieval samples from Albania is challenging, as Greek, South Slavic and West Balkan Romance are the only recorded languages of the region, while there is no indication of the survival of “Illyrian” following the first centuries of Roman rule. Furthermore, Albanian displays Latin loans from both the Western and Eastern Balkans, which attests to linguistic influences beyond the confines of modern Albania. Testing the Messapic hypothesis for Albanian was not possible due to the low coverage of said samples. Although the presence of haplogroups J2b-L283, I-M223, and R1b-Z2103 among the Messapians suggests a West Balkan origin, whether a related language persisted in the Balkans during Medieval times is unknown.
Even though Eastern Roman historians were unfamiliar with Albanians, we cannot exclude the possibility that proto-Albanians interacted with populations speaking Greek, Aromanian, or Slavic in what is now southern Albania during Medieval times. Given that genetic data strongly suggest a predominantly local origin for Albanians, their Medieval ancestors may have inhabited a geographically restricted area [possibly the region of Mat in central Albania], only occasionally venturing towards the south. These movements may have increased in scale over time, finally attracting the attention of Greek-speaking historians in the 11th century.
While the quest for the origins of the Albanian language will certainly continue, we expect that the present study will shape these debates and provide the necessary framework for more extensive research on the genetic ancestry of the ancient and modern inhabitants of Albania.