Wednesday, June 3, 2020

Quebec As A Natural Experiment

The National Geographic channel (which is included in the newly initiated Disney+ streaming service), has a new historical fiction TV drama series called "Barkskins" about the Filles du Roi (aka King's Daughters) who were sent to New France by King Louis XIV (who died in 1715 CE) to correct the gender imbalance in the colony. 

Mostly, they ended up in Quebec.

Quebec is a hot spot for international genetics research related to the multi-generational inheritance of genetic traits that can be benchmarked against theoretical models to test their reliability, as a long running natural experiment. 

This is because it has high quality and very complete birth, death and marriage records for about 400 years, with additional descriptive data because lots of people journaled or wrote vivid letters to family that were preserved, going all of the way back to this generation of men and women.

Also, there was not massive migration to or from Europe after this era, or to or from the rest of North America (except a pretty much one time, one directional migration to Louisiana), in part due to language barriers, since no new Francophone territories were left after the Louisiana Purchase in 1809. So, while this is certainly an oversimplification (as a comprehensive capsule history of Quebec from below the fold points out in detail) you still have a comparatively isolated gene pool pretty much until the 20th century, for a dozen generations or so after the women depicted in the new drama. And, because the habitable part of Quebec has essentially no natural geographic barriers and few strict endogamy inducing cultural divisions, Quebec can be studied as a first approximation of a single, unstructured, closed gene pool, which is what lots of widely used population genetic models try to approximate.

For example, in the case of many rare genes found among people in Quebec, it is possible to identify the names of specific named women from this founding population group, who we know something about beyond a mere tombstone information, who introduced the trait into the gene pool of Quebec.

This makes it possible, for instance, to quantify exactly the statistical impact of a gene on the fertility and mortality of those who carried it, and to precisely quantify mutation rates of genes present in the founding populations by looking at all descendants of that person who inherited either the original version or a mutated version of the original gene.

There are a few other places where that can be done (e.g. Iceland and Korea have 500+ year time series of accurate vital statistics records at the individual by individual level), that capture essentially a complete and isolated gene pool, but not many. In lots of other places that were keeping good written vital statistics records from the late 17th century or earlier, wars, fires and natural disasters destroyed the records at some point.

So you can use the example of Quebec to compare mathematically idealized multi-generational population genetic models with reality to see what factors differ enough from mathematical ideals (from the fact that there is no such thing as a fractional child in real life, to the fact that "panmixia" isn't a realistic assumption even in a very weakly structured population) that need to be included in a model that accurately describes reality.

Selected references from this body of the scientific literature and a detailed capsule history of Quebec from the body text of one of those articles, appear below the fold.


One of the earlier examples of this genetics literature on Quebec is this article, that abstract of which is also provided below:

The population of Quebec, Canada (7.3 million) contains ∼6 million French Canadians; they are the descendants of ∼8500 permanent French settlers who colonized Nouvelle France between 1608 and 1759. Their well-documented settlements, internal migrations, and natural increase over four centuries in relative isolation (geographic, linguistic, etc.) contain important evidence of social transmission of demographic behavior that contributed to effective family size and population structure. This history is reflected in at least 22 Mendelian diseases, occurring at unusually high prevalence in its subpopulations. Immigration of non-French persons during the past 250 years has given the Quebec population further inhomogeneity, which is apparent in allelic diversity at various loci. The histories of Quebec's subpopulations are, to a great extent, the histories of their alleles. Rare pathogenic alleles with high penetrance and associated haplotypes at 10 loci (CFTR, FAH, HBB, HEXA, LDLR, LPL, PAH, PABP2, PDDR, and SACS) are expressed in probands with cystic fibrosis, tyrosinemia, β-thalassemia, Tay-Sachs, familial hypercholesterolemia, hyperchylomicronemia, PKU, oculopharyngeal muscular dystrophy, pseudo vitamin D deficiency rickets, and spastic ataxia of Charlevoix-Saguenay, respectively) reveal the interpopulation and intrapopulation genetic diversity of Quebec. Inbreeding does not explain the clustering and prevalence of these genetic diseases; genealogical reconstructions buttressed by molecular evidence point to founder effects and genetic drift in multiple instances. Genealogical estimates of historical meioses and analysis of linkage disequilibrium show that sectors of this young population are suitable for linkage disequilibrium mapping of rare alleles. How the population benefits from what is being learned about its structure and how its uniqueness could facilitate construction of a genomic map of linkage disequilibrium are discussed.
Charles R. Scriver, "Human Genetics: Lessons From Quebec Populations" 2 Annual Review of Genomics and Human Genetics 69-101 (September 2001) (open access).

The introductory material and discussion of Quebec specific history in this paper is unusually well done for a journal article on genetics, which often glosses over these important background facts:
Patterns of migration and admixture over the last 100,000 years are determinants of the recent microevolution of human populations; 500 years ago, range expansion and colonization began to be important processes by which Europeans established new populations overseas. The large-scale analyses by Cavalli-Sforza et al. and the further reflections by Fix provide depth and breadth to our understanding of human migrations over the past many millennia; their work serves as a palimpsest for this selective analysis of a New World population. However, any attempt to study the Quebec example of human population genetics without taking into consideration associated cultural features is likely to lead to confusion; accordingly, how colonization, immigration, migration, and expansion by natural increase occurred in Quebec is described in some detail here because it will explain why there are subpopulations. 
Modern continental Europeans have population structures that resemble networks more than trees, and the population of France has a particularly dense network configuration. In the seventeenth century, emigration to overseas colonies occurred from particular regions of France, with selective sampling of the French population structure. From these initial expansions were created two separate populations of French settlers in the New World. The founders of Nouvelle France (which was later incorporated into Lower Canada and then became Quebec) established French enclaves along the banks of the St. Lawrence River. Another population of settlers from France established an independent North American colony in the seventeenth century; it was situated on the southern shore of the Bay of Fundy. The latter were the Acadians; they flourished until their colony was dispersed by the British in the 1750s. The Acadians retained a lively culture and identity, and their forced diaspora had genetic consequences in the twentieth century, notably, clustering of Mendelian disorders among the Acadians, among those who settled anew in Louisiana (the Cajuns), and those who resettled in Canada, some 4000 of them in Quebec. 
Migration is a recurrent theme of this review. The “unit of migration,” as it is perceived by Alan Fix and discussed elsewhere, was probably small, particularly in the geographic loci of settlement in the New World by the original French colonizers of the seventeenth and eighteenth centuries. Again it was small when some of the French Canadians began a second internal migration in the nineteenth century from Charlevoix County (on the northern shore of the St. Lawrence River downstream from Quebec City) into the region of Saguenay Lac St. Jean in northeastern Quebec. A kin-structured migration, involving shared genetic ancestry, was highly probable in the event and would have increased the chance of founder effect and genetic drift. It follows that awareness of the cultural and social behavior of the settlers and their descendants in the New World is necessary if we are to explain the notable clusterings of Mendelian diseases in contemporary Quebec. 
The Acadian theme receives little further attention here, but on behalf of modern-day Acadians in Canada and in Louisiana, it deserves extensive formal treatment at some future time. The genetic contribution by the Native Americans is also a regretted omission; they were significant in the success of the early colonists of Nouvelle France. 
Themes and Approach 

The histories of populations and the histories of their alleles can reflect each other. In the particular case of French Canadians, demographic and genetic histories may begin to provide the history of a genome. 
The colonization of Nouvelle France by (largely) French settlers2in the seventeenth and eighteenth centuries, their subsequent natural increase from founders in relative genetic isolation, and their own interregional migrations underlie the present-day regional clustering of at least 22 hereditary diseases in Quebec. Migration to Quebec continued after the French colony became British in 1759; it mainly comprised non-French persons. These latter day immigrants brought other rare alleles that now account for some clusterings of genetic diseases in Quebec. These unique demographic and social histories and the relative youth of this New World population account for differences in its genetic structure compared with that of Finland, for example (see J. Kere, Human Population Genetics: Lessons from Finland, this Volume).


Alleles are not randomly distributed in the geographic regions of Quebec and they reveal both intrapopulation and interpopulation distributions that reflect the patterns of migration and population growth. 
The high prevalence of several autosomal recessive diseases in Quebec might be a result of inbreeding, although the evidence indicates otherwise. The degree of inbreeding is not exceptional, as shown by Catholic records of dispensation, measures of homozygosity at the PAH locus, and by analysis of consanguinity and kinship in a non-Mendelian genetic disease (Down syndrome). The more likely explanation for certain disease prevalences and clustering in Quebec is founder effect and/or genetic drift. 
Highly penetrant pathogenic alleles, usually rare in frequency elsewhere, appear at elevated frequencies in Quebec and manifest themselves as autosomal recessive or dominant diseases. Although the origins of these alleles apparently lie in founder effects and/or genetic drift (during rapid expansion in relative isolation), allelic heterogeneity is observed repeatedly in the disease clusters. Phenotypic heterogeneity also occurs in several of the Quebec diseases, which could be the result of the allelic heterogeneity but also the result of modifier loci (in which case, studies of genotype/phenotype correlations in Quebec families could be informative). Whether the Quebec model will be equally informative for genomic searches of prevalent alleles with low penetrance and associated with common multifactorial disease is a question yet unanswered and certainly of timely interest. 
To know how Quebec was peopled and how its population increased over 400 years is to appreciate events reflected in the distribution of some rare penetrant alleles, as it were, into subpopulations. Ten loci are used to develop the themes as follows: The PAH locus illustrates the effects of demography on interpopulation and regional allelic diversity; the CFTR locus also has those features but also shows a founder effect in Ch-SLSJ. Two dyslipidemias illustrate (through the LDLR and LPL loci) intrapopulation allelic heterogeneity and regional diversity among French Canadians. Two loci (HEXAHBB) remind us that the appearance of a typical ethnic disease (e.g. Tay-Sachs, β-thalassemia) in a nontypical population, as a transpopulation event, need not result from its introduction by someone from the typical community. Four loci (FAHPDRRSACS, and PAPB2) reflect founder effects and/or genetic drift and show how populations in eastern Quebec, in which the number of historical meioses is very large, can serve the mapping of genes by linkage disequilibrium, even in a young population. 
The Peopling of Quebec 

The province of Quebec is a vast modern geopolitical entity. It occupies a region of the North American continent easily visible from space. In 1998, there were 7.3 million persons in the Province, 80% distributed in the St. Lawrence Valley, with 3.4 million living in the Montreal region and 1.62 million in the northeast (administrative regions 01, 02, 03, and 09). The 6 million French Canadians of Quebec consider themselves to be descendants of the original colonists. The anglophone population, comprising descendants from earlier settlers and more recent immigrants, is only 0.6 million; the Native American population is smaller still, ∼72,000. The balance of the population is allophone, reflecting non-French, non-British immigration, comprising 225,000 of Italian origin, 100,000 Jewish, 90,000 German, 60,000 Greek, 40,000 Portuguese, and similar numbers of Polish and Arab-speaking nationals. 
With the exception of its Native Americans, all citizens of Quebec today are either immigrants themselves or descendants of recent (within the past 400 years) immigrants. With only modern exceptions, those settlers and immigrants are of European origin, and they account for population growth and changes in its components during the past four centuries. 
Modern Quebec, known as Nouvelle France before 1759 and as Lower Canada after 1841, became a province of Canada at Confederation in 1867. By the mid-nineteenth century, the formerly French (and francophone) colony had significantly changed by virtue of non-French immigration. The anglophone population (221,000 persons) became 25% of the total, itself comprising people of Irish (60%), English (29%), and Scottish (11%) origins. 
The large non-French immigration to Quebec mainly took place after 1820, the majority settling in the Montreal region. However, the ethnic origin could be indistinct; for example, among the Catholic Irish, intermarriage with French Canadians was frequent, and when the mother was Irish, the offspring would be counted as French in the Quebec census. By 1931, one in seven Quebecois of Irish origin spoke French as the mother tongue. Such details are important when inferring, for example, the origins of particular PAH alleles in Quebec families today. 
In the 1890s, Canada developed an immigration policy that has remained relatively constant. In the late twentieth century, Quebec was accepting 25,000 immigrants annually. A total of 600,000 new citizens arrived from Europe, Africa, Latin America, and Asia during that century. 

In 1535, Jacques Cartier and his crew wintered over at Stadacona near what would later become Quebec City. Native tales of gold in the Kingdom of Saguenay persuaded Francis I, King of France, to send Cartier back in 1541 to form a colony. The venture failed. In the 1580s, fur trade became a new focus of interest and traders reached the Island of Montreal in 1600 where they built a post and wintered over for the year. At the time, excepting the few possible survivors of Raleigh's Virginia Colony, these were the only Europeans to live through a winter on the American continent north of Florida.
In 1603, Samuel de Champlain sailed the St. Lawrence River and then returned to France. The following year he returned, settled temporarily on the Island of St. Croix off the coast of New Brunswick, and in 1605 explored the coast that would soon host the Puritan colony at Plymouth. (Champlain did not claim this land for France; the “what-if” school of history pauses to reflect.) Champlain returned to the St. Lawrence River in 1608 and established a permanent post on the future site of Quebec City. The French colony of Nouvelle France had begun. 
A post at Trois-Rivières was established in 1634 where the St. Francis river from the south and the St. Maurice River from the north meet the St. Lawrence; the site would divide Quebec into Western and Eastern halves of settlement. Champlain died in his colony in 1635; at the time, some 200 settlers were established in Nouvelle France. In 1642, a religious impulse, led by the soldier Sieur de Maisonneuve and the dynamic Ursuline nun Jeanne Mance, created a permanent settlement at Montreal. 
The settlers of Nouvelle France and their descendants are among the best documented populations in the world. The archives, which cover births, marriages, and deaths in the Catholic parishes and reflect civic censuses, reveal who came as a colonist, from where in France, who returned permanently from the New World to France, who settled where in the St. Lawrence Valley, and how the population grew by immigration and natural increase. There were fewer than 10,000 permanent settlers [some say only 8483 settlers actually contributed to population expansion] in the 150 years of this European colonization, but by the mid-eighteenth century, natural increase had brought the French Canadian population to 60,000 inhabitants. The records also describe settlement, first along the northern shore of the river east of Quebec City in the region of Charlevoix from where, because of population expansion, there was migration up the Saguenay fiord followed by settlement after 1832 in the Saguenay Lac St. Jean region (SLSJ). The latter, a form of kin-structured migration, introduced a second bottleneck and potential for founder effect. The pertinent archives for the Charlevoix and SLSJ regions now exist as a computerized genealogical database. 
The Seven Years War (1756–1763) engulfed the European world and its colonies. After a long siege, on September 12, 1759, British troops led by Wolfe climbed the cliffs of Quebec at night; in the morning, they faced Montcalm on the Plains of Abraham. By evening, the colony of Nouvelle France had become a British possession; the Treaty of Paris in 1763 confirmed this. Yet, by 1776, the British Crown held only a foothold in North America, and this only by virtue of its acquired Canadian colony. It had lost what would become the United States of America, and it left the continent to its Native Americans, Spanish colonists, English religious dissidents, and other European emigrants. 

The demography of Quebec changed after the Treaty of Paris in 1763. French immigration essentially ceased and was replaced in the eighteenth century by settlers from Great Britain and by Loyalist emigrants leaving the newly created United States of America. 
By the time of the Canadian Confederation in 1867, 93% of the inhabitants of Lower Canada were Canadian-born and comprised three groups: Native Americans, French Canadians, and descendants either of Loyalists from the United States or British settlers. While the Native American population was in decline, the French Canadians were still expanding. In spite of almost no new francophone immigration to Quebec (the 4000 Acadians were a small exception), the French Canadian population doubled every 30 years between 1750 and 1875. The increase was attributed to the high birth and fertility rates of women who married early. Expansion of the French Canadian population took place largely in rural areas, yet it also required spatial expansion to accommodate it in the Charlevoix region, which ultimately led to internal migration and the founding of the SLSJ enclave centered on Chicoutimi. There was also migration to the cities and, during the economic depression of 1873, movement to the United States, with implications for a diaspora of French Canadian alleles inside and beyond Quebec. 

The twentieth century saw the trials and triumphs of materialism in Canada. The prospects were sufficient in the first decade for its Prime Minister to propose that this was to be “Canada's Century.” Immigrants came in large waves in the first three decades and again after World War II. In the final four decades of the twentieth century, the Canadian population more than doubled, with one fifth of the population (30.7 million in the year 2000) as new immigrants. Between 1946 and 1971, of the 3.5 million people who entered Canada, 15% settled in Quebec; only 5% came from francophone countries. The immigration policy fractured the traditional identity of Canada (and Quebec), and the country became “a polity in search of a nation”. The lingering effect of that policy may now be contributing to francophone Quebec's renewed search to retain its identity through political independence. Immigration brought to Quebec a collective genetic heritage quite different from the one initiated four centuries earlier. Each part of that heritage now echoes in the spectrum of alleles segregating in Quebec's population.
Another example of this literature is:

Knowledge of the genetic demography of Quebec is useful for gene mapping, diagnosis, treatment, community genetics and public health. The French‐Canadian population of Quebec, currently about 6 million people, descends from about 8500 French settlers who arrived in Nouvelle‐France between 1608 and 1759. The migrations of those settlers and their descendants led to a series of regional founder effects, reflected in the geographical distribution of genetic diseases in Quebec. This review describes elements of population history and clinical genetics pertinent to the treatment of French Canadians and other population groups from Quebec and summarizes the cardinal features of over 30 conditions reported in French Canadians. Some were discovered in French Canadians, such as autosomal recessive ataxia of the Charlevoix–Saguenay (MIM 270550), agenesis of corpus callosum and peripheral neuropathy (MIM 218000) and French‐Canadian‐type Leigh syndrome (MIM 220111). Other conditions are particularly frequent or have special genetic characteristics in French Canadians, including oculopharyngeal muscular dystrophy, hepatorenal tyrosinaemia, cystic fibrosis, Leber hereditary optic neuropathy and familial hypercholesterolaemia. Three genetic diseases of Quebec First Nations children are also discussed: Cree encephalitis (MIM 608505), Cree leukoencephalopathy (MIM 603896) and North American Indian childhood cirrhosis (MIM 604901).

A-M Laberge, et al., "Population history and its impact on medical genetics in Quebec" 68(4) Clinical Genetics 287-301 (August 19, 2005)

More recently, for example, this article contributed to the historical genetics literature based upon data from Quebec:
Genome-wide association studies (GWAS) have had a tremendous success in the identification of common DNA sequence variants associated with complex human diseases and traits. However, because of their design, GWAS are largely inappropriate to characterize the role of rare and low-frequency DNA variants on human phenotypic variation. Rarer genetic variation is geographically more restricted, supporting the need for local whole-genome sequencing (WGS) efforts to study these variants in specific populations. Here, we present the first large-scale low-pass WGS of the French-Canadian population. Specifically, we sequenced at ~5.6× coverage the whole genome of 1970 French Canadians recruited by the Montreal Heart Institute Biobank and identified 29 million bi-allelic variants (31% novel), including 19 million variants with a minor allele frequency (MAF) <0.5%. Genotypes from the WGS data are highly concordant with genotypes obtained by exome array on the same individuals (99.8%), even when restricting this analysis to rare variants (MAF <0.5, 99.9%) or heterozygous sites (98.9%). To further validate our data set, we showed that we can effectively use it to replicate several genetic associations with myocardial infarction risk and blood lipid levels. Furthermore, we analyze the utility of our WGS data set to generate a French-Canadian-specific imputation reference panel and to infer population structure in the Province of Quebec. Our results illustrate the value of low-pass WGS to study the genetics of human diseases in the founder French-Canadian population.
Cécile Low-Kam, et al., "Whole-genome sequencing in French Canadians from Quebec", 135 Human Genetics 1213-1221 (July 4, 2016).

No comments: