The distinctions are subtle, but with enough data the modern Han Chinese can be deconstructed to reveal distinct ancestral groups that are dissolving into each other in the biggest cities of China.
Han Chinese is the most populated ethnic group across the globe with a comprehensive substructure that resembles its cultural diversification. Studies have constructed the genetic polymorphism spectrum of Han Chinese, whereas high-resolution investigations are still missing to unveil its fine-scale substructure and trace the genetic imprints for its demographic history. Here we construct a haplotype network consisted of 111,000 genome-wide genotyped Han Chinese individuals from direct-to-consumer genetic testing and over 1.3 billion identity-by-descent (IBD) links. We observed a clear separation of the northern and southern Han Chinese and captured 5 subclusters and 17 sub-subclusters in haplotype network hierarchical clustering, corresponding to geography (especially mountain ranges), immigration waves, and clans with cultural-linguistic segregation. We inferred differentiated split histories and founder effects for population clans Cantonese, Hakka, and Minnan-Chaoshanese in southern China, and also unveiled more recent demographic events within the past few centuries, such as Zou Xikou and Chuang Guandong. The composition shifts of the native and current residents of four major metropolitans (Beijing, Shanghai, Guangzhou, and Shenzhen) imply a rapidly vanished genetic barrier between subpopulations. Our study yields a fine-scale population structure of Han Chinese and provides profound insights into the nation's genetic and cultural-linguistic multiformity.Ao Lan, et al, "Fine-scale Population Structure and Demographic History of Han Chinese Inferred from Haplotype Network of 111,000 Genomes" bioRxiv (July 3, 2020) doi: https://doi.org/10.1101/2020.07.03.166413