Iran is fairly distinctive population generically from Southwest Asia and five of the eight ethnicities studied form one big supercluster, although the Arab ethnicity breaks out distinguishably from the other four ethnicities in the supercluster that overlap to a great extent, mostly on Principal Component 2 of the data. PC 1 and PC 3 are pretty much an overlapping jumble for the five core Iranian populations.
In the top two PCA charts, light purple is Turkmen, light blue is Persian Gulf Islanders, orange is Baluch. In all four PCA charts, dark blue is Arab, red is Lur, green is Azeri, yellow is Persian, and dark purple is Kurd. The close up is of the PCA chart labeled "C" comparing PC 1 to PC 2 for the five core Iranian ethnicities only.
In the top two PCA charts, light purple is Turkmen, light blue is Persian Gulf Islanders, orange is Baluch. In all four PCA charts, dark blue is Arab, red is Lur, green is Azeri, yellow is Persian, and dark purple is Kurd. The close up is of the PCA chart labeled "C" comparing PC 1 to PC 2 for the five core Iranian ethnicities only.
Considering the application of human genome variation databases in precision medicine, population-specific genome projects are continuously being developed. However, the Middle Eastern population is underrepresented in current databases. Accordingly, we established Iranome database (www.iranome.ir/com) by performing whole exome sequencing on 800 individuals from eight major Iranian ethnic groups representing the second largest population of Middle East. We identified 1,575,702 variants of which 308,311 were novel (19.6%). Also, by presenting higher frequency for 37,384 novel or known rare variants, Iranome database can improve the power of molecular diagnosis. Moreover, attainable clinical information makes this database a good resource for classifying pathogenicity of rare variants. Principal components analysis indicated that, apart from Baluchs, Turkmen and Persian Gulf Islanders, who form their own clusters, rest of the population were genetically linked, forming a super-population. Furthermore, only 0.6% of novel variants showed counterparts in "Greater Middle East Variome Project", emphasizing the value of Iranome at national level by releasing a comprehensive catalogue of Iranian genomic variations and also filling another gap in the catalogue of human genome variations at international level. We introduce Iranome as a resource which may also be applicable in other countries located in neighboring regions historically called Greater Iran (Persia).
Fattahi Z et al., Iranome: A catalogue of genomic variations in the Iranian population. Hum Mutat. (July 25, 2019). doi: 10.1002/humu.23880. [Epub ahead of print]
Supplemental information including population descriptions and PCA plots is here (MS Word).