近日,国际著名杂志PLoS ONE在线刊登了了上海生科院计算生物学所金力教授等的最新研究成果“A Map of Copy Number Variations in Chinese Populations。”。该项工作构建了首张包括中国汉族和少数民族在内的拷贝数变异图谱,为研究中国人群的基因组多样性、群体分化和环境适应以及复杂性状基因定位提供了新的视角和数据信息参考。
拷贝数变异(Copy Number Variation,CNV)是近几年人类基因组和遗传学领域研究的热点。到目前为止,对中国人群的拷贝数变异研究,正如几乎所有的全基因组关联研究(GWAS)都在汉族人中进行一样,无论是大型国际合作计划还是国内独立开展的工作多集中在汉族人群。然而,一方面,中国人群的遗传多样性绝大部分存在于少数民族人群中,另一方面,对于基因功能和表型拷贝数变异的相对其他变异如单核苷酸多态的效应要大得多,但是其频率也较低,以至于在全基因组寻找与特定表型或疾病关联的拷贝数变异往往需要参考数据库。该项工作利用Affymetrix芯片技术,通过近1百万个拷贝数变异探针的信息,在中国汉族、藏族、侗族、瑶族、壮族、黎族和维吾尔族群体样本中检测了全基因组范围的拷贝数变异;并系统比较和分析了少数民族和汉族以及中国人群与世界其他大洲人群的基因组多样性和群体差异。研究发现少数民族人群与汉族人群不共享的拷贝数变异区域多达35%;与欧洲人群的研究结果相比,标签拷贝数变异(tag CNV)在中国人群之间的可移植性要低很多,提示全面研究中国人群拷贝数变异的必要性。进一步的研究表明,群体特异性的拷贝数变异可能与人群对其特定生存环境的长期适应有关。
该工作由博士生楼海一在导师金力教授和徐书华研究员的指导下,与复旦大学及哈佛儿科医院的研究人员合作完成。该研究工作得到了国家自然科学基金委、上海市科委、中国科学院、德国马普学会、香港王宽诚教育基金会等多项基金的资助。(生物谷Bioon.com)
doi:10.1371/journal.pone.0027341
PMC:
PMID:
A Map of Copy Number Variations in Chinese Populations
Haiyi Lou1#, Shilin Li2#, Yajun Yang2, Longli Kang2, Xin Zhang2, Wenfei Jin1, Bailin Wu2,3, Li Jin1,2*, Shuhua Xu1*
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ~35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.