日前,中科院北京基因组研究所重大疾病基因组与个体化医疗实验室,“百人计划”方向东研究员项目组助理研究员渠鸿竹博士等开展的国际合作研究“人类疾病全基因组关联分析研究”获得重大进展,相关研究成果Systematic Localization of Common Disease-Associated Variation in Regulatory DNA于2012年9月在Science杂志发表。本研究进一步采用新一代高通量测序技术,在表观基因组水平上开展全基因组关联分析(Genome-wide association study; GWAS)研究,并且在该研究领域取得了新的进展。
渠鸿竹博士和华盛顿大学美国国立卫生研究院西北注释表观基因组绘图中心(Northwest Reference Epigenome Mapping Center,NIH)主任、华盛顿大学基因组学系副教授John A. Stamatoyannopoulis博士所领导实验室的研究人员,通过分析人类349种细胞和组织样本的全基因组DNase I图谱与已有的GWAS SNPs数据,发现约93%的与疾病和性状相关的SNPs位于非编码序列内,并且集中在DNase I高敏感位点区域(DHSs)。88%含有SNP的DHSs存在于胎儿发育阶段,并且在这些DHSs内的SNPs与妊娠暴露相关表型(gestational exposure-related phenotypes)有关。此外,与含有SNP的DHS密切相关的远距离靶基因(绝大多数基因距离该DHSs超过100 kb)行使的功能与同一SNP相关的疾病表型相类似,该联系拓展了在基因组水平疾病与性状之间的关联性,同时提供了一个潜在的致病基因库来解释这种关联性。93.2%DHSs内部的疾病相关SNPs同时位于转录因子识别序列内,并影响了局部的染色质结构。这些转录因子进一步形成复杂的网络系统,调控与疾病相关的基因表达。此项研究突破性地从表观基因组水平进行GWAS分析,并在系统生物学理论的指导下,通过统合的生物信息学分析策略,从而建立疾病与生物学性状之间关联性的调控网络模型,为阐明人类常见疾病与基因性状之间的相互关系提供了崭新的科学视角和有利的研究工具。
全基因组关联分析(GWAS)的主要目的是在人类全基因组范围内寻找与疾病相关的序列变异,即单核苷酸多态性(SNP)。GWAS研究在某种疾病患者的全基因组范围内检测出SNP位点并与对照组人群进行比较,筛选所有的变异等位基因频率,避免了象候选基因策略一样需要预先假设致病基因,从而为复杂疾病的发病机制研究提供了更多的线索。(生物谷Bioon.com)
doi: 10.1126/science.1222794
PMC:
PMID:
Systematic Localization of Common Disease-Associated Variation in Regulatory DNA
Matthew T. Maurano, Richard Humbert, Eric Rynes, Robert E. Thurman, Eric Haugen, Hao Wang1, Alex P. Reynolds, Richard Sandstrom, Hongzhu Qu, Jennifer Brody, Anthony Shafer, Fidencio Neri1, Kristen Lee, Tanya Kutyavin, Sandra Stehling-Sun, Audra K. Johnson, Theresa K. Canfield, Erika Giste, Morgan Diegel, Daniel Bates, R. Scott Hansen, Shane Neph, Peter J. Sabo, Shelly Heimfeld, Antony Raubitschek, Steven Ziegler, Chris Cotsapas, Nona Sotoodehnia, Ian Glass, Shamil R. Sunyaev, Rajinder Kau, John A. Stamatoyannopoulos
Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure–related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.