“人类基因组计划”显示,人类30亿个碱基对中只有很少一部分用于编码蛋白质。那么人类基因组剩余的部分在起着什么作用?它们是否只是之前遗传事件的残余?
美国科学家近日研究发现,遗传信息转变为蛋白质的过程中会留下遗传“指纹”,这些“指纹”甚至会出现在未参与蛋白编码过程的序列中。研究人员估计这些“指纹”至少影响到了三分之一的基因组,这表明虽然大部分DNA不参与编码蛋白,但是它们对在进化期间保持持续性具有重要的生物学作用。相关论文4月7日在线发表于美国《国家科学院院刊》(PNAS)上。
生物学家认为,不同物种的基因组有些相关序列相差很大,这是进化导致的突变所致;而有些相关序列所含基因相似,这些序列称为保守序列(conserved sequences),这些序列中基因的突变会使物种无法存活。生物学家因此将保守序列视为生物学重要性的标记。
要检测保守性,研究人员需要在两个物种中找到匹配的序列。这对编码序列来说相对简单,而对非编码序列来说则要困难许多。即使在一个基因之内,编码目标蛋白的序列也通常会点缀着“内含子”(introns),这些内含子在蛋白质形成之前会被切除下来。
之前,科学家猜测内含子中的突变不会影响最后的蛋白质,所以它们就简单地积累起来。而在最新的研究中,美国冷泉港实验室和芝加哥大学的研究人员发现,即使在这些区域,进化也会拒绝一些类型的突变。研究负责人、冷泉港实验室教授Michael Zhang认为,虽然选择是微弱的,但就对存活的影响来说,“内含子不是中立的”。
接下来的研究发现,必须有某种信号序列(signal sequences)出现在内含子中,它才能被很好地剪切,否则会带来潜在的致命效果。其它一些序列同样得以保存在保留区域内。
研究人员还发现了内含子和编码区对不同碱基的偏爱。这些区域一共构成三分之一多的基因组,经历着进化的选择压力。这一发现支持了其他一些研究的结果,即虽然大多数DNA不编码蛋白质,它们中的大部分却具有重要的生物学意义。
除了证明了剪接怎样影响遗传进化之外,此次研究还确定了一些可能的信号序列,其中一些为过去已知,另一些则是全新发现。论文合作者、冷泉港实验室教授Adrian Krainer说:“令人激动的是,将来要用实验方法检测这些预测元素是否是真的。”(科学网 梅进/编译)
生物谷推荐原始出处:
(PNAS),doi:10.1073/pnas.0801692105,Chaolin Zhang,Michael Q. Zhang
RNA landscape of evolution for optimal exon and intron discrimination
Chaolin Zhang*,, Wen-Hsiung Li, Adrian R. Krainer*, and Michael Q. Zhang*,
*Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724; Department of Biomedical Engineering, State University of New York, Stony Brook, NY 11794; and Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
Contributed by Wen-Hsiung Li, February 20, 2008 (sent for review January 8, 2008)
Abstract
Accurate pre-mRNA splicing requires primary splicing signals, including the splice sites, a polypyrimidine tract, and a branch site, other splicing-regulatory elements (SREs). The SREs include exonic splicing enhancers (ESEs), exonic splicing silencers (ESSs), intronic splicing enhancers (ISEs), and intronic splicing silencers (ISSs), which are typically located near the splice sites. However, it is unclear to what extent splicing-driven selective pressure constrains exonic and intronic sequences, especially those distant from the splice sites. Here, we studied the distribution of SREs in human genes in terms of DNA strand-asymmetry patterns. Under a neutral evolution model, each mononucleotide or oligonucleotide should have a symmetric (Chargaff's second parity rule), or weakly asymmetric yet uniform, distribution throughout a pre-mRNA transcript. However, we found that large sets of unbiased, experimentally determined SREs show a distinct strand-asymmetry pattern that is inconsistent with the neutral evolution model, and reflects their functional roles in splicing. ESEs are selected in exons and depleted in introns and vice versa for ESSs. Surprisingly, this trend extends into deep intronic sequences, accounting for one third of the genome. Selection is detectable even at the mononucleotide level, so that the asymmetric base compositions of exons and introns are predictive of ESEs and ESSs. We developed a method that effectively predicts SREs based on strand asymmetry, expanding the current catalog of SREs. Our results suggest that human genes have been optimized for exon and intron discrimination through an RNA landscape shaped during evolution.