近日,密苏里大学发现,在多种植物基因组完全不同的区域中发现相同的DNA序列。Dmitry Korkin是计算机系的助理教授,也是该论文的主要作者。“之前没有人能够完成这样一规模的研究。”研究结果发表在PNAS杂志上。
当白宫科技政策办公室宣布了“大数据研究和发展倡议”后,对大量数据进行官方分析成为国家的重中之重。密苏里大学的一个多学科团队成功地应对了巨大数据的挑战,他们用开创性的计算计算法发现不同动植物种类间的相同DNA序列,从而解决的一个主要的生物学问题。
研究的共同作者、动物科学助理教授Gavin Conant说,“我们的发现有助于解释植物进化的一些谜团,植物基因组的基础研究为药物及农作物开发提供给了原材料并改进技术”
先前的研究发现,在不同的动物DNA中存在长段的相同编码。但是在MU的此次新研究前,计算机程序不足够发现植物DNA中的相同序列,因为这些相同的片段不在同一位点上。
之前的研究是将六种动物(狗、鸡、人类、小鼠、猕猴、大鼠)的基因组相互进行了对比。同样的,六种植物(拟南芥、大豆、大米、三叶、高粱和葡萄)的基因组也进行了相互对比。完成这些遗传序列对比共使用了48台具有每小时100万次搜索能力的计算机,耗时4个星期,总搜索次数达320亿次。
虽然研究人员发现植物种类间就像动物种族一样有相同序列,但他们表示这些序列演化过程不同。
Conant 说,“人们可能希望看到趋同进化,但是我们不这么认为,植物和动物都是复杂的多细胞生物,都需要应对许多相同的环境条件,例如呼吸空气和摄入水分、应对天气变化,不过它们的基因组以不同的方式编码应对这些挑战的解决方案。
MU团队的研究为将来研究动植物发展出不同的遗传机制的原因以及这些遗传机制如何运作奠定了基础;他们的基础研究也为可能改善人类生活的新发现奠定了基础。用于编码分析的计算机程序除了提高遗传科学在抵抗疾病中的潜能外,其本身也有助于新药研发。
Korkin说:“同样的算法可用于发现生物体整套蛋白质中相同的序列模式,这有助于找到现有药物新靶标或研究这些药物的副作用。”(生物谷Bioon.com)
doi:10.1073/pnas.1121356109
PMC:
PMID:
Long identical multispecies elements in plant and animal genomes
Jeff Reneker, Eric Lyons, Gavin C. Conant, J. Chris Pires, Michael Freeling, Chi-Ren Shyu, and Dmitry Korkin
Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.