云计算是一种通过Internet以服务的方式提供动态可伸缩的虚拟化的资源的计算模式。现今,随着高通量测序技术的迅猛发展,生物信息学进入到大数据时代,所引发的多组学海量生物数据的存储和分析等问题亟待需要利用云的方式来解决。
近期,中国科学院北京基因组研究所基因组科学与信息重点实验室的“百人计划”章张研究员,与沙特阿卜杜拉国王科技大学(King Abdullah University of Science and Technology)、北京理工大学、IBM中国系统与科技研发中心开展合作研究,在Biology Direct杂志上发表了题为Bioinformatics clouds for big data manipulation的学术论文。文中分析了现有生物信息学领域的云计算服务(简称:生物信息云),根据其服务特点首次提出分类方法:数据即服务(DaaS,Data as a Service)、软件即服务(SaaS,Software as a Service)、平台即服务(PaaS,Platform as a Service)以及基础设施即服务(IaaS,Infrastructure as a Service)。
生物信息云从四个方面提供了海量生物数据的储存、获取、分析等相关需求的服务。同时,文中对云计算在生物信息学的应用进行了展望和讨论,提出并分析了以下几个亟需解决问题,即生物信息云应实现数据和软件的云储存,结合最新的高速传输、P2P、数据压缩等技术支持大数据的传输,开发基于云的轻量型编程环境,以及建立开放的生物信息学云平台。(生物谷Bioon.com)
doi:10.1186/1745-6150-7-43
PMC:
PMID:
Bioinformatics clouds for big data manipulation
Lin Dai, Xin Gao, Yan Guo, Jingfa Xiao and Zhang Zhang
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.