The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating down-stream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.
Bibliographical noteFunding Information:
This work was supported by the grants from the National Research Foundation of Korea (NRF-2014M3C9A3064552, NRF-2014M3C9A3065221, NRF-2014M3C9A3064548, NRF-2014M3C9A3068554, NRF-2014M3C9A3068822, and NRF-2019M3C9A5069653). A portion of the data used for this study were obtained from the Genome-InfraNet (IDs: 1711041199, 1711057837, 1711031849, 1711042674) at the Korea Bioinfor-mation Center.
© 2020, Korea Genome Organization.
- Analysis pipeline
- Cloud computing
- Genomic data
- Web server
- Workflow system