GEMiCCL: Mining genotype and expression data of cancer cell lines with elaborate visualization

Inhae Jeong, Namhee Yu, Insu Jang, Yukyung Jun, Min Seo Kim, Jinhyuk Choi, Byungwook Lee, Sanghyuk Lee

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Cancer cell lines are essential components for biomedical research. However, proper choice of cell lines for experimental purposes is often difficult because genotype and/or expression data are missing or scattered in diverse resources. Here, we report Gene Expression and Mutations in Cancer Cell Lines (GEMiCCL), an online database of human cancer cell lines that provides genotype and expression information. We have collected mutation, gene expression and copy number variation (CNV) data from three representative databases on cell lines - Cancer Cell Line Encyclopedia, Catalogue of Somatic Mutations in Cancer and NCI60. In total, GEMiCCL includes 1406 cell lines from 185 cancer types and 29 tissues. Gene expression, mutation and CNV information are available for 1304, 1334 and 1365 cell lines, respectively. We removed batch effects due to different microarray platforms using the ComBat software and re-processed the entire gene expression and SNP chip data. Cell line names and clinical information were standardized using Cellosaurus from ExPASy. Our user interface supports cell line search, gene search, browsing for specific molecular characteristics and complex queries-based on Boolean logic rules. We also implemented many interactive features and user-friendly visualizations. Providing molecular characteristics and clinical information, we believe that GEMiCCL would be a valuable resource for biomedical research for functional or screening studies.

Original languageEnglish
JournalDatabase
Volume2018
Issue number2018
DOIs
StatePublished - 1 Jan 2018

Bibliographical note

Publisher Copyright:
© The Author(s) 2018. Published by Oxford University Press.

Fingerprint

Dive into the research topics of 'GEMiCCL: Mining genotype and expression data of cancer cell lines with elaborate visualization'. Together they form a unique fingerprint.

Cite this