Comprehensive landscape of subtype-specific coding and noncoding RNA transcripts in breast cancer

Trung Nghia Vu, Setia Pramana, Stefano Calza, Chen Suo, Donghwan Lee, Yudi Pawitan

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


Molecular classification of breast cancer into clinically relevant subtypes helps improve prognosis and adjuvant-treatment decisions. The aim of this study is to provide a better characterization of the molecular subtypes by providing a comprehensive landscape of subtype-specific isoforms including coding, long noncoding RNA and microRNA transcripts. Isoform-level expression of all coding and non-coding RNAs is estimated from RNA-sequence data of 1168 breast samples obtained from The Cancer Genome Atlas (TCGA) project. We then search the whole transcriptome systematically for subtype-specific isoforms using a novel algorithm based on a robust quasi-Poisson model. We discover 5451 isoforms specific to single subtypes. A total of 27% of the subtype-specific isoforms have better accuracy in classifying the intrinsic subtypes than that of their corresponding genes. We find three subtype-specific miRNA and 707 subtype-specific long non-coding RNAs. The isoforms from long non-coding RNAs also show high performance for separation between Luminal A and Luminal B subtypes with an AUC of 0.97 in the discovery set and 0.90 in the validation set. In addition, we discover 1500 isoforms preferentially co-expressed in two subtypes, including 369 isoforms co-expressed in both Normallike and Basal subtypes, which are commonly considered to have distinct ER-receptor status. Finally, analyses at protein level reveal four subtype-specific proteins and two subtype co-expression proteins that successfully validate results from the isoform level.

Original languageEnglish
Pages (from-to)68851-68863
Number of pages13
Issue number42
StatePublished - Oct 2016


  • Breast cancer
  • Non-coding RNAs
  • RNA sequencing
  • Subtype co-expression
  • Subtype-specific isoforms


Dive into the research topics of 'Comprehensive landscape of subtype-specific coding and noncoding RNA transcripts in breast cancer'. Together they form a unique fingerprint.

Cite this