TY - JOUR
T1 - Characterization of histone modification patterns and prediction of novel promoters using functional principal component analysis
AU - Kim, Mijeong
AU - Lin, Shili
N1 - Funding Information:
Funding:Thisworkwassupportedinpartbya NationalResearchFoundationofKorea(NRF)grant fundedbytheKoreanGovernment(url:https://nrf.
Publisher Copyright:
© 2020 Kim, Lin. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - Characterization of distinct histone methylation and acetylation binding patterns in promoters and prediction of novel regulatory regions remains an important area of genomic research, as it is hypothesized that distinct chromatin signatures may specify unique genomic functions. However, methods that have been proposed in the literature are either descriptive in nature or are fully parametric and hence more restrictive in pattern discovery. In this article, we propose a two-step non-parametric statistical inference procedure to characterize unique histone modification patterns and apply it to analyzing the binding patterns of four histone marks, H3K4me2, H3K4me3, H3K9ac, and H4K20me1, in human B-lymphoblastoid cells. In the first step, we used a functional principal component analysis method to represent the concatenated binding patterns of these four histone marks around the transcription start sites as smooth curves. In the second step, we clustered these curves to reveal several unique classes of binding patterns. These uncovered patterns were used in turn to scan the whole-genome to predict novel and alternative promoters. Our analyses show that there are three distinct promoter binding patterns of active genes. Further, 19654 regions not within known gene promoters were found to overlap with human ESTs, CpG islands, or common SNPs, indicative of their potential role in gene regulation, including being potential novel promoter regions.
AB - Characterization of distinct histone methylation and acetylation binding patterns in promoters and prediction of novel regulatory regions remains an important area of genomic research, as it is hypothesized that distinct chromatin signatures may specify unique genomic functions. However, methods that have been proposed in the literature are either descriptive in nature or are fully parametric and hence more restrictive in pattern discovery. In this article, we propose a two-step non-parametric statistical inference procedure to characterize unique histone modification patterns and apply it to analyzing the binding patterns of four histone marks, H3K4me2, H3K4me3, H3K9ac, and H4K20me1, in human B-lymphoblastoid cells. In the first step, we used a functional principal component analysis method to represent the concatenated binding patterns of these four histone marks around the transcription start sites as smooth curves. In the second step, we clustered these curves to reveal several unique classes of binding patterns. These uncovered patterns were used in turn to scan the whole-genome to predict novel and alternative promoters. Our analyses show that there are three distinct promoter binding patterns of active genes. Further, 19654 regions not within known gene promoters were found to overlap with human ESTs, CpG islands, or common SNPs, indicative of their potential role in gene regulation, including being potential novel promoter regions.
UR - http://www.scopus.com/inward/record.url?scp=85085538544&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0233630
DO - 10.1371/journal.pone.0233630
M3 - Article
C2 - 32459819
AN - SCOPUS:85085538544
SN - 1932-6203
VL - 15
JO - PLoS ONE
JF - PLoS ONE
IS - 5
M1 - e0233630
ER -