Identifying genomic signatures of n-gram nucleotide sequences to classify the chromatin states of broad histone track

Kyung Eun Lee, Hyun Seok Park

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A large proportion of human noncoding DNA had been known to have no biological function. However, unprecedented technical advances have started to convert unannotated noncoding DNA into highly annotated functional regions. In this paper, the frequency of n-grams of regional DNA sequences from fifteen chromatin states of Broad Histone Track are thoroughly analyzed, applying biological language modelling to n-grams. It has been shown that a few particular n-grams are found in abundance in one chromatin state but occurring very rarely in other states, thereby serving as chromatin state signatures. We discuss the significance of the patterns found, as well as their potential use in the special statistical models of nucleotide sequences necessary for developing algorithms for the computational analysis of functional units in noncoding DNA regions.

Original languageEnglish
Title of host publicationACM IMCOM 2015 - Proceedings
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450333771
DOIs
StatePublished - 8 Jan 2015
Event9th International Conference on Ubiquitous Information Management and Communication, ACM IMCOM 2015 - Bali, Indonesia
Duration: 8 Jan 201510 Jan 2015

Publication series

NameACM IMCOM 2015 - Proceedings

Conference

Conference9th International Conference on Ubiquitous Information Management and Communication, ACM IMCOM 2015
Country/TerritoryIndonesia
CityBali
Period8/01/1510/01/15

Keywords

  • Computational epigenetics
  • GC-contents
  • Methylation states
  • Noncoding DNA
  • Nucleotide frequency patterns

Fingerprint

Dive into the research topics of 'Identifying genomic signatures of n-gram nucleotide sequences to classify the chromatin states of broad histone track'. Together they form a unique fingerprint.

Cite this