A sequential clustering algorithm with applications to gene expression data

Jongwoo Song, Dan L. Nicolae

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Clustering algorithms are used in the analysis of gene expression data to identify groups of genes with similar expression patterns. These algorithms group genes with respect to a predefined dissimilarity measure without using any prior classification of the data. Most of the clustering algorithms require the number of clusters as input, and all the objects in the dataset are usually assigned to one of the clusters. We propose a clustering algorithm that finds clusters sequentially, and allows for sporadic objects, so there are objects that are not assigned to any cluster. The proposed sequential clustering algorithm has two steps. First it finds candidates for centers of clusters. Multiple candidates are used to make the search for clusters more efficient. Secondly, it conducts a local search around the candidate centers to find the set of objects that defines a cluster. The candidate clusters are compared using a predefined score, the best cluster is removed from data, and the procedure is repeated. We investigate the performance of this algorithm using simulated data and we apply this method to analyze gene expression profiles in a study on the plasticity of the dendritic cells.

Original languageEnglish
Pages (from-to)175-184
Number of pages10
JournalJournal of the Korean Statistical Society
Volume38
Issue number2
DOIs
StatePublished - Jun 2009

Bibliographical note

Funding Information:
This research was supported in part by Burroughs Wellcome Fund Interfaces grant 1001774 (Song) and by The National Science Foundation grant DMS-0072510 (Nicolae).

Keywords

  • 62L12
  • 91C20
  • Clustering algorithm
  • Clustering score
  • Microarrays
  • Sequential clustering
  • primary
  • secondary

Fingerprint

Dive into the research topics of 'A sequential clustering algorithm with applications to gene expression data'. Together they form a unique fingerprint.

Cite this