Model-based clustering with dissimilarities: A Bayesian approach

Man Suk Oh, Adrian E. Raftery

Research output: Contribution to journalArticlepeer-review

34 Scopus citations


A Bayesian model-based clustering method is proposed for clustering objects on the basis of dissimilarites. This combines two basic ideas. The first is that the objects have latent positions in a Euclidean space, and that the observed dissimilarities are measurements of the Euclidean distances with error. The second idea is that the latent positions are generated from a mixture of multivariate normal distributions, each one corresponding to a cluster. We estimate the resulting model in a Bayesian way using Markov chain Monte Carlo. The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties. In the examples we study, the clustering results based on low-dimensional configurations were almost as good as those based on high-dimensional ones. Thus, the method can be used as a tool for dimension reduction when clustering high-dimensional objects, which may be useful especially for visual inspection of clusters. We also propose a Bayesian criterion for choosing the dimension of the object configuration and the number of clusters simultaneously. This is easy to compute and works reasonably well in simulations and real examples.

Original languageEnglish
Pages (from-to)559-585
Number of pages27
JournalJournal of Computational and Graphical Statistics
Issue number3
StatePublished - Sep 2007


  • Hierarchical model
  • Markov chain Monte Carlo
  • Mixture models
  • Multidimensional scaling


Dive into the research topics of 'Model-based clustering with dissimilarities: A Bayesian approach'. Together they form a unique fingerprint.

Cite this