On high-dimensional two sample mean testing statistics: a comparative study with a data adaptive choice of coefficient vector

Soeun Kim, Jae Youn Ahn, Woojoo Lee

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The key issues involved in two sample tests in high dimensional problems arise due to large dimension of the mean vector for a relatively small sample size. Recently, Wang et al. (Stat Sin 23:667–690, 2013) proposed a jackknife empirical likelihood test that works under weak assumptions on the dimension of variables (p), and showed that the test statistic has a chi-square limit regardless of whether p is finite or diverges. The sufficient condition required for this statistic is still restrictive. In this paper we significantly relax the sufficient condition for the asymptotic chi-square limit with models allowing flexible dependence structures and derive simpler alternative statistics for testing the equality of two high dimensional means. The proposed statistics have a chi-squared distribution or the maximum of two independent chi-square statistics as their limiting distributions, and the asymptotic results hold for either finite or divergent p. We also propose a data-adaptive method to select the coefficient vector, and compare the various methods in simulation studies. The proposed choice of coefficient vector substantially increases power in the simulation.

Original languageEnglish
Pages (from-to)451-464
Number of pages14
JournalComputational Statistics
Volume31
Issue number2
DOIs
StatePublished - 1 Jun 2016

Bibliographical note

Funding Information:
This work was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2013R1A1A1061332) and INHA UNIVERSITY Research Grant.

Publisher Copyright:
© 2015, Springer-Verlag Berlin Heidelberg.

Keywords

  • Coefficient vector
  • Data adaptive
  • High dimension
  • Two sample mean test

Fingerprint

Dive into the research topics of 'On high-dimensional two sample mean testing statistics: a comparative study with a data adaptive choice of coefficient vector'. Together they form a unique fingerprint.

Cite this