The logistic regression (LR) procedure for testing differential item functioning (DIF) typically depends on the asymptotic sampling distributions. The likelihood ratio test (LRT) usually relies on the asymptotic chi-square distribution. Also, the Wald test is typically based on the asymptotic normality of the maximum likelihood (ML) estimation, and the Wald statistic is tested using the asymptotic chi-square distribution. However, in small samples, the asymptotic assumptions may not work well. The penalized maximum likelihood (PML) estimation removes the first-order finite sample bias from the ML estimation, and the bootstrap method constructs the empirical sampling distribution. This study compares the performances of the LR procedures based on the LRT, Wald test, penalized likelihood ratio test (PLRT), and bootstrap likelihood ratio test (BLRT) in terms of the statistical power and type I error for testing uniform and non-uniform DIF. The result of the simulation study shows that the LRT with the asymptotic chi-square distribution works well even in small samples.
- differential item functioning
- logistic regression
- penalized maximum likelihood
- small samples