The focus of this work is on developing a general method for identifying cheaters in MOOCs in a way that does not assume a particular method of cheating. For that, we develop a classification model that takes as input a set of features that operationalize performance and behavioral parameters that are known to be associated with cheating. These include students' ability, the level of interaction with the course resources, solving time, and Item Response Theory (IRT) person fit parameters. We start with a list of six candidate features, and after a feature selection process, remain with four. We use these to build a probabilistic classifier (logistic regression) that yields an Area Under the Curve (AUC) of 0.826. Our data is based on an Introductory Physics MOOC. The features are computed using data-mining and standard IRT packages. We consider only the users who received a certificate in the course. Each of these users is considered as an example for the classifier. The positive examples are the set of users who were detected as "using multiple accounts to harvest solutions"by a different algorithm that was reported in a previous publication.
|Journal||CEUR Workshop Proceedings|
|State||Published - 2016|
|Event||24th ACM Conference on User Modeling, Adaptation and Personalisation, UMAP 2016 - Halifax, Canada|
Duration: 13 Jul 2016 → 16 Jul 2016
- Academic dishonesty
- Item Re-sponse Theory
- Learning analytics