Evaluating the Robustness of Learning Analytics Results Against Fake Learners

Giora Alexandron, José A. Ruipérez-Valiente, Sunbok Lee, David E. Pritchard

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


Massive Open Online Courses (MOOCs) collect large amounts of rich data. A primary objective of Learning Analytics (LA) research is studying these data in order to improve the pedagogy of interactive learning environments. Most studies make the underlying assumption that the data represent truthful and honest learning activity. However, previous studies showed that MOOCs can have large cohorts of users that break this assumption and achieve high performance through behaviors such as Cheating Using Multiple Accounts or unauthorized collaboration, and we therefore denote them fake learners. Because of their aberrant behavior, fake learners can bias the results of Learning Analytics (LA) models. The goal of this study is to evaluate the robustness of LA results when the data contain a considerable number of fake learners. Our methodology follows the rationale of ‘replication research’. We challenge the results reported in a well-known, and one of the first LA/Pedagogic-Efficacy MOOC papers, by replicating its results with and without the fake learners (identified using machine learning algorithms). The results show that fake learners exhibit very different behavior compared to true learners. However, even though they are a significant portion of the student population (∼ 15%), their effect on the results is not dramatic (does not change trends). We conclude that the LA study that we challenged was robust against fake learners. While these results carry an optimistic message on the trustworthiness of LA research, they rely on data from one MOOC. We believe that this issue should receive more attention within the LA research community, and can explain some ‘surprising’ research results in MOOCs.

Original languageEnglish
Title of host publicationLifelong Technology-Enhanced Learning - 13th European Conference on Technology Enhanced Learning, EC-TEL 2018, Proceedings
EditorsRaymond Elferink, Hendrik Drachsler, Viktoria Pammer-Schindler, Mar Perez-Sanagustin, Maren Scheffel
PublisherSpringer Verlag
Number of pages14
ISBN (Print)9783319985718
StatePublished - 2018
Event13th European Conference on Technology Enhanced Learning, EC-TEL 2018 - Leeds, United Kingdom
Duration: 3 Sep 20186 Sep 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11082 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference13th European Conference on Technology Enhanced Learning, EC-TEL 2018
Country/TerritoryUnited Kingdom

Bibliographical note

Publisher Copyright:
© 2018, Springer Nature Switzerland AG.


  • Educational data mining
  • Fake learners
  • IRT
  • Learning analytics
  • MOOCs
  • Reliability


Dive into the research topics of 'Evaluating the Robustness of Learning Analytics Results Against Fake Learners'. Together they form a unique fingerprint.

Cite this