Surveying the MOOC Data Set Universe - Citegraph

Paper Info

Title
Surveying the MOOC Data Set Universe

Abstract
This paper is a survey of the availability of open data sets generated from Massively Open Online Courses (MOOCs). This log data allows researchers to analyze and predict student performance. Often, the goal of the analysis is to focus on at-risk students who are not likely to finish a course. There is a growing gap between the average researcher (who does not have access to proprietary data) and the ready availability of data sets for analysis. Most research papers studying and predicting student performance in MOOCs are done on proprietary data sets that are not anonymized (de-identified) or released for general study. There are no standardized tools that provide a gateway to access usable data sets; instead, the researcher must navigate a maze of sites with different data structures and varying data access policies. To our knowledge, no open data sets are being produced, and have not been since 2016. The authors survey the history of MOOC data sharing, identify the few available open data sets, and discuss a path forward to increase the reproducibility of MOOC research.

Year	DOI	Venue
2019	10.1109/LWMOOCS47620.2019.8939594	2019 IEEE Learning With MOOCS (LWMOOCS)
Keywords	DocType	ISBN
MOOC,weblog,analysis,edx2bigquery,Google BigQuery,anonymized data set,de-identification,MOOCdb,Moodle,Educational Data Mining,Learning Analytics,Learning at Scale,Limeade	Conference	978-1-7281-2550-3
Citations	PageRank	References
0	0.34	0
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
James J. Lohse	1	0	0.34
Christine A. McManus	2	0	0.34
David Joyner	3	9	8.40

1