Improved Automated Classification of Sentences in Data Science Exercises - Citegraph

Paper Info

Title
Improved Automated Classification of Sentences in Data Science Exercises

Abstract
The use of artificial intelligence proved to be useful to automating the grading process, especially when the assessment involves a large number of students. The general problem we are addressing is the automated grading of assignments, which solutions are composed of a list of commands, their outputs, and possible comments. In this paper, we focus on the automated classification of the comments, as "right" or "wrong". In particular, we investigated the effect of different features (i.e., fastText, BERT, distance-based and custom features), fed to several classifiers (i.e., Logistic Regression, Support Vector Machines, Random Forest, Multi-Layer Perceptron - MLP), to select the best one in terms of best balanced accuracy. In the experiment carried out, the best result was obtained by the MLP classifier using the fastText embeddings. When instead fed with BERT embeddings, MLP obtained a slightly lower accuracy and F1 score, even if it remains the best option with respect to the other classifiers. Furthermore, we tested the classifier with comments given to different assignments (of the same structure), given by different students and evaluated by a different professor. Also in this case, we achieved a relatively good accuracy and F1 score.

Year	DOI	Venue
2021	10.1007/978-3-030-86618-1_2	METHODOLOGIES AND INTELLIGENT SYSTEMS FOR TECHNOLOGY ENHANCED LEARNING
Keywords	DocType	Volume
TEL, Automated grading, NLP, ML, fastText, BERT, Multi-layer perceptron	Conference	326
ISSN	Citations	PageRank
2367-3370	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Anna Maria Angelone	1	0	1.35
Alessandra Galassi	2	0	0.68
Pierpaolo Vittorini	3	57	18.62

1