Grammatical Error Correction: More Data with More Context - Citegraph

Paper Info

Title
Grammatical Error Correction: More Data with More Context

Abstract
Grammatical Error Correction (GEC) seriously suffers from a scarcity of data, both annotated and unannotated, as humans do not intentionally make grammatical errors. To account for this, we make use of the plentiful unlabeled plain text available and augment a dataset with artificial noise to increase our effective training data and pre-train our model as a denoising autoencoder (DAE), which offers an intuitive data augmentation solution for GEC. In a novel approach, we enhance our DAE, a Transformer Model, with a cross-document context mechanism and use a parallel encoder to encode the cross-document context before fusing the two contexts of the encoders in the decoder. Supplied by the combination of document similarity metrics and any unlabeled plain text, this serves as a new method of equipping a GEC model with supplemental context and allowing it to glean grammatical information from a separate plain text corpus. We evaluate our model on the CoNLL-2014 GEC Shared Task and achieve results that approach state-of-the-art for single models and show great potential with ever available and plentiful plain text.

Year	DOI	Venue
2020	10.1109/IALP51396.2020.9310498	2020 International Conference on Asian Language Processing (IALP)
Keywords	DocType	ISSN
grammatical error correction,transformer,data augmentation	Conference	2159-1962
ISBN	Citations	PageRank
978-1-7281-7690-1	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Kevin Parnow	1	0	2.03
Zuchao Li	2	0	0.34
Hai Zhao	3	960	113.64

1