The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction. - Citegraph

Paper Info

Title
The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction.

Abstract
This paper introduces the freely available WikEd Error Corpus. We describe the data mining process from Wikipedia revision histories, corpus content and format. The corpus consists of more than 12 million sentences with a total of 14 million edits of various types. As one possible application, we show that WikEd can be successfully adapted to improve a strong baseline in a task of grammatical error correction for English-as-a-Second-Language (ESL) learners' writings by 2.63%. Used together with an ESL error corpus, a composed system gains 1.64% when compared to the ESL-trained system.

Year	DOI	Venue
2014	10.1007/978-3-319-10888-9_47	Advances in Natural Language Processing
Keywords	Field	DocType
error corpus,Wikipedia revision histories,grammatical error correction	Computer science,Error detection and correction,Speech recognition,Artificial intelligence,Natural language processing	Conference
Volume	ISSN	Citations
8686	0302-9743	2
PageRank	References	Authors
0.39	18	2

Authors (2 rows)

Cited by (2 rows)

References (18 rows)

Name	Order	Citations	PageRank
Roman Grundkiewicz	1	109	11.75
Marcin Junczys-Dowmunt	2	312	24.24

1