Title
CodeSwitch-Reddit: Exploration of Written Multilingual Discourse in Online Discussion Forums
Abstract
In contrast to many decades of research on oral code-switching, the study of written multilingual productions has only recently enjoyed a surge of interest. Many open questions remain regarding the sociolinguistic underpinnings of written code-switching, and progress has been limited by a lack of suitable resources. We introduce a novel, large, and diverse dataset of written code-switched productions, curated from topical threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. We investigate whether findings in oral code-switching concerning content and style, as well as speaker proficiency, are carried over into written code-switching in discussion forums. The released dataset can further facilitate a range of research and practical activities.
Year
DOI
Venue
2019
10.18653/v1/D19-5558
international joint conference on natural language processing
Field
DocType
Volume
World Wide Web,Computer science,Natural language processing,Artificial intelligence,Online discussion
Conference
D19-1
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Ella Rabinovich101.01
Masih Sultani200.34
Suzanne Stevenson356664.31