Corpus studies in word prediction - Citegraph

Paper Info

Title
Corpus studies in word prediction

Abstract
Word prediction can be used to enhance the communication rate of people with disabilities who use Augmentative and Alternative Communication (AAC) devices. We use statistical methods in a word prediction system, which are trained on a corpus, and then measure the efficacy of the resulting system by calculating the theoretical keystroke savings on some held out data. Ideally training and testing should be done on a large corpus of AAC text covering a variety of topics, but no such corpus exists. We discuss training and testing on a wide variety of corpora meant to approximate text from AAC users. We show that training on a combination of in-domain data with out-of-domain data is often more beneficial than either data set alone and that advanced language modeling such as topic modeling is portable even when applied to very different text.

Year	DOI	Venue
2007	10.1145/1296843.1296877	ASSETS
Keywords	Field	DocType
aac user,resulting system,corpus study,in-domain data,ideally training,aac text,topic modeling,word prediction,large corpus,out-of-domain data,different text,approximate text,language modeling,language model,corpora	Computer science,Keystroke logging,Human–computer interaction,Artificial intelligence,Natural language processing,Topic model,Language model,Augmentative and alternative communication,Prediction system	Conference
Citations	PageRank	References
11	0.79	12
Authors
2

Authors (2 rows)

Cited by (11 rows)

References (12 rows)

Name	Order	Citations	PageRank
Keith Trnka	1	97	7.51
Kathleen F. McCoy	2	671	93.90

1