Title
Corpus studies in word prediction
Abstract
Word prediction can be used to enhance the communication rate of people with disabilities who use Augmentative and Alternative Communication (AAC) devices. We use statistical methods in a word prediction system, which are trained on a corpus, and then measure the efficacy of the resulting system by calculating the theoretical keystroke savings on some held out data. Ideally training and testing should be done on a large corpus of AAC text covering a variety of topics, but no such corpus exists. We discuss training and testing on a wide variety of corpora meant to approximate text from AAC users. We show that training on a combination of in-domain data with out-of-domain data is often more beneficial than either data set alone and that advanced language modeling such as topic modeling is portable even when applied to very different text.
Year
DOI
Venue
2007
10.1145/1296843.1296877
ASSETS
Keywords
Field
DocType
aac user,resulting system,corpus study,in-domain data,ideally training,aac text,topic modeling,word prediction,large corpus,out-of-domain data,different text,approximate text,language modeling,language model,corpora
Computer science,Keystroke logging,Human–computer interaction,Artificial intelligence,Natural language processing,Topic model,Language model,Augmentative and alternative communication,Prediction system
Conference
Citations 
PageRank 
References 
11
0.79
12
Authors
2
Name
Order
Citations
PageRank
Keith Trnka1977.51
Kathleen F. McCoy267193.90