Title
Syntactically-informed models for comma prediction
Abstract
Providing punctuation in speech transcripts not only improves readability, but it also helps downstream text processing such as information extraction or machine translation. In this paper, we improve by 7% the accuracy of comma prediction in English broadcast news by introducing syntactic features inspired by the role of commas as described in linguistics studies. We conduct an analysis of the impact of those features on other subsets of features (prosody, words…) when combined through CRFs. The syntactic cues can help characterizing large syntactic patterns such as appositions and lists which are not necessarily marked by prosody.
Year
DOI
Venue
2009
10.1109/ICASSP.2009.4960679
ICASSP
Keywords
Field
DocType
english broadcast news,providing punctuation,large syntactic pattern,speech transcript,syntactically-informed model,index terms— speech processing,machine learning,comma prediction,linguistics study,downstream text,syntactic cue,machine translation,punctuation,information extraction,indexing terms,information model,text analysis,linguistics,decision trees,speech,neural networks,computer science,testing,speech processing,natural language processing,speech recognition,data mining,feature extraction,probability density function,broadcasting,boosting,predictive models
Speech processing,Prosody,Computer science,Machine translation,Speech recognition,Information extraction,Artificial intelligence,Natural language processing,Syntax,Punctuation,CRFS,Text processing
Conference
ISSN
Citations 
PageRank 
1520-6149
3
0.41
References 
Authors
6
3
Name
Order
Citations
PageRank
Benoit Favre11338.58
Dilek Hakkani-Tür228217.30
Elizabeth Shriberg33057325.64