The ICSI-SRI-UW metadata extraction system - Citegraph

Paper Info

Title
The ICSI-SRI-UW metadata extraction system

Abstract
Both human and automatic processing of speech require recogniz- ing more than just the words. We describe a state-of-the-art sys- tem for automatic detection of "metadata" (information beyond the words) in both broadcast news and spontaneous telephone conver- sations, developed as part of the DARPA EARS Rich Transcription program. System tasks include sentence boundary detection, filler word detection, and detection/correction of disfluencies. To achieve best performance, we combine information from different types of language models (based on words, part-of-speech classes, and au- tomatically induced classes) with information from a prosodic clas- sifier. The prosodic classifier employs bagging and ensemble ap- proaches to better estimate posterior probabilities. We use confu- sion networks to improve robustness to speech recognition errors. Most recently, we have investigated a maximum entropy approach for the sentence boundary detection task, yielding a gain over our standard HMM approach. We report results for these techniques on the official NIST Rich Transcription metadata tasks.

Year	Venue	Field
2004	INTERSPEECH	Metadata,Pattern recognition,Computer science,Robustness (computer science),Speech recognition,NIST,Artificial intelligence,Principle of maximum entropy,Classifier (linguistics),Hidden Markov model,Sentence,Language model
DocType	Citations	PageRank
Conference	6	0.84
References	Authors
12	7

Authors (7 rows)

Cited by (6 rows)

References (12 rows)

Name	Order	Citations	PageRank
Elizabeth Shriberg	1	3057	325.64
Andreas Stolcke	2	6690	712.46
Dustin Hillard	3	410	26.56
Mari Ostendorf	4	2462	348.75
Barbara Peskin	5	176	18.45
Mary P. Harper	6	609	66.92
Yang Liu	7	491	116.11

1