Parsing Speech: a Neural Approach to Integrating Lexical and Acoustic-Prosodic Information. - Citegraph

Paper Info

Title
Parsing Speech: a Neural Approach to Integrating Lexical and Acoustic-Prosodic Information.

Abstract
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing spoken utterances, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and prosodic features. We find that different types of acoustic-prosodic features are individually helpful, and together give statistically significant improvements in parse and disfluency detection F1 scores over a strong text-only baseline. For this study with known sentence boundaries, error analyses show that the main benefit of acoustic-prosodic features is in sentences with disfluencies, attachment decisions are most improved, and transcription errors obscure gains from prosody.

Year	Venue	Field
2018	NAACL-HLT	Computer science,Artificial intelligence,Natural language processing,Parsing
DocType	Citations	PageRank
Conference	2	0.37
References	Authors
0	6

Authors (6 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Trang Tran	1	8	2.50
Shubham Toshniwal	2	19	4.12
Mohit Bansal	3	871	63.19
Kevin Gimpel	4	1545	79.71
Karen Livescu	5	1254	71.43
Mari Ostendorf	6	2462	348.75

1