Title
Automatic Summarization of Highly Spontaneous Speech.
Abstract
This paper addresses speech summarization of highly spontaneous speech. Speech is converted into text using an ASR, then segmented into tokens. Human made and automatic, prosody based tokenization are compared. The obtained sentence-like units are analysed by a syntactic parser to help automatic sentence selection for the summary. The preprocessed sentences are ranked based on thematic terms and sentence position. The thematic term is expressed in two ways: TF-IDF and Latent Semantic Indexing. The sentence score is calculated as linear combination of the thematic term score and a sentence position score. To generate the summary, the top 10 candidates for the most informative/best summarizing sentences are selected. The system performance showed comparable results (recall: 0.62, precision: 0.79 and F-measure 0.68) with the prosody based tokenization approach. A subjective test is also carried out on a Likert scale.
Year
DOI
Venue
2016
10.1007/978-3-319-43958-7_16
Lecture Notes in Computer Science
Keywords
Field
DocType
Speech summarization,Latent semantic indexing,Spontaneous speech
Tokenization (data security),Prosody,Automatic summarization,Ranking,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Parsing,Sentence,Syntax,Recall
Conference
Volume
ISSN
Citations 
9811
0302-9743
1
PageRank 
References 
Authors
0.36
8
2
Name
Order
Citations
PageRank
András Beke1225.51
György Szaszák25113.21