Title
Incorporating paragraph embeddings and density peaks clustering for spoken document summarization
Abstract
Representation learning has emerged as a newly active research subject in many machine learning applications because of its excellent performance. As an instantiation, word embedding has been widely used in the natural language processing area. However, as far as we are aware, there are relatively few studies investigating paragraph embedding methods in extractive text or speech summarization. Extractive summarization aims at selecting a set of indicative sentences from a source document to express the most important theme of the document. There is a general consensus that relevance and redundancy are both critical issues for users in a realistic summarization scenario. However, most of the existing methods focus on determining only the relevance degree between sentences and a given document, while the redundancy degree is calculated by a post-processing step. Based on these observations, three contributions are proposed in this paper. First, we comprehensively compare the word and paragraph embedding methods for spoken document summarization. Next, we propose a novel summarization framework which can take both relevance and redundancy information into account simultaneously. Consequently, a set of representative sentences can be automatically selected through a one-pass process. Third, we further plug in paragraph embedding methods into the proposed framework to enhance the summarization performance. Experimental results demonstrate the effectiveness of our proposed methods, compared to existing state-of-the-art methods.
Year
DOI
Venue
2015
10.1109/ASRU.2015.7404796
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
Keywords
Field
DocType
Spoken document,summarization,embedding,relevance,redundancy
Computer science,Paragraph,Redundancy (engineering),Artificial intelligence,Natural language processing,Word embedding,Cluster analysis,Automatic summarization,Multi-document summarization,Embedding,Pattern recognition,Speech recognition,Feature learning
Conference
Citations 
PageRank 
References 
1
0.35
31
Authors
5
Name
Order
Citations
PageRank
Kuan-Yu Chen145055.78
Kai-Wun Shih210.35
Shih-Hung Liu36614.53
Berlin Chen415134.59
Hsin-min Wang51201129.62