Title | ||
---|---|---|
Automatic Broadcast News Summarization Via Rank Classifiers And Crowdsourced Annotation |
Abstract | ||
---|---|---|
Extractive speech summarization methods generally operate as a binary classifier deciding if a sentence belongs to the summary or not. However, it is well known that even human annotators do not agree on selecting most summary sentences. In this paper, we take a probabilistic view of the summarization ground-truth and assume that more frequently selected sentences by annotators are of higher importance. Using a large summary data-set obtained through crowdsourcing, we empirically show that sentence selection frequency is inversely related to its summarization rank. Consequently, we model the relative importance between sentences using a rank-based classifier. Additionally, we utilize an extended paralinguistic feature set that has not been previously used for speech summarization. Lexical and structural features are also included. Support Vector Machine (SVM) is used as the baseline binary classifier and rank classifier. Experimental evaluations show that the proposed approach outperforms traditional binary classifiers with respect to various ROUGE summarization metrics for different summarization compression ratios (CR). |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/ICASSP.2015.7178974 | 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
Spoken document summarization, paralinguistic features, crowdsourcing | Automatic summarization,Pattern recognition,Binary classification,Computer science,Crowdsourcing,Support vector machine,Artificial intelligence,Natural language processing,Probabilistic logic,Classifier (linguistics),Sentence,Binary number | Conference |
ISSN | Citations | PageRank |
1520-6149 | 2 | 0.36 |
References | Authors | |
14 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
S. Parthasarthy | 1 | 60 | 5.25 |
Taufiq Hasan | 2 | 216 | 13.77 |