Abstract | ||
---|---|---|
Multi-document summarization (MDS) aims at giving a brief summary for a cluster of related documents. In this paper, we consider the MDS task as an optimization problem with a novel measure named soaking capacity being the objective function. The origin of our method is the classic hypothesis: the summary components are the sinks of information diffusion. We point out that the hypothesis only gives the role of summary but does not cover how well a summary acts as this role. To fill in the gap, soaking capacity is formally defined to quantify the ability of summary to soak up information. We explicitly demonstrate its fitness as an indicator for both the saliency and the diversity goal of MDS. For solving the optimization problem, we propose a greedy algorithm named Soap by adopting a surrogate of soaking capacity to accelerate the computation. Experiments on MDS datasets across various domains show the great potential of Soap as compared with the state-of-the-art MDS systems.
|
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3340531.3411909 | CIKM '20: The 29th ACM International Conference on Information and Knowledge Management
Virtual Event
Ireland
October, 2020 |
DocType | ISBN | Citations |
Conference | 978-1-4503-6859-9 | 0 |
PageRank | References | Authors |
0.34 | 12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kexiang Wang | 1 | 103 | 6.35 |
Baobao Chang | 2 | 445 | 46.85 |
Zhifang Sui | 3 | 172 | 39.06 |