Abstract | ||
---|---|---|
Data-driven approaches to sentence compression define the task as dropping any subset of words from the input sentence while retaining important information and grammaticality. We show that only 16% of the observed compressed sentences in the domain of subtitling can be accounted for in this way. We argue that this is partly due to the lack of appropriate evaluation material and estimate that a deletion model is in fact compatible with approximately 55% of the observed data. We analyse the remaining cases in which deletion only failed to provide the required level of compression. We conclude that in those cases word order changes and paraphrasing are crucial. We therefore argue for more elaborate sentence compression models which include paraphrasing and word reordering. We report preliminary results of applying a recently proposed more powerful compression model in the context of subtitling for Dutch. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1007/978-3-642-15573-4_3 | Empirical Methods in Natural Language Generation |
Keywords | Field | DocType |
observed data,input sentence,elaborate sentence compression model,preliminary result,deletion model,cases word order change,powerful compression model,word reordering,appropriate evaluation material,important information,word order | Word order,Parse tree,Computer science,Computational linguistics,Speech recognition,Sentence compression,Compression ratio,Natural language processing,Artificial intelligence,Grammaticality,Linguistics,Sentence | Conference |
Volume | ISSN | ISBN |
5790 | 0302-9743 | 3-642-15572-3 |
Citations | PageRank | References |
6 | 0.45 | 26 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Erwin Marsi | 1 | 543 | 46.13 |
Emiel Krahmer | 2 | 866 | 110.30 |
Iris Hendrickx | 3 | 285 | 30.91 |
Walter Daelemans | 4 | 2019 | 269.73 |