Title
Combining hierarchical clustering and machine learning to predict high-level discourse structure
Abstract
We propose a novel method to predict the interparagraph discourse structure of text, i.e. to infer which paragraphs are related to each other and form larger segments on a higher level. Our method combines a clustering algorithm with a model of segment "relatedness" acquired in a machine learning step. The model integrates information from a variety of sources, such as word co-occurrence, lexical chains, cue phrases, punctuation, and tense. Our method outperforms an approach that relies on word co-occurrence alone.
Year
DOI
Venue
2004
10.3115/1220355.1220362
COLING
Keywords
Field
DocType
higher level,larger segment,novel method,high-level discourse structure,cue phrase,hierarchical clustering,lexical chain,interparagraph discourse structure,clustering algorithm,word co-occurrence
Hierarchical clustering,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Natural language processing,Cluster analysis,Punctuation,Machine learning,Discourse structure
Conference
Volume
Citations 
PageRank 
C04-1
7
0.67
References 
Authors
9
2
Name
Order
Citations
PageRank
Caroline Sporleder145331.84
alex lascarides250368.41