Abstract | ||
---|---|---|
The problem of classifying text with respect to belonging to a document or a meta-document is formulated and its application areas are proposed. An algorithm is proposed for document classification tasks where counts of words is insufficient do differentiate between such abstract classes of text as metalanguage and object-level. We extend the parse tree kernel method from the level of individual sentences towards the level of paragraphs, based on anaphora, rhetoric structure relations and communicative actions linking phrases in different sentences. Tree kernel learning technique is applied to these extended trees to leverage of additional discourse-related information. We evaluate our approach in the domain of action-plan documents. |
Year | Venue | Field |
---|---|---|
2015 | RANLP | Document classification,Parse tree,Computer science,Rhetoric,Tree kernel,Metalanguage,Artificial intelligence,Natural language processing,Kernel method,Discourse structure |
DocType | Citations | PageRank |
Conference | 1 | 0.34 |
References | Authors | |
16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Boris Galitsky | 1 | 248 | 37.81 |
Dmitry I. Ilvovsky | 2 | 14 | 7.38 |
Sergei O. Kuznetsov | 3 | 1630 | 121.46 |