Abstract | ||
---|---|---|
We present a method for structuring a document according to the information present in its different organizational tables: table of contents, tables of figures, etc. This method is based on a two-step approach that leverages functional and formal (layout-based) kinds of knowledge. The functional definition of organizational table, based on five properties, is used to provide a first solution, which is improved in a second step by automatically learning the form of the table of contents. We also report on the robustness and performance of the method and we illustrate its use in a real conversion case. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/s10032-009-0078-8 | International Journal on Document Analysis and Recognition |
Keywords | DocType | Volume |
organizational table,real conversion case,different organizational table,two-step approach,document structuring · table of contents recognition · functional approach · machine learning,functional definition,information present | Journal | 12 |
Issue | ISSN | Citations |
1 | 1433-2825 | 15 |
PageRank | References | Authors |
1.26 | 13 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hervé Déjean | 1 | 377 | 48.52 |
Jean-Luc Meunier | 2 | 243 | 39.36 |