Title
A Rectangle Mining Method for Understanding the Semantics of Financial Tables
Abstract
Financial statements report crucial information in tables with complex semantic structure, which are desirable, yet challenging, to interpret automatically. For example, in such tables a row of data cells is often explained by the headers of other rows. In a departure from prior art, we propose a rectangle mining framework for understanding complex tables, which considers rectangular regions rather than individual cells or pairs of cells in a table. We instantiate this framework with ReMine, an algorithm for extracting row header semantics of table, and show that it significantly outperforms prior pair-wise classification approaches on two datasets: (i) a set of manually labeled financial tables from multiple companies, and (ii) the ICDAR 2013 Table Competition dataset.
Year
DOI
Venue
2017
10.1109/ICDAR.2017.52
2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
Keywords
Field
DocType
complex tables,row header semantics,rectangle mining method,complex semantic structure,data cells,financial tables,ICDAR 2013 Table Competition dataset,financial statements,ReMine algorithm
Row,Computer science,Rectangle,Feature extraction,Prediction algorithms,Header,Finance,Semantics
Conference
Volume
ISSN
ISBN
01
1520-5363
978-1-5386-3587-2
Citations 
PageRank 
References 
0
0.34
15
Authors
5
Name
Order
Citations
PageRank
Xilun Chen1387.71
Laura Chiticariu2101.51
Marina Danilevsky300.68
Alexandre V. Evfimievski450141.76
Prithviraj Sen583738.24