Abstract | ||
---|---|---|
State-of-the-art methods for document image classification rely on visual features extracted by deep convolutional neural networks (CNNs). These methods do not utilize rich semantic information present in the text of the document, which can be extracted using Optical Character Recognition (OCR). We first study the performance of state-of-the-art text classification approaches when applied to noisy text obtained from OCR. We then show that fusing this textual information with visual CNN methods produces state-of-the-art results on the RVL-CDIP classification dataset. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICDAR.2019.00021 | 2019 International Conference on Document Analysis and Recognition (ICDAR) |
Keywords | Field | DocType |
Classification,Document Image,Multimodal | Pattern recognition,Computer science,Convolutional neural network,Textual information,Noisy text,Optical character recognition,Semantic information,Artificial intelligence,Contextual image classification | Conference |
ISSN | ISBN | Citations |
1520-5363 | 978-1-7281-3015-6 | 1 |
PageRank | References | Authors |
0.35 | 0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rajiv Jain | 1 | 4 | 5.16 |
Curtis Wigington | 2 | 6 | 3.17 |