Title
CONFIRM - Clustering of noisy form images using robust matching.
Abstract
•A clustering framework is proposed for clustering noisy form images.•Novel algorithms for matching text lines and rule lines are introduced.•We show 44% improvement over the state-of-the-art on 5 datasets of historical forms.•Sampling and bootstrapping is employed for scalability to large datasets.
Year
DOI
Venue
2019
10.1016/j.patcog.2018.10.004
Pattern Recognition
Keywords
Field
DocType
Form processing,Document analysis,Document image clustering,Historical document processing,Clustering
Fuzzy clustering,Canopy clustering algorithm,CURE data clustering algorithm,Clustering high-dimensional data,Correlation clustering,Pattern recognition,Artificial intelligence,Brown clustering,Cluster analysis,Mathematics,Visual Word
Journal
Volume
Issue
ISSN
87
1
0031-3203
Citations 
PageRank 
References 
0
0.34
36
Authors
2
Name
Order
Citations
PageRank
Chris Tensmeyer1204.83
Tony R. Martinez21364100.44