Abstract | ||
---|---|---|
Segmentation of a document image plays an important role in automatic document processing. In this paper, we propose a consensus-based clustering approach for document image segmentation. In this method, the foreground regions of a document image are grouped into a set of primitive blocks, and a set of features is extracted from them. Similarities among the blocks are computed on each feature using a hypothesis test-based similarity measure. Based on the consensus of these similarities, clustering is performed on the primitive blocks. This clustering approach is used iteratively with a classifier to label each primitive block. Experimental results show the effectiveness of the proposed method. It is further shown in the experimental results that the dependency of classification performance on the training data is significantly reduced. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/s10032-016-0275-1 | International Journal on Document Analysis and Recognition |
Keywords | Field | DocType |
Document analysis, Segmentation, Clustering, Hypothesis testing, Stroke width | Fuzzy clustering,Data mining,Scale-space segmentation,Computer science,Document clustering,Segmentation-based object categorization,Image segmentation,Consensus clustering,Artificial intelligence,Cluster analysis,Pattern recognition,Correlation clustering,Machine learning | Journal |
Volume | Issue | ISSN |
19 | 4 | 1433-2825 |
Citations | PageRank | References |
1 | 0.36 | 33 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Soumyadeep Dey | 1 | 12 | 3.00 |
Jayanta Mukherjee | 2 | 378 | 56.06 |
Shamik Sural | 3 | 1008 | 96.36 |