Abstract | ||
---|---|---|
This paper presents a new method of functional classification of text blocks on a document. It is based on texture analysis and unsupervised classification. Texture is used here to define different classes of text blocks in the document and to direct a possible way of exploration from the most eye-catching data to the less significant text block. The typographicaI properties of blocks are characterized by two main discriminating primitives: the complexity of the text draw ing and the structural relief of the block. This analysis is the starting point of ahree-classes categorization into functional families (main headings, sub-headings and text paragraphs). Each block of text is described and classified through a labeling process based on a 3D-feature space using the two previous features (complexity and structural relief) and athird one among pattern primitives, blocks size and location in the document. This method allows a first approach to a global context-free classification of documents. |
Year | DOI | Venue |
---|---|---|
2002 | 10.1109/ICPR.2002.1047835 | Pattern Recognition, 2002. Proceedings. 16th International Conference |
Keywords | Field | DocType |
document image processing,image classification,pattern clustering,statistical analysis,text analysis,3D-feature space,complexity,discriminating primitives,functional classification,global context free classification,heterogeneous grey level documents,main headings,most eye-catching data,pattern primitives,structural relief,sub-headings,text blocks,text drawing,text entities,text paragraphs,texture analysis,three-classes categorization,typographical properties,unsupervised clustering | Categorization,Text mining,Pattern recognition,Document image processing,Pattern clustering,Computer science,Artificial intelligence,Cluster analysis,Contextual image classification,Gray (horse),Statistical analysis | Conference |
Volume | ISSN | ISBN |
3 | 1051-4651 | 0-7695-1695-X |
Citations | PageRank | References |
1 | 0.39 | 5 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Stéphane Bres | 1 | 127 | 14.42 |
Véronique Eglin | 2 | 131 | 19.67 |
Antoine Gagneux | 3 | 1 | 0.39 |