Abstract | ||
---|---|---|
Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student's-t distribution for the gap classification stage. The main advantage of the Student's-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student's-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/DAS.2016.35 | 2016 12th IAPR Workshop on Document Analysis Systems (DAS) |
Keywords | Field | DocType |
Word Segmentation,Student's-t Distribution,Finite mixture models,Robust models | Scale-space segmentation,Pattern recognition,Computer science,Segmentation,Student's t-distribution,Outlier,Segmentation-based object categorization,Speech recognition,Text segmentation,Image segmentation,Artificial intelligence,Mixture model | Conference |
Citations | PageRank | References |
1 | 0.37 | 12 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Georgios Louloudis | 1 | 81 | 9.54 |
G. Sfikas | 2 | 155 | 14.23 |
Nikolaos Stamatopoulos | 3 | 20 | 2.79 |
Basilis Gatos | 4 | 773 | 43.34 |