Title
Word Segmentation Using the Student's-t Distribution
Abstract
Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student's-t distribution for the gap classification stage. The main advantage of the Student's-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student's-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model.
Year
DOI
Venue
2016
10.1109/DAS.2016.35
2016 12th IAPR Workshop on Document Analysis Systems (DAS)
Keywords
Field
DocType
Word Segmentation,Student's-t Distribution,Finite mixture models,Robust models
Scale-space segmentation,Pattern recognition,Computer science,Segmentation,Student's t-distribution,Outlier,Segmentation-based object categorization,Speech recognition,Text segmentation,Image segmentation,Artificial intelligence,Mixture model
Conference
Citations 
PageRank 
References 
1
0.37
12
Authors
4
Name
Order
Citations
PageRank
Georgios Louloudis1819.54
G. Sfikas215514.23
Nikolaos Stamatopoulos3202.79
Basilis Gatos477343.34