Abstract | ||
---|---|---|
Scene text detection and scene segmentation are meaningful tasks in the computer vision field. Could the semantic scene segmentation assist scene text detection in any degree? For example, can we expect the probability of a region being text is low if its surrounding segment, i.e., its context, is labeled as sky? In this paper, we have a positive answer by constructing a scene context-based text detection model. In this model, we use texton features and a fully-connected conditional random field (CRF) to estimate pixel-level scene class's probability to be considered as image's context feature. Meanwhile, maximally stable extremal regions (MSERs) are extracted, integrated and extended as image patches of character candidates. Then, each image patch is fed to a simple two-layer convolutional neural network (CNN) to automatically extract its character feature. The averaged context feature of the corresponding patch is considered as the patch's context feature. The character feature and context feature are fused as the input into a support vector machine for text/non-text determination. Finally, as a post-processing, neighboring text regions are grouped hierarchically. The performance evaluation on ICDAR2013 and SVT databases, as well as a preliminary evaluation on a patch-level database, proves that the scene context can improve the performance of scene text detection. Moreover, the comparative study with state-of-the-art methods shows the top-level performance of our method. HighlightsIt provides a new perspective for scene text detection.The semantic scene segmentation can assist scene text detection.The character feature and context feature are fused for textźnon-text classification.Four variable controlled comparison experiments are tested on a patch-level dataset. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1016/j.patcog.2016.04.011 | Pattern Recognition |
Keywords | Field | DocType |
Scene text detection,Fully connected CRF,Convolutional neural network,Character feature,Context feature | Convolutional neural network,Computer science,Artificial intelligence,Conditional random field,Computer vision,Pattern recognition,Texton,Feature (computer vision),Support vector machine,Maximally stable extremal regions,Scene segmentation,Machine learning,Text detection | Journal |
Volume | Issue | ISSN |
58 | C | 0031-3203 |
Citations | PageRank | References |
6 | 0.43 | 41 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Anna Zhu | 1 | 14 | 3.64 |
Renwu Gao | 2 | 6 | 0.77 |
Seiichi Uchida | 3 | 790 | 105.59 |