Automatic Feature Selection with Applications to Script Identification of Degraded Documents - Citegraph

Paper Info

Title
Automatic Feature Selection with Applications to Script Identification of Degraded Documents

Abstract
Current approaches to script identification rely onhand-selected features and often require processing a significantpart of the document to achieve reliable identification.We present an approach that applies a large pool ofimage features to a small training sample and uses subsetfeature selection techniques to automatically select a subsetwith the most discriminating power. At run time we usea classifier coupled with an evidence accumulation engineto report a script label once a preset likelihood thresholdhas been reached. We apply the system to a diverse corpusof printed Russian and English documents that suffer fromcommon degradation problems. Our validation studyshows promising results both in terms of the script identificationaccuracy and the ability to identify script on thescale of individual words and text lines.

Year	DOI	Venue
2003	10.1109/ICDAR.2003.1227762	ICDAR-1
Keywords	Field	DocType
fromcommon degradation problem,degraded documents,english document,script identification,reliable identification,current approach,script identificationaccuracy,automatic feature selection,evidence accumulation,diverse corpusof,script label,individual word,engines,feature selection,image features,degradation,image quality,pattern recognition,testing	Computer vision,Pattern recognition,Feature selection,Computer science,Image quality,Speech recognition,Artificial intelligence,Classifier (linguistics),Text recognition	Conference
ISSN	ISBN	Citations
1520-5363	0-7695-1960-1	11
PageRank	References	Authors
0.70	10	2

Authors (2 rows)

Cited by (11 rows)

References (10 rows)

Name	Order	Citations	PageRank
Vitaly Ablavsky	1	84	7.16
Mark R. Stevens	2	75	8.93

1