Robust Math Formula Recognition in Degraded Chinese Document Images - Citegraph

Paper Info

Title
Robust Math Formula Recognition in Degraded Chinese Document Images

Abstract
In this paper, we study the problem of math formula recognition (MFR) in degraded Chinese document images. Compared to traditional optical character recognition (OCR), the MFR problem brings new challenges in terms of character segmentation and structural analysis, especially in degraded images. To tackle these issues, we propose an over-segmentation strategy to split and recognize adhesive formula elements based on convolutional neural network (CNN). In addition, we propose a hierarchical framework for formula structure analysis that constructs the formula in a top-down manner to iteratively split the regions into recognizable units. Due to the lack of degraded Chinese document images with math formulas in the community, we also harvest a diverse ground-truth dataset containing 100 images submitted from our system users. Extended experiments demonstrate the effectiveness and robustness of our proposed method in comparison with state-of-the-art methods.

Year	DOI	Venue
2017	10.1109/ICDAR.2017.27	2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
Keywords	Field	DocType
Math Formula Recognition,Chinese Document Image,Convolutional Neural Network	Structure analysis,Pattern recognition,Character recognition,Convolutional neural network,Segmentation,Computer science,Optical character recognition,Image segmentation,Robustness (computer science),Artificial intelligence,Text recognition	Conference
Volume	ISSN	ISBN
01	1520-5363	978-1-5386-3587-2
Citations	PageRank	References
1	0.35	6
Authors
7

Authors (7 rows)

Cited by (1 rows)

References (6 rows)

Name	Order	Citations	PageRank
Ning Liu	1	88	31.20
Dongxiang Zhang	2	743	43.89
Xing Xu	3	764	62.73
Long Guo	4	11	3.67
Lijiang Chen	5	304	23.22
Wenju Liu	6	214	39.32
Dengfeng Ke	7	12	6.51

1