Abstract | ||
---|---|---|
There is a significant need for a realistic dataset on which to evaluate layout analysis methods and examine their performance in detail. This paper presents a new dataset (and the methodology used to create it) based on a wide range of contemporary documents. Strong emphasis is placed on comprehensive and detailed representation of both complex and simple layouts, and on colour originals. In-depth information is recorded both at the page and region level. Ground truth is efficiently created using a new semi-automated tool and stored in a new comprehensive XML representation, the PAGE format. The dataset can be browsed and searched via a Web-based front end to the underlying database and suitable subsets (relevant to specific evaluation goals) can be selected and downloaded. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICDAR.2009.271 | ICDAR-1 |
Keywords | Field | DocType |
comprehensive xml representation,layout analysis,new comprehensive xml representation,contemporary document,region classification,xml,page format,online front-ends,ground truth,detailed representation,realistic dataset,pge segmentation,software performance evaluation,new dataset,datasets,in-depth information,ground truth format,document layout analysis,new semi-automated tool,performance evaluation,colour original,document handling,web-based front end,contemporary documents,pattern recognition,text analysis,pattern analysis,image recognition,front end,layout,databases,quality control,image analysis,biomedical imaging,data engineering | Front and back ends,Data mining,Information retrieval,XML,Computer science,Pattern analysis,Document layout analysis,Ground truth,Information engineering,Document handling | Conference |
ISSN | ISBN | Citations |
1520-5363 E-ISBN : 978-0-7695-3725-2 | 978-0-7695-3725-2 | 34 |
PageRank | References | Authors |
2.27 | 4 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Apostolos Antonacopoulos | 1 | 378 | 36.45 |
David Bridson | 2 | 105 | 9.12 |
Christos Papadopoulos | 3 | 58 | 4.06 |
stefan pletschacher | 4 | 216 | 20.78 |