Abstract | ||
---|---|---|
The Marsden Collection is an important source of great interest to historians of New Zealand, but most of the pages are untranscribed and therefore keyword search is impossible. Stacked hourglass networks are a new architecture that has been shown to work on tasks that are relevantly similar to keyword search - they are capable of precisely locating semantic parts of objects (in the case of keyword search, the objects are words and the semantic parts are characters). We implemented a stacked hourglass network that we trained on simplified data generated out of individual characters drawn from the Marsden Collection. It learned to not just identify, but locate the characters in the image. Despite training on simplified data, our network can also correctly identify letters in unseen documents with 59% accuracy, which rises to 80% when considering the top 5 hypotheses. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/IVCNZ.2018.8634694 | IVCNZ |
Keywords | Field | DocType |
Training,Heating systems,Task analysis,Character recognition,Keyword search,Semantics,Optical character recognition software | Architecture,Pattern recognition,Character recognition,Hourglass,Task analysis,Computer science,Keyword search,Artificial intelligence,Natural language processing,Semantics,Optical character recognition software | Conference |
ISSN | ISBN | Citations |
2151-2191 | 978-1-7281-0125-5 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hannah Clark-Younger | 1 | 0 | 0.34 |
Steven Mills | 2 | 41 | 17.74 |
Lech Szymanski | 3 | 28 | 6.78 |