Visual Descriptor Extraction from Patent Figure Captions: A Case Study of Data Efficiency Between BiLSTM and Transformer - Citegraph

Paper Info

Title
Visual Descriptor Extraction from Patent Figure Captions: A Case Study of Data Efficiency Between BiLSTM and Transformer

Abstract
Technical drawings used for illustrating designs are ubiquitous in patent documents, especially design patents. Different from natural images, these drawings are usually made using black strokes with little color information, making it challenging for models trained on natural images to recognize objects. To facilitate indexing and searching, we propose an effective and efficient visual descriptor model that extracts object names and aspects from patent captions to annotate benchmark patent Figure datasets. We compared two state-of-the-art named entity recognition (NER) models and found that with a limited number of annotated samples, the BiLSTM-CRF model outperforms the Transformer model by a significant margin, achieving an overall F1 =96.60%. We further conducted a data efficiency study by varying the number of training samples and found that BiLSTM consistently beats the transformer model on our task. The proposed model is used to annotate a benchmark patent Figure dataset. CCS CONCEPTS • Computing methodologies Information extraction.

Year	DOI	Venue
2022	10.1145/3529372.3533299	2022 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
Keywords	DocType	ISSN
NLP,NER,big data,entity recognition,deep learning	Conference	2575-7865
ISBN	Citations	PageRank
978-1-6654-9155-6	0	0.34
References	Authors
7	4

Authors (4 rows)

Cited by (0 rows)

References (7 rows)

Name	Order	Citations	PageRank
Xin Wei	1	0	0.34
Jian Wu	2	0	2.37
Kehinde Ajayi	3	0	0.34
Diane Oyen	4	0	0.68

1