Title
Distinguishing Unseen from Seen for Generalized Zero-shot Learning
Abstract
Generalized zero-shot learning (GZSL) aims to recognize samples whose categories may not have been seen at training. Recognizing unseen classes as seen ones or vice versa often leads to poor performance in GZSL. Therefore, distinguishing seen and unseen domains is naturally an effective yet challenging solution for GZSL. In this paper, we present a novel method which leverages both visual and semantic modalities to distinguish seen and unseen categories. Specifically, our method deploys two variational autoencoders to generate latent representations for visual and semantic modalities in a shared latent space, in which we align latent representations of both modalities by Wasserstein distance and reconstruct two modalities with the representations of each other. In order to learn a clearer boundary between seen and unseen classes, we propose a two-stage training strategy which takes advantage of seen and unseen semantic descriptions and searches a threshold to separate seen and unseen visual samples. At last, a seen expert and an unseen expert are used for final classification. Extensive experiments on five widely used benchmarks verify that the proposed method can significantly improve the results of GZSL. For instance, our method correctly recognizes more than 99% samples when separating domains and improves the final classification accuracy from 72.6% to 82.9% on AWA1.
Year
DOI
Venue
2022
10.1109/CVPR52688.2022.00773
IEEE Conference on Computer Vision and Pattern Recognition
Keywords
DocType
Volume
Image and video synthesis and generation, Computer vision theory, Machine learning, Transfer/low-shot/long-tail learning
Conference
2022
Issue
Citations 
PageRank 
1
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Hongzu Su101.01
Jingjing Li259744.26
Zhi Chen300.68
Lei Zhu485451.69
Ke Lu527918.85