Title
Inferred box harmonization and aggregation for degraded face detection in crowds
Abstract
Since objects usually keep a certain distance from the surveillance camera, small object detection is a practical issue. Detecting small objects is also one of the remaining challenges in the computer vision community. The current detectors usually leverage a more robust backbone network, build one or more multi-scale feature pyramids, or define a more precise anchor-box screening criteria. However, the distinguishable features are scarce due to the appearance degradation and a shallow resolution. In this paper, we leverage high-level context to enhance anchor-based detectors’ capabilities for small and crowded face detection. We first define face co-occurrence prior based on density maps (FCP-DM) to explore extensive high-level contextual information. We propose a score-size-specific non-maximum suppression (S3NMS) to replace the traditional non-maximum suppression at the end of anchor-based detectors. Our approach is plug and play and model-independent, which could be concatenated into the existing anchor-based face detectors without extra learning. Compared to the prior art on the WIDER FACE hard set, our method increases an Average Precision of 0.1%-1.3%, while on Crowd Face, which we make for testing small and crowded face detection, it raises an Average Precision of 1% - 6%. Codes and dataset have been available online.
Year
DOI
Venue
2022
10.1007/s11042-022-12319-y
Multimedia Tools and Applications
Keywords
DocType
Volume
Object detection, Degraded face, Video surveillance
Journal
81
Issue
ISSN
Citations 
24
1380-7501
0
PageRank 
References 
Authors
0.34
4
5
Name
Order
Citations
PageRank
Liang Dong132652.32
Geng Qixiang200.34
Sun Han300.34
Huiyu Zhou41303111.91
Shun'ichi Kaneko523035.34