Title
3D human pose estimation with cross-modality training and multi-scale local refinement
Abstract
This paper aims to study two major research problems for 3D human pose estimation using depth data. First we seek an effective way for applying the RGB pre-trained 2D CNN model to 3D pose field, so as to transfer large-scale RGB annotation information to depth domain. In particular, we proposed a cross-modality CNN training strategy, where the key idea is to set a partial Batch Normalization (BN) layer within the RGB pre-trained 2D CNN model to weaken the distribution divergence between the RGB and depth data during training. To involve richer 3D descriptive cues, the raw depth data is appended with the normal vector map. Albeit coarse-to-fine human pose estimation with local refinement is helpful to enhance performance. While the way for setting the optimal local observation scale is not well addressed. Towards this crucial problem, we propose to fuse the multi-scale local information jointly. A multi-scale local refinement network is proposed accordingly, where the small local region focuses on capturing the fine information. On the other hand, the large local region contains richer semantic contextual information. The experiments on two 3D human pose estimation datasets with depth data verify the effectiveness and real-time running capacity of our proposition.
Year
DOI
Venue
2022
10.1016/j.asoc.2022.108950
Applied Soft Computing
Keywords
DocType
Volume
3D human pose estimation,Deep convolutional neural network,Batch normalization,Normal vector,Multi-scale local refinement
Journal
122
ISSN
Citations 
PageRank 
1568-4946
0
0.34
References 
Authors
28
7
Name
Order
Citations
PageRank
Boshen Zhang150.82
Yang Xiao223726.58
Fu Xiong300.34
Cunlin Wu400.34
Zhiguo Cao531444.17
Ping Liu635916.70
Joey Tianyi Zhou735438.60