Title
Preterm infants' pose estimation with spatio-temporal features.
Abstract
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Objective:</italic> Preterm infants’ limb monitoring in neonatal intensive care units (NICUs) is of primary importance for assessing infants’ health status and motor/cognitive development. Herein, we propose a new approach to preterm infants’ limb pose estimation that features spatio-temporal information to detect and track limb joints from depth videos with high reliability. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Methods:</italic> Limb-pose estimation is performed using a deep-learning framework consisting of a detection and a regression convolutional neural network (CNN) for rough and precise joint localization, respectively. The CNNs are implemented to encode connectivity in the temporal direction through 3D convolution. Assessment of the proposed framework is performed through a comprehensive study with sixteen depth videos acquired in the actual clinical practice from sixteen preterm infants (the babyPose dataset). <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Results:</italic> When applied to pose estimation, the median root mean square distance, computed among all limbs, between the estimated and the ground-truth pose was 9.06 pixels, overcoming approaches based on spatial features only (11.27 pixels). <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Conclusion:</italic> Results showed that the spatio-temporal features had a significant influence on the pose-estimation performance, especially in challenging cases (e.g., homogeneous image intensity). <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Significance:</italic> This article significantly enhances the state of art in automatic assessment of preterm infants’ health status by introducing the use of spatio-temporal features for limb detection and tracking, and by being the first study to use depth videos acquired in the actual clinical practice for limb-pose estimation. The babyPose dataset has been released as the first annotated dataset for infants’ pose estimation.
Year
DOI
Venue
2020
10.1109/TBME.2019.2961448
IEEE Transactions on Biomedical Engineering
Keywords
DocType
Volume
Pediatrics,Videos,Feature extraction,Monitoring,Pose estimation,Hip
Journal
67
Issue
ISSN
Citations 
8
0018-9294
1
PageRank 
References 
Authors
0.37
0
4
Name
Order
Citations
PageRank
Sara Moccia1389.44
Lucia Migliorelli210.37
Virgilio Carnielli310.37
Emanuele Frontoni424847.04