Title
Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN.
Abstract
Automatically detecting surgical tools in recorded surgery videos is an important building block of further content-based video analysis. In ophthalmology, the results of such methods can support training and teaching of operation techniques and enable investigation of medical research questions on a dataset of recorded surgery videos. While previous methods used frame-based classification techniques to predict the presence of surgical tools - but did not localize them, we apply a recent deep-learning segmentation method (Mask R-CNN) to localize and segment surgical tools used in ophthalmic cataract surgery. We add ground-truth annotations for multi-class instance segmentation to two existing datasets of cataract surgery videos and make resulting datasets publicly available for research purposes. In the absence of comparable results from literature, we tune and evaluate the Mask R-CNN approach on these datasets for instrument segmentation/localization and achieve promising results (61% mean average precision on 50% intersection over union for instance segmentation, working even better for bounding box detection or binary segmentation), establishing a reasonable baseline for further research. Moreover, we experiment with common data augmentation techniques and analyze the achieved segmentation performance with respect to each class (instrument), providing evidence for future improvements of this approach.
Year
DOI
Venue
2020
10.1109/CBMS49503.2020.00112
CBMS
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Markus Fox100.34
Mario Taschwer2769.39
Klaus Schoeffmann350963.01