Simulation-to-Real domain adaptation with teacher-student learning for endoscopic instrument segmentation
Purpose: Segmentation of surgical instruments in endoscopic videos is
essential for automated surgical scene understanding and process modeling.
However, relying on fully supervised deep learning for this task is challenging
because manual annotation occupies valuable time of the clinical experts.
Methods: We introduce a teacher-student learning approach that learns jointly
from annotated simulation data and unlabeled real data to tackle the erroneous
learning problem of the current consistency-based unsupervised domain
adaptation framework.
Results: Empirical results on three datasets highlight the effectiveness of
the proposed framework over current approaches for the endoscopic instrument
segmentation task. Additionally, we provide analysis of major factors affecting
the performance on all datasets to highlight the strengths and failure modes of
our approach.
Conclusion: We show that our proposed approach can successfully exploit the
unlabeled real endoscopic video frames and improve generalization performance
over pure simulation-based training and the previous state-of-the-art. This
takes us one step closer to effective segmentation of surgical tools in the
annotation scarce setting.