m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks
Autonomous surgical procedures, in particular minimal invasive surgeries, are
the next frontier for Artificial Intelligence research. However, the existing
challenges include precise identification of the human anatomy and the surgical
settings, and modeling the environment for training of an autonomous agent. To
address the identification of human anatomy and the surgical settings, we
propose a deep learning based semantic segmentation algorithm to identify and
label the tissues and organs in the endoscopic video feed of the human torso
region. We present an annotated dataset, m2caiSeg, created from endoscopic
video feeds of real-world surgical procedures. Overall, the data consists of
307 images, each of which is annotated for the organs and different surgical
instruments present in the scene. We propose and train a deep convolutional
neural network for the semantic segmentation task. To cater for the low
quantity of annotated data, we use unsupervised pre-training and data
augmentation. The trained model is evaluated on an independent test set of the
proposed dataset. We obtained a F1 score of 0.33 while using all the labeled
categories for the semantic segmentation task. Secondly, we labeled all
instruments into an 'Instruments' superclass to evaluate the model's
performance on discerning the various organs and obtained a F1 score of 0.57.
We propose a new dataset and a deep learning method for pixel level
identification of various organs and instruments in a endoscopic surgical
scene. Surgical scene understanding is one of the first steps towards
automating surgical procedures.
Authors
Salman Maqbool, Aqsa Riaz, Hasan Sajid, Osman Hasan