Weakly Supervised Convolutional LSTM Approach for Tool Tracking in Laparoscopic Videos (IPCAI 2019)
CI. Nwoye, D. Mutter, J. Marescaux, N. Padoy
This repository contains the inference demo and evaluation scripts.
This model is a re-implementation of the deep learning tracking model for surgical tools in Laparoscopic videos using weakly supervised Convolutional LSTM approach. The network learns tool detection, localization and tracking from image-level labels. The approach is composed of a CNN + Convolutional LSTM (ConvLSTM) neural networks trained end-to-end, but weakly supervised on tool binary presence labels only. We use the ConvLSTM to model the temporal dependencies in the motion of the surgical tools and leverage its spatio-temporal ability to smooth the class peak activations in the localization heat maps (Lh-maps). Remarkably, the model learn the spatio-temporal information in time-space domain just from binary presence label using ConvLSTM. The model achieved a state-of-the-art performance on tool detection, localization and tracking for weakly supervised models.
There are several variants of the tracker as contained in the paper, but this gitlab repo provides the code for only the R+CL+C variant which can easily be modified to the other variants. (See the published paper)
Code list:
- model.py: the model implementation (with all the necessary libs)
- evaluation.ipynb: the evaluation and demo script (jupyter notebook)
- lib: The model depends on three utility files: resnet, resnet_utils and convlstm_cell. The convlstm_cell is a modified version of the original ConvLSTM library due to bugs in the contrib release. (TensorFlow may likely correct this in future release) The one-shot state initialization and subsequent between batch state propagation is implemented in the model.py code. So, frame seek information (frame index number) is necessary to enforce this continuity in the video data.
The model is trained on the largest public endoscopic video dataset to date: Cholec80 dataset. To reproduce this model on a custom dataset, users can write their training script. All the necessary hyperparameters are contained in the published paper and can be tuned to suit your task and dataset. Model can be trained on sequential frames or video data (with frame-wise binary presence labels)
No data preprocessing is required.
If you clone this repo, running the evaluation script will automatically download the model weights (in ckpt directory) and a sample shortened test video (in data directory) for model testing. If you redirect these downloads to a different location, change also their paths in the jupyter notebook's Variables & Device setup section. Feel free to try on other laparoscopic videos.
The model depends on the following libraries:
- Tensorflow (1.3 < tf < 2.0)
- ffmpeg
- Opencv
- imageio
- imageio-ffmpeg
- matplotlib
- Python >= 2.7
The code has been test on Linux operating system. It runs on both CPU and GPU. Equivalence of basic OS commands such as unzip, cd, wget, etc. will be needed to run in Windows or Mac OS.
The test results are contained in the published paper. Qualitative results can be found on the CAMMA ICube YouTube channel: video 1 and video 2
If you use a whole or part of the code, data, model weights or any idea contained herein in your research, please cite this paper:
Nwoye, C.I., Mutter, D., Marescaux, J. and Padoy, N., 2019. Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. International journal of computer assisted radiology and surgery, 14(6), pp.1059-1067.
@article{nwoye2019weakly,
title={Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos},
author={Nwoye, Chinedu Innocent and Mutter, Didier and Marescaux, Jacques and Padoy, Nicolas},
journal={International journal of computer assisted radiology and surgery},
volume={14},
number={6},
pages={1059--1067},
year={2019},
publisher={Springer}
}
This code, models, and datasets are available for non-commercial scientific research purposes provided by CC BY-NC-SA 4.0 LICENSE attached as LICENSE file. By downloading and using this code you agree to the terms in the LICENSE. Third-party codes are subject to their respective licenses.
This repo is maintained by CAMMA. Comments are welcomed. Check for updates.