This repository is based on VoVNet-v2
We measure the inference time of all models with batch size 1 on the same RTX2080Ti GPU machine.
- pytorch1.4.0
- CUDA 10.2
- cuDNN 7.3
Backbone | Param. | lr sched | inference time | AP | AP75 | AP50 | download |
---|---|---|---|---|---|---|---|
MobileNetV2-0.5-64 | N/A | 1x | 0.033 | 43.31 | 44.66 | 78.08 | model | metrics |
MobileNetV2-0.5 | N/A | 1x | 0.037 | 42.93 | 44.27 | 77.31 | model | metrics |
MobileNetV2 | 3.5M | 3x | 0.031 | 52.11 | 58.72 | 85.98 | model | metrics |
MobileNetV2 | 3.5M | 1x | 0.031 | 51.20 | 56.93 | 85.71 | model | metrics |
MobileNetV2-FLGC | N/A | 1x | 0.030 | 50.59 | 56.05 | 85.21 | model | metrics |
ShuffleNetV2-0.5 | N/A | 1x | 0.039 | 48.24 | 52.95 | 82.10 | model | metrics |
ShuffleNetV2 | N/A | 1x | 0.028 | 52.60 | 59.55 | 86.19 | model | metrics |
V2-19 | 11.2M | 1x | 0.034 | 41.46 | 44.97 | 71.32 | model | metrics |
V2-19-DW | 6.5M | 1x | N/A | N/A | N/A | N/A | model | metrics |
V2-19-Slim | 3.1M | 1x | 0.027 | 47.68 | 51.47 | 82.36 | model | metrics |
V2-19-Slim-DW | 1.8M | 3x | N/A | N/A | N/A | N/A | model | metrics |
- 64 FPN.OUT_CHANNELS = 64
- DW and Slim denote depthwise separable convolution and a thiner model with half the channel size, respectively.
Backbone | Param. | lr sched | inference time | AP | AP75 | AP50 | download |
---|---|---|---|---|---|---|---|
V2-19-FPN | 37.6M | 3x | N/A | N/A | N/A | N/A | model | metrics |
R-50-FPN | 51.2M | 3x | N/A | N/A | N/A | N/A | model | metrics |
V2-39-FPN | 52.6M | 3x | 0.071 | 51.47 | 57.5 | 85.5 | model | metrics |
Using this command with --num-gpus 1
python /path/to/sku110/train_net.py --config-file /path/to/sku110/configs/<config.yaml> --eval-only --num-gpus 1 MODEL.WEIGHTS <model.pth>
As this repository is implemented as a extension form (detectron2/projects) upon detectron2, you just install detectron2 following INSTALL.md.
Prepare for SKU-110K dataset:
- To download dataset, please visit here
- Extract the file downloaded to
datasets/sku110/images
- Extract
datasets/sku110/Annotations.zip
, there are 2 foldersAnnotations
andImageSets
To train a model, run
python /path/to/sku110/train_net.py --config-file /path/to/sku110/configs/<config.yaml>
For example, to launch end-to-end Faster R-CNN training with VoVNetV2-39 backbone on 8 GPUs, one should execute:
python /path/to/sku110/train_net.py --config-file /path/to/sku110/configs/faster_rcnn_V_39_FPN_3x.yaml --num-gpus 8
Model evaluation can be done similarly:
python /path/to/sku110/train_net.py --config-file /path/to/sku110/configs/faster_rcnn_V_39_FPN_3x.yaml --eval-only MODEL.WEIGHTS <model.pth>
To visual the result, run
python /path/to/sku110/demo.py --config-file /path/to/sku110/configs/faster_rcnn_V_39_FPN_3x.yaml --input image.jpg --output image.jpg MODEL.WEIGHTS <model.pth>
If you use VoVNet, please use the following BibTeX entry.
@inproceedings{lee2019energy,
title = {An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection},
author = {Lee, Youngwan and Hwang, Joong-won and Lee, Sangrok and Bae, Yuseok and Park, Jongyoul},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops},
year = {2019}
}