Skip to content

Idein/chainer-hand-pose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chainer-hand-pose

  • This is a Chainer [12] implementation of 3D hand pose estimation.

  • Our algorithm adopts the top-down pipeline that consists of Detector and PoseEstimator namely:

    • Detector: MobileNetV2 [9] base SSD (insize=224x224) [4].
    • Pose Estimator: MobileNetV2 base Pose Proposal Networks (insize=224x224) [10].
  • First, Detector is applied on the image to generate a set of bounding-boxes surround human hand. The input image is cropped by them and these cropped patches serve as input to PoseEstimator. It estimates the 2D joint location [10] and 3D joint vector like [5], [7] for each hand.

Directory structure

$ cd path/to/this/README.md
$ tree -L 2 -d
.
├── docs
│   └── imgs
├── experiments
│   ├── docker
│   ├── notebooks
│   └── test_images
├── result
│   └── release
└── src
    ├── demo
    ├── detector
    └── pose

How to use

Setup environment

Clone this repository

$ cd /path/to/your/working/directory
$ git clone https://github.com/Idein/chainer-hand-pose.git

Prepare Docker & NVIDIA Container Toolkit

  • For simplicity, we will use docker image of idein/chainer which includes Chainer, ChainerCV and other utilities with CUDA driver. This will save time setting development environment.

Prepare dataset

Train/Predict Detector

Train/Predict PoseEstimator

Run demo (naive implementation)

  • After training Detector and PoseEstimator, these results will be stored in result directory. We provide demo script to run inference with them.
  • You can also use our pre-trained model. See our release page.

Run without docker

  • Just run
$ cd src
$ python3 demo.py ../result/release ../result/release

Run with docker

  • You can also use docker(on Ubuntu machine with GPU).

build docker image from Dockerfile

$ cd path/to/root/of/repository
$ docker build -t hand_demo experiments/docker/demo/gpu/

After finished building the docker image, just run src/run_demo.sh

$ cd src
$ bash run_demo.sh

Appendix

Run demo using Actcast framework (a.k.a actfw)

See src/demo/README.md

References

  • [1] Dollár, Piotr et al. “Cascaded pose regression.” CVPR (2010).
  • [2] Garcia-Hernando, Guillermo et al. “First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations.” CVPR (2017).
  • [3] Gomez-Donoso, Francisco et al. “Large-scale Multiview 3D Hand Pose Dataset.” Image Vision Comput. (2017).
  • [4] Liu, Wei et al. “SSD: Single Shot MultiBox Detector.” ECCV (2016).
  • [5] Luo, Chenxu et al. “OriNet: A Fully Convolutional Network for 3D Human Pose Estimation.” BMVC (2018).
  • [6] Mueller, Franziska et al. “Real-Time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor.” ICCV (2017).
  • [7] Mueller, Franziska et al. “GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB.” CVPR (2017).
  • [8] Niitani, Yusuke et al. “ChainerCV: a Library for Deep Learning in Computer Vision.” ACM Multimedia (2017).
  • [9] Sandler, Mark et al. “MobileNetV2: Inverted Residuals and Linear Bottlenecks.” CVPR (2018)
  • [10] Sekii, Taiki. “Pose Proposal Networks.” ECCV (2018).
  • [11] Simon, Tomas et al. “Hand Keypoint Detection in Single Images Using Multiview Bootstrapping.” CVPR (2017).
  • [12] Tokui, Seiya, et al. "Chainer: a next-generation open source framework for deep learning." NIPS (2015).
  • [13] Tompson, Jonathan et al. “Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks.” ACM Trans. Graph. 33 (2014).
  • [14] Zhang, Jiawei et al. “3D Hand Pose Tracking and Estimation Using Stereo Matching.” ArXiv (2016).
  • [15] Zimmermann, Christian and Thomas Brox. “Learning to Estimate 3D Hand Pose from Single RGB Images.” ICCV (2017).
  • [16] Zimmermann, Christian et al. “FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images.” ArXiv (2019).