Skip to content

Official repository of Human3.6M 3D WholeBody (H3WB) dataset

License

Notifications You must be signed in to change notification settings

wholebody3d/wholebody3d

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

H3WB: Human3.6M 3D WholeBody Dataset and Benchmark

This is the official repository for the paper "H3WB: Human3.6M 3D WholeBody Dataset and Benchmark" (ICCV'23). The repo contains Human3.6M 3D WholeBody (H3WB) annotations proposed in this paper.

For the 3D whole-body benchmark and results please refer to benchmark.md.

🆕Updates

Table of Content

What is H3WB

H3WB is a large-scale dataset for 3D whole-body pose estimation. It is an extension of Human3.6m dataset and contains 133 whole-body (17 for body, 6 for feet, 68 for face and 42 for hands) keypoint annotations on 100K images. The skeleton layout is the same as COCO-Wholebody dataset. Extensions to other popular 3D pose estimation datasets are ongoing and we already have annotations for Total Capture. If you want your favorite multi-view dataset to get whole-body 3D annotations, let us know!

Example annotations:

Layout from COCO-WholeBody: Image source.

H3WB Dataset

Download

  • Images can be downloaded from the official cite of Human3.6m dataset. We provide a data preparation script to compile Human3.6m videos into images which allows establishing correct correspondence between images and annotations.

  • The annotations for H3WB can be downloaded from here and by default it is put under datasets/json/.

  • The annotations for T3WB can be downloaded from here.

  • You could also download H3WB dataset in a format commonly employed for 3D pose estimation tasks here. We provide an accompanying data preparation class for this format. We highly recommend this format for your experiments. The util files camera.py, mocap_dataset.py and skeleton.py are directly taken from VideoPose repository.

Annotation format

Every json is in the following structure, but not every json contains all these values. See Tasks section.

XXX.json --- sample id --- 'image_path'
                        |
                        -- 'bbox' --- 'x_min'
                        |          |- 'y_min'
                        |          |- 'x_max'
                        |          |- 'y_max'
                        |
                        |- 'keypont_2d' --- joint id --- 'x'
                        |                             |- 'y'
                        |
                        |- 'keypont_3d' --- joint id --- 'x'
                                                      |- 'y'
                                                      |- 'z'
                        
                        

We also provide a script to load json files.

Pretrained models

H3WB comes with pretrained models that were used to create the datasets. Model implementations can be found in the 'models/' folder. Please find chekpoints in the table below:

Dataset Completion Diffusion Hands Diffusion Face
H3WB ckpt ckpt ckpt

Pretrained models for the different tasks of the benchmark can be found in benchmark.md.

Tasks

We propose 3 different tasks along with the 3D WholeBody dataset:

2D → 3D: 2D complete whole-body to 3D complete whole-body lifting

  • Use 2Dto3D_train.json for training and validation. It contains 80k 2D and 3D keypoints.

  • Use 2Dto3D_test_2d.json for test on leaderboard. It contains 10k 2D keypoints.

I2D → 3D: 2D incomplete whole-body to 3D complete whole-body lifting

  • Use 2Dto3D_train.json for training and validation. It contains 80k 2D and 3D keypoints.

  • Please apply masking on yourself during the training. The official masking strategy is as follows:

    • With 40% probability, each keypoint has a 25% chance of being masked,
    • with 20% probability, the face is entirely masked,
    • with 20% probability, the left hand is entirely masked,
    • with 20% probability, the right hand is entirely masked.
  • Use I2Dto3D_test_2d.json for test on leaderboard. It contains 10k 2D keypoints. Note that this test set is different from the 2Dto3D_test_2d.json.

RGB → 3D: Image to 3D complete whole-body prediction

  • Use RGBto3D_train.json for training and validation. It contains 80k image_path, bounding box and 3D keypoints.
  • It has the same samples from the 2Dto3D_train.json, so you can also access to 2D keypoints if needed.
  • Use RGBto3D_test_img.json for test on leaderboard. It contains 20k image_path and bounding box.
  • Note that the test sample ids are not aligned with previous 2 tasks.

Evaluation

Validation

We do not provide a validation set. We encourage researchers to report 5-fold cross-validation results with average and standard deviation values.

Evaluation on test set

We have released the the test sets of H3WB dataset.

Both 2D → 3D and I2D → 3D test sets contain 10k triplets of {image, 2D coordinates, 3D coordinates}. Note that, in order to prevent cheating on I2D → 3D task they have different test samples.

Visualization

We provide a function to visualize 3D whole-body, as well as the evaluation function for the leaderboard in this script.

Benchmark

Please refer to benchmark.md for the benchmark results.

Terms of Use

  1. This project is released under the MIT License.

  2. We do not own the copyright of the images. Use of the images must abide by the Human3.6m License agreement.

How to cite

If you find H3WB 3D WholeBody dataset useful for your project, please cite our paper as follows.

Yue Zhu, Nermin Samet, David Picard, "H3WB: Human3.6M 3D WholeBody Dataset and benchmark", ICCV, 2023.

BibTeX entry:

@InProceedings{Zhu_2023_ICCV,
    author    = {Zhu, Yue and Samet, Nermin and Picard, David},
    title     = {H3WB: Human3.6M 3D WholeBody Dataset and Benchmark},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {20166-20177}
}

Please also consider citing the following works.

@article{h36m_pami,
 author = {Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu, Cristian},
 title = {Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments},
 journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
 publisher = {IEEE Computer Society},
 year = {2014}
} 
 
@inproceedings{IonescuSminchisescu11,
 author = {Catalin Ionescu, Fuxin Li, Cristian Sminchisescu},
 title = {Latent Structured Models for Human Pose Estimation},
 booktitle = {International Conference on Computer Vision},
 year = {2011}
}