This is the project repository for the research study Skeleton-Split Framework using Spatial Temporal Graph Convolutional Networks for Action Recogntion presented by Motasem S. Alsawadi and Miguel Rio.
Skeleton extraction from a UCF-101 sample
- 1. Download UCF-101 skeleton dataset
- 2. Download HMDB skeleton dataset
- 3. Storage info
- Citation
- Acknowledgements
The UCF-101 dataset provides a total of 13,320 clips classified into 101 action classes. The skeleton output layout has been extracted using the OpenPose system. Due to the variability of the duration of each clip, a fixed duration of 300 frames has been proposed. Therefore, if any video clip has less than 300 frames, we repeat the initial frames until reach the amount needed. Otherwise, if the video clip exceeds the frame number, we trim it. As a consequence, the spatio-temporal information of the skeleton of each video sample can be represented as a tensor with shape (18, 3, 300). An independent JSON file has been exported for each video sample. Thus, the outcome of this process are 13,320 JSON files with the skeleton information of the UCF-101 dataset. respectively. The UCF-101 skeleton dataset can be downloaded here.
The HMDB-51 dataset provides a total of 6,766 video clips of 51 different classes. Similar to the UCF-101 skeleton dataset, the skeleton output layout has been extracted using the OpenPose system using the same fixed length of 300 frames. Also, an independent JSON file has been exported for each video sample. Therefore, 6,766 JSON files with the skeleton information of the HMDB-51 dataset. The HMDB-51 skeleton dataset can be downloaded here.
The UCF-101 and HMDB-51 skeleton dataset are provided as .zip format.
On the other hand, the UCF-101 skeleton dataset compressed format size is 991 MB. On the uncompressed format, this dataset has a size of 2.56 GB of storage.
The compressed file size of the HMDB-51 skeleton dataset is 581 MB. On the uncompressed format, this dataset has a size of 1.62 GB of storage.
Please cite the following paper if you use this repository in your reseach:
@article{alsawadi2021skeleton,
title={Skeleton-Split Framework using Spatial Temporal Graph Convolutional Networks for Action Recogntion},
author={Alsawadi, Motasem and Rio, Miguel},
journal={arXiv preprint arXiv:2111.03106},
year={2021}
}
The training code is written by Motasem S. Alsawadi and it is based upon the skeleton extraction guidelines provided in Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition, Sijie Yan, Yuanjun Xiong and Dahua Lin, AAAI 2018. [Arxiv Preprint]