Skip to content

AVAuco/ssd_people

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SSD-based upper-body and head detectors

Image showing upper-body detections

By Pablo Medina-Suarez and Manuel J. Marin-Jimenez.

UPDATE: we have released a new Keras+Tensorflow version of our head detector model, give it a try!

This repository contains two MatConvNet models for people detection in images: an upper-body detector, and a head detector. These models are based on the Single Shot Multibox Detector (SSD), as described in:

SSD: Single Shot MultiBox Detector
Authors: Liu, Wei; Anguelov, Dragomir; Erhan, Dumitru; Szegedy, Christian; Reed, Scott; Fu, Cheng-Yang; Berg, Alexander C. 

Both models have been trained on the Hollywood Heads Dataset, using the MatConvNet implementation of SSD developed by Samuel Albanie.

Quick start

Cloning the repository

In order to download the models provided in this repository installation of git and git-lfs is mandatory. To do so, run the following commands in command line interface:

Install git:     
    sudo apt-get install git
Install git-lfs:
    sudo apt-get install git-lfs
Set up git-lfs:
    git lfs install
Clone ssd_people from GitHub using the method of your choice: 
    git clone https://github.com/AVAuco/ssd_people.git (HTTPS)
    git clone [email protected]:AVAuco/ssd_people.git (SSH)

You can verify the installation by checking that the file size of the files under the models directory is approximately 90 MB each.
Alternatively, you can use the following direct download link: download

Running the demo code

Demo code is provided in ssd_people_demo.m. Running this script will perform detections over 3 sample images using the selected model, showing the results in screen.

To run this script, start MATLAB and setup MatConvNet with contrib modules:

% Add MatConvNet to MATLAB's PATH:
addpath /usr/local/matconvnet-25/matlab     % Adapt this path to your setup
vl_setupnn

% Setup mcnSSD and its dependencies
vl_contrib('setup', 'mcnSSD')

% Setup ssd_people
cd <root_ssd_people>  
addpath(genpath(pwd))   % Just in case

% Run demo code on CPU
ssd_people_demo;    % Runs the upper-body detector
ssd_people_demo('model','head');    % Runs the head detector

Software requirements

Minimal requirements to run the models on the CPU:

  • MATLAB (tested on R2016b and R2017a).
    • Demo code requires Parallel Computing, Computer Vision System and Image Processing toolboxes.
  • MatConvNet (version >= 1.0-beta25 is recommended because of vl_contrib).
  • mcnSSD.

The following code installs mcnSSD and its dependencies via vl_contrib:

vl_contrib('install', 'mcnSSD');
vl_contrib('compile', 'mcnSSD');
vl_contrib('setup', 'mcnSSD');

vl_contrib('install','autonn');
vl_contrib('setup','autonn');

vl_contrib('install','mcnExtraLayers');
vl_contrib('setup','mcnExtraLayers');

Additional, recommended requirements to run the detectors on the GPU:

  • NVIDIA CUDA Toolkit (tested on v8.0 GA2, v9.2 and v10.0).
  • Optional: a NVIDIA cuDNN version matching the NVIDIA CUDA Toolkit version installed.

Performance

Both the upper-body and head detectors have a 512x512 input size, favoring precision over speed. Nonetheless, these models run at an average of 35 Hz on a NVIDIA GTX 1080, allowing real time detections.

Qualitative results

We show some results of both the head (left) and upper-body detectors (right) on the UCO-LAEO dataset in the following videos. No temporal smoothing or other kind of post-processing has been applied to the output of the detectors.

Citation

If you find these models useful, please consider citing the following paper:

@inproceedings{marin19cvpr,
  author    = {Mar\'in-Jim\'enez, Manuel J. and Kalogeiton, Vicky and Medina-Su\'arez, Pablo and and Zisserman, Andrew},
  title     = {{LAEO-Net}: revisiting people {Looking At Each Other} in videos},
  booktitle = CVPR,
  year      = {2019},
}

Acknowledgements

We thank the authors of the images used in the demo code, which are licensed under a CC BY 2.0 license: