Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 871 Bytes

README.md

File metadata and controls

16 lines (12 loc) · 871 Bytes

CLIP4Clip v2: A Modified CLIP4Clip codebase for Video Clip Retrieval

Modified implementation of paper CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval. (Official repo can be found here)

CLIP4Clip is a video-text retrieval model based on CLIP (ViT-B). We investigate three similarity calculation approaches: parameter-free type, sequential type, and tight type, in this work. The model achieve SOTA results on MSR-VTT, MSVD, LSMDC, ActivityNet, and DiDeMo.

CLIP4Clip.png

Requirement

# From CLIP
conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
pip install ftfy regex tqdm
pip install opencv-python boto3 requests pandas