【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
-
Updated
Nov 29, 2024 - Python
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
A Survey on video and language understanding.
Video Graph Transformer for Video Question Answering (ECCV'22)
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
A repository of Video Language papers, code and datasets.
Add a description, image, and links to the video-language-understanding topic page so that developers can more easily learn about it.
To associate your repository with the video-language-understanding topic, visit your repo's landing page and select "manage topics."