An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
-
Updated
Jul 25, 2024 - Python
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.
Video-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)
Text from the video is extracted and saved into a .docx file in the form of notes.
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).
Its a full-screen video behind text animation using Next.js
Capstone project for UPSchool AI First Developer Program
Add a description, image, and links to the video-text topic page so that developers can more easily learn about it.
To associate your repository with the video-text topic, visit your repo's landing page and select "manage topics."