Code for ALBEF: a new vision-language pre-training method
-
Updated
Sep 20, 2022 - Python
Code for ALBEF: a new vision-language pre-training method
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Data release for the ImageInWords (IIW) paper.
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
The largest multilingual image-text classification dataset. It contains fashion products.
Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment
A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
Wrapper for PHP's GD Library for easy image manipulation. Support for scaling multi-line text, shapes, filters and smart resize.
WWDC22: Enabling Live Text interactions with images in SwiftUI
A server powering LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
Download flickr8k, flickr30k image caption datasets
An Interactive Game-based Vision Planning benchmark
This project is a FastAPI-based web application designed to analyze C a m b r i d g e I E L T S P D F s ( B o o k s 1 − 18 ) for the most and least repeated words. It can handle both regular text-based PDFs and scanned image-based PDFs by converting them to images and extracting text using OCR (Optical Character Recognition).
Contrastive Learning Representations for Images and Text Pairs. Colab implementation of ConVIRT for transfer learning with insufficient data volume.
caption generator using lavis and argostranslate
Image Captioning With MobileNet-LLaMA 3
Add a description, image, and links to the image-text topic page so that developers can more easily learn about it.
To associate your repository with the image-text topic, visit your repo's landing page and select "manage topics."