PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
-
Updated
Aug 5, 2024 - Jupyter Notebook
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Deep Modular Co-Attention Networks for Visual Question Answering
FiLM: Visual Reasoning with a General Conditioning Layer
Recent Papers including Neural Symbolic Reasoning, Logical Reasoning, Visual Reasoning, planning and any other topics connecting deep learning and reasoning
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Pytorch implementation of "Explainable and Explicit Visual Reasoning over Scene Graphs "
[CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning
[ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
Image captioning using python and BLIP
Visual Question Reasoning on General Dependency Tree
Learning Perceptual Inference by Contrasting
[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.
Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)
📄 A curated list of visual reasoning papers.
ACRE: Abstract Causal REasoning Beyond Covariation
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
Add a description, image, and links to the visual-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the visual-reasoning topic, visit your repo's landing page and select "manage topics."