-
Nankai University
- http://implus.github.io/
Stars
This repo contains the code for 1D tokenizer and generator
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Official repository for LPFP&PCLN [TIP 2024], SPG [ECCV 2024], and ...
Event stream based visual object tracking using Mamba/State Space Model
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
This repository contains the official implementation for the paper "From Words to Worth: Newborn Article Impact Prediction with LLM".
CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition -- ICCV 2023
StyleGAN - Official TensorFlow Implementation
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Denoising Diffusion Probabilistic Models
[NeurIPS'21] Projected GANs Converge Faster
Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion
PyTorch implementations of Generative Adversarial Networks.
Resources and Implementations of Generative Adversarial Nets: GAN, DCGAN, WGAN, CGAN, InfoGAN
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Object Recognition as Next Token Prediction (CVPR 2024 Highlight)
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
[AAAI 2024] Official implementation of "SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation", and more.
Faster depthwise convolutions for PyTorch
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"
[BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…