Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 148 8 Updated Nov 5, 2024

Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,710 178 Updated Sep 28, 2024

cfmata / CoPT

[ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings

Python 4 Updated Oct 7, 2024

BenchCouncil / AIGCBench

Official repo for AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

Python 33 Updated Jan 30, 2024

NUS-HPC-AI-Lab / VideoSys

VideoSys: An easy and efficient system for video generation

Python 1,783 123 Updated Nov 24, 2024

ImageOptim / gifski

GIF encoder based on libimagequant (pngquant). Squeezes maximum possible quality from the awful GIF format.

Rust 4,844 143 Updated Nov 3, 2024

microsoft / Focal-Transformer

[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Python 546 60 Updated Mar 27, 2022

lllyasviel / ControlNet

Let us control diffusion models!

Python 30,488 2,742 Updated Feb 25, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,579 1,026 Updated Nov 23, 2024

LostXine / LLaRA

LLaRA: Large Language and Robotics Assistant

Python 156 3 Updated Oct 2, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 22,327 2,182 Updated Nov 20, 2024

Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,089 88 Updated Aug 6, 2024

AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI

Python 143,383 27,006 Updated Nov 24, 2024

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,397 571 Updated May 31, 2024

WisconsinAIVision / ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python 298 21 Updated Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kumara Kahatapitiya kkahatapitiya

Achievements

Achievements

Block or report kkahatapitiya

Stars

Lightricks / LTX-Video

prathebaselva / FORA

AdaCache-DiT / AdaCache

Vchitect / FasterCache

genmoai / mochi

lorenmt / clarity-template

kkahatapitiya / LangRepo

sgl-project / sglang

kahnchana / locvlm

cfmata / dakpn

black-forest-labs / flux

Ziyang412 / VideoTree

Vchitect / VBench

jongwoopark7978 / LVNet

bdaiinstitute / theia

bytedance / tarsier