Skip to content
View kkahatapitiya's full-sized avatar

Block or report kkahatapitiya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official repository for LTX-Video

Python 544 27 Updated Nov 24, 2024

FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.

Python 29 2 Updated Jul 8, 2024

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Python 103 3 Updated Nov 5, 2024

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Python 168 8 Updated Nov 12, 2024

The best OSS video generation models

Python 2,108 212 Updated Nov 23, 2024

Clarity: A Minimalist Website Template for AI Research

CSS 58 3 Updated Oct 28, 2024

Language Repository for Long Video Understanding

Python 29 3 Updated Jun 17, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 6,202 528 Updated Nov 24, 2024

Unofficial Implementation of "Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs"

Python 3 Updated Oct 29, 2024
Python 2 Updated Oct 13, 2023

Official inference repo for FLUX.1 models

Python 17,111 1,223 Updated Nov 21, 2024

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Python 82 3 Updated Aug 6, 2024

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Python 592 28 Updated Nov 22, 2024
Python 18 1 Updated Jul 31, 2024

Theia: Distilling Diverse Vision Foundation Models for Robot Learning

Python 184 7 Updated Oct 10, 2024

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 148 8 Updated Nov 5, 2024

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,710 178 Updated Sep 28, 2024

[ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings

Python 4 Updated Oct 7, 2024

Official repo for AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

Python 33 Updated Jan 30, 2024

VideoSys: An easy and efficient system for video generation

Python 1,783 123 Updated Nov 24, 2024

GIF encoder based on libimagequant (pngquant). Squeezes maximum possible quality from the awful GIF format.

Rust 4,844 143 Updated Nov 3, 2024

[NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Python 546 60 Updated Mar 27, 2022

Let us control diffusion models!

Python 30,488 2,742 Updated Feb 25, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,579 1,026 Updated Nov 23, 2024

LLaRA: Large Language and Robotics Assistant

Python 156 3 Updated Oct 2, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,327 2,182 Updated Nov 20, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,089 88 Updated Aug 6, 2024

Stable Diffusion web UI

Python 143,383 27,006 Updated Nov 24, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,397 571 Updated May 31, 2024

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python 298 21 Updated Jul 17, 2024
Next