Skip to content
View tak-s's full-sized avatar

Block or report tak-s

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 1,607 101 Updated Sep 28, 2024

Interface for OuteTTS models.

Python 410 25 Updated Nov 6, 2024

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

Python 504 16 Updated Nov 8, 2024

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Python 490 22 Updated Nov 11, 2024

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,594 188 Updated Nov 6, 2024

Autonomous Agents (LLMs) research papers. Updated Daily.

522 28 Updated Nov 19, 2024

A comprehensive tool for processing and analyzing video footage, producing detailed insights into gameplay and player performance enhancing game understanding and performance evaluation.

Jupyter Notebook 78 16 Updated Oct 9, 2024

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper

HTML 16,399 1,205 Updated Nov 22, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,113 278 Updated Nov 5, 2024

🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide better code/research plans 🧰 OpenAI, Anthropic, Ollama, etc s…

Python 1,098 49 Updated Nov 19, 2024

OpenAI Whisper ASR Webservice API

Python 2,126 380 Updated Oct 6, 2024

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 3,555 372 Updated Oct 31, 2024

An MIT rewrite of YOLOv9

Python 661 69 Updated Nov 22, 2024

A permissively licensed implementation of YOLOv9.

Python 7 1 Updated Apr 18, 2024

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 50,425 4,890 Updated Nov 20, 2024

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,746 696 Updated Jul 25, 2024
Jupyter Notebook 7,764 547 Updated Jun 16, 2024

Japanese instruction data (日本語指示データ)

Python 22 Updated Jul 13, 2023

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 45,411 5,401 Updated Nov 11, 2024

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped, CVPR 2022

Python 548 77 Updated Nov 1, 2023

UI for your AI. Open Source Tailwind components tailored for your GPT, generative AI, and LLM projects.

HTML 2,496 129 Updated Jul 10, 2024

Code review powered by LLMs (OpenAI GPT4, Sonnet 3.5) & Embeddings ⚡️ Improve code quality and catch bugs before you break production 🚀 Lives in your Github/GitLab/Azure DevOps CI

TypeScript 1,606 162 Updated Nov 4, 2024
Python 20 Updated Jul 16, 2023

An open source implementation of OpenAI's ChatGPT Code interpreter

Python 3,562 446 Updated Mar 20, 2024

📋 A list of open LLMs available for commercial use.

11,224 734 Updated Jul 5, 2024

chat to visualization with LLM

Python 211 29 Updated Nov 19, 2023

A demo of an GPT-based agent existing in an RPG-like environment

JavaScript 983 108 Updated May 3, 2023

Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.

JavaScript 35,207 4,197 Updated Nov 23, 2024

Interactively explore unstructured datasets from your dataframe.

TypeScript 1,127 83 Updated Nov 18, 2024

Rembg is a tool to remove images background

Python 17,088 1,881 Updated Nov 20, 2024
Next