Skip to content
View hukenovs's full-sized avatar
🐢
hi ._.
🐢
hi ._.

Organizations

@ai-forever

Block or report hukenovs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)

Python 9 1 Updated Nov 6, 2024

MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundamental models.

Jupyter Notebook 58 8 Updated Oct 7, 2024

nanoGPT style version of Llama 3.1

Python 1,236 60 Updated Aug 8, 2024
Python 17 2 Updated Sep 2, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,285 1,128 Updated Oct 14, 2024

MINT-1T: A one trillion token multimodal interleaved dataset.

773 20 Updated Jul 31, 2024

Framework agnostic sliced/tiled inference + interactive ui + error analysis plots

Python 4,093 591 Updated Aug 27, 2024

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Python 4,289 1,248 Updated Aug 14, 2024

LLM101n: Let's build a Storyteller

30,090 1,643 Updated Aug 1, 2024

YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]

Python 9,913 982 Updated Sep 26, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,111 145 Updated Sep 3, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,046 1,387 Updated Oct 15, 2024

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

C++ 1,668 256 Updated Nov 10, 2024

Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)

Python 120 4 Updated Nov 13, 2023

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 318 15 Updated Oct 8, 2024

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 2,072 195 Updated Apr 24, 2024

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 8,953 566 Updated Apr 16, 2024

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Python 8,977 1,422 Updated Aug 9, 2024

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention

Python 786 62 Updated Jun 2, 2024

MiVOLO age & gender transformer neural network

Python 329 55 Updated Aug 5, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,654 452 Updated Nov 5, 2024

Mixture-of-Experts for Large Vision-Language Models

Python 1,979 126 Updated May 15, 2024

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 2,977 200 Updated Sep 19, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,094 4,550 Updated Nov 13, 2024

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,298 49 Updated Oct 2, 2024

We write your reusable computer vision tools. 💜

Python 24,078 1,792 Updated Nov 12, 2024

Paper list of sign language, including sign language recognition(SLR), sign language translation(SLT) and other interesting work. Quick start your awesome work with us!! 🤟🤟🤟

88 1 Updated Oct 12, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,160 2,551 Updated Nov 9, 2024

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,480 107 Updated Jul 5, 2024
Next