EmbeddedLLM

LLM_Sizing_Guide Public Forked from qoofyk/LLM_Sizing_Guide
A calculator to estimate the memory footprint, capacity, and latency on NVIDIA AMD Intel

EmbeddedLLM/LLM_Sizing_Guide’s past year of commit activity

Python 0 2 0 0 Updated Nov 24, 2024
infinity-executable Public Forked from michaelfeil/infinity
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.

EmbeddedLLM/infinity-executable’s past year of commit activity

Python 0 MIT 116 0 0 Updated Nov 22, 2024
JamAIBase Public
The collaborative spreadsheet for AI. Chain cells into powerful pipelines, experiment with prompts and models, and evaluate LLM responses in real-time. Work together seamlessly to build and iterate on AI applications.

EmbeddedLLM/JamAIBase’s past year of commit activity

Python 462 Apache-2.0 17 1 0 Updated Nov 22, 2024
vllm Public Forked from vllm-project/vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs

EmbeddedLLM/vllm’s past year of commit activity

Python 89 Apache-2.0 4,744 4 0 Updated Nov 22, 2024
SageAttention-rocm Public Forked from thu-ml/SageAttention
ROCm Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

EmbeddedLLM/SageAttention-rocm’s past year of commit activity

Cuda 0 Apache-2.0 27 0 0 Updated Nov 21, 2024
torchac_rocm Public Forked from LMCache/torchac_cuda
ROCm Implementation of torchac_cuda from LMCache

EmbeddedLLM/torchac_rocm’s past year of commit activity

Cuda 0 1 0 0 Updated Nov 17, 2024
axolotl-amd Public Forked from axolotl-ai-cloud/axolotl
Go ahead and axolotl questions

EmbeddedLLM/axolotl-amd’s past year of commit activity

Python 0 Apache-2.0 888 0 0 Updated Nov 16, 2024
Liger-Kernel Public Forked from linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training

EmbeddedLLM/Liger-Kernel’s past year of commit activity

Python 0 BSD-2-Clause 213 0 0 Updated Nov 16, 2024
LMCache-ROCm Public Forked from LMCache/LMCache
ROCm support of Ultra-Fast and Cheaper Long-Context LLM Inference

EmbeddedLLM/LMCache-ROCm’s past year of commit activity

Python 0 Apache-2.0 25 0 0 Updated Nov 14, 2024
jamaibase-ts-docs Public
Typescript Documentation of JamAISDK

EmbeddedLLM/jamaibase-ts-docs’s past year of commit activity

HTML 0 0 0 0 Updated Nov 14, 2024

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EmbeddedLLM

Pinned Loading

Repositories

People

Top languages

Most used topics