Skip to content

Add support for batched decoding api#795

Merged
abetlen merged 11 commits intomainfrom
add-support-for-llama-batch
Nov 3, 2023
Merged

Add support for batched decoding api#795
abetlen merged 11 commits intomainfrom
add-support-for-llama-batch

Conversation

@abetlen
Copy link
Owner

@abetlen abetlen commented Oct 5, 2023

llama.cpp recently moved to a new api which supports batching (both for single sequences with multiple outputs and multiple seperate streams) and streaming support. This new api based on llama_decode supercedes the now deprecated llama_eval api. This means that the current api should be migrated anyways regardless of the new features but we'll see how easy it is to implement along the way.

@flexorRegev
Copy link

is there a way to help with this one?

@zpzheng
Copy link

zpzheng commented Oct 26, 2023

Is this feature live yet?Why can't I support batch tasks locally?

@abetlen abetlen marked this pull request as ready for review November 3, 2023 00:12
@abetlen abetlen merged commit ab028cb into main Nov 3, 2023
@abetlen abetlen deleted the add-support-for-llama-batch branch November 14, 2023 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants