Skip to content

Tags: leixy76/llama.cpp

Tags

b3454

Toggle b3454's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Build Llama SYCL Intel with static libs (ggml-org#8668)

Ensure SYCL CI builds both static & dynamic libs for testing purposes

Signed-off-by: Joe Todd <[email protected]>

b3439

Toggle b3439's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
*.py: Stylistic adjustments for python (ggml-org#8233)

* Superflous parens in conditionals were removed.
* Unused args in function were removed.
* Replaced unused `idx` var with `_`
* Initializing file_format and format_version attributes
* Renaming constant to capitals
* Preventing redefinition of the `f` var

Signed-off-by: Jiri Podivin <[email protected]>

b3432

Toggle b3432's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
flake.lock: Update (ggml-org#8610)

b3414

Toggle b3414's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: use relative routes for static files in new UI (ggml-org#8552)

* server: public: fix api_url on non-index pages

* server: public: use relative routes for static files in new UI

b3409

Toggle b3409's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CONTRIBUTING.md : remove mention of noci (ggml-org#8541)

b3405

Toggle b3405's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
make/cmake: add missing force MMQ/cuBLAS for HIP (ggml-org#8515)

b3389

Toggle b3389's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : fix Gemma-2 Query scaling factors (ggml-org#8473)

* 9B - query_pre_attn_scalar = 256 not 224

See google/gemma_pytorch@03e6575

Gemma 9b should use 256 and not 224 (self.config.hidden_size // self.config.num_attention_heads)

* llama : fix Gemma-2 Query scaling factor

ggml-ci

---------

Co-authored-by: Daniel Han <[email protected]>

b3384

Toggle b3384's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : handle content array in chat API (ggml-org#8449)

* server : handle content array in chat API

* Update examples/server/utils.hpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b3372

Toggle b3372's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
gitignore : deprecated binaries

b3369

Toggle b3369's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Initialize default slot sampling parameters from the global context. (g…

…gml-org#8418)