llm-api

Web API and websocket for Large Language Models in C++

Buıld

clone the repo and cd into it:

git clone https://github.com/monatis/llm-api.git && cd llm-api

Install asio for the web API.

apt install libasio-dev

Note: You can also run scripts/install-dev.sh to install asio (and websocat additionally, in order to test websocket on the terminal).

Build with cmake and make:

mkdir build && cd build
cmake -DLLM_NATIVE=ON ..
make -j4

Find the executible in ./bin/llm-api.

Run

Download gpt4all-j model if you haven't already:

wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O ./bin/ggml-gpt4all-j.bin

Run the executible:

./bin/llm-api

Note: You can pass the model path with -m argument if it's located elsewhere. See below for more options.

Options

./bin/llm-api -h

usage: ./bin/llm-api [options]                                                                                          
                                                                                                                        
options:
  -h, --help            show this help message and exit
  -v, --verbose            log generation in stdout (default: disabled)
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  --port PORT     port to listen on (default: 8080)
  -p PROMPT, --prompt PROMPT                                                                                            
                        prompt to start generation with (default: random)
  -n N, --n_predict N   number of tokens to predict (default: 200)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.9)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME                                                                                               
                        model path (default: ggml-gpt4all-j.bin)

Roadmap

Improve multi-user experience
Integrate StableLM model.
Add embedding endpoint.
Provide a chain mechanism.
Integrate a chat UI.
Add Docker support.
Extend readme and docs

Credits

ggerganov for his awesome work in ggml.
Nomic AI for their continuus contribution in truely open-source LLMs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

llm-api

Buıld

Run

Options

Roadmap

Credits

Files

README.md

Latest commit

History

README.md

File metadata and controls

llm-api

Buıld

Run

Options

Roadmap

Credits