LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
Project description
LLaMA Server
LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.
🦙LLaMA C++ (via 🐍PyLLaMACpp) ➕ 🤖Chatbot UI ➕ 🔗LLaMA Server 🟰 😊
UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!
UPDATE: Now supports better streaming through PyLLaMACpp!
UPDATE: Now supports streaming!
Demo
- Better Streaming
- Streaming
- Non-streaming
Setup
-
Get your favorite LLaMA models by
- Download from 🤗Hugging Face;
- Or follow instructions at LLaMA C++;
- Make sure models are converted and quantized;
-
Create a
models.yml
file to provide yourmodel_home
directory and add your favorite South American camelids, e.g.:
model_home: /path/to/your/models
models:
llama-7b:
name: LLAMA-7B
path: 7B/ggml-model-q4_0.bin # relative to `model_home` or an absolute path
See models.yml for an example.
- Set up python environment:
conda create -n llama python=3.9
conda activate llama
-
Install LLaMA Server:
- From PyPI:
python -m pip install llama-server
- Or from source:
python -m pip install git+https://github.com/nuance1979/llama-server.git
-
Start LLaMA Server with your
models.yml
file:
llama-server --models-yml models.yml --model-id llama-7b
- Check out my fork of Chatbot UI and start the app;
git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev
- Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
- Or follow instructions at Chatbot UI to put your key into a
.env.local
file and restart;
cp .env.local.example .env.local <edit .env.local to add your OPENAI_API_KEY>
- Enjoy!
More
- Try a larger model if you have it:
llama-server --models-yml models.yml --model-id llama-13b # or any `model_id` defined in `models.yml`
- Try non-streaming mode by restarting Chatbot UI:
export LLAMA_STREAM_MODE=0 # 1 to enable streaming
npm run dev
Fun facts
I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama-server-0.2.2.tar.gz
.
File metadata
- Download URL: llama-server-0.2.2.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e49ff9751b7e0318e40c9e5b9b5ba7f0c284998fa198f4c82b7256637bc5e54 |
|
MD5 | d9d0af55b6f84287596a8cbcdff2dbcc |
|
BLAKE2b-256 | 1ad2ec268cfad7f6676b7e0271e9a6287cd90fa9b6dac7d25e8688cddd872195 |
File details
Details for the file llama_server-0.2.2-py3-none-any.whl
.
File metadata
- Download URL: llama_server-0.2.2-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c0392709de9458bf9815bc34ac7f54e349296828edfab8eaee91c863bb2697c |
|
MD5 | d9ff32a65a44e67f8caa28400fc7dbd8 |
|
BLAKE2b-256 | af786855e42bbda9cb7db0e74a10210ea5134cd96521c6be50c6ce9a96ff5ead |