LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
Project description
LLaMA Server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
🦙LLaMA C++ ➕ 🤖Chatbot UI ➕ 🔗LLaMA Server 🟰 😊
UPDATE: Now supports Windows through pyllamacpp! And better streaming!
UPDATE: Now supports streaming!
Demo
- Streaming
- Non-streaming
Setup
-
Get your favorite LLaMA models by
- Download from 🤗Hugging Face;
- Or follow instructions at LLaMA C++;
- Make sure models are converted and quantized;
- Copy models to the folders with correct name, e.g.,
models/7B/ggml-model-q4_0.bin
;
-
Set up python environment:
conda create -n llama python=3.9
conda activate llama
python -m pip install -r requirements.txt
- Start LLaMA Server:
export LLAMA_SERVER_HOME=$(git rev-parse --show-toplevel) # path to this repo
python -m llama_server
- Check out my fork of Chatbot UI and start the app;
git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev
- Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
- Or follow instructions at Chatbot UI to put your key into a
.env.local
file and restart;
cp .env.local.example .env.local <edit .env.local to add your OPENAI_API_KEY>
- Enjoy!
More
- Try a larger model if you have it:
export LLAMA_MODEL_ID=llama-13b # llama-7b/llama-33b/llama-65b
python -m llama_server
- Try streaming mode by restarting Chatbot UI:
export LLAMA_STREAM_MODE=1 # 0 to disable streaming
npm run dev
Limitations
- "Regenerate response" is currently not working;
- IMHO, the prompt/reverse-prompt machanism of LLaMA C++'s interactive mode needs an overhaul. I tried very hard to dance around it but the whole thing is still a hack.
Fun facts
I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llama-server-0.0.1.tar.gz
(8.8 kB
view hashes)
Built Distribution
Close
Hashes for llama_server-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 927d0ee606f6cb9e7591c08d0b78fef1158c211ff6e12d9df5c1a111b5df6337 |
|
MD5 | 36c0b69b97367ea7a85c4b0e292feded |
|
BLAKE2b-256 | 33c113fc0f4f6b9ff1e3a2dbb14ce62a3b6da384fc4d3113681cb9fec6808fa3 |