Llama serve
Project description
Llama serve
Serve llama models locally.
-
⬇️ Downloads weights from S3
-
📦 Unpacks
-
🚀 Serves via a local OpenAI-compatible server
Prerequisites
- Python 3.12
Configuration
- Create a .env file with the details you have been provided with:
MODEL=
WEIGHTS_ID=
WEIGHTS_KEY=
Installation
- (Recommended) Create a virtual environment and activate it:
python -m venv .venv
source .venv/bin/activate
- Install this package:
pip install londonaicentre-llama-serve.
Usage
CLI
-
Note command line arguments:
Argument Description -v, --verbose Enable debug output (optional) -
Start the server as follows:
llamaserve [args].
Clients
OpenAI
-
Interact with the server using the OpenAI client in python:
from openai import OpenAI client = OpenAI( base_url="http://localhost:4000", ) response = client.chat.completions.create( model="<model>", messages=[ {"role": "system", "content": "You are an LLM named gpt-4o"}, {"role": "user", "content": "Hello"} ] ) print(response.choices[0].message.content)
License
This project uses the CC BY-NC-ND 4.0 license (see LICENSE).
The contents of this repository are designed for NHS organisations to use on private data.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file londonaicentre_llama_serve-1.0.0.tar.gz.
File metadata
- Download URL: londonaicentre_llama_serve-1.0.0.tar.gz
- Upload date:
- Size: 12.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45206f51f14e2eded4d708cfcebe1fc1eeb7db4fbbc1a73479e4ed0120be0288
|
|
| MD5 |
6ee78d00fc11614b7b494b551173291b
|
|
| BLAKE2b-256 |
9b58d78714f06e4f19f6743cb7c8bf4aab608a83625188a12a074d0b34386a29
|
File details
Details for the file londonaicentre_llama_serve-1.0.0-py3-none-any.whl.
File metadata
- Download URL: londonaicentre_llama_serve-1.0.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
494494456038be3f3be2c15533e931144cc1efc1c55595879483e4f3e0f66ead
|
|
| MD5 |
48fd5e787611feae4903165ee1114fa6
|
|
| BLAKE2b-256 |
0d0b9b81455dda43fe22033e4d0c35ae05b485a5f5acf0d7ca65aa1754951eb0
|