Serve llama models locally

These details have not been verified by PyPI

Project description

Llama serve

Serve llama models locally.

⬇️ Downloads weights from S3
📦 Unpacks
🚀 Serves via a local OpenAI-compatible server

Prerequisites

Software

Python 3.12

Hardware

A GPU with >=24GB VRAM (tested on NVIDIA A30)

Configuration

Create a file called .env in the directory where you intend to run this package. Populate it with the details you have been provided with in the following format:

MODEL_NAME=
WEIGHTS_ID=
WEIGHTS_KEY=

Installation

(Recommended) Create a virtual environment and activate it:

python -m venv .venv
source .venv/bin/activate

Install this package: pip install londonaicentre-llama-serve.

Usage

CLI

Note command line arguments:

Argument Description

-v, --verbose Enable debug output (optional)
Start the server as follows: llamaserve [args].

Argument	Description
-v, --verbose	Enable debug output (optional)

Clients

OpenAI (example)

Interact with the server using the OpenAI client in python:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:5000/v1",
    api_key="blank" 
)

response = client.chat.completions.create(
    model="<MODEL_NAME>",
    messages=[
        {"role": "system", "content": "You are an LLM named gpt-4o"},
        {"role": "user", "content": "Hello"}
    ]
)

print(response.choices[0].message.content)

License

This project uses the CC BY-NC-ND 4.0 license (see LICENSE).

The contents of this repository are designed for NHS organisations to use on private data.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.2.0

Jan 22, 2026

1.1.3

Dec 22, 2025

1.1.2

Dec 6, 2025

1.1.1

Dec 5, 2025

This version

1.1.0

Nov 21, 2025

1.0.0

Nov 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

londonaicentre_llama_serve-1.1.0.tar.gz (12.5 kB view details)

Uploaded Nov 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

londonaicentre_llama_serve-1.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Nov 21, 2025 Python 3

File details

Details for the file londonaicentre_llama_serve-1.1.0.tar.gz.

File metadata

Download URL: londonaicentre_llama_serve-1.1.0.tar.gz
Upload date: Nov 21, 2025
Size: 12.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for londonaicentre_llama_serve-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`041b44bbc37b87188265c179c44b6653bb6ca8762377c553598580a4c65de76e`
MD5	`b42edc4dbb25fe2bf37b36213d6b288f`
BLAKE2b-256	`277c88690ab0d46ded774b7bbaa28c1c14359b3f687f322b568e235d8078901d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for londonaicentre_llama_serve-1.1.0.tar.gz:

Publisher: llamaserve-build-and-publish.yml on londonaicentre/GenoLlama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: londonaicentre_llama_serve-1.1.0.tar.gz
- Subject digest: 041b44bbc37b87188265c179c44b6653bb6ca8762377c553598580a4c65de76e
- Sigstore transparency entry: 714012802
- Sigstore integration time: Nov 21, 2025
Source repository:
- Permalink: londonaicentre/GenoLlama@25cffd207224a98a2d757a1dda906afb6e5f500f
- Branch / Tag: refs/tags/llamaserve-v1.1.0
- Owner: https://github.com/londonaicentre
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: llamaserve-build-and-publish.yml@25cffd207224a98a2d757a1dda906afb6e5f500f
- Trigger Event: release

File details

Details for the file londonaicentre_llama_serve-1.1.0-py3-none-any.whl.

File metadata

Download URL: londonaicentre_llama_serve-1.1.0-py3-none-any.whl
Upload date: Nov 21, 2025
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for londonaicentre_llama_serve-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cc3e7dbacc0df3e4a511ec96adabd05b0d2f3f64fe73c9d6dd484ae00fe354f0`
MD5	`1411fc61b027c903a4a0660cb998ddbf`
BLAKE2b-256	`0f7209126b21afd83b0013c489c9e34fe6cb2c1f8f6dd1bba1c24d3724ad1692`

See more details on using hashes here.

Provenance

The following attestation bundles were made for londonaicentre_llama_serve-1.1.0-py3-none-any.whl:

Publisher: llamaserve-build-and-publish.yml on londonaicentre/GenoLlama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: londonaicentre_llama_serve-1.1.0-py3-none-any.whl
- Subject digest: cc3e7dbacc0df3e4a511ec96adabd05b0d2f3f64fe73c9d6dd484ae00fe354f0
- Sigstore transparency entry: 714012803
- Sigstore integration time: Nov 21, 2025
Source repository:
- Permalink: londonaicentre/GenoLlama@25cffd207224a98a2d757a1dda906afb6e5f500f
- Branch / Tag: refs/tags/llamaserve-v1.1.0
- Owner: https://github.com/londonaicentre
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: llamaserve-build-and-publish.yml@25cffd207224a98a2d757a1dda906afb6e5f500f
- Trigger Event: release

londonaicentre-llama-serve 1.1.0

Navigation

Verified details

Owner

Unverified details

Meta

Project description

Llama serve

Prerequisites

Software

Hardware

Configuration

Installation

Usage

CLI

Clients

OpenAI (example)

License

Project details

Verified details

Owner

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance