Optimising LLM proxy
Project description
Handle:
boost
URL: http://localhost:34131/
boost is an optimising LLM proxy with OpenAI-compatible API.
Documentation
- Features
- Starting
- Configuration
- API
- Environment Variables Reference
- Built-in Modules Reference
- Custom Modules Guide
- Standalone Usage Guide
- Boost Starter repo
Features
OpenAI-compatible API
Acts as a drop-in proxy for OpenAI APIs, compatible with most LLM providers and clients. Boost can be used as a "plain" proxy to combine multiple LLM backends behind a single endpoint with a single API key.
POST http://localhost:34131/v1/chat/completions
{
"model": "llama3.1",
"messages": [{ "role": "user", "content": "Tell me about LLMs" }]
}
Modules
Run custom code inside or instead of a chat completion, to fetch external data, improve reasoning, perform trace inference, and more.
POST http://localhost:34131/v1/chat/completions
{
"model": "klmbr-llama3.1",
"messages": [{ "role": "user", "content": "Suggest me a random color" }]
}
Boost comes with a lot of built-in modules with various functions. You can use them directly or as a base for your own creations.
markov |
concept |
|---|---|
nbs |
|---|
dnd |
promx |
|---|---|
dot |
klmbr |
r0 |
|---|---|---|
Scripting
Creating custom modules is a first-class feature and one of the main use-cases for Harbor Boost.
# Simplest echo module replies back
# with the last message from the input
def apply(llm, chat):
await llm.emit_message(prompt=chat.tail.content)
See the Custom Modules guide for more information on how to create your own modules and overview of available interfaces.
Starting
Start with Harbor
# [Optional] pre-build the image
harbor build boost
# Start the service
harbor up boost
- Harbor connects
boostwith:- to all included LLM backends (
ollama,llamacpp,vllm, etc.) optillmas a backendwebuianddifyfrontends
- to all included LLM backends (
# Get the URL for the boost service
harbor url boost
# Open default boost endpoint in the browser
harbor open boost
Start standalone
docker run \
-e "HARBOR_BOOST_OPENAI_URLS=http://172.17.0.1:11434/v1" \
-e "HARBOR_BOOST_OPENAI_KEYS=sk-ollama" \
-e "HARBOR_BOOST_MODULES=dot;klmbr;promx;autotemp;markov;" \
-e "HARBOR_BOOST_BASE_MODELS=true" \
-e "HARBOR_BOOST_API_KEY=sk-boost" \
-p 34131:8000 \
ghcr.io/av/harbor-boost:latest
See standalone usage guide below.
Configuration
Configuration can be performed via Harbor CLI, harbor config, harbor env or the .env file.
All of the above ways are interchangeable and result in setting environment variables for the service.
Harbor CLI
Specific options can be set using harbor CLI:
# Enable/Disable a module
harbor boost modules add <module>
harbor boost modules rm <module>
# Set a parameter
harbor boost <module> <parameter>
harbor boost <module> <parameter> <value>
# See boost/module help entries
# for more info
harbor boost --help
harbor boost klmbr --help
harbor boost rcn --help
harbor boost g1 --help
# Additional OpenAI-compatible APIs to boost
harbor boost urls add http://localhost:11434/v1
harbor boost urls rm http://localhost:11434/v1
harbor boost urls rm 0 # by index
harobr boost urls ls
# Keys for the OpenAI-compatible APIs to boost. Semicolon-separated list.
# ⚠️ These are index-matched with the URLs. Even if the API doesn't require a key,
# you still need to provide a placeholder for it.
harbor boost keys add sk-ollama
harbor boost keys rm sk-ollama
harbor boost keys rm 0 # by index
harbor boost keys ls
Harbor Config
More options are available via harbor config.
# See all available options
harbor config ls boost
# Some of the available options
harbor config set boost.host.port 34131
harbor config set boost.api.key sk-boost
harbor config set boost.api.keys sk-user1;sk-user2;sk-user3
Below are additional configuration options that do not have an alias in the Harbor CLI (so you need to use harbor config directly). For example harbor config set boost.intermediate_output true.
Environment Variables
Most comprehensive way to configure boost is to use environment variables. You can set them in the .env file or via harbor env.
# Using harbor env
harbor env boost HARBOR_BOOST_API_KEY_MISTRAL sk-mistral
# Or open one of these in your text editor
open $(harbor home)/.env
open $(harbor home)/boost/override.env
See all supported environment variables in the Environment Variables Reference.
There's no configuration for this module yet.
API
boost works as an OpenAI-compatible API proxy. It'll query configured downstream services for which models they serve and provide "boosted" wrappers in its own API.
See the http catalog entry for some sample requests.
Authorization
When configured to require an API key, you can provide the API key in the Authorization header.
<!-- All three versions are accepted -->
Authorization: sk-boost
Authorization: bearer sk-boost
Authorization: Bearer sk-boost
GET /v1/models
List boosted models. boost will serve additional models as per enabled modules. For example:
[
{
// Original, unmodified model proxy
"id": "llama3.1:8b"
// ...
},
{
// LLM with klmbr technique applied
"id": "klmbr-llama3.1:8b"
// ...
},
{
// LLM with rcn technique applied
"id": "rcn-llama3.1:8b"
// ...
}
]
POST /v1/chat/completions
Chat completions endpoint.
- Proxies all parameters to the downstream API, so custom payloads are supported out of the box, for example
jsonformat for Ollama - Supports streaming completions and tool calls
POST http://localhost:34131/v1/chat/completions
{
"model": "llama3.1:8b",
"messages": [
{ "role": "user", "content": "Suggest me a random color" }
],
"stream": true
}
GET /events/:stream_id
Listen to a specific stream of events (associated with a single completion workflow). The stream ID is a unique identifier of the LLM instance processing the request (you may decide to advertise/pass it to the client in the workflow's code).
GET /health
Health check endpoint. Returns { status: 'ok' } if the service is running.
Standalone usage
You can run boost as a standalone Docker container. See harbor-boost package in GitHub Container Registry.
# [Optional] pre-pull the image
docker pull ghcr.io/av/harbor-boost:latest
# Start the container
docker run \
# 172.17.0.1 is the default IP of the host, when running on Linux
# So, the example below is for local ollama
-e "HARBOR_BOOST_OPENAI_URLS=http://172.17.0.1:11434/v1" \
-e "HARBOR_BOOST_OPENAI_KEYS=sk-ollama" \
# Configuration for the boost modules
-e "HARBOR_BOOST_MODULES=klmbr;rcn;g1" \
-e "HARBOR_BOOST_KLMBR_PERCENTAGE=60" \
# [Optional] mount folder with custom modules
-v /path/to/custom_modules/folder:/app/custom_modules \
-p 8004:8000 \
ghcr.io/av/harbor-boost:latest
# In the separate terminal (or detach the container)
curl http://localhost:8004/health
curl http://localhost:8004/v1/models
You can take a look at a boost-starter repo for a minimal example repository to get started.
Configuration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file harbor_boost-0.3.19.tar.gz.
File metadata
- Download URL: harbor_boost-0.3.19.tar.gz
- Upload date:
- Size: 76.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a489a29d68183e0244fad74211bf00c487747900ce5a21dc4909ea47b2ea0a1
|
|
| MD5 |
eade3bba8e6b920fe252f03f257caadc
|
|
| BLAKE2b-256 |
dd4163cf9ef9f7521fce340b2425aa3f489acb6356c2c2b984dcb2eced9ffa78
|
File details
Details for the file harbor_boost-0.3.19-py3-none-any.whl.
File metadata
- Download URL: harbor_boost-0.3.19-py3-none-any.whl
- Upload date:
- Size: 97.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8d280bbaffb5f8a8e62e914e25798bf2ea08d341a14ff07041fe9c19378a67f
|
|
| MD5 |
d3000f7016403e410e44a7dc5e9410af
|
|
| BLAKE2b-256 |
c961b203643c3d32687bc7ee7af8c4cca1895b4cd2dbec25137fa9958ebdb745
|