A lightweight Python SDK for using local and OpenAI-compatible LLMs.
Project description
llmbridge
llmbridge is a lightweight Python SDK and CLI for using local and
OpenAI-compatible LLMs. It connects to runtimes you already run, such as Ollama,
LM Studio, vLLM, llama.cpp server, LocalAI, or another OpenAI-compatible API.
llmbridge does not ship model files. You install and run the model runtime yourself, then use llmbridge as a small developer-friendly bridge.
Features
- Local Ollama provider
- Generic OpenAI-compatible provider
- CLI commands for setup checks, model listing, chat, ask, pull, and config
- Streaming responses
- Local config at
~/.llmbridge/config.toml - Prompt templates
- Structured JSON output with Pydantic validation and retry
- Typed response models
Installation
pip install llmbridge-sdk
The PyPI distribution is named llmbridge-sdk. The Python import and CLI command
remain llmbridge.
Requirements
- Python 3.10+
- Ollama for the Ollama provider, or an already-running OpenAI-compatible server
- No bundled LLM model files
Ollama Quickstart
Install Ollama from https://ollama.com, start it locally, then pull a model:
ollama pull llama3.1:latest
Check your setup:
llmbridge doctor
llmbridge serve-check
llmbridge models
Ask a question:
llmbridge ask "Explain FastAPI in simple words"
Set your default model:
llmbridge config set model llama3.1:latest
OpenAI-Compatible Quickstart
Use an OpenAI-compatible server such as LM Studio, vLLM, llama.cpp server, or
LocalAI. The base_url should point to the API root, usually ending in /v1.
llmbridge ask "Explain FastAPI" \
--provider openai_compatible \
--model local-model \
--base-url http://localhost:1234/v1
List models:
llmbridge models \
--provider openai_compatible \
--base-url http://localhost:1234/v1
The OpenAI-compatible provider does not download or manage models. Start your server with the model you want before calling llmbridge.
CLI Usage
Use the configured default model:
llmbridge ask "Explain FastAPI"
Override the model:
llmbridge ask "Explain FastAPI" --model gemma4:e4b
Adjust temperature:
llmbridge ask "Explain FastAPI" --temperature 0.2
Run chat with an explicit model:
llmbridge chat llama3.1:latest "Explain PostgreSQL in simple words"
Run chat against an OpenAI-compatible server:
llmbridge chat local-model "Hello" \
--provider openai_compatible \
--base-url http://localhost:1234/v1
Pull an Ollama model:
llmbridge pull llama3.1:latest
Streaming Usage
llmbridge chat llama3.1:latest "Explain Docker" --stream
llmbridge ask "Explain Docker" --stream
Streaming chunks are printed as they arrive. Non-streaming CLI output is trimmed before printing.
Python streaming:
from llmbridge import LLM
llm = LLM(model="llama3.1:latest")
for chunk in llm.stream("Explain Docker Compose"):
print(chunk.text, end="")
Config Usage
llmbridge stores local CLI defaults in:
~/.llmbridge/config.toml
Supported config keys:
providermodelbase_urlapi_keytemperaturetimeout
Commands:
llmbridge config show
llmbridge config set provider ollama
llmbridge config set model llama3.1:latest
llmbridge config set base_url http://localhost:11434
llmbridge config set api_key local-secret
llmbridge config set temperature 0.2
llmbridge config set timeout 120
llmbridge config reset
For OpenAI-compatible servers:
llmbridge config set provider openai_compatible
llmbridge config set base_url http://localhost:1234/v1
llmbridge config set model local-model
llmbridge config set api_key local-secret
llmbridge config show masks stored API keys.
For llmbridge ask, model resolution order is:
--modelmodelin~/.llmbridge/config.tomlLLMBRIDGE_DEFAULT_MODELllama3.1:latest
Python Usage
Ollama:
from llmbridge import LLM
llm = LLM(
provider="ollama",
model="llama3.1:latest",
)
response = llm.chat("Explain FastAPI in simple words")
print(response.text)
OpenAI-compatible:
from llmbridge import LLM
llm = LLM(
provider="openai_compatible",
model="local-model",
base_url="http://localhost:1234/v1",
)
response = llm.chat("Explain FastAPI in simple words")
print(response.text)
Message format:
response = llm.chat(
[
{"role": "system", "content": "You are a helpful backend architect."},
{"role": "user", "content": "Explain PostgreSQL indexes."},
]
)
PromptTemplate Usage
Use PromptTemplate for small reusable prompts with named variables:
from llmbridge import LLM, PromptTemplate
template = PromptTemplate("Explain {topic} for a {audience}.")
prompt = template.format(topic="FastAPI", audience="backend developer")
llm = LLM(model="llama3.1:latest")
response = llm.chat(prompt)
print(response.text)
If a required variable is missing, llmbridge raises PromptTemplateError.
Structured Output Usage
LLM.structured() asks the model for JSON, validates it with a Pydantic schema,
and returns a typed object:
from pydantic import BaseModel
from llmbridge import LLM
class TaskResult(BaseModel):
title: str
priority: str
llm = LLM(model="llama3.1:latest")
result = llm.structured(
"Create a task for fixing a login bug",
schema=TaskResult,
)
print(result.title)
print(result.priority)
Structured output depends on the model following instructions. llmbridge asks for
JSON matching your schema, extracts JSON from the response, validates it with
Pydantic, and retries when the output is invalid. If the final response still
cannot be parsed or validated, llmbridge raises StructuredOutputError.
SQL plan example:
from pydantic import BaseModel
from llmbridge import LLM
class SQLPlan(BaseModel):
sql: str
explanation: str
tables_used: list[str]
llm = LLM(model="llama3.1:latest")
plan = llm.structured(
"Create a SQL plan to list the latest 10 paid invoices. Do not execute SQL.",
schema=SQLPlan,
)
print(plan.sql)
This returns a structured SQL plan only. llmbridge does not execute SQL.
Examples
Runnable examples live in the examples/ folder:
python examples/basic_chat.py
python examples/streaming_chat.py
python examples/list_models.py
python examples/custom_options.py
python examples/ask_style_usage.py
python examples/prompt_template.py
python examples/structured_output.py
python examples/structured_sql_plan.py
Troubleshooting
If Ollama is not running, you may see:
Ollama is not running at http://localhost:11434. Start Ollama and run: ollama pull llama3.1
Start Ollama and pull the selected model:
ollama pull llama3.1:latest
If the CLI says a model is missing:
Model 'llama3.1:latest' is not installed.
Run:
llmbridge pull llama3.1:latest
Pull it:
llmbridge pull llama3.1:latest
If your Ollama server uses a different URL:
llmbridge config set base_url http://localhost:11434
Or pass it for one command:
llmbridge ask "Explain FastAPI" --base-url http://localhost:11434
Roadmap
- More provider integrations
- Better structured-output controls
- Tool calling
- Embeddings and RAG support
- Higher-level application workflows
Local Development
git clone https://github.com/iwasbugged/llmbridge.git
cd llmbridge
python3 -m pip install -e ".[dev]"
Run tests:
python3 -m pytest
Run linting:
python3 -m ruff check .
python3 -m ruff format --check .
Author
Rahul Kumar iamrahul.rk4@gmail.com
License
MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmbridge_sdk-0.1.0.tar.gz.
File metadata
- Download URL: llmbridge_sdk-0.1.0.tar.gz
- Upload date:
- Size: 21.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90e59c2bec82289ec991d857215f834cedb9348b7a19a95947f0e7b3eadb1ab5
|
|
| MD5 |
20a5d969510d6d0c0f0e6dfc2c3176b1
|
|
| BLAKE2b-256 |
a9958c9221b3560608dd76edc57dbe6b64699d909d53f73403c2fd5b14f516b3
|
Provenance
The following attestation bundles were made for llmbridge_sdk-0.1.0.tar.gz:
Publisher:
publish.yml on iwasbugged/llmbridge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmbridge_sdk-0.1.0.tar.gz -
Subject digest:
90e59c2bec82289ec991d857215f834cedb9348b7a19a95947f0e7b3eadb1ab5 - Sigstore transparency entry: 1787542247
- Sigstore integration time:
-
Permalink:
iwasbugged/llmbridge@35bed082214e8d24233288775ae4d1f59a86b821 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/iwasbugged
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@35bed082214e8d24233288775ae4d1f59a86b821 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llmbridge_sdk-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llmbridge_sdk-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
777460f4a73e67724eb2a238ab4fb6bdafb20bf910ec778c98eb0f93b5592598
|
|
| MD5 |
82a0aa6d747c13c7037e31c1f166afa8
|
|
| BLAKE2b-256 |
0ec3399cb30f173be1f36aed82a793c8e3b7eee5cb372c45a6766685c19cbc37
|
Provenance
The following attestation bundles were made for llmbridge_sdk-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on iwasbugged/llmbridge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmbridge_sdk-0.1.0-py3-none-any.whl -
Subject digest:
777460f4a73e67724eb2a238ab4fb6bdafb20bf910ec778c98eb0f93b5592598 - Sigstore transparency entry: 1787542290
- Sigstore integration time:
-
Permalink:
iwasbugged/llmbridge@35bed082214e8d24233288775ae4d1f59a86b821 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/iwasbugged
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@35bed082214e8d24233288775ae4d1f59a86b821 -
Trigger Event:
release
-
Statement type: