A lightweight Python SDK for using local and OpenAI-compatible LLMs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

llmbridge

llmbridge is a lightweight Python SDK and CLI for using local and OpenAI-compatible LLMs. It connects to runtimes you already run, such as Ollama, LM Studio, vLLM, llama.cpp server, LocalAI, or another OpenAI-compatible API.

llmbridge does not ship model files. You install and run the model runtime yourself, then use llmbridge as a small developer-friendly bridge.

Features

Local Ollama provider
Generic OpenAI-compatible provider
CLI commands for setup checks, model listing, chat, ask, pull, and config
Streaming responses
Local config at ~/.llmbridge/config.toml
Prompt templates
Structured JSON output with Pydantic validation and retry
Typed response models

Installation

pip install llmbridge-sdk

The PyPI distribution is named llmbridge-sdk. The Python import and CLI command remain llmbridge.

Requirements

Python 3.10+
Ollama for the Ollama provider, or an already-running OpenAI-compatible server
No bundled LLM model files

Ollama Quickstart

Install Ollama from https://ollama.com, start it locally, then pull a model:

ollama pull llama3.1:latest

Check your setup:

llmbridge doctor
llmbridge serve-check
llmbridge models

Ask a question:

llmbridge ask "Explain FastAPI in simple words"

Set your default model:

llmbridge config set model llama3.1:latest

OpenAI-Compatible Quickstart

Use an OpenAI-compatible server such as LM Studio, vLLM, llama.cpp server, or LocalAI. The base_url should point to the API root, usually ending in /v1.

llmbridge ask "Explain FastAPI" \
  --provider openai_compatible \
  --model local-model \
  --base-url http://localhost:1234/v1

List models:

llmbridge models \
  --provider openai_compatible \
  --base-url http://localhost:1234/v1

The OpenAI-compatible provider does not download or manage models. Start your server with the model you want before calling llmbridge.

CLI Usage

Use the configured default model:

llmbridge ask "Explain FastAPI"

Override the model:

llmbridge ask "Explain FastAPI" --model gemma4:e4b

Adjust temperature:

llmbridge ask "Explain FastAPI" --temperature 0.2

Run chat with an explicit model:

llmbridge chat llama3.1:latest "Explain PostgreSQL in simple words"

Run chat against an OpenAI-compatible server:

llmbridge chat local-model "Hello" \
  --provider openai_compatible \
  --base-url http://localhost:1234/v1

Pull an Ollama model:

llmbridge pull llama3.1:latest

Streaming Usage

llmbridge chat llama3.1:latest "Explain Docker" --stream
llmbridge ask "Explain Docker" --stream

Streaming chunks are printed as they arrive. Non-streaming CLI output is trimmed before printing.

Python streaming:

from llmbridge import LLM

llm = LLM(model="llama3.1:latest")

for chunk in llm.stream("Explain Docker Compose"):
    print(chunk.text, end="")

Config Usage

llmbridge stores local CLI defaults in:

~/.llmbridge/config.toml

Supported config keys:

provider
model
base_url
api_key
temperature
timeout

Commands:

llmbridge config show
llmbridge config set provider ollama
llmbridge config set model llama3.1:latest
llmbridge config set base_url http://localhost:11434
llmbridge config set api_key local-secret
llmbridge config set temperature 0.2
llmbridge config set timeout 120
llmbridge config reset

For OpenAI-compatible servers:

llmbridge config set provider openai_compatible
llmbridge config set base_url http://localhost:1234/v1
llmbridge config set model local-model
llmbridge config set api_key local-secret

llmbridge config show masks stored API keys.

For llmbridge ask, model resolution order is:

--model
model in ~/.llmbridge/config.toml
LLMBRIDGE_DEFAULT_MODEL
llama3.1:latest

Python Usage

Ollama:

from llmbridge import LLM

llm = LLM(
    provider="ollama",
    model="llama3.1:latest",
)

response = llm.chat("Explain FastAPI in simple words")
print(response.text)

OpenAI-compatible:

from llmbridge import LLM

llm = LLM(
    provider="openai_compatible",
    model="local-model",
    base_url="http://localhost:1234/v1",
)

response = llm.chat("Explain FastAPI in simple words")
print(response.text)

Message format:

response = llm.chat(
    [
        {"role": "system", "content": "You are a helpful backend architect."},
        {"role": "user", "content": "Explain PostgreSQL indexes."},
    ]
)

PromptTemplate Usage

Use PromptTemplate for small reusable prompts with named variables:

from llmbridge import LLM, PromptTemplate

template = PromptTemplate("Explain {topic} for a {audience}.")
prompt = template.format(topic="FastAPI", audience="backend developer")

llm = LLM(model="llama3.1:latest")
response = llm.chat(prompt)
print(response.text)

If a required variable is missing, llmbridge raises PromptTemplateError.

Structured Output Usage

LLM.structured() asks the model for JSON, validates it with a Pydantic schema, and returns a typed object:

from pydantic import BaseModel

from llmbridge import LLM


class TaskResult(BaseModel):
    title: str
    priority: str


llm = LLM(model="llama3.1:latest")
result = llm.structured(
    "Create a task for fixing a login bug",
    schema=TaskResult,
)

print(result.title)
print(result.priority)

Structured output depends on the model following instructions. llmbridge asks for JSON matching your schema, extracts JSON from the response, validates it with Pydantic, and retries when the output is invalid. If the final response still cannot be parsed or validated, llmbridge raises StructuredOutputError.

SQL plan example:

from pydantic import BaseModel

from llmbridge import LLM


class SQLPlan(BaseModel):
    sql: str
    explanation: str
    tables_used: list[str]


llm = LLM(model="llama3.1:latest")
plan = llm.structured(
    "Create a SQL plan to list the latest 10 paid invoices. Do not execute SQL.",
    schema=SQLPlan,
)

print(plan.sql)

This returns a structured SQL plan only. llmbridge does not execute SQL.

Examples

Runnable examples live in the examples/ folder:

python examples/basic_chat.py
python examples/streaming_chat.py
python examples/list_models.py
python examples/custom_options.py
python examples/ask_style_usage.py
python examples/prompt_template.py
python examples/structured_output.py
python examples/structured_sql_plan.py

Troubleshooting

If Ollama is not running, you may see:

Ollama is not running at http://localhost:11434. Start Ollama and run: ollama pull llama3.1

Start Ollama and pull the selected model:

ollama pull llama3.1:latest

If the CLI says a model is missing:

Model 'llama3.1:latest' is not installed.
Run:
  llmbridge pull llama3.1:latest

Pull it:

llmbridge pull llama3.1:latest

If your Ollama server uses a different URL:

llmbridge config set base_url http://localhost:11434

Or pass it for one command:

llmbridge ask "Explain FastAPI" --base-url http://localhost:11434

Roadmap

More provider integrations
Better structured-output controls
Tool calling
Embeddings and RAG support
Higher-level application workflows

Local Development

git clone https://github.com/iwasbugged/llmbridge.git
cd llmbridge
python3 -m pip install -e ".[dev]"

Run tests:

python3 -m pytest

Run linting:

python3 -m ruff check .
python3 -m ruff format --check .

Author

Rahul Kumar iamrahul.rk4@gmail.com

License

MIT License. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

iwasbugged

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmbridge_sdk-0.1.0.tar.gz (21.5 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmbridge_sdk-0.1.0-py3-none-any.whl (19.1 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file llmbridge_sdk-0.1.0.tar.gz.

File metadata

Download URL: llmbridge_sdk-0.1.0.tar.gz
Upload date: Jun 11, 2026
Size: 21.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmbridge_sdk-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`90e59c2bec82289ec991d857215f834cedb9348b7a19a95947f0e7b3eadb1ab5`
MD5	`20a5d969510d6d0c0f0e6dfc2c3176b1`
BLAKE2b-256	`a9958c9221b3560608dd76edc57dbe6b64699d909d53f73403c2fd5b14f516b3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmbridge_sdk-0.1.0.tar.gz:

Publisher: publish.yml on iwasbugged/llmbridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmbridge_sdk-0.1.0.tar.gz
- Subject digest: 90e59c2bec82289ec991d857215f834cedb9348b7a19a95947f0e7b3eadb1ab5
- Sigstore transparency entry: 1787542247
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: iwasbugged/llmbridge@35bed082214e8d24233288775ae4d1f59a86b821
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/iwasbugged
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@35bed082214e8d24233288775ae4d1f59a86b821
- Trigger Event: release

File details

Details for the file llmbridge_sdk-0.1.0-py3-none-any.whl.

File metadata

Download URL: llmbridge_sdk-0.1.0-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 19.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmbridge_sdk-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`777460f4a73e67724eb2a238ab4fb6bdafb20bf910ec778c98eb0f93b5592598`
MD5	`82a0aa6d747c13c7037e31c1f166afa8`
BLAKE2b-256	`0ec3399cb30f173be1f36aed82a793c8e3b7eee5cb372c45a6766685c19cbc37`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmbridge_sdk-0.1.0-py3-none-any.whl:

Publisher: publish.yml on iwasbugged/llmbridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmbridge_sdk-0.1.0-py3-none-any.whl
- Subject digest: 777460f4a73e67724eb2a238ab4fb6bdafb20bf910ec778c98eb0f93b5592598
- Sigstore transparency entry: 1787542290
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: iwasbugged/llmbridge@35bed082214e8d24233288775ae4d1f59a86b821
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/iwasbugged
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@35bed082214e8d24233288775ae4d1f59a86b821
- Trigger Event: release

llmbridge-sdk 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

llmbridge

Features

Installation

Requirements

Ollama Quickstart

OpenAI-Compatible Quickstart

CLI Usage

Streaming Usage

Config Usage

Python Usage

PromptTemplate Usage

Structured Output Usage

Examples

Troubleshooting

Roadmap

Local Development

Author

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance