Python SDK for Fallbakit local-first chat completions.

These details have not been verified by PyPI

Project description

Fallbakit Python SDK

Python SDK for Fallbakit's OpenAI-compatible, local-first chat completions API.

Fallbakit routes to the customer's local Ollama, oMLX, or vLLM runner through the open-source tunnel agent first, then falls back to configured BYOK cloud providers when local inference is unavailable and fallback is allowed.

Install

pip install fallbakit

For local development from this repository:

python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[test,openai]"

Configuration

export FALLBAKIT_API_KEY=or_your_application_key
export FALLBAKIT_BASE_URL=https://api.fallbakit.com

Keep FALLBAKIT_API_KEY in your environment or secret manager for security, then pass it explicitly when constructing the client. base_url defaults to https://api.fallbakit.com. For local development, pass base_url="http://localhost:8080".

For local examples in this repository:

cp .env.example .env.local

.env.example is set up for the local developer stack:

FALLBAKIT_API_KEY=or_your_generated_api_key
FALLBAKIT_BASE_URL=http://localhost:8080
FALLBAKIT_OPENAI_BASE_URL=http://localhost:8080/v1
FALLBAKIT_MODEL=llama3.2
FALLBAKIT_FALLBACK_PROVIDER=openai
FALLBAKIT_FALLBACK_MODEL=gpt-4o-mini

Basic Chat

import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)

print(response["choices"][0]["message"]["content"])

Official OpenAI SDK

Fallbakit also works with the official OpenAI Python SDK because the router exposes POST /v1/chat/completions. Use a Fallbakit application API key and include /v1 in the OpenAI SDK base URL.

python -m pip install -e ".[openai]"
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py

import os

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["FALLBAKIT_API_KEY"],
    base_url=os.environ.get("FALLBAKIT_OPENAI_BASE_URL", "http://localhost:8080/v1"),
)

completion = client.chat.completions.create(
    model=os.environ.get("FALLBAKIT_MODEL", "llama3.2"),
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)

Streaming

import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

for chunk in client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Stream a short answer."}],
    stream=True,
):
    delta = chunk["choices"][0].get("delta", {})
    print(delta.get("content", ""), end="", flush=True)

Cloud Fallback Controls

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Summarize hybrid inference."}],
    fallback_provider="openai",
    fallback_model="gpt-4o-mini",
)

Supported Fallbakit controls:

fallback: allow cloud fallback when local cannot serve. Defaults to True.
force_local: require local routing.
fallback_provider: request a specific configured BYOK fallback provider.
fallback_model: request a specific cloud model for fallback.
cloud_model_only: skip local routing and call the selected cloud provider.
local_model_only: prevent cloud fallback.
extra_body: pass additional OpenAI-compatible or Fallbakit-specific request body fields.

Timeouts

import os

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"], timeout=30)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Hello"}],
    timeout=10,
)

Per-request timeout overrides the client default.

Errors

API failures raise FallbakitAPIError with:

status_code
code
response

import os

from fallbakit import Fallbakit, FallbakitAPIError

try:
    Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"]).chat.completions.create(
        model="llama3.2",
        messages=[{"role": "user", "content": "Hello"}],
    )
except FallbakitAPIError as error:
    print(error.status_code, error.code, error)

Development

pytest
python examples/minimal_chat.py
python examples/streaming_chat.py
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py

Local Verification

Use this flow to confirm the package installs, the unit tests pass, and the examples can talk to a local Fallbakit router.

Start a local model runtime in a separate terminal:

ollama serve
ollama pull llama3.2

Start the local Fallbakit stack from the repository root:

cp configs/local-infra.env.example configs/local-infra.env
cp configs/api.env.example configs/api.env
scripts/dev-up.sh
# In the dashboard, create a runner and export its generated RUNNER_* values.
(
  cd open-source/tunnel-agent
  go run ./cmd/fallbakit-agent \
    -api-key="$FALLBAKIT_RUNNER_API_KEY" \
    -runner-id="$FALLBAKIT_RUNNER_ID" \
    -base-url=http://localhost:8080 \
    -local-provider=ollama \
    -local-base-url=http://localhost:11434
)

For vLLM local verification, start vLLM on localhost:8000, create a vLLM runner in the dashboard, then use -local-provider=vllm -local-base-url=http://localhost:8000. Direct OpenAI clients use http://localhost:8000/v1; the Fallbakit agent should use the origin without /v1.

In a new terminal, install the SDK in editable mode and load the example environment:

cd open-source/python-sdk
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools wheel
python -m pip install -e ".[test,openai]"
cp .env.example .env.local
set -a
source .env.local
set +a

Replace FALLBAKIT_API_KEY in .env.local with the application API key you created in the dashboard.

Run the package tests:

pytest

Run the local smoke tests against http://localhost:8080:

python examples/minimal_chat.py
python examples/streaming_chat.py

If your local stack has a fallback provider configured, run the fallback example too:

python examples/local_first_with_fallback.py

When database-backed auth is enabled, application allowlist rules are enforced per application_id, so use an application API key created from an enabled dashboard application when running the examples.

If those commands return model output, the local package setup is working correctly. When you are done, stop the dev stack with scripts/dev-down.sh from the repository root.

Publishing

Update version in pyproject.toml.
Run pytest.
Build artifacts:

python -m pip install build twine
python -m build
twine check dist/*

Publish:

twine upload dist/*

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Jun 25, 2026

This version

0.1.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fallbakit-0.1.0.tar.gz (11.0 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fallbakit-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file fallbakit-0.1.0.tar.gz.

File metadata

Download URL: fallbakit-0.1.0.tar.gz
Upload date: Jun 25, 2026
Size: 11.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for fallbakit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e603e0711d7d6f64030e984808d2cb50190ce9e9f9a09a0ea8ded6ac406f8f7b`
MD5	`9bf6f9fc6918838975cdac4347f84131`
BLAKE2b-256	`48d4ceb87cc84029cd058e5c79e4c519cd6e33cb8870f135d6cdaa35504bc76d`

See more details on using hashes here.

File details

Details for the file fallbakit-0.1.0-py3-none-any.whl.

File metadata

Download URL: fallbakit-0.1.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for fallbakit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0042c792455749c0c0aae151a28aae1969984ee47025e422cba84193eb8d80f3`
MD5	`8945bf9a68b9ceca37315043520ad627`
BLAKE2b-256	`62fd5b793a94e9078138e2e3b597d94cb88b1fff1ace5c7cd543e4bde1a78d1c`

See more details on using hashes here.

fallbakit 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Fallbakit Python SDK

Install

Configuration

Basic Chat

Official OpenAI SDK

Streaming

Cloud Fallback Controls

Timeouts

Errors

Development

Local Verification

Publishing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes