Python SDK for Fallbakit local-first chat completions.
Project description
Fallbakit Python SDK
Python SDK for Fallbakit's OpenAI-compatible, local-first chat completions API.
Fallbakit routes to the customer's local Ollama, oMLX, or vLLM runner through the open-source tunnel agent first, then falls back to configured BYOK cloud providers when local inference is unavailable and fallback is allowed.
Install
pip install fallbakit
For local development from this repository:
python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[test,openai]"
Configuration
export FALLBAKIT_API_KEY=or_your_application_key
export FALLBAKIT_BASE_URL=https://api.fallbakit.com
Keep FALLBAKIT_API_KEY in your environment or secret manager for security, then pass it explicitly when constructing the client. base_url defaults to https://api.fallbakit.com. For local development, pass base_url="http://localhost:8080".
For local examples in this repository:
cp .env.example .env.local
.env.example is set up for the local developer stack:
FALLBAKIT_API_KEY=or_your_generated_api_key
FALLBAKIT_BASE_URL=http://localhost:8080
FALLBAKIT_OPENAI_BASE_URL=http://localhost:8080/v1
FALLBAKIT_MODEL=llama3.2
FALLBAKIT_FALLBACK_PROVIDER=openai
FALLBAKIT_FALLBACK_MODEL=gpt-4o-mini
Basic Chat
import os
from fallbakit import Fallbakit
client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])
response = client.chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)
print(response["choices"][0]["message"]["content"])
Official OpenAI SDK
Fallbakit also works with the official OpenAI Python SDK because the router exposes POST /v1/chat/completions. Use a Fallbakit application API key and include /v1 in the OpenAI SDK base URL.
python -m pip install -e ".[openai]"
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FALLBAKIT_API_KEY"],
base_url=os.environ.get("FALLBAKIT_OPENAI_BASE_URL", "http://localhost:8080/v1"),
)
completion = client.chat.completions.create(
model=os.environ.get("FALLBAKIT_MODEL", "llama3.2"),
messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)
Streaming
import os
from fallbakit import Fallbakit
client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])
for chunk in client.chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Stream a short answer."}],
stream=True,
):
delta = chunk["choices"][0].get("delta", {})
print(delta.get("content", ""), end="", flush=True)
Cloud Fallback Controls
response = client.chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Summarize hybrid inference."}],
fallback_provider="openai",
fallback_model="gpt-4o-mini",
)
Supported Fallbakit controls:
fallback: allow cloud fallback when local cannot serve. Defaults toTrue.force_local: require local routing.fallback_provider: request a specific configured BYOK fallback provider.fallback_model: request a specific cloud model for fallback.cloud_model_only: skip local routing and call the selected cloud provider.local_model_only: prevent cloud fallback.extra_body: pass additional OpenAI-compatible or Fallbakit-specific request body fields.
Timeouts
import os
client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"], timeout=30)
response = client.chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Hello"}],
timeout=10,
)
Per-request timeout overrides the client default.
Errors
API failures raise FallbakitAPIError with:
status_codecoderesponse
import os
from fallbakit import Fallbakit, FallbakitAPIError
try:
Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"]).chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Hello"}],
)
except FallbakitAPIError as error:
print(error.status_code, error.code, error)
Development
pytest
python examples/minimal_chat.py
python examples/streaming_chat.py
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py
Local Verification
Use this flow to confirm the package installs, the unit tests pass, and the examples can talk to a local Fallbakit router.
- Start a local model runtime in a separate terminal:
ollama serve
ollama pull llama3.2
- Start the local Fallbakit stack from the repository root:
cp configs/local-infra.env.example configs/local-infra.env
cp configs/api.env.example configs/api.env
scripts/dev-up.sh
# In the dashboard, create a runner and export its generated RUNNER_* values.
(
cd open-source/tunnel-agent
go run ./cmd/fallbakit-agent \
-api-key="$FALLBAKIT_RUNNER_API_KEY" \
-runner-id="$FALLBAKIT_RUNNER_ID" \
-base-url=http://localhost:8080 \
-local-provider=ollama \
-local-base-url=http://localhost:11434
)
For vLLM local verification, start vLLM on localhost:8000, create a vLLM runner in the dashboard, then use -local-provider=vllm -local-base-url=http://localhost:8000. Direct OpenAI clients use http://localhost:8000/v1; the Fallbakit agent should use the origin without /v1.
- In a new terminal, install the SDK in editable mode and load the example environment:
cd open-source/python-sdk
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools wheel
python -m pip install -e ".[test,openai]"
cp .env.example .env.local
set -a
source .env.local
set +a
Replace FALLBAKIT_API_KEY in .env.local with the application API key you created in the dashboard.
- Run the package tests:
pytest
- Run the local smoke tests against
http://localhost:8080:
python examples/minimal_chat.py
python examples/streaming_chat.py
- If your local stack has a fallback provider configured, run the fallback example too:
python examples/local_first_with_fallback.py
When database-backed auth is enabled, application allowlist rules are enforced per application_id, so use an application API key created from an enabled dashboard application when running the examples.
If those commands return model output, the local package setup is working correctly. When you are done, stop the dev stack with scripts/dev-down.sh from the repository root.
Publishing
- Update
versioninpyproject.toml. - Run
pytest. - Build artifacts:
python -m pip install build twine
python -m build
twine check dist/*
- Publish:
twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fallbakit-0.1.0.tar.gz.
File metadata
- Download URL: fallbakit-0.1.0.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e603e0711d7d6f64030e984808d2cb50190ce9e9f9a09a0ea8ded6ac406f8f7b
|
|
| MD5 |
9bf6f9fc6918838975cdac4347f84131
|
|
| BLAKE2b-256 |
48d4ceb87cc84029cd058e5c79e4c519cd6e33cb8870f135d6cdaa35504bc76d
|
File details
Details for the file fallbakit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fallbakit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0042c792455749c0c0aae151a28aae1969984ee47025e422cba84193eb8d80f3
|
|
| MD5 |
8945bf9a68b9ceca37315043520ad627
|
|
| BLAKE2b-256 |
62fd5b793a94e9078138e2e3b597d94cb88b1fff1ace5c7cd543e4bde1a78d1c
|