Skip to main content

Python SDK for Fallbakit local-first chat completions.

Project description

Fallbakit Python SDK

Python SDK for Fallbakit's OpenAI-compatible, local-first chat completions API.

Fallbakit routes to the customer's local Ollama, oMLX, or vLLM runner through the open-source tunnel agent first, then falls back to configured BYOK cloud providers when local inference is unavailable and fallback is allowed.

Install

pip install fallbakit

For local development from this repository:

python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[test,openai]"

Configuration

export FALLBAKIT_API_KEY=or_your_application_key
export FALLBAKIT_BASE_URL=https://api.fallbakit.com

Keep FALLBAKIT_API_KEY in your environment or secret manager for security, then pass it explicitly when constructing the client. base_url defaults to https://api.fallbakit.com. For local development, pass base_url="http://localhost:8080".

For local examples in this repository:

cp .env.example .env.local

.env.example is set up for the local developer stack:

FALLBAKIT_API_KEY=or_your_generated_api_key
FALLBAKIT_BASE_URL=http://localhost:8080
FALLBAKIT_OPENAI_BASE_URL=http://localhost:8080/v1
FALLBAKIT_MODEL=llama3.2
FALLBAKIT_FALLBACK_PROVIDER=openai
FALLBAKIT_FALLBACK_MODEL=gpt-4o-mini

Basic Chat

import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)

print(response["choices"][0]["message"]["content"])

Official OpenAI SDK

Fallbakit also works with the official OpenAI Python SDK because the router exposes POST /v1/chat/completions. Use a Fallbakit application API key and include /v1 in the OpenAI SDK base URL.

python -m pip install -e ".[openai]"
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py
import os

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["FALLBAKIT_API_KEY"],
    base_url=os.environ.get("FALLBAKIT_OPENAI_BASE_URL", "http://localhost:8080/v1"),
)

completion = client.chat.completions.create(
    model=os.environ.get("FALLBAKIT_MODEL", "llama3.2"),
    messages=[{"role": "user", "content": "Write a tiny launch checklist."}],
)

Streaming

import os

from fallbakit import Fallbakit

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"])

for chunk in client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Stream a short answer."}],
    stream=True,
):
    delta = chunk["choices"][0].get("delta", {})
    print(delta.get("content", ""), end="", flush=True)

Cloud Fallback Controls

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Summarize hybrid inference."}],
    fallback_provider="openai",
    fallback_model="gpt-4o-mini",
)

Supported Fallbakit controls:

  • fallback: allow cloud fallback when local cannot serve. Defaults to True.
  • force_local: require local routing.
  • fallback_provider: request a specific configured BYOK fallback provider.
  • fallback_model: request a specific cloud model for fallback.
  • cloud_model_only: skip local routing and call the selected cloud provider.
  • local_model_only: prevent cloud fallback.
  • extra_body: pass additional OpenAI-compatible or Fallbakit-specific request body fields.

Timeouts

import os

client = Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"], timeout=30)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Hello"}],
    timeout=10,
)

Per-request timeout overrides the client default.

Errors

API failures raise FallbakitAPIError with:

  • status_code
  • code
  • response
import os

from fallbakit import Fallbakit, FallbakitAPIError

try:
    Fallbakit(api_key=os.environ["FALLBAKIT_API_KEY"]).chat.completions.create(
        model="llama3.2",
        messages=[{"role": "user", "content": "Hello"}],
    )
except FallbakitAPIError as error:
    print(error.status_code, error.code, error)

Development

pytest
python examples/minimal_chat.py
python examples/streaming_chat.py
python examples/openai_sdk_chat.py
python examples/openai_sdk_streaming.py

Local Verification

Use this flow to confirm the package installs, the unit tests pass, and the examples can talk to a local Fallbakit router.

  1. Start a local model runtime in a separate terminal:
ollama serve
ollama pull llama3.2
  1. Start the local Fallbakit stack from the repository root:
cp configs/local-infra.env.example configs/local-infra.env
cp configs/api.env.example configs/api.env
scripts/dev-up.sh
# In the dashboard, create a runner and export its generated RUNNER_* values.
(
  cd open-source/tunnel-agent
  go run ./cmd/fallbakit-agent \
    -api-key="$FALLBAKIT_RUNNER_API_KEY" \
    -runner-id="$FALLBAKIT_RUNNER_ID" \
    -base-url=http://localhost:8080 \
    -local-provider=ollama \
    -local-base-url=http://localhost:11434
)

For vLLM local verification, start vLLM on localhost:8000, create a vLLM runner in the dashboard, then use -local-provider=vllm -local-base-url=http://localhost:8000. Direct OpenAI clients use http://localhost:8000/v1; the Fallbakit agent should use the origin without /v1.

  1. In a new terminal, install the SDK in editable mode and load the example environment:
cd open-source/python-sdk
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade setuptools wheel
python -m pip install -e ".[test,openai]"
cp .env.example .env.local
set -a
source .env.local
set +a

Replace FALLBAKIT_API_KEY in .env.local with the application API key you created in the dashboard.

  1. Run the package tests:
pytest
  1. Run the local smoke tests against http://localhost:8080:
python examples/minimal_chat.py
python examples/streaming_chat.py
  1. If your local stack has a fallback provider configured, run the fallback example too:
python examples/local_first_with_fallback.py

When database-backed auth is enabled, application allowlist rules are enforced per application_id, so use an application API key created from an enabled dashboard application when running the examples.

If those commands return model output, the local package setup is working correctly. When you are done, stop the dev stack with scripts/dev-down.sh from the repository root.

Publishing

  1. Update version in pyproject.toml.
  2. Run pytest.
  3. Build artifacts:
python -m pip install build twine
python -m build
twine check dist/*
  1. Publish:
twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fallbakit-0.1.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fallbakit-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file fallbakit-0.1.0.tar.gz.

File metadata

  • Download URL: fallbakit-0.1.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for fallbakit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e603e0711d7d6f64030e984808d2cb50190ce9e9f9a09a0ea8ded6ac406f8f7b
MD5 9bf6f9fc6918838975cdac4347f84131
BLAKE2b-256 48d4ceb87cc84029cd058e5c79e4c519cd6e33cb8870f135d6cdaa35504bc76d

See more details on using hashes here.

File details

Details for the file fallbakit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fallbakit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for fallbakit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0042c792455749c0c0aae151a28aae1969984ee47025e422cba84193eb8d80f3
MD5 8945bf9a68b9ceca37315043520ad627
BLAKE2b-256 62fd5b793a94e9078138e2e3b597d94cb88b1fff1ace5c7cd543e4bde1a78d1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page