Skip to main content

AxonFlow governance integration for LiteLLM

Project description

axonflow-litellm

AxonFlow governance integration for LiteLLM. Enforce policies, audit LLM calls, and gate high-risk requests behind human approval — all through a drop-in wrapper around litellm.completion().

Installation

pip install axonflow-litellm

Quick Start

from axonflow_litellm import AxonFlowLogger, AxonFlowLoggerConfig, PolicyDeniedError

logger = AxonFlowLogger(AxonFlowLoggerConfig(
    endpoint="http://localhost:8080",
    client_id="my-app",
    client_secret="...",
))

try:
    response = logger.completion(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize quarterly earnings"}],
    )
    print(response.choices[0].message.content)
except PolicyDeniedError as e:
    print(f"Blocked: {e.reason}")

How It Works

AxonFlowLogger provides two integration modes:

Governance Mode (recommended)

Use logger.completion() or logger.acompletion() as drop-in replacements for litellm.completion() / litellm.acompletion():

  1. Pre-check — sends the prompt to AxonFlow for policy evaluation
  2. HITL — if the policy returns require_approval, creates a human-in-the-loop review request and polls until approved, rejected, or timed out
  3. LLM call — delegates to LiteLLM (all providers supported)
  4. Audit — records the response to AxonFlow for observability
# Async (recommended for production)
response = await logger.acompletion(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "..."}],
    user_token="jwt-from-your-auth",
)

Audit-Only Mode

Register as a LiteLLM callback for observability without blocking:

import litellm

litellm.callbacks = [logger]
response = litellm.acompletion(model="gpt-4o", messages=[...])

In this mode, every LLM call is recorded to AxonFlow for audit trail. Policy denials are logged as warnings but cannot block the request (a LiteLLM SDK limitation — callback exceptions are silently swallowed).

Configuration

Parameter Default Description
endpoint (required) AxonFlow agent URL
client_id (required) AxonFlow client identifier
client_secret "" AxonFlow client secret
default_user_token "anonymous" Token for policy evaluation when none provided
tenant_id None AxonFlow tenant identifier
fail_open True Allow LLM calls when AxonFlow is unreachable
call_timeout_seconds 5.0 Per-hook timeout for AxonFlow API calls
breaker_failure_threshold 5 Consecutive failures before circuit opens
breaker_recovery_seconds 30.0 Wait before attempting recovery probe
enable_hitl_polling True Enable HITL approval flow for require_approval
approval_poll_interval_seconds 2.0 Polling interval for HITL status
approval_max_wait_seconds 300.0 Maximum wait for HITL decision
extra_context {} Additional context sent with every pre-check

Fail-Open vs. Fail-Closed

By default, fail_open=True: if AxonFlow is unreachable or times out, the LLM call proceeds normally. This ensures an AxonFlow outage does not break your application.

For high-stakes workloads where unapproved LLM calls must never proceed:

config = AxonFlowLoggerConfig(
    endpoint="http://localhost:8080",
    client_id="payments-service",
    client_secret="...",
    fail_open=False,
)

Sync vs. Async

Both litellm.completion() (sync) and litellm.acompletion() (async) are fully supported.

When registered via litellm.callbacks, sync hooks delegate to their async counterparts via asyncio.run(). This adds minor overhead (~1ms) per hook call in the sync path. For performance-critical sync workloads, use logger.completion() directly (governance wrapper) which amortizes the event loop creation.

If sync hooks are invoked inside a running event loop (unusual — e.g., sync callbacks from an async framework), a one-time RuntimeWarning is emitted directing you to acompletion().

Sync callback mode caveats

In sync callback mode (litellm.callbacks = [logger] + litellm.completion()), each callback hook creates an ephemeral asyncio event loop via asyncio.run(). Pre-check (governance) and post-LLM audit both fire and write to AxonFlow. However:

  • Audit write failures are logged at WARNING level and do not raise to the caller (fail-open by default). If AxonFlow is temporarily unreachable during the audit phase, the LLM response is still returned but the audit row may be missing.
  • Each hook creates a new event loop, so connection pooling is not shared across hooks within the same LLM call. This is slightly less efficient than the governance wrapper path.

For strict audit guarantees (every LLM call audited, failure = exception), use logger.completion() or logger.acompletion() instead of the callback registration path.

Exceptions

Exception When
PolicyDeniedError Policy denied the request
ApprovalRejected HITL approval was rejected
ApprovalTimeout HITL approval timed out

All exceptions carry .reason (string) and .policies (list of policy IDs).

These exceptions do NOT extend litellm.exceptions.APIError — catch governance denials via PolicyDeniedError, not LiteLLM's exception hierarchy.

MCP Governance

LiteLLM is LLM-completion-focused. For MCP tool governance, use AxonFlow's MCP server directly.

Requirements

  • Python >= 3.10
  • litellm >= 1.40
  • axonflow >= 8.2.0

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

axonflow_litellm-1.0.3.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

axonflow_litellm-1.0.3-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file axonflow_litellm-1.0.3.tar.gz.

File metadata

  • Download URL: axonflow_litellm-1.0.3.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for axonflow_litellm-1.0.3.tar.gz
Algorithm Hash digest
SHA256 2f1a90f1c711a291f11e997bbaee83572366937de8d205f0a326d148cda9a915
MD5 7b3d4e9da7200e91402b90ee23c8b5e1
BLAKE2b-256 2e5c421fbb7a51349f0f3d3519db0d90ed0c9a70a767a25dbc8ec29f53c6404d

See more details on using hashes here.

Provenance

The following attestation bundles were made for axonflow_litellm-1.0.3.tar.gz:

Publisher: release.yml on getaxonflow/axonflow-litellm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file axonflow_litellm-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for axonflow_litellm-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 faa01043ae2ea51ee79a87325e03b000f5ea1a48ebed03f4ffb974201fdc64b5
MD5 17169ebf81f0012bd5de8f36009f1d9b
BLAKE2b-256 1b8e6ead17b801977c5061ba51475e4ce21ba4f0349b475805df0d9d9b5458e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for axonflow_litellm-1.0.3-py3-none-any.whl:

Publisher: release.yml on getaxonflow/axonflow-litellm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page