Skip to main content

FastAPI middleware that turns your routes into a natural-language-callable surface using LLM function-calling.

Project description

fastapi-ai-router

Turn your existing FastAPI routes into a natural-language-callable surface — in one line.

Drop-in middleware. Zero new metadata. Uses the OpenAPI schema FastAPI already generates.

License: MIT Python 3.11+ FastAPI Pydantic v2 Code style: ruff Typed: mypy strict Tests: 74 passing Coverage: 87%25 Status: alpha


What it does

from fastapi import FastAPI
from fastapi_ai_router import AIRouter, ai_route
from fastapi_ai_router.backends.litellm import LiteLLMBackend

app = FastAPI()

@app.post("/orders/{order_id}/cancel")
@ai_route(description="Cancel a customer's order.")
def cancel_order(order_id: int, reason: str | None = None):
    ...

AIRouter(app, llm=LiteLLMBackend(model="gpt-4o-mini"))   # one line to enable
$ curl -X POST localhost:8000/ai \
    -H 'content-type: application/json' \
    -d '{"query":"cancel order 123 because it was a duplicate"}'
{
  "endpoint": "POST /orders/{order_id}/cancel",
  "args": {"order_id": 123, "reason": "duplicate"},
  "result": {"status": "cancelled"},
  "reasoning": "User wants to cancel order 123 with reason 'duplicate'.",
  "result_status": 200
}

That's it. The LLM picked the right route, filled the args, the middleware dispatched the call, and your existing Depends(auth) + middleware + Pydantic validation all ran normally.


Why this exists

Most LLM "routing" libraries are SaaS gateways or LangChain agents. There was no clean way to add a natural-language layer to an existing FastAPI app — until now. fastapi-ai-router is the conversational layer for any FastAPI codebase, and it leans on the OpenAPI schema FastAPI already generates so there's nothing new to maintain.

Need Without this library With this library
Add NL to one endpoint Write a LangChain agent + tool wrappers Add @ai_route
Add NL to a whole app Hand-code 50 tool wrappers AIRouter(app, llm=...)
Keep auth/middleware/validation Re-implement in your agent Free — loopback through FastAPI
Swap models or providers Rewrite agent Swap LLMBackend
Test without an API key 🥲 FakeLLMBackend(returns=ToolCall(...))

How it works

AIRouter(app, llm=...) adds a single POST /ai endpoint to your FastAPI app. On the first request, it walks app.routes and projects each one (filtered by mode) into a JSON Schema tool definition — using the OpenAPI machinery FastAPI already generates. The user's natural-language {"query": "..."} is sent to your LLM along with those tool definitions; the LLM picks one tool and fills its arguments. The middleware then dispatches that call internally via httpx + ASGITransport (the same pattern FastAPI's TestClient uses), so your existing Depends(auth), middleware, validation, and exception handlers all run normally — auth and tracing headers are forwarded transparently. The dispatched response is wrapped in an envelope showing what the LLM picked and why, and returned to the client with the dispatched call's HTTP status code.

In the quickstart above, the LLM read the cancel_order route's description and signature, decided it was the right match for "cancel order 123, it was a duplicate", extracted order_id=123 and reason="duplicate" from the natural-language query, and the middleware dispatched the call exactly as if a normal client had hit POST /orders/123/cancel?reason=duplicate directly.

sequenceDiagram
    autonumber
    participant Client
    participant Router as AIRouter at /ai
    participant LLM
    participant Route as FastAPI route
    participant Deps as Depends(auth)

    Client->>Router: POST /ai with query JSON
    Note over Router: Layer-1 deps fire here
    Router->>Router: Build tool defs from app.routes (cached)
    Router->>LLM: messages + tools (OpenAI tool-calling shape)
    LLM-->>Router: ToolCall name and args
    Router->>Router: Resolve name to RouteSpec, un-flatten, URL-encode
    Router->>Deps: Forward Authorization via httpx ASGI loopback
    Deps->>Route: Layer-2 auth passes
    Route-->>Router: dispatched response
    Router->>Router: wrap_envelope(decision, response)
    Router-->>Client: 200 OK with envelope

Internally the architecture is small and split by responsibility:

flowchart LR
    subgraph public ["Public surface"]
        AIRouter(["AIRouter"])
        ai_route(["ai_route decorator"])
        LLMBackend(["LLMBackend Protocol"])
    end

    subgraph core ["Core pipeline"]
        introspection["introspection<br/>mode-aware route walk"]
        schema["schema<br/>OpenAPI to flat tool defs"]
        dispatcher["dispatcher<br/>un-flatten + ASGI loopback"]
        envelope["envelope<br/>wrap or raw"]
        observability["observability<br/>async hooks"]
    end

    subgraph backends ["Backends"]
        LiteLLM["LiteLLMBackend<br/>via litellm extra"]
        Fake["FakeLLMBackend<br/>for tests"]
        BYO["Your backend<br/>implements Protocol"]
    end

    AIRouter --> introspection
    AIRouter --> schema
    AIRouter --> dispatcher
    AIRouter --> envelope
    AIRouter --> observability
    AIRouter -. uses .-> LLMBackend
    LLMBackend -. implemented by .-> LiteLLM
    LLMBackend -. implemented by .-> Fake
    LLMBackend -. implemented by .-> BYO

Each module has one responsibility, ~100-300 lines, fully typed, fully tested.


Install

pip install fastapi-ai-router[litellm]

The [litellm] extra gives you OpenAI / Anthropic / Gemini / Ollama / 100+ providers via LiteLLM — usually all you need. To bring your own LLM, implement the LLMBackend Protocol and skip the extra entirely:

pip install fastapi-ai-router

Exposure modes — explicit and safe by default

flowchart TD
    Start{Pick a mode<br/>at construction time}
    Start -->|"default — safest"| Decorator
    Start --> Tag
    Start --> All

    Decorator["mode='decorator'<br/><br/>Only routes decorated with<br/>@ai_route(expose=True)<br/>are exposed"]
    Tag["mode='tag', tag='ai'<br/><br/>Only routes whose tags=<br/>list contains the tag<br/>are exposed"]
    All["mode='all', exclude=[…]<br/><br/>Every route except<br/>excluded paths and<br/>@ai_route(expose=False)<br/>kill switches"]

    style Decorator fill:#d4edda,stroke:#28a745,color:#000
    style Tag fill:#fff3cd,stroke:#ffc107,color:#000
    style All fill:#f8d7da,stroke:#dc3545,color:#000

There is no silent fallback between modes — you pick one explicitly. expose=False on @ai_route is a kill switch that excludes a route from exposure in every mode, so you can mark sensitive routes as never-AI-callable regardless of how the AIRouter is configured elsewhere.

Mode Use when… Default safety
"decorator" You want surgical control over what's AI-callable. ✅ Safest. Empty surface until you opt in.
"tag" You already use FastAPI tags to organize routes. 🟡 Safe if your tagging is intentional.
"all" You're in a sandbox or trust the LLM completely. 🔴 Footgun. Pair with exclude= and expose=False.

Two-layer auth — auth doesn't reinvent itself

flowchart LR
    Client((Client)) -- "Authorization: Bearer …" --> L1
    L1{Layer 1<br/>dependencies= on /ai}
    L1 -- pass --> LLM[/LLM picks a tool/]
    L1 -- fail 401/403 --> Reject((Rejected — no LLM call))
    LLM --> Dispatch[/Dispatcher: forward Authorization/]
    Dispatch --> L2{Layer 2<br/>route's own Depends auth}
    L2 -- pass --> Handler[/Route handler runs/]
    L2 -- fail 401/403 --> Envelope[envelope.result_status<br/>= 401 or 403]
    Handler --> Envelope
    Envelope --> Client
  • Layer 1 gates "who can use the AI feature at all" (e.g., paid-tier check on /ai).
  • Layer 2 gates "who can call this specific endpoint" — and it's enforced by FastAPI's own Depends() chain on each dispatched route. Nothing about your auth changes. The Authorization header (and other configured headers) is forwarded transparently via the httpx loopback.

Bring your own LLM

from fastapi_ai_router import AIRouter, LLMBackend, Message, ToolCall, ToolDef

class MyBackend:
    async def call(self, messages: list[Message], tools: list[ToolDef]) -> ToolCall | None:
        # call your LLM, parse the response, return ToolCall(...) or None
        ...

AIRouter(app, llm=MyBackend())

No subclassing required — the LLMBackend is a structural Protocol. Test backends are built the same way:

from fastapi_ai_router.backends.fake import FakeLLMBackend

router = AIRouter(app, llm=FakeLLMBackend(returns=ToolCall(name="cancel", args={"order_id": 7}, ...)))

The whole test suite uses FakeLLMBackend74 tests pass deterministically without a single API key.


Observability — pluggable, no vendor deps

from fastapi_ai_router import AIRouter, Decision, ErrorEvent

async def to_langfuse(d: Decision) -> None:
    await langfuse_client.log(...)

async def to_sentry(e: ErrorEvent) -> None:
    sentry_sdk.capture_message(...)

AIRouter(app, llm=..., on_decision=to_langfuse, on_error=to_sentry)

Every routing decision (and every error) flows through async hooks you control. Pipe to Langfuse, OpenTelemetry, Sentry, plain logs, or a Postgres table — the library has zero hard dependency on any tracing vendor.


Error semantics

Failure HTTP status Body shape
NoRouteMatched (LLM declined all tools) 422 {"error":"no_route_matched", "available_tools":[…]}
UnknownTool (LLM hallucinated a name) 422 {"error":"unknown_tool", "tool_name":"…"}
LLMBackendError (timeout, rate limit, etc.) 502 {"error":"llm_backend_error", "retryable":true}
Dispatched route 4xx/5xx passthrough envelope wraps the response, result_status set
DispatchError (transport failure) 500 {"error":"dispatch_error", "detail":"…"}

Dispatched-route errors are never swallowed. If the route's Depends(auth) rejects with 403, the /ai response is also 403 — the library does not silently flatten downstream errors to 200.


What's not in v0.1 — by design

Feature Why not in v0.1 When
Multi-step / agent loops Stays out of LangChain's territory; sharp positioning v0.3+ if there's pull
Conversation history Single-shot is the demo v0.3+
Semantic caching Out-of-scope for the first wedge v0.2
Streaming SSE responses Adds complexity to the response path v0.2
Mountable sub-app Single dedicated endpoint is cleaner v0.2
Form / multipart bodies JSON-only keeps the loopback contract simple v0.2
Semantic prefiltering for 100+ routes All tools sent every call in v0.1 v0.2

Saying "we don't do this yet" up front is itself a positioning choice — see docs/concepts.md for the rationale.


Project status

Alpha. v0.1.0.dev0. The API surface above is what we'll ship as v0.1.0 stable. Breaking changes from here forward are documented in CHANGELOG.md.

  • ✅ Core: introspection + dispatch + envelope + errors + observability
  • ✅ Three exposure modes (decorator / tag / all)
  • ✅ Two backends shipped: LiteLLMBackend, FakeLLMBackend
  • ✅ 74 tests passing, 87% coverage, mypy strict, ruff clean
  • ✅ Examples + concepts/recipes/security docs

Documentation

Doc What it covers
docs/concepts.md Mental model, request flow, two-layer auth, mode comparison, caching
docs/recipes.md Custom backend, custom forwarding, tracing integrations, large-app strategies
docs/security.md When mode="all" is dangerous, prompt injection, header forwarding
examples/ Four runnable apps: basic, tag-mode, with-auth, with-observability
CONTRIBUTING.md Dev setup, testing without API keys, adding a backend

Testing

uv sync --extra dev
uv run pytest                                   # 74 tests, deterministic, no API keys
uv run pytest --cov=fastapi_ai_router           # coverage report
RUN_LLM_TESTS=1 uv run pytest tests/e2e/        # gated real-LLM smoke tests

The test suite is deterministic and network-free by default — every test uses FakeLLMBackend. Real-LLM tests are gated behind an env var and run only on release tags in CI.


Contributing

PRs welcome. See CONTRIBUTING.md for dev setup and the bar for new code (TDD, mypy strict, ruff clean, 80%+ coverage).

Particularly welcome:

  • New LLMBackend adapters (Anthropic-direct, Gemini-direct, vLLM, Ollama-direct, etc.)
  • Bug reports with minimal repro
  • Doc improvements

License

MIT — do whatever you like, attribution appreciated.

Built with care for the FastAPI community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastapi_ai_router-0.1.0.tar.gz (226.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastapi_ai_router-0.1.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file fastapi_ai_router-0.1.0.tar.gz.

File metadata

  • Download URL: fastapi_ai_router-0.1.0.tar.gz
  • Upload date:
  • Size: 226.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fastapi_ai_router-0.1.0.tar.gz
Algorithm Hash digest
SHA256 decbae0775209394fa82bf8007fa4b204c40edea0fa86da28a6cb14d073de489
MD5 f884a6d42a6630eb35f1603f9343cd53
BLAKE2b-256 10fe1cb1783c855a8c7b7f4ff4f08e5b7367203630fc9e13a3a1583edd1920b0

See more details on using hashes here.

File details

Details for the file fastapi_ai_router-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fastapi_ai_router-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fastapi_ai_router-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f7643836bf79151144de9af09264053f8cf155087c6e22d9e7b22e2b6d9ac2f7
MD5 e84a98f51518cd146272d5f83bb93da0
BLAKE2b-256 69be90cb491e83981a99f441e0989db9725d8b68b663418f06fa3b851e9c64a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page