FastAPI middleware that turns your routes into a natural-language-callable surface using LLM function-calling.
Project description
fastapi-ai-router
Turn your existing FastAPI routes into a natural-language-callable surface — in one line.
Drop-in middleware. Zero new metadata. Uses the OpenAPI schema FastAPI already generates.
What it does
from fastapi import FastAPI
from fastapi_ai_router import AIRouter, ai_route
from fastapi_ai_router.backends.litellm import LiteLLMBackend
app = FastAPI()
@app.post("/orders/{order_id}/cancel")
@ai_route(description="Cancel a customer's order.")
def cancel_order(order_id: int, reason: str | None = None):
...
AIRouter(app, llm=LiteLLMBackend(model="gpt-4o-mini")) # one line to enable
$ curl -X POST localhost:8000/ai \
-H 'content-type: application/json' \
-d '{"query":"cancel order 123 because it was a duplicate"}'
{
"endpoint": "POST /orders/{order_id}/cancel",
"args": {"order_id": 123, "reason": "duplicate"},
"result": {"status": "cancelled"},
"reasoning": "User wants to cancel order 123 with reason 'duplicate'.",
"result_status": 200
}
That's it. The LLM picked the right route, filled the args, the middleware dispatched the call, and your existing Depends(auth) + middleware + Pydantic validation all ran normally.
Why this exists
Most LLM "routing" libraries are SaaS gateways or LangChain agents. There was no clean way to add a natural-language layer to an existing FastAPI app — until now. fastapi-ai-router is the conversational layer for any FastAPI codebase, and it leans on the OpenAPI schema FastAPI already generates so there's nothing new to maintain.
| Need | Without this library | With this library |
|---|---|---|
| Add NL to one endpoint | Write a LangChain agent + tool wrappers | Add @ai_route |
| Add NL to a whole app | Hand-code 50 tool wrappers | AIRouter(app, llm=...) |
| Keep auth/middleware/validation | Re-implement in your agent | Free — loopback through FastAPI |
| Swap models or providers | Rewrite agent | Swap LLMBackend |
| Test without an API key | 🥲 | FakeLLMBackend(returns=ToolCall(...)) |
How it works
AIRouter(app, llm=...) adds a single POST /ai endpoint to your FastAPI app. On the first request, it walks app.routes and projects each one (filtered by mode) into a JSON Schema tool definition — using the OpenAPI machinery FastAPI already generates. The user's natural-language {"query": "..."} is sent to your LLM along with those tool definitions; the LLM picks one tool and fills its arguments. The middleware then dispatches that call internally via httpx + ASGITransport (the same pattern FastAPI's TestClient uses), so your existing Depends(auth), middleware, validation, and exception handlers all run normally — auth and tracing headers are forwarded transparently. The dispatched response is wrapped in an envelope showing what the LLM picked and why, and returned to the client with the dispatched call's HTTP status code.
In the quickstart above, the LLM read the cancel_order route's description and signature, decided it was the right match for "cancel order 123, it was a duplicate", extracted order_id=123 and reason="duplicate" from the natural-language query, and the middleware dispatched the call exactly as if a normal client had hit POST /orders/123/cancel?reason=duplicate directly.
sequenceDiagram
autonumber
participant Client
participant Router as AIRouter at /ai
participant LLM
participant Route as FastAPI route
participant Deps as Depends(auth)
Client->>Router: POST /ai with query JSON
Note over Router: Layer-1 deps fire here
Router->>Router: Build tool defs from app.routes (cached)
Router->>LLM: messages + tools (OpenAI tool-calling shape)
LLM-->>Router: ToolCall name and args
Router->>Router: Resolve name to RouteSpec, un-flatten, URL-encode
Router->>Deps: Forward Authorization via httpx ASGI loopback
Deps->>Route: Layer-2 auth passes
Route-->>Router: dispatched response
Router->>Router: wrap_envelope(decision, response)
Router-->>Client: 200 OK with envelope
Internally the architecture is small and split by responsibility:
flowchart LR
subgraph public ["Public surface"]
AIRouter(["AIRouter"])
ai_route(["ai_route decorator"])
LLMBackend(["LLMBackend Protocol"])
end
subgraph core ["Core pipeline"]
introspection["introspection<br/>mode-aware route walk"]
schema["schema<br/>OpenAPI to flat tool defs"]
dispatcher["dispatcher<br/>un-flatten + ASGI loopback"]
envelope["envelope<br/>wrap or raw"]
observability["observability<br/>async hooks"]
end
subgraph backends ["Backends"]
LiteLLM["LiteLLMBackend<br/>via litellm extra"]
Fake["FakeLLMBackend<br/>for tests"]
BYO["Your backend<br/>implements Protocol"]
end
AIRouter --> introspection
AIRouter --> schema
AIRouter --> dispatcher
AIRouter --> envelope
AIRouter --> observability
AIRouter -. uses .-> LLMBackend
LLMBackend -. implemented by .-> LiteLLM
LLMBackend -. implemented by .-> Fake
LLMBackend -. implemented by .-> BYO
Each module has one responsibility, ~100-300 lines, fully typed, fully tested.
Install
pip install fastapi-ai-router[litellm]
The [litellm] extra gives you OpenAI / Anthropic / Gemini / Ollama / 100+ providers via LiteLLM — usually all you need. To bring your own LLM, implement the LLMBackend Protocol and skip the extra entirely:
pip install fastapi-ai-router
Exposure modes — explicit and safe by default
flowchart TD
Start{Pick a mode<br/>at construction time}
Start -->|"default — safest"| Decorator
Start --> Tag
Start --> All
Decorator["mode='decorator'<br/><br/>Only routes decorated with<br/>@ai_route(expose=True)<br/>are exposed"]
Tag["mode='tag', tag='ai'<br/><br/>Only routes whose tags=<br/>list contains the tag<br/>are exposed"]
All["mode='all', exclude=[…]<br/><br/>Every route except<br/>excluded paths and<br/>@ai_route(expose=False)<br/>kill switches"]
style Decorator fill:#d4edda,stroke:#28a745,color:#000
style Tag fill:#fff3cd,stroke:#ffc107,color:#000
style All fill:#f8d7da,stroke:#dc3545,color:#000
There is no silent fallback between modes — you pick one explicitly. expose=False on @ai_route is a kill switch that excludes a route from exposure in every mode, so you can mark sensitive routes as never-AI-callable regardless of how the AIRouter is configured elsewhere.
| Mode | Use when… | Default safety |
|---|---|---|
"decorator" |
You want surgical control over what's AI-callable. | ✅ Safest. Empty surface until you opt in. |
"tag" |
You already use FastAPI tags to organize routes. | 🟡 Safe if your tagging is intentional. |
"all" |
You're in a sandbox or trust the LLM completely. | 🔴 Footgun. Pair with exclude= and expose=False. |
Two-layer auth — auth doesn't reinvent itself
flowchart LR
Client((Client)) -- "Authorization: Bearer …" --> L1
L1{Layer 1<br/>dependencies= on /ai}
L1 -- pass --> LLM[/LLM picks a tool/]
L1 -- fail 401/403 --> Reject((Rejected — no LLM call))
LLM --> Dispatch[/Dispatcher: forward Authorization/]
Dispatch --> L2{Layer 2<br/>route's own Depends auth}
L2 -- pass --> Handler[/Route handler runs/]
L2 -- fail 401/403 --> Envelope[envelope.result_status<br/>= 401 or 403]
Handler --> Envelope
Envelope --> Client
- Layer 1 gates "who can use the AI feature at all" (e.g., paid-tier check on
/ai). - Layer 2 gates "who can call this specific endpoint" — and it's enforced by FastAPI's own
Depends()chain on each dispatched route. Nothing about your auth changes. TheAuthorizationheader (and other configured headers) is forwarded transparently via the httpx loopback.
Bring your own LLM
from fastapi_ai_router import AIRouter, LLMBackend, Message, ToolCall, ToolDef
class MyBackend:
async def call(self, messages: list[Message], tools: list[ToolDef]) -> ToolCall | None:
# call your LLM, parse the response, return ToolCall(...) or None
...
AIRouter(app, llm=MyBackend())
No subclassing required — the LLMBackend is a structural Protocol. Test backends are built the same way:
from fastapi_ai_router.backends.fake import FakeLLMBackend
router = AIRouter(app, llm=FakeLLMBackend(returns=ToolCall(name="cancel", args={"order_id": 7}, ...)))
The whole test suite uses FakeLLMBackend — 74 tests pass deterministically without a single API key.
Observability — pluggable, no vendor deps
from fastapi_ai_router import AIRouter, Decision, ErrorEvent
async def to_langfuse(d: Decision) -> None:
await langfuse_client.log(...)
async def to_sentry(e: ErrorEvent) -> None:
sentry_sdk.capture_message(...)
AIRouter(app, llm=..., on_decision=to_langfuse, on_error=to_sentry)
Every routing decision (and every error) flows through async hooks you control. Pipe to Langfuse, OpenTelemetry, Sentry, plain logs, or a Postgres table — the library has zero hard dependency on any tracing vendor.
Error semantics
| Failure | HTTP status | Body shape |
|---|---|---|
NoRouteMatched (LLM declined all tools) |
422 | {"error":"no_route_matched", "available_tools":[…]} |
UnknownTool (LLM hallucinated a name) |
422 | {"error":"unknown_tool", "tool_name":"…"} |
LLMBackendError (timeout, rate limit, etc.) |
502 | {"error":"llm_backend_error", "retryable":true} |
| Dispatched route 4xx/5xx | passthrough | envelope wraps the response, result_status set |
DispatchError (transport failure) |
500 | {"error":"dispatch_error", "detail":"…"} |
Dispatched-route errors are never swallowed. If the route's Depends(auth) rejects with 403, the /ai response is also 403 — the library does not silently flatten downstream errors to 200.
What's not in v0.1 — by design
| Feature | Why not in v0.1 | When |
|---|---|---|
| Multi-step / agent loops | Stays out of LangChain's territory; sharp positioning | v0.3+ if there's pull |
| Conversation history | Single-shot is the demo | v0.3+ |
| Semantic caching | Out-of-scope for the first wedge | v0.2 |
| Streaming SSE responses | Adds complexity to the response path | v0.2 |
| Mountable sub-app | Single dedicated endpoint is cleaner | v0.2 |
| Form / multipart bodies | JSON-only keeps the loopback contract simple | v0.2 |
| Semantic prefiltering for 100+ routes | All tools sent every call in v0.1 | v0.2 |
Saying "we don't do this yet" up front is itself a positioning choice — see docs/concepts.md for the rationale.
Project status
Alpha. v0.1.0.dev0. The API surface above is what we'll ship as v0.1.0 stable. Breaking changes from here forward are documented in CHANGELOG.md.
- ✅ Core: introspection + dispatch + envelope + errors + observability
- ✅ Three exposure modes (
decorator/tag/all) - ✅ Two backends shipped:
LiteLLMBackend,FakeLLMBackend - ✅ 74 tests passing, 87% coverage, mypy strict, ruff clean
- ✅ Examples + concepts/recipes/security docs
Documentation
| Doc | What it covers |
|---|---|
| docs/concepts.md | Mental model, request flow, two-layer auth, mode comparison, caching |
| docs/recipes.md | Custom backend, custom forwarding, tracing integrations, large-app strategies |
| docs/security.md | When mode="all" is dangerous, prompt injection, header forwarding |
| examples/ | Four runnable apps: basic, tag-mode, with-auth, with-observability |
| CONTRIBUTING.md | Dev setup, testing without API keys, adding a backend |
Testing
uv sync --extra dev
uv run pytest # 74 tests, deterministic, no API keys
uv run pytest --cov=fastapi_ai_router # coverage report
RUN_LLM_TESTS=1 uv run pytest tests/e2e/ # gated real-LLM smoke tests
The test suite is deterministic and network-free by default — every test uses FakeLLMBackend. Real-LLM tests are gated behind an env var and run only on release tags in CI.
Contributing
PRs welcome. See CONTRIBUTING.md for dev setup and the bar for new code (TDD, mypy strict, ruff clean, 80%+ coverage).
Particularly welcome:
- New
LLMBackendadapters (Anthropic-direct, Gemini-direct, vLLM, Ollama-direct, etc.) - Bug reports with minimal repro
- Doc improvements
License
MIT — do whatever you like, attribution appreciated.
Built with care for the FastAPI community.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastapi_ai_router-0.1.0.tar.gz.
File metadata
- Download URL: fastapi_ai_router-0.1.0.tar.gz
- Upload date:
- Size: 226.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
decbae0775209394fa82bf8007fa4b204c40edea0fa86da28a6cb14d073de489
|
|
| MD5 |
f884a6d42a6630eb35f1603f9343cd53
|
|
| BLAKE2b-256 |
10fe1cb1783c855a8c7b7f4ff4f08e5b7367203630fc9e13a3a1583edd1920b0
|
File details
Details for the file fastapi_ai_router-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fastapi_ai_router-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7643836bf79151144de9af09264053f8cf155087c6e22d9e7b22e2b6d9ac2f7
|
|
| MD5 |
e84a98f51518cd146272d5f83bb93da0
|
|
| BLAKE2b-256 |
69be90cb491e83981a99f441e0989db9725d8b68b663418f06fa3b851e9c64a5
|