Skip to main content

Rust runtime for Python AI apps. Drop-in for openai/anthropic SDKs with native SSE streaming, an agent loop with concurrent tool dispatch, and Logfire-compatible OTel emission.

Project description

f3dx

The Rust runtime your Python imports. Drop-in for openai and anthropic SDKs with native SSE streaming, an agent loop with concurrent tool dispatch, and Logfire-compatible OTel emission. PyO3 + abi3 wheels for ubuntu/macos/windows. Built for pydantic-ai.

The intellectual frame is Cruz's "AI Runtime Infrastructure" (arXiv:2603.00495, Feb 2026): a distinct execution-time layer above the model and below the application that observes, reasons over, and intervenes in agent behavior at runtime. f3dx is that layer, in Rust, for Python apps.

pip install f3dx
import f3dx

# 5x faster streaming, drop-in for openai SDK
client = f3dx.OpenAI(api_key="...", base_url="https://api.openai.com/v1")
for chunk in client.chat_completions_create_stream({"model": "gpt-4", "messages": [...]}):
    print(chunk["choices"][0]["delta"].get("content", ""), end="")

# Drop-in for anthropic SDK with native Messages event handling
client = f3dx.Anthropic(api_key="...")
for event in client.messages_create_stream({"model": "claude-3-5-sonnet", "max_tokens": 1024, "messages": [...]}):
    if event.get("type") == "content_block_delta":
        print(event["delta"].get("text", ""), end="")

# 5-10x faster agent runtime via concurrent tool dispatch
agent = f3dx.AgentRuntime(system_prompt="...", concurrent_tool_dispatch=True)
result = agent.run(user_prompt, tools={...}, mock_responses=[...])

# Tool-call streaming reassembly: skip the accumulate-fragments boilerplate
for ev in client.chat_completions_create_stream_assembled({...}):
    if ev["type"] == "tool_call":
        result = dispatch(ev["name"], ev["arguments"])  # arguments is parsed dict, ready

# Validated structured output: skip accumulate-then-json.loads at end
for ev in client.chat_completions_create_stream_assembled(req, validate_json=True):
    if ev["type"] == "validated_output":
        process(ev["data"])  # already parsed
    elif ev["type"] == "validation_error":
        log.warning("model emitted invalid JSON: %s", ev["error"])

# Logfire-compatible OTel spans by default - gen_ai.* semconv
f3dx.configure_otel(
    endpoint="https://logfire-api.pydantic.dev/v1/traces",
    headers={"Authorization": f"Bearer {LOGFIRE_TOKEN}"},
)
# Every Agent.run + every chat_completions / messages call now emits
# spans with gen_ai.system, gen_ai.request.model, gen_ai.usage.{input,output}_tokens, etc.

Why

Compound AI systems (Zaharia BAIR 2024, Mei AIOS arXiv:2403.16971) are the dominant production pattern. The orchestration + HTTP layer is now the bottleneck, not the model. Every other AI infra layer is non-Python by 2026 (vLLM C++, TGI Rust, mistral.rs Rust, Outlines-core Rust, XGrammar C++). Orchestration is the last lane; f3dx ships it.

Bench results (reproducible from bench/)

What vs Speedup
f3dx.AgentRuntime concurrent dispatch pure-python sequential agent loop 5-10x at 5-10 tools/turn
f3dx.OpenAI streaming openai Python SDK 5.10x at 1000 chunks
f3dx.Anthropic streaming anthropic Python SDK 2.9-5.2x at 50-1000 events
Tool-call assembled stream raw fragment iteration 17 chunks -> 2 events
validate_json=True accumulate + json.loads + try/except one extra event, zero user code

All benches live under bench/, all use the stdlib mock servers in the same dir, all single-thread.

Architecture

Cargo workspace, five crates, one PyPI package:

f3dx/
  crates/
    f3dx-py/      PyO3 bridge cdylib (the only crate with #[pymodule])
    f3dx-rt/      agent runtime + concurrent tool dispatch
    f3dx-http/    LLM HTTP client (reqwest + native SSE + streaming JSON validation)
    f3dx-trace/   OpenTelemetry span emission (Logfire-compatible, gen_ai.* semconv)
    f3dx-mcp/     Model Context Protocol client (rmcp + stdio transport)

OpenAI-compatible endpoints (vLLM, Mistral, xAI, Groq, Together, Fireworks) all work via f3dx.OpenAI by setting base_url.

Observability

Configure once with f3dx.configure_otel(endpoint, headers, service_name, stdout). Every AgentRuntime.run emits a root span with gen_ai.system="f3dx" + gen_ai.prompt.length_chars + f3dx.{concurrent_tool_dispatch,iterations,tool_calls_executed,duration_ms,output.length_chars}.

Every chat_completions_create* / messages_create* emits a SpanKind::Client span:

gen_ai.system               openai | anthropic
gen_ai.operation.name       chat | messages
gen_ai.request.model        from request
gen_ai.request.{temperature, top_p, max_tokens, stream}
gen_ai.response.{id, model, finish_reasons}
gen_ai.usage.{input_tokens, output_tokens}

Streaming spans hold open until terminal chunk; usage attrs land when the closing chunk carries them (auto-injects stream_options.include_usage=true for OpenAI; reads message_start.message.usage + message_delta.usage for Anthropic).

Status: Ok on success, Status::error("<msg>") on HTTP failure.

JSONL trace sink for downstream replay-eval tools:

f3dx.configure_traces("traces.jsonl", capture_messages=True)
# every AgentRuntime.run appends one row with prompt + system_prompt +
# output + input_tokens + output_tokens (capture_messages off by default;
# opt-in because PII-sensitive). Polars/DuckDB scan via pl.scan_ndjson /
# duckdb.read_json. Replay via tracewright.

# Or convert to columnar parquet for fast analytics:
# pip install f3dx[arrow]
from f3dx.analytics import jsonl_to_parquet, tail_jsonl_to_parquet
jsonl_to_parquet("traces.jsonl", "traces.parquet")             # batch convert
# Or live-tail a long-running production process:
tail_jsonl_to_parquet("traces.jsonl", "traces.parquet",
                      poll_seconds=10, batch_size=200,
                      until=lambda: time.time() > deadline)
# pl.scan_parquet("traces.parquet").filter(pl.col("output_tokens") > 100).collect()

Layout

f3dx/
  bench/                            reproducible benches + verify scripts + stdlib mock servers
  crates/                           cargo workspace member crates
  python/f3dx/__init__.py           core Python wrapper (AgentRuntime, OpenAI, Anthropic, configure_otel)
  python/f3dx/compat/               opt-in subclass shims (f3dx[openai-compat])
  python/f3dx/pydantic_ai/          pydantic-ai integration (f3dx[pydantic-ai])
  python/f3dx/langchain/            langchain-openai integration (f3dx[langchain])
  pyproject.toml                    maturin build, optional extras
  Cargo.toml                        cargo workspace root + workspace lints
  rust-toolchain.toml               pinned to 1.90.0 for reproducible builds
  .github/workflows/ci.yml          ubuntu/macos/windows + clippy gate + built-wheel install
  .github/workflows/release.yml     glibc/musl x86_64+aarch64 wheels + macos x86_64+aarch64 + windows + sdist + OIDC PyPI publish

What this is not

f3dx is a Python-from-Rust runtime - a Rust core that ships as a Python wheel via PyO3. If you're building a pure Rust application and want an agent framework in your binary, look at AutoAgents (Rust agent framework with role-based multi-agent), rig (provider abstraction + RAG primitives in Rust), or mistral.rs (local inference engine). Different audience, different scope.

f3dx is not an inference engine. Use vLLM, TGI, mistral.rs, llama.cpp, or any OpenAI-compatible endpoint underneath; f3dx talks to them.

f3dx is not a multi-agent orchestration framework. It is the runtime layer below frameworks like pydantic-ai, LangChain, LlamaIndex, CrewAI, AutoGen.

Sibling projects

The f3d1 ecosystem on top of f3dx:

  • tracewright - pip install tracewright. Trace-replay adapter for pydantic-evals. Read an f3dx or pydantic-ai logfire JSONL trace, get a pydantic_evals.Dataset you can run any pydantic-evals evaluator against. Closes the loop from "we have observability" to "we have regression tests".
  • f3dx-cache - pip install f3dx-cache. Content-addressable LLM response cache + replay layer. redb + RFC 8785 JCS + BLAKE3. Identical requests fingerprint identically; cached response returns at sub-ms. CI runs the eval suite against captured prod traces at zero token cost. pytest11 plugin: @pytest.mark.f3dx_cache.
  • pydantic-cal - pip install pydantic-cal. Calibration metrics for pydantic-evals: ECE, MCE, ACE, Brier, reliability diagrams, Murphy 1973 decomposition, temperature/Platt/isotonic scaling, Fisher-Rao geometry kernel. The calibration layer the eval world is missing.
  • f3dx-router - pip install f3dx-router. In-process Rust router for LLM providers. Hedged-parallel + 429/5xx hot-swap < 1ms. Composes with hosted gateways like llmkit instead of competing.
  • llmkit - pip install llmkit-sdk or npx @f3d1/llmkit-cli. Hosted API gateway with budget enforcement, session tracking, cost dashboards, MCP server. The hosted complement to f3dx-router's in-process surface.

Composition with ATLAS-RTC (Cruz)

# pip install f3dx[atlas-rtc]
from atlas_rtc.adapters.mock_adapter import MockAdapter, MockScenario
from f3dx.atlas_rtc import controlled_completion

result = controlled_completion(
    prompt="Return JSON with name and age.",
    contract=["name", "age"],          # shorthand for JSONSchemaContract(required_keys=...)
    adapter=MockAdapter(scenario),     # or HFAdapter / VLLMAdapter for real models
)
# result.text='{"name":"alice","age":30}', result.valid=True, result.interventions=N

ATLAS-RTC (Christopher Cruz, MIT) is a runtime control layer that enforces structured outputs at decode time - drift detection + logit masking + rollback during generation. f3dx's runtime sits at a different layer (transport + observability + agent loop). They compose: ATLAS-RTC owns the per-token control loop, f3dx owns the request transport and trace emission. Most useful with local vLLM / HuggingFace where decode-time control is reachable; cloud APIs (OpenAI, Anthropic) don't expose that surface.

# pip install f3dx[vigil]
from f3dx.vigil import f3dx_jsonl_to_vigil_events

f3dx.configure_traces("traces.jsonl", capture_messages=True)
# ... agent runs ...
f3dx_jsonl_to_vigil_events("traces.jsonl", "events.jsonl", actor="robin_a")
# Robin B (cruz209/V.I.G.I.L) reads events.jsonl, builds Roses/Buds/Thorns
# diagnosis, proposes prompt + code adaptations.

V.I.G.I.L / Robin B is the reflective-supervisor sibling: reads a JSONL event log, builds an "emotional bank" appraisal (Roses / Buds / Thorns), diagnoses reliability issues, proposes prompt + code patches. f3dx provides the runtime that produces the trace; this bridge converts the trace into VIGIL's expected event shape.

MCP client

import f3dx, json

# spawn an MCP server over stdio (npm-based, Python-based, any binary)
client = f3dx.MCPClient.stdio("npx", ["-y", "@modelcontextprotocol/server-everything"])

for tool in client.list_tools():
    print(tool["name"], tool["description"])

result = client.call_tool("get-sum", json.dumps({"a": 7, "b": 35}))
# 'The sum of 7 and 35 is 42.'

f3dx-mcp is a sibling cargo crate; the rmcp Rust SDK drives the JSON-RPC handshake. Stdio + streamable-HTTP transports + sampling-callback bridge ship today; SSE-only transport (rare in practice - streamable-HTTP subsumes it for MCP) skipped.

Sampling callback - the MCP server can ask the connected client for a model completion via sampling/createMessage. Pass a Python callback to MCPClient.stdio / streamable_http and it fires on every such request:

def my_sampling(messages_json: str, system_prompt: str) -> str:
    # messages_json is the serialized rmcp message list; reach for whatever
    # field the request exposes. Run any model - f3dx.OpenAI, f3dx.Anthropic,
    # pydantic-ai Agent, ATLAS-RTC controlled_completion - and return text.
    return run_my_model(messages_json, system_prompt)

client = f3dx.MCPClient.stdio(
    "python", ["-m", "my_mcp_server"],
    sampling_callback=my_sampling,
)

Without a callback, sampling requests get the standard "method not supported" error.

Server-side - expose Python callables AS MCP tools that other MCP clients (Claude Desktop, IDE plugins, other f3dx-built clients) can call:

import f3dx, json

def add(args_json: str) -> str:
    args = json.loads(args_json)
    return str(args["a"] + args["b"])

server = f3dx.MCPServer(name="my-server", version="0.0.1")
server.add_tool(
    "add",
    add,
    description="Add two numbers.",
    input_schema={"type": "object", "properties": {"a": {"type": "number"}, "b": {"type": "number"}}, "required": ["a", "b"]},
)
server.serve_stdio()  # blocks until client closes

f3dx now ships the full bidirectional MCP surface: client (stdio + streamable-HTTP), server (stdio), and sampling-callback bridge so server-issued completions route back through user-controlled model code.

Adapter packages

# pip install f3dx[openai-compat]
from f3dx.compat import OpenAI, AsyncOpenAI    # subclass openai.OpenAI / openai.AsyncOpenAI
import openai
client = OpenAI(api_key=...)
isinstance(client, openai.OpenAI)               # True - passes isinstance checks in
                                                # instructor, litellm, smolagents, langchain
out = client.chat.completions.create(...)       # routes through Rust, returns
                                                # openai.types.chat.ChatCompletion

# pip install f3dx[anthropic-compat]
from f3dx.compat import AsyncAnthropic         # subclass anthropic.AsyncAnthropic
client = AsyncAnthropic(api_key=...)           # also intercepts client.beta.messages.create
                                               # for pydantic-ai's BetaMessage validation path

# pip install f3dx[pydantic-ai]
from f3dx.pydantic_ai import openai_model, anthropic_model, F3dxCapability
from pydantic_ai import Agent
cap = F3dxCapability()
agent = Agent(openai_model('gpt-4', api_key=...), capabilities=[cap])
result = await agent.run('hi')                  # f3dx-routed HTTP, capability counts requests
# anthropic_model('claude-haiku-4', api_key=...) likewise

# pip install f3dx[langchain]
from f3dx.langchain import ChatOpenAI
llm = ChatOpenAI(model='gpt-4', api_key=...)    # subclass of langchain_openai.ChatOpenAI
msg = llm.invoke('hi')                          # sync + ainvoke both routed via f3dx

What's not here yet

  • Gemini adapter (Phase C.2)
  • MCP V0.1: SSE + streamable-HTTP transports + sampling callback bridge (V0 ships stdio only; covers Claude Desktop + every npm-based server + python-based servers via python -m)
  • Parent-child trace context propagation between AgentRuntime span and HTTP child spans (needs Python-side context bridge)
  • Phase E V0.2.2: full streaming JSON parser (V0.2.1 ships fail-fast on invalid JSON prefix - catches the dominant LLM-prefaces-with-prose case in 5-10 tokens; bracket-balance + per-token schema FSM are the next steps)
  • Phase E V0.2.3: per-token JSON Schema state machine (V0.2 ships terminal-time output_schema= via jsonschema-rs; full per-token schema FSM lands once V0.2.2 is in)
  • Phase G V0.3: Rust-side parquet sink (V0.2 ships AppendingParquetWriter + tail_jsonl_to_parquet Python-side via pyarrow under f3dx[arrow]; a Rust-native sink would skip the JSONL middlefile but adds ~30MB to the wheel - deferred unless requested)
  • langchain-f3dx standalone PyPI package per LangChain partner-package convention (today integrated via the f3dx[langchain] extra; standalone-package split happens before LangChain partner-registry submission)

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

f3dx-0.0.18.tar.gz (71.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

f3dx-0.0.18-cp310-abi3-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.10+Windows x86-64

f3dx-0.0.18-cp310-abi3-musllinux_1_2_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

f3dx-0.0.18-cp310-abi3-musllinux_1_2_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

f3dx-0.0.18-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

f3dx-0.0.18-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

f3dx-0.0.18-cp310-abi3-macosx_11_0_arm64.whl (4.6 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

f3dx-0.0.18-cp310-abi3-macosx_10_12_x86_64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file f3dx-0.0.18.tar.gz.

File metadata

  • Download URL: f3dx-0.0.18.tar.gz
  • Upload date:
  • Size: 71.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for f3dx-0.0.18.tar.gz
Algorithm Hash digest
SHA256 9f19e251efff5e63227f02d15480abb128dbcfdeb9fa3bb7f40fadae65ce4805
MD5 d9eda62d28770b496e0f6d31dfc3d342
BLAKE2b-256 7088368820110259a074b957ff790010b3aa972e1ae20cc4f1cc9bcd05697fce

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18.tar.gz:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: f3dx-0.0.18-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 552c3e006503a194ab0e6f2361dd6eba80d57225223bd02bc56280b8bcd0d39f
MD5 7c204665f7faea751158f46ee41c118c
BLAKE2b-256 2454c7e4480c36a363a74fdae3d3cbdb4951c5c45bce52db43960fdc2cf3ef72

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-win_amd64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 de06199aa0089a6e374586e9db8268b3e4bc14379c658208c07fcb8b79d4ca6c
MD5 208af64d785e2234f7ab991a473f3628
BLAKE2b-256 9695079f6669849f996d11bb2a7fd7891f6b4711c8bda32cb3952eb9cb759c63

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 9e8f09a6c6d246ecfa3fd7c45b4d170a967a4b44e6d3af8a36fa9c2fba94c7fd
MD5 91021f9f7aa8bbc6704fc0b474978238
BLAKE2b-256 85225f930e7743c31a17a6e4827d961459559daa349c04d78f141bfc49b665a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5210480daaeac454395daeb360841c71fa459bdcf0bafa7966c81cd9fd2c59cc
MD5 746ce61eff23422f30142c18c3adb1cc
BLAKE2b-256 d0ffc86cd4a1baa44bf9d70b6dadcf5a5d6dc8fa255e8ec2ec82de3e73d9c812

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6dd9a1cdc9e58cf3b1d9503ebf6d37ce9d998da4d39fab0c0f378f7704188bb1
MD5 6f108e21e19b0c17f85a5f64dc29831c
BLAKE2b-256 e878df78da98feebb2599c3adaadba03d51e1ba2f196de5e3fb2c9559c87414a

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d1df983fc142b47562b94dc789dd7eb8e48238c434df8345bf8dfea842419e0b
MD5 f6a53c635d8a63220d020fc1857d0298
BLAKE2b-256 79e4be6cc7ca5dd7460b298151dae949c981f89eea1dd14cdabf4f61c9b98249

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file f3dx-0.0.18-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for f3dx-0.0.18-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c3476cd7b74d68725139070a112ffd2af982efe1d809806237dbcf91c287ada7
MD5 b478f96bd5a3e28e213ca6d34a76505e
BLAKE2b-256 6ddcc77c23521b5e699dc8901d41afafc33f5571362e3074d44c1e3449ee192e

See more details on using hashes here.

Provenance

The following attestation bundles were made for f3dx-0.0.18-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on smigolsmigol/f3dx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page