Superfast logprob-native agent runtime
Project description
swiftagents
Superfast, logprob-native, async-first agent runtime.
Why swiftagents
- Logprob-native routing and uncertainty
- Tool-agnostic, model-agnostic (strict about logprobs)
- Async-first with bounded speculation (max 2 tools)
- Cost-aware, cacheable, and observable
- Optional judge pipeline
Install
pip install swiftagents
Local development:
pip install -e .[dev]
Quickstart
import asyncio
from swiftagents.core import AgentRuntime, AgentConfig, MockModelClient, ToolRegistry, ToolSpec
def web_search(query: str) -> dict:
# Put any code here: Pinecone, DB, APIs, etc.
return {"snippet": "Example result"}
async def main():
client = MockModelClient()
client.queue_text("TOOL=WEB")
client.queue_text("Answer using web evidence")
spec = ToolSpec(
name="WEB",
description="Search the web",
input_schema={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
example_calls=[],
cost_hint="medium",
latency_hint_ms=200,
side_effects=False,
cacheable=True,
cancellable=True,
)
tools = ToolRegistry()
tools.register_function(web_search, spec)
runtime = AgentRuntime(client=client, tools=tools, config=AgentConfig())
result = await runtime.run("Find the latest overview")
print(result.answer)
asyncio.run(main())
Core concepts
Model clients (logprobs required)
swiftagents requires token-level logprobs for routing. If a backend cannot provide them, it hard-errors.
Supported clients:
OpenAIChatCompletionsClientVLLMOpenAICompatibleClientMockModelClient(tests and examples)
Tools
Tools are async callables with a ToolSpec and return ToolResult.
from swiftagents.core import ToolSpec, ToolResult
class MyTool:
def __init__(self):
self.spec = ToolSpec(
name="RAG",
description="Retrieve from docs",
input_schema={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
example_calls=[],
cost_hint="medium",
latency_hint_ms=200,
side_effects=False,
cacheable=True,
cancellable=True,
)
async def __call__(self, **kwargs):
return ToolResult(ok=True, data={"docs": []}, error=None, metadata={})
Register functions directly (no tool classes)
Use any code inside a function (sync or async) and register it with a ToolSpec.
from swiftagents.core import ToolRegistry, ToolSpec, ToolResult
def pinecone_search(query: str) -> dict:
# Put any code here (Pinecone, DB, API, etc.)
# return raw data or ToolResult
return {"matches": [{"id": "doc1", "score": 0.92}]}
spec = ToolSpec(
name="PINECONE",
description="Vector search over Pinecone",
input_schema={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
example_calls=[],
cost_hint="medium",
latency_hint_ms=200,
side_effects=False,
cacheable=True,
cancellable=True,
)
registry = ToolRegistry()
registry.register_function(pinecone_search, spec)
If you prefer decorators:
from swiftagents.core import ToolRegistry, ToolSpec, tool
spec = ToolSpec(
name="PINECONE",
description="Vector search over Pinecone",
input_schema={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
example_calls=[],
cost_hint="medium",
latency_hint_ms=200,
side_effects=False,
cacheable=True,
cancellable=True,
)
@tool(spec)
async def pinecone_tool(query: str):
return {"matches": [{"id": "doc1", "score": 0.92}]}
registry = ToolRegistry()
registry.register(pinecone_tool)
Routing (logprob-gated)
Routing prompts the same LLM to output TOOL=<LABEL> where LABEL is NONE or a shortlist tool name.
Confidence is computed using logprobs, entropy, and margin. Low confidence triggers bounded speculation.
Multi-tool routing modes
AgentConfig.multi_tool_mode controls how the runtime selects multiple tools:
single: default single-label routing (bounded speculation when uncertain).multi_label: pick multiple tools from one router call using logprob thresholds.multi_intent: lightweight heuristic splitting, then route each segment.decompose: logprob-gated split decision + LLM decomposition into sub-questions, then route each.
All multi-tool modes merge tool evidence and produce one final answer.
Judge
Judge behaves like a tool. It can be disabled, run a cheap LLM, and optionally escalate to a stronger LLM.
It can also run deterministic stage0 checks.
Caching and observability
- Tool and model decision caches with TTL
- Structured trace events
- Token usage metrics and wasted work ratio
Examples
python -m swiftagents.examples.tool_selection
python -m swiftagents.examples.speculative_execution_demo
python -m swiftagents.examples.function_tool_demo
Benchmarks
python -m swiftagents.benchmarks.run_benchmark
Tests
pytest
Design notes
- The router is logprob-native; labels should be compact and stable (prefer short uppercase names).
- Speculation is bounded to two tools and never speculative for side-effecting tools unless explicitly allowed.
- All runtime stages are async-first.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swiftagents-0.1.4.tar.gz.
File metadata
- Download URL: swiftagents-0.1.4.tar.gz
- Upload date:
- Size: 28.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b2517f3f7932ca5701473bc33111ece118ed1e1be7623877b55bb643beabdf3
|
|
| MD5 |
d284885f518ddf688b58e46b8b2f1827
|
|
| BLAKE2b-256 |
7e75f958e209dbbf2fc26a22f9558f5c956081a7b84f5e5a0e89104df4d59a04
|
File details
Details for the file swiftagents-0.1.4-py3-none-any.whl.
File metadata
- Download URL: swiftagents-0.1.4-py3-none-any.whl
- Upload date:
- Size: 33.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
583114a914d3b2051e4dff8540d644c2e6900689642acc9ccf89e9da45999d4e
|
|
| MD5 |
62b46850d2d6832248697865fef9f8cb
|
|
| BLAKE2b-256 |
1f8526efa9724121bd8611fc658347fe210ee4c1a684d91def8fecfbc144f081
|