Skip to main content

pytest for AI agents — trace, debug and catch regressions in LLM swarms

Project description

SwarmTrace

Observability for AI agents — trace, debug, and monitor with 2 lines of code

PyPI Python License Built at AMD Hackathon

Dashboard · PyPI · GitHub


Install

pip install swarmtrace

Quick Start

from tracely import observe

@observe
def my_agent(question):
    return llm.chat(question)

my_agent("What is machine learning?")
swarmtrace    # view traces in terminal

Every call is recorded — latency, tokens, cost, errors. Nothing else to configure.


Single Agent

Wrap your agent with @observe. Any LLM or tool calls inside it get tagged with kind="llm" or kind="tool" so they roll up into the agent's stats — they never appear as phantom agents on the dashboard.

from tracely import observe, init

init(api_key="your-key", endpoint="https://swarmtrace.vercel.app")

@observe
def my_agent(query):
    plan = call_llm(query)
    return search_web(plan)

@observe(kind="llm")
def call_llm(prompt):
    return client.chat(model="gpt-4o-mini", messages=[...])

@observe(kind="tool")
def search_web(q):
    ...

One agent card on the dashboard. call_llm and search_web fold their tokens, cost, and errors into my_agent — they never get their own card.


Multi-Agent Swarms

Every bare @observe is its own agent card. Nesting is handled automatically via contextvars — no IDs, no config.

from tracely import observe

@observe
def researcher(q):
    return call_llm(f"Research: {q}")

@observe
def summarizer(text):
    return call_llm(f"Summarize: {text}")

@observe
def orchestrator(q):
    research = researcher(q)
    return summarizer(research)

orchestrator("What is AGI?")
▶ orchestrator    4.2s  |  7 in / 78 out   |  $0.0003
  ▶ researcher    3.4s  |  7 in / 330 out  |  $0.0013
  ▶ summarizer    0.8s  |  338 in / 78 out |  $0.0005

Three agent cards on the dashboard — one per named agent. Sub-calls (call_llm) fold into whichever agent invoked them.


Span Kinds

Kind Decorator Dashboard
agent @observe (default) Own card — tasks, tokens, cost, status
llm @observe(kind="llm") Rolls up into calling agent
tool @observe(kind="tool") Rolls up into calling agent
function @observe(kind="function") Rolls up into calling agent

The rule: only functions you want as separate dashboard cards get bare @observe. Everything else gets a kind=.


Async Support

import asyncio
from tracely import observe

@observe
async def async_agent(q):
    return await llm.achat(q)

@observe
async def orchestrator(q):
    results = await asyncio.gather(
        async_agent(q),
        async_agent(q + " — deep dive"),
    )
    return " | ".join(results)

asyncio.run(orchestrator("Explain transformers"))

Live Cost Tracking

Automatic cost calculation for any model from any provider — powered by LiteLLM's live pricing registry.

@observe
def agent(q):
    # OpenAI, Anthropic, Google, Mistral, DeepSeek,
    # Groq, Cohere, xAI — cost tracked automatically
    return client.chat(model="gpt-4o-mini", messages=[...])

Custom or fine-tuned models:

from tracely import set_model_pricing

set_model_pricing("my-finetune", input_per_million=5.00, output_per_million=15.00)

Token Budget

Stop runaway agents before they burn your budget.

from tracely import observe, budget

@observe
@budget(max_tokens=10_000, on_exceed="warn")   # or "stop"
def agent(q):
    return llm.chat(q)

Regression Detection

Catch when a prompt change breaks your agent's behavior.

pip install swarmtrace[regression]
from tracely.regression import compare

compare(
    my_agent,
    inputs=["What is ML?", "How does Python work?", "What is an API?"],
    version_a_prompt="You are a helpful assistant.",
    version_b_prompt="Reply only in emojis.",
    threshold=0.6,
)
INPUT                    SIMILARITY   REGRESSION?
What is ML?              0.10         🔴 YES
How does Python work?    0.15         🔴 YES
What is an API?          0.12         🔴 YES

Result: 3/3 regressions detected

Tool Attention

Reduce token overhead by up to 95% — only pass relevant tools to each agent call, scored via ISO Scoring (arXiv:2604.21816).

pip install swarmtrace[tools]
from tracely import ToolAttention

ta = ToolAttention(tools=all_my_tools)

@observe
def agent(query):
    relevant_tools = ta.select(query, top_k=3)
    return llm.chat(query, tools=relevant_tools)

Remote Dashboard

Send traces to the SwarmTrace dashboard for live monitoring.

from tracely import init, observe

init(
    api_key="your-swarmtrace-api-key",
    endpoint="https://swarmtrace.vercel.app",
)

@observe
def my_agent(q):
    ...

Or via environment variables:

export SWARMTRACE_API_KEY=your-key
export SWARMTRACE_ENDPOINT=https://swarmtrace.vercel.app

CLI

swarmtrace                       # last 100 traces
swarmtrace --limit 50            # last 50
swarmtrace-replay <id>           # replay any trace
swarmtrace-export --format json
swarmtrace-export --format csv

vs LangSmith

Feature SwarmTrace LangSmith
Open source
Works offline
Any LLM / any framework ❌ LangChain only
Live cost tracking ✅ all models
Regression detection
Token budget enforcement
Tool attention (ISO)
Setup 2 lines SDK + account
Price Free $20/month

Optional Extras

pip install swarmtrace[regression]   # AI regression detection
pip install swarmtrace[tools]        # Tool attention + FAISS
pip install swarmtrace[budget]       # Token budget with tiktoken
pip install swarmtrace[scraper]      # Web scraping traces
pip install swarmtrace[all]          # Everything

AMD MI300X Benchmarks

Tested on AMD Instinct MI300X 192GB via AMD Developer Cloud.

Metric Value
Swarms tested 5
Total agent calls 20
Avg orchestrator latency 6.1s
Avg researcher latency 1.8s
Trace overhead < 1ms

Built with ❤️ at AMD Hackathon 2026 by Ravi Kumar

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swarmtrace-0.3.1.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

swarmtrace-0.3.1-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file swarmtrace-0.3.1.tar.gz.

File metadata

  • Download URL: swarmtrace-0.3.1.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for swarmtrace-0.3.1.tar.gz
Algorithm Hash digest
SHA256 31a425b6b36372974e3d135c3591f5cff96ede4acd24546406da810ad4f1eb6c
MD5 ec5ac64a51d82414f5f12af247ec683e
BLAKE2b-256 7aafc574f2bd3cc78f0291762a9d8cb8004a649025e6ea3350608145744d0fb2

See more details on using hashes here.

File details

Details for the file swarmtrace-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: swarmtrace-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for swarmtrace-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4a9128df5104c451201d22edab53fc3ab0e7b7d2b68aa67bba57689ed2e684b7
MD5 4d5859c95357824505bc35ef2032478e
BLAKE2b-256 5efad495fe9d24e5556e4abd245f242aa39a6143e932cfbfdeb709c342dd0f58

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page