pytest for AI agents — trace, debug and catch regressions in LLM swarms

These details have not been verified by PyPI

Project description

SwarmTrace

Observability for AI agents — trace, debug, and monitor with 2 lines of code

Install

pip install swarmtrace

Quick Start

from tracely import observe

@observe
def my_agent(question):
    return llm.chat(question)

my_agent("What is machine learning?")

swarmtrace    # view traces in terminal

Every call is recorded — latency, tokens, cost, errors. Nothing else to configure.

Single Agent

Wrap your agent with @observe. Any LLM or tool calls inside it get tagged with kind="llm" or kind="tool" so they roll up into the agent's stats — they never appear as phantom agents on the dashboard.

from tracely import observe, init

init(api_key="your-key", endpoint="https://swarmtrace.vercel.app")

@observe
def my_agent(query):
    plan = call_llm(query)
    return search_web(plan)

@observe(kind="llm")
def call_llm(prompt):
    return client.chat(model="gpt-4o-mini", messages=[...])

@observe(kind="tool")
def search_web(q):
    ...

One agent card on the dashboard. call_llm and search_web fold their tokens, cost, and errors into my_agent — they never get their own card.

Multi-Agent Swarms

Every bare @observe is its own agent card. Nesting is handled automatically via contextvars — no IDs, no config.

from tracely import observe

@observe
def researcher(q):
    return call_llm(f"Research: {q}")

@observe
def summarizer(text):
    return call_llm(f"Summarize: {text}")

@observe
def orchestrator(q):
    research = researcher(q)
    return summarizer(research)

orchestrator("What is AGI?")

▶ orchestrator    4.2s  |  7 in / 78 out   |  $0.0003
  ▶ researcher    3.4s  |  7 in / 330 out  |  $0.0013
  ▶ summarizer    0.8s  |  338 in / 78 out |  $0.0005

Three agent cards on the dashboard — one per named agent. Sub-calls (call_llm) fold into whichever agent invoked them.

Span Kinds

Kind	Decorator	Dashboard
`agent`	`@observe` (default)	Own card — tasks, tokens, cost, status
`llm`	`@observe(kind="llm")`	Rolls up into calling agent
`tool`	`@observe(kind="tool")`	Rolls up into calling agent
`function`	`@observe(kind="function")`	Rolls up into calling agent

The rule: only functions you want as separate dashboard cards get bare @observe. Everything else gets a kind=.

Async Support

import asyncio
from tracely import observe

@observe
async def async_agent(q):
    return await llm.achat(q)

@observe
async def orchestrator(q):
    results = await asyncio.gather(
        async_agent(q),
        async_agent(q + " — deep dive"),
    )
    return " | ".join(results)

asyncio.run(orchestrator("Explain transformers"))

Live Cost Tracking

Automatic cost calculation for any model from any provider — powered by LiteLLM's live pricing registry.

@observe
def agent(q):
    # OpenAI, Anthropic, Google, Mistral, DeepSeek,
    # Groq, Cohere, xAI — cost tracked automatically
    return client.chat(model="gpt-4o-mini", messages=[...])

Custom or fine-tuned models:

from tracely import set_model_pricing

set_model_pricing("my-finetune", input_per_million=5.00, output_per_million=15.00)

Token Budget

Stop runaway agents before they burn your budget.

from tracely import observe, budget

@observe
@budget(max_tokens=10_000, on_exceed="warn")   # or "stop"
def agent(q):
    return llm.chat(q)

Regression Detection

Catch when a prompt change breaks your agent's behavior.

pip install swarmtrace[regression]

from tracely.regression import compare

compare(
    my_agent,
    inputs=["What is ML?", "How does Python work?", "What is an API?"],
    version_a_prompt="You are a helpful assistant.",
    version_b_prompt="Reply only in emojis.",
    threshold=0.6,
)

INPUT                    SIMILARITY   REGRESSION?
What is ML?              0.10         🔴 YES
How does Python work?    0.15         🔴 YES
What is an API?          0.12         🔴 YES

Result: 3/3 regressions detected

Tool Attention

Reduce token overhead by up to 95% — only pass relevant tools to each agent call, scored via ISO Scoring (arXiv:2604.21816).

pip install swarmtrace[tools]

from tracely import ToolAttention

ta = ToolAttention(tools=all_my_tools)

@observe
def agent(query):
    relevant_tools = ta.select(query, top_k=3)
    return llm.chat(query, tools=relevant_tools)

Remote Dashboard

Send traces to the SwarmTrace dashboard for live monitoring.

from tracely import init, observe

init(
    api_key="your-swarmtrace-api-key",
    endpoint="https://swarmtrace.vercel.app",
)

@observe
def my_agent(q):
    ...

Or via environment variables:

export SWARMTRACE_API_KEY=your-key
export SWARMTRACE_ENDPOINT=https://swarmtrace.vercel.app

CLI

swarmtrace                       # last 100 traces
swarmtrace --limit 50            # last 50
swarmtrace-replay <id>           # replay any trace
swarmtrace-export --format json
swarmtrace-export --format csv

vs LangSmith

Feature	SwarmTrace	LangSmith
Open source	✅	❌
Works offline	✅	❌
Any LLM / any framework	✅	❌ LangChain only
Live cost tracking	✅ all models	✅
Regression detection	✅	❌
Token budget enforcement	✅	❌
Tool attention (ISO)	✅	❌
Setup	2 lines	SDK + account
Price	Free	$20/month

Optional Extras

pip install swarmtrace[regression]   # AI regression detection
pip install swarmtrace[tools]        # Tool attention + FAISS
pip install swarmtrace[budget]       # Token budget with tiktoken
pip install swarmtrace[scraper]      # Web scraping traces
pip install swarmtrace[all]          # Everything

AMD MI300X Benchmarks

Tested on AMD Instinct MI300X 192GB via AMD Developer Cloud.

Metric	Value
Swarms tested	5
Total agent calls	20
Avg orchestrator latency	6.1s
Avg researcher latency	1.8s
Trace overhead	< 1ms

Built with ❤️ at AMD Hackathon 2026 by Ravi Kumar

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

Jun 17, 2026

0.4.0

Jun 17, 2026

This version

0.3.1

Jun 15, 2026

0.3.0

Jun 15, 2026

0.2.1

Jun 14, 2026

0.2.0

Jun 12, 2026

0.1.9

Jun 12, 2026

0.1.8

Jun 9, 2026

0.1.7

May 7, 2026

0.1.6

May 7, 2026

0.1.5

May 7, 2026

0.1.4

May 6, 2026

0.1.3

May 4, 2026

0.1.2

May 1, 2026

0.1.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swarmtrace-0.3.1.tar.gz (20.9 kB view details)

Uploaded Jun 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

swarmtrace-0.3.1-py3-none-any.whl (22.7 kB view details)

Uploaded Jun 15, 2026 Python 3

File details

Details for the file swarmtrace-0.3.1.tar.gz.

File metadata

Download URL: swarmtrace-0.3.1.tar.gz
Upload date: Jun 15, 2026
Size: 20.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for swarmtrace-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`31a425b6b36372974e3d135c3591f5cff96ede4acd24546406da810ad4f1eb6c`
MD5	`ec5ac64a51d82414f5f12af247ec683e`
BLAKE2b-256	`7aafc574f2bd3cc78f0291762a9d8cb8004a649025e6ea3350608145744d0fb2`

See more details on using hashes here.

File details

Details for the file swarmtrace-0.3.1-py3-none-any.whl.

File metadata

Download URL: swarmtrace-0.3.1-py3-none-any.whl
Upload date: Jun 15, 2026
Size: 22.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for swarmtrace-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a9128df5104c451201d22edab53fc3ab0e7b7d2b68aa67bba57689ed2e684b7`
MD5	`4d5859c95357824505bc35ef2032478e`
BLAKE2b-256	`5efad495fe9d24e5556e4abd245f242aa39a6143e932cfbfdeb709c342dd0f58`

See more details on using hashes here.

swarmtrace 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

SwarmTrace

Install

Quick Start

Single Agent

Multi-Agent Swarms

Span Kinds

Async Support

Live Cost Tracking

Token Budget

Regression Detection

Tool Attention

Remote Dashboard

CLI

vs LangSmith

Optional Extras

AMD MI300X Benchmarks

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes