Lightning-paywalled FastAPI service — niche: OTel-Compatible Per-Task Cost Splitter for Multi-Agent Systems
Project description
milo-agent-trace-cost-attributor
OTel-compatible per-task cost splitter for multi-agent systems. Free OSS processor + Lightning-paywalled batch API. No KYC, no signup, no $300/mo SaaS floor.
What it does
You're running a multi-agent system. Maybe LangGraph, maybe CrewAI, maybe
hand-rolled. Your traces look like a forest of gen_ai.* spans. Your
finance team wants to know:
- How much did each agent cost this month?
- How much did each user / tenant / workflow cost?
- Which agent is the budget hog?
Existing tools (Langfuse, LangSmith, Braintrust, Helicone, Arize Phoenix, Datadog LLM Obs) answer this — but they want $300–$1000/month minimum and they split by model, not by task. If you want per-tenant or per-agent rollups you're rebuilding it in BigQuery.
This service is the missing primitive. Drop in an OTel trace JSON, get back a per-attribution-key cost breakdown:
{
"trace_id": "trace-multi-agent-001",
"total_usd": 0.01910,
"total_tokens": {"input": 2500, "output": 1000},
"span_count": 4,
"attributions": [
{"key": "writer_agent", "cost_usd": 0.0114, "input_tokens": 800, "output_tokens": 600},
{"key": "research_agent", "cost_usd": 0.00725, "input_tokens": 1500, "output_tokens": 350},
{"key": "critic_agent", "cost_usd": 0.00045, "input_tokens": 200, "output_tokens": 50}
]
}
That's it. JSON in, JSON out, no dashboard, no account.
Endpoint catalog
| Endpoint | Tier | Pricing | What it does |
|---|---|---|---|
GET / |
free | — | Service info + endpoint catalog + free sample |
POST /attribute |
free | 10 req / 60s / IP | Single-trace attribution. Body: OTel trace JSON. Returns full report + summary. |
POST /attribute/batch |
paid | 100 sats per 10k spans | Multi-trace batch. First call returns 402 + BOLT-11; re-call with ?payment_hash=<h> after settlement. |
GET /pro |
paid | $9 (one-time) | Legacy worked-example endpoint (kept for compat). |
Quick start
pip install -e .
python -m milo_agent_trace_cost_attributor.server --port 8101
# Free single-trace attribution:
curl -X POST http://127.0.0.1:8101/attribute \
-H "content-type: application/json" \
-d @tests/fixtures/multi_agent_flat.json
# Paid batch:
curl -X POST http://127.0.0.1:8101/attribute/batch \
-H "content-type: application/json" \
-d '{"traces": [<otel_trace_1>, <otel_trace_2>, ...]}'
# → 402 + BOLT-11 + batch metadata. Pay invoice in any LN wallet, then:
curl -X POST "http://127.0.0.1:8101/attribute/batch?payment_hash=<h>" \
-H "content-type: application/json" \
-d '{"traces": [...]}'
Trace shapes accepted
Both common OpenTelemetry serializations work:
A. Flat agent-framework shape (LangGraph, CrewAI, custom):
{
"traceId": "...",
"spans": [
{"spanId": "...", "name": "llm.chat",
"startTimeUnixNano": ..., "endTimeUnixNano": ...,
"attributes": {
"agent.name": "research_agent",
"gen_ai.request.model": "gpt-4o",
"gen_ai.usage.input_tokens": 1000,
"gen_ai.usage.output_tokens": 250
}}
]
}
B. OTLP-wire shape (opentelemetry-instrumentation-openai-v2 etc.):
{
"resourceSpans": [{
"resource": {"attributes": [{"key": "service.name", "value": {"stringValue": "..."}}]},
"scopeSpans": [{"spans": [{ ... }]}]
}]
}
Attribution-key precedence
The processor picks the bucket key for each span using the first match of:
attributes["agent.name"](OpenInference / agentic convention)attributes["gen_ai.agent.name"](OTel GenAI semconv, WIP)attributes["service.name"](OTel resource attribute)span.name(fallback)
Override with your own field by passing custom logic in code (see
domain.CostAttributor).
Cost-basis modes
Each per-span cost record carries a cost_basis:
token_usage— span hadgen_ai.usage.*tokens AND model is in the embedded price table. Highest confidence.token_no_price— tokens present but model unknown. Cost = 0; tokens surface so the consumer can price them themselves.duration_share— caller passedtrace_total_usdand span lacks tokens; cost prorated by duration share of total trace duration.unknown— no signal at all. Cost = 0.
Pricing
Free OSS + free /attribute endpoint (rate-limited to 10/min/IP).
Hosted batch: 100 sats per 10,000 spans processed (~$0.04/10k spans at $40k/BTC). Pay-per-call via Lightning Network — no subscription, no KYC, no committee. Compare:
| Tool | Floor | Per-call |
|---|---|---|
| Langfuse Cloud | $0 → $89/mo for >100k events | included in plan |
| LangSmith | $39/mo Plus, $499/mo Pro | trace-bundled |
| Braintrust | $249/mo Pro | event-bundled |
| Helicone Pro | $80/mo | included |
| this service | $0 | 100 sats / 10k spans (~$0.04) |
Badge embed
Drop this in your repo README to attribute back:
[](https://github.com/miloantaeus/milo-agent-trace-cost-attributor)
30-day kill criterion
Per Milo's market-truth doctrine, every SKU has a deprecation criterion.
This one's: if no third party submits the first POST /attribute call
within 30 days of public launch, kill the hosted service and keep only
the OSS package. Distribution paths to publish: PyPI, OpenTelemetry
contrib registry PR, awesome-llm-observability list.
Tests
pip install -e ".[test]"
pytest -q
# 38 passed
Architecture
Wired to milo-paywall-kit from day zero. Lightning provider, ledger, and HTTP-402 mediation live in the kit — this SKU only owns:
domain.py—CostAttributor.attribute(),summarize(), price tableserver.py— FastAPI routes (/,/attribute,/attribute/batch,/pro), in-process IP rate-limiter
Lightning Network section
This service uses milo-paywall-kit for L402-style
payment. To pay /attribute/batch: hit the endpoint, get a BOLT-11
invoice in the 402 body, pay it in any LN wallet (Alby, Phoenix, Wallet
of Satoshi, Zeus), then re-call with ?payment_hash=<hash>. The
settlement watcher (milo-agent-trace-cost-attributor-lightning-watcher)
flips the ledger to paid within ~60s of settlement.
Stacker.News launch template
**milo-agent-trace-cost-attributor** — per-task cost splitter for multi-agent LLM systems, paid in sats
If you run LangGraph / CrewAI / any multi-agent system, you've hit this:
your OTel traces have hundreds of gen_ai.* spans and your finance team
wants per-agent cost rollups. Existing tools (Langfuse, LangSmith,
Braintrust) start at $300+/mo and split by model, not by task.
Built a tiny FastAPI processor. POST /attribute with an OTel trace JSON,
get back per-agent / per-user / per-tenant cost breakdown. Free for
single traces (rate-limited). Batch at 100 sats per 10k spans, paid via
Lightning. No KYC, no signup.
Free: <your-deploy-url>/attribute
Paid batch: <your-deploy-url>/attribute/batch
Code: https://github.com/miloantaeus/milo-agent-trace-cost-attributor
Zap me at <your-LN-address> if it saves your finance team an afternoon.
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file milo_agent_trace_cost_attributor-0.1.0-py3-none-any.whl.
File metadata
- Download URL: milo_agent_trace_cost_attributor-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba0301909c0a93b0d2baf0616ba28d5a553e092a2af432452c3a5c24b47b4891
|
|
| MD5 |
ee0a1b7ed73612979badc433a20e0a9c
|
|
| BLAKE2b-256 |
cf01a237efabeb757ea0e39146d25b3579fcb71da7257146bc68caa4be751837
|