Skip to main content

Lightning-paywalled FastAPI service — niche: OTel-Compatible Per-Task Cost Splitter for Multi-Agent Systems

Project description

milo-agent-trace-cost-attributor

OTel-compatible per-task cost splitter for multi-agent systems. Free OSS processor + Lightning-paywalled batch API. No KYC, no signup, no $300/mo SaaS floor.

Lightning License: MIT OTel GenAI

What it does

You're running a multi-agent system. Maybe LangGraph, maybe CrewAI, maybe hand-rolled. Your traces look like a forest of gen_ai.* spans. Your finance team wants to know:

  • How much did each agent cost this month?
  • How much did each user / tenant / workflow cost?
  • Which agent is the budget hog?

Existing tools (Langfuse, LangSmith, Braintrust, Helicone, Arize Phoenix, Datadog LLM Obs) answer this — but they want $300–$1000/month minimum and they split by model, not by task. If you want per-tenant or per-agent rollups you're rebuilding it in BigQuery.

This service is the missing primitive. Drop in an OTel trace JSON, get back a per-attribution-key cost breakdown:

{
  "trace_id": "trace-multi-agent-001",
  "total_usd": 0.01910,
  "total_tokens": {"input": 2500, "output": 1000},
  "span_count": 4,
  "attributions": [
    {"key": "writer_agent",   "cost_usd": 0.0114, "input_tokens": 800,  "output_tokens": 600},
    {"key": "research_agent", "cost_usd": 0.00725, "input_tokens": 1500, "output_tokens": 350},
    {"key": "critic_agent",   "cost_usd": 0.00045, "input_tokens": 200,  "output_tokens": 50}
  ]
}

That's it. JSON in, JSON out, no dashboard, no account.

Endpoint catalog

Endpoint Tier Pricing What it does
GET / free Service info + endpoint catalog + free sample
POST /attribute free 10 req / 60s / IP Single-trace attribution. Body: OTel trace JSON. Returns full report + summary.
POST /attribute/batch paid 100 sats per 10k spans Multi-trace batch. First call returns 402 + BOLT-11; re-call with ?payment_hash=<h> after settlement.
GET /pro paid $9 (one-time) Legacy worked-example endpoint (kept for compat).

Quick start

pip install -e .
python -m milo_agent_trace_cost_attributor.server --port 8101

# Free single-trace attribution:
curl -X POST http://127.0.0.1:8101/attribute \
  -H "content-type: application/json" \
  -d @tests/fixtures/multi_agent_flat.json

# Paid batch:
curl -X POST http://127.0.0.1:8101/attribute/batch \
  -H "content-type: application/json" \
  -d '{"traces": [<otel_trace_1>, <otel_trace_2>, ...]}'
# → 402 + BOLT-11 + batch metadata. Pay invoice in any LN wallet, then:
curl -X POST "http://127.0.0.1:8101/attribute/batch?payment_hash=<h>" \
  -H "content-type: application/json" \
  -d '{"traces": [...]}'

Trace shapes accepted

Both common OpenTelemetry serializations work:

A. Flat agent-framework shape (LangGraph, CrewAI, custom):

{
  "traceId": "...",
  "spans": [
    {"spanId": "...", "name": "llm.chat",
     "startTimeUnixNano": ..., "endTimeUnixNano": ...,
     "attributes": {
       "agent.name": "research_agent",
       "gen_ai.request.model": "gpt-4o",
       "gen_ai.usage.input_tokens": 1000,
       "gen_ai.usage.output_tokens": 250
     }}
  ]
}

B. OTLP-wire shape (opentelemetry-instrumentation-openai-v2 etc.):

{
  "resourceSpans": [{
    "resource": {"attributes": [{"key": "service.name", "value": {"stringValue": "..."}}]},
    "scopeSpans": [{"spans": [{ ... }]}]
  }]
}

Attribution-key precedence

The processor picks the bucket key for each span using the first match of:

  1. attributes["agent.name"] (OpenInference / agentic convention)
  2. attributes["gen_ai.agent.name"] (OTel GenAI semconv, WIP)
  3. attributes["service.name"] (OTel resource attribute)
  4. span.name (fallback)

Override with your own field by passing custom logic in code (see domain.CostAttributor).

Cost-basis modes

Each per-span cost record carries a cost_basis:

  • token_usage — span had gen_ai.usage.* tokens AND model is in the embedded price table. Highest confidence.
  • token_no_price — tokens present but model unknown. Cost = 0; tokens surface so the consumer can price them themselves.
  • duration_share — caller passed trace_total_usd and span lacks tokens; cost prorated by duration share of total trace duration.
  • unknown — no signal at all. Cost = 0.

Pricing

Free OSS + free /attribute endpoint (rate-limited to 10/min/IP).

Hosted batch: 100 sats per 10,000 spans processed (~$0.04/10k spans at $40k/BTC). Pay-per-call via Lightning Network — no subscription, no KYC, no committee. Compare:

Tool Floor Per-call
Langfuse Cloud $0 → $89/mo for >100k events included in plan
LangSmith $39/mo Plus, $499/mo Pro trace-bundled
Braintrust $249/mo Pro event-bundled
Helicone Pro $80/mo included
this service $0 100 sats / 10k spans (~$0.04)

Badge embed

Drop this in your repo README to attribute back:

[![cost-attributed by milo-agent-trace-cost-attributor](https://img.shields.io/badge/cost%20attribution-OTel-blue)](https://github.com/miloantaeus/milo-agent-trace-cost-attributor)

30-day kill criterion

Per Milo's market-truth doctrine, every SKU has a deprecation criterion. This one's: if no third party submits the first POST /attribute call within 30 days of public launch, kill the hosted service and keep only the OSS package. Distribution paths to publish: PyPI, OpenTelemetry contrib registry PR, awesome-llm-observability list.

Tests

pip install -e ".[test]"
pytest -q
# 38 passed

Architecture

Wired to milo-paywall-kit from day zero. Lightning provider, ledger, and HTTP-402 mediation live in the kit — this SKU only owns:

  • domain.pyCostAttributor.attribute(), summarize(), price table
  • server.py — FastAPI routes (/, /attribute, /attribute/batch, /pro), in-process IP rate-limiter

Lightning Network section

This service uses milo-paywall-kit for L402-style payment. To pay /attribute/batch: hit the endpoint, get a BOLT-11 invoice in the 402 body, pay it in any LN wallet (Alby, Phoenix, Wallet of Satoshi, Zeus), then re-call with ?payment_hash=<hash>. The settlement watcher (milo-agent-trace-cost-attributor-lightning-watcher) flips the ledger to paid within ~60s of settlement.

Stacker.News launch template

**milo-agent-trace-cost-attributor** — per-task cost splitter for multi-agent LLM systems, paid in sats

If you run LangGraph / CrewAI / any multi-agent system, you've hit this:
your OTel traces have hundreds of gen_ai.* spans and your finance team
wants per-agent cost rollups. Existing tools (Langfuse, LangSmith,
Braintrust) start at $300+/mo and split by model, not by task.

Built a tiny FastAPI processor. POST /attribute with an OTel trace JSON,
get back per-agent / per-user / per-tenant cost breakdown. Free for
single traces (rate-limited). Batch at 100 sats per 10k spans, paid via
Lightning. No KYC, no signup.

Free: <your-deploy-url>/attribute
Paid batch: <your-deploy-url>/attribute/batch
Code: https://github.com/miloantaeus/milo-agent-trace-cost-attributor

Zap me at <your-LN-address> if it saves your finance team an afternoon.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file milo_agent_trace_cost_attributor-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for milo_agent_trace_cost_attributor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba0301909c0a93b0d2baf0616ba28d5a553e092a2af432452c3a5c24b47b4891
MD5 ee0a1b7ed73612979badc433a20e0a9c
BLAKE2b-256 cf01a237efabeb757ea0e39146d25b3579fcb71da7257146bc68caa4be751837

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page