Skip to main content

Migrate from LLM observability tools (Helicone, Langfuse, LangSmith) to TokenSentinel.

Project description

TokenSentinel Migrate (tokensentinel-migrate)

Migrate from LLM observability tools — Helicone, Langfuse, and LangSmith — into TokenSentinel without losing the months of trace data you've already accumulated.

Why this exists

In March 2026, Mintlify acquired Helicone. Mintlify's interest is the AI documentation play; Helicone is now in maintenance mode — security patches and bug fixes only, no new integrations, no new analytics, no roadmap. The 16,000 organisations that built on Helicone are all looking for somewhere else to go.

If you were one of them, this tool is your bridge. In one command it pulls your Helicone trace history, replays it through TokenSentinel's eight waste detection rules, and shows you the dollars you would have saved if you'd had intervention turned on. Then, if you want, it backfills the resulting events to your TokenSentinel cloud project so the dashboard reflects history-plus-now from day one.

It is MIT-licensed, stdlib-only apart from the TokenSentinel SDK itself, and runs entirely on your machine — your Helicone API key never leaves your laptop.

5-minute migration

pip install tokensentinel-migrate

python -m tokensentinel_migrate helicone \
    --helicone-api-key sk-helicone-... \
    --tokensentinel-endpoint https://api.tokensentinel.dev \
    --tokensentinel-api-key tsk_... \
    --project my-agent \
    --since 2026-04-09 \
    --dry-run

Sample output:

[migrate] Fetching Helicone traces since 2026-04-09...
[migrate]   page 1: 100 requests
[migrate]   page 2: 100 requests
[migrate]   page 3: 47 requests
[migrate] Fetched 247 traces (12 sessions inferred from heliconeproperty Helicone-Session-Id)
[migrate] Running TokenSentinel rules retroactively...
[migrate]   tool_loop:           3 firings
[migrate]   retry_storm:         1 firing
[migrate]   model_misroute:      8 firings
[migrate]   embedding_waste:     0 firings
[migrate]   (others):            0 firings
[migrate] 12 leak events would be backfilled (dry-run, not posted)
[migrate]
[migrate] Estimated cost saved if these had been intervened:
[migrate]   tool_loop savings:        $0.83
[migrate]   retry_storm savings:      $0.21
[migrate]   model_misroute savings:   $4.42
[migrate]   total:                    $5.46
[migrate]
[migrate] Re-run without --dry-run to backfill events to TokenSentinel cloud.

Re-running without --dry-run POSTs each event to the cloud's backfill endpoint so the dashboard's "tokens saved this week" counter reflects what TokenSentinel would have caught had it been wired in across the import window.

What gets migrated

For each Helicone request the importer pulls:

Helicone field TokenSentinel CallRecord field
provider, model provider, model
prompt_tokens, completion_tokens (or nested usage.*_tokens) matching fields
latency_ms latency_ms
created_at timestamp (UTC-normalised)
request_id request_hash (used by retry_storm for dedup)
properties["Helicone-Session-Id"]properties["session_id"]request_id session_id
body.messages or prompt raw_request.messages
body.input raw_request.input (drives embedding_waste)

Embedding-shaped models (anything with embedding in the name) are routed to embeddings.create so the embedding_waste rule fires correctly.

What you get back

Each Helicone request that triggered a TokenSentinel rule is converted into a LeakEvent and POSTed to your cloud project at <endpoint>/v1/events:backfill?project=<project> with the original timestamp preserved. That last detail matters: without it, the dashboard would attribute every backfilled event to "today" and the savings counter would double-count migrated history as live activity. With it, the dashboard timeline reflects the truth — these leaks happened on the days Helicone says they happened.

The CLI also surfaces a per-leak-type dollar savings estimate for the import window, summed from each event's estimated_burn field. That's the number to forward to your CFO.

Helicone API quirks worth knowing

A few footnotes from the Helicone integration:

  • POST /v1/request/query, not GET. Pagination + filter both ride in the JSON body. offset and limit are top-level keys; the SDK uses limit=100 per page (the maximum at the time of the Mintlify acquisition).
  • Timestamps come as Z-suffixed ISO-8601. datetime.fromisoformat on Python 3.10 needs the Z swapped for +00:00; we do that.
  • properties casing. Helicone's official SDK ships Helicone-Session-Id (mixed case); some community SDKs ship session_id. We check both, in that order, then fall back to request_id for one-call sessions.
  • Retry-After header. Sometimes seconds-as-integer, sometimes HTTP-date. We honour the integer form; HTTP-date callers get a 5-second default backoff. Both forms cap out at 60 seconds so a misbehaving deploy can't strand the CLI.
  • Non-2xx behaviour. 401/403 abort immediately (check your key); 429 sleeps and retries up to six times in a row before giving up; everything else is non-retryable and surfaces in stderr.

Langfuse

Langfuse is the largest OSS LLM observability project and the second migration target after Helicone. The Langfuse importer pulls every GENERATION observation from your traces and replays them through the same eight waste rules.

python -m tokensentinel_migrate langfuse \
    --langfuse-public-key pk-lf-... \
    --langfuse-secret-key sk-lf-... \
    --langfuse-base-url https://cloud.langfuse.com \
    --tokensentinel-endpoint https://api.tokensentinel.dev \
    --tokensentinel-api-key tsk_... \
    --project my-agent \
    --since 2026-04-09 \
    --dry-run

Self-hosted Langfuse users point --langfuse-base-url at their own deployment — the default is https://cloud.langfuse.com.

Langfuse gotchas:

  • Two-key auth. Langfuse uses HTTP Basic with the public key as the username and the secret key as the password. Both are required; the importer aborts with a clean error if either is missing.
  • Only type=="GENERATION" observations are imported. SPAN / EVENT rows don't represent real LLM calls and the rule engine has no meaningful interpretation for them — they're dropped during normalisation.
  • usage.unit == "CHARACTERS" zeros the token count. Langfuse customers who never wired token counting see a degraded cost estimate (the CallRecord still propagates so non-token rules like tool_loop and retry_storm fire correctly).
  • Embedding detection is lossy. Langfuse doesn't preserve the SDK method, so every CallRecord lands as messages.create. The embedding_waste rule under-fires on Langfuse imports relative to Helicone — a known tradeoff documented in the founder spec.

LangSmith

LangSmith is LangChain's hosted observability product and the default trace destination for any LangChain / LangGraph agent. The importer queries the /runs/query cursor-paginated endpoint.

python -m tokensentinel_migrate langsmith \
    --langsmith-api-key ls__... \
    --langsmith-base-url https://api.smith.langchain.com \
    --tokensentinel-endpoint https://api.tokensentinel.dev \
    --tokensentinel-api-key tsk_... \
    --project my-agent \
    --since 2026-04-09 \
    --dry-run

Enterprise LangSmith tenants point --langsmith-base-url at their per-tenant URL; the default is https://api.smith.langchain.com.

LangSmith gotchas:

  • Only run_type=="llm" runs are imported. chain / tool / retriever runs are dropped — the rule engine reads CallRecord.tool_calls from the LLM run's structured output instead, which captures the same signal more reliably.
  • Token counts live in two places. Newer LangSmith ships prompt_tokens / completion_tokens at the top level; older versions stash them under extra.invocation_params.usage. The importer checks both, in that order, before falling back to (0, 0).
  • Cursor pagination, not page numbers. The importer keeps re-POSTing the cursor from the previous response until the server hands back cursors.next == null. There's no way to know up front how many pages a date range will produce.
  • Provider inference is heuristic. LangSmith's _type field (anthropic-chat, chat-openai, etc.) drives a regex match; the model-name fallback (claude → anthropic, gpt → openai, …) catches non-standard _type values.

Roadmap

Importer Status When
Helicone shipping in v0.1.0 now
Langfuse shipping in v0.2.0 now
LangSmith shipping in v0.2.0 now

Each importer is a separate subcommand under python -m tokensentinel_migrate and a separate module under tokensentinel_migrate/. The shared infrastructure (_backfill.py and _retroactive.py) is provider-agnostic; adding a new importer is a couple-of-hundred lines of fetch + normalise + pagination glue.

Development

git clone https://github.com/tokensentinel/tokensentinel-migrate-python
cd tokensentinel-migrate-python
pip install -e ".[dev]"
python -m pytest
python -m ruff check tokensentinel_migrate tests

The test suite uses unittest.mock.patch('urllib.request.urlopen', ...) to inject canned Helicone responses and to verify the cloud-side backfill payload — no live network calls in CI.

License

MIT. See LICENSE.

Contact & Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokensentinel_migrate-0.1.0.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokensentinel_migrate-0.1.0-py3-none-any.whl (45.2 kB view details)

Uploaded Python 3

File details

Details for the file tokensentinel_migrate-0.1.0.tar.gz.

File metadata

  • Download URL: tokensentinel_migrate-0.1.0.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for tokensentinel_migrate-0.1.0.tar.gz
Algorithm Hash digest
SHA256 62b637d20c18bf2252920420108961bf4aaa2da51f6f9e276f5042e845dc2664
MD5 310db33e3f44e97dc18ff21112ff5978
BLAKE2b-256 b2cfe7b5cf192acd191183c04c25ce9433837d4f9c523658cfed7c3e5c824f61

See more details on using hashes here.

File details

Details for the file tokensentinel_migrate-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tokensentinel_migrate-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3918dd33403fe27f1025f5256d19d0e4a2c7fecb34d944442bfce9325a98f9f2
MD5 6ca423a09106a8405a07dcf72d1fcaee
BLAKE2b-256 df8941bb0d864819b180e619e9c9ccd8073301e6200d20ca64b7dff5467ea475

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page