One line to track every AI API cost. Sentry for AI costs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

deependra

These details have not been verified by PyPI

Project description

tokenly

One line to track every AI API cost. Sentry for AI costs. No proxy, no account, free forever.

import tokenly
tokenly.init()

That's it. Now every OpenAI / Anthropic / Google call you make is logged — tokens, cost, latency, cache hits — to a local SQLite file.

$ tokenly stats

  tokenly · Today
  ────────────────────────────────────────────────────
  Spend                    $4.21
  Calls                       89
  Input               1,240,500 tokens
  Output                210,400 tokens
  Cache read             87,200 tokens
  Avg latency            842 ms

Why

Your monthly AI bill came back at $847 and you have no idea which feature caused it.
Your bill swings 2-3× every quarter for no reason you can explain.
Every existing tool wants you to change your base URL, run a proxy, or create an account.
tokenly is a tracker, not a gateway. One line, zero config, local first.

Install

pip install tokenly

Python 3.10+. Zero runtime dependencies.

Use it

import tokenly
tokenly.init()

import openai
client = openai.OpenAI()
client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "hi"}],
)

Then any time:

tokenly stats              # today
tokenly stats --week       # last 7 days
tokenly stats --month      # this month
tokenly stats --by=model   # group by model
tokenly tail               # live stream
tokenly export > calls.csv
tokenly doctor             # diagnose setup

Tag calls by user / feature

tokenly.configure(tags={"user": "alice", "feature": "chat"})

Then:

tokenly stats --by=tag.user
tokenly stats --by=tag.feature

Budget alerts

export TOKENLY_DAILY_BUDGET=10   # raise BudgetExceeded when spend hits $10/day
export TOKENLY_DAILY_WARN=5      # warn at $5/day, keep going

Or in code:

tokenly.init(budget_usd_day=10, warn_usd_day=5)

Works with

Provider	Tracks
OpenAI	prompt / completion tokens, cached tokens, cost
Anthropic	input / output tokens, cache read, cache write, cost
Google Gemini	prompt / output tokens, cached content tokens, cost
DeepSeek, xAI, Mistral, Cohere	via pricing DB; patches coming

Because tokenly patches the underlying SDKs, LangChain, LlamaIndex, and any other framework built on these SDKs work automatically — no integration needed. See examples/langchain_example.py and examples/llamaindex_example.py.

OpenTelemetry GenAI export (optional)

Emit an OpenTelemetry span per tracked call, following the GenAI semantic conventions (gen_ai.provider.name, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens). That means tokenly plugs straight into Grafana, Datadog, Honeycomb, Jaeger, or any OTel-compatible backend — no extra integration.

pip install tokenly[otel]

import tokenly
tokenly.init(otel=True)   # or: export TOKENLY_OTEL=1

Span start_time is reconstructed from the measured latency so backends see a span that actually covers the model call, not a zero-width marker. The GenAI semconv is still experimental upstream — we track the latest and will bump as it stabilizes.

Where is the data?

By default: ~/.tokenly/log.db — a single SQLite file. One table, ten columns. Move it, query it, back it up, delete it. It's yours.

Pick any backend

SQLite is the default and needs nothing. For a team setup, point tokenly at your own MySQL or Postgres:

# One of these:
export TOKENLY_DB_URL="sqlite:///~/.tokenly/log.db"                 # default
export TOKENLY_DB_URL="mysql://user:pass@host:3306/tokenly"         # pip install tokenly[mysql]
export TOKENLY_DB_URL="postgresql://user:pass@host:5432/tokenly"    # pip install tokenly[postgres]

Or in code:

tokenly.init(db_url="postgresql://user:pass@db.internal/tokenly")

The schema is created automatically on first connect. The legacy TOKENLY_DB=/path/to.db env var still works (treated as a SQLite path).

Local dashboard

tokenly dashboard

Boots a local, read-only web dashboard on http://127.0.0.1:8787 (auto-picks a free port if that's taken) and opens your browser. Spend cards, cost-by-model bars, cost-over-time line chart, and a live table of recent calls. Tabs for Today / Week / Month / All. Refreshes every 5 seconds.

Stdlib HTTP server, no JS framework, Chart.js via CDN. Stays zero-dep. Pass --no-open for headless, --host 0.0.0.0 to expose on your LAN (no auth — only do this on trusted networks).

Dashboard security

Binds 127.0.0.1 by default — reachable only from the same machine.
--host 0.0.0.0 (or ::) prints a yellow warning at startup: no authentication, read-only, and reachable by anyone on your network. Run it behind a reverse proxy with auth before exposing it publicly.
Query params are validated: /api/timeseries?bucket= must be a positive int between 60 and 86 400 seconds; /api/recent?limit= must be in [1, 1000]. Bad input → HTTP 400 with a JSON error, no crashes, no resource exhaustion.
Tag keys in /api/by-tag?key=... are sanitized against an identifier allowlist so SQL injection is off the table.

Concurrency & shutdown

tokenly.init() is thread-safe and idempotent. Call it from any thread, any number of times — a module-level lock prevents duplicate writer threads or re-patched SDKs. If the configured DB URL changes across calls the writer restarts against the new URL.
Writes never block your API call. Every tracked row goes into a bounded in-memory queue (maxsize=10 000) drained by a background thread. Past the queue limit, new rows are dropped with a rate-limited warning — the caller still returns normally.
Shutdown is handled for you. An atexit hook flushes the queue and joins the writer with a 5 s timeout. Rows in flight at interpreter exit are persisted cleanly.
For workers about to be SIGKILLed (container teardown, cron timeout), call tokenly.flush(timeout=5.0) explicitly to force a drain + commit before you exit.

import tokenly
tokenly.init()
...
tokenly.flush()   # block until everything is on disk

Production notes

Batched writes. The writer coalesces up to 100 rows (or 500 ms) into one transaction — 1 000 calls turn into ~10 commits, not 1 000. Trades ≤500 ms of observability latency for dramatically lower disk I/O.
Budget check is O(1). Daily budget / warn thresholds are tracked in an in-memory counter seeded from the DB at startup and reset on UTC day rollover. No per-call SQL query.
Pricing auto-reloads. Tokenly compares pricing.json mtime on every lookup; the weekly sync_pricing.py cron doesn't need a process restart.
SQLite WAL. Default backend runs in WAL mode with wal_autocheckpoint=1000 and synchronous=NORMAL. Back up with sqlite3 ~/.tokenly/log.db ".backup /path/to/backup.db" — safe while the process is writing.
Tested Python versions. 3.10, 3.11, 3.12, 3.13. CI matrix runs the full test suite on all four.
Zero deps on the default path. The SQLite + OpenAI/Anthropic/Google install is stdlib only. MySQL / Postgres / OTel are opt-in extras.
Overhead per call. Measured at ~35 µs on the hot path (token clamp + cost lookup + queue put), well under 0.1 % of even a 20 ms model call.

Troubleshooting

tokenly doctor — one-shot diagnostic: tokenly version, resolved DB URL (password-masked), backend connect status, which provider SDKs are installed, which optional DB drivers are available, and the values of the TOKENLY_* env vars. Start here for any setup issue.
"no pricing for foo/bar" warning — that model isn't in pricing.json yet. The call is still logged at $0; open a PR with the rate.
Dashboard port already in use — tokenly auto-falls back to the next free port starting from 8787 and prints the chosen URL. Pass --port N to pin one explicitly.
Switching DB URLs — just call tokenly.init(db_url="...") again. The writer restarts cleanly against the new backend. Nothing in the old DB moves; both files remain on disk.
Logs leaking passwords? They shouldn't. doctor, configure(), and internal warnings all mask DB passwords via urllib.parse. If you see an unmasked URL in our output, please file an issue.

vs other tools

	tokenly	LiteLLM	Helicone	Langfuse
One-line setup	✓	✗	✗	✗
Requires URL change	✗	✓	✓	✗
Needs account	✗	✗	✓	✓
Local-first	✓	~	✗	~
Gateway / routing	✗	✓	✓	✗
Pure cost tracking	✓	~	~	~
Zero runtime deps	✓	✗	✗	✗

tokenly is tracking-only by design. If you want routing, fallbacks, or an auth proxy, use LiteLLM or Portkey. If you just want to know what you're spending, use tokenly.

Roadmap

OpenAI, Anthropic, Google auto-patch
CLI: stats, tail, export, reset, doctor
Tags and budget alerts
Streaming-response support (OpenAI, Anthropic)
Multi-DB backend: SQLite (default), MySQL, Postgres
Local web dashboard (tokenly dashboard)
OpenTelemetry GenAI export (pip install tokenly[otel])
Weekly auto-updated pricing DB
Node / TypeScript SDK (same storage)

License

Pricing numbers are best-effort; verify with the provider before basing decisions on them. Unknown models log with $0 cost; please PR them in src/tokenly/pricing.json.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

deependra

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.2

Apr 22, 2026

0.2.0

Apr 21, 2026

0.1.0

Apr 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenly-0.2.2.tar.gz (48.8 kB view details)

Uploaded Apr 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokenly-0.2.2-py3-none-any.whl (38.8 kB view details)

Uploaded Apr 22, 2026 Python 3

File details

Details for the file tokenly-0.2.2.tar.gz.

File metadata

Download URL: tokenly-0.2.2.tar.gz
Upload date: Apr 22, 2026
Size: 48.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenly-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`18c1073fbb2e91949c5037fa212d511d67255c72d53789836a10d527f4351194`
MD5	`675646fb6ecbbe180a6d4c96b67a339f`
BLAKE2b-256	`434b8a608e6ee57e2b72de9d5a53ae5193a4a263abff3d9a620030ff971362fc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenly-0.2.2.tar.gz:

Publisher: release.yml on deependra04/tokenly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokenly-0.2.2.tar.gz
- Subject digest: 18c1073fbb2e91949c5037fa212d511d67255c72d53789836a10d527f4351194
- Sigstore transparency entry: 1357140212
- Sigstore integration time: Apr 22, 2026
Source repository:
- Permalink: deependra04/tokenly@d05f169b087d3da4ed254dde9d6f18170e1664da
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/deependra04
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@d05f169b087d3da4ed254dde9d6f18170e1664da
- Trigger Event: push

File details

Details for the file tokenly-0.2.2-py3-none-any.whl.

File metadata

Download URL: tokenly-0.2.2-py3-none-any.whl
Upload date: Apr 22, 2026
Size: 38.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenly-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58d48710e8f40d6cc9beeffc0e18edced2dbd04a07b212628b51e7e7a85d2166`
MD5	`29f5c9e41fdd82d01539304620f2b033`
BLAKE2b-256	`e88838b7f2e4686c298370caa5419aa0afb4bb379588954be20dc33566e36001`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenly-0.2.2-py3-none-any.whl:

Publisher: release.yml on deependra04/tokenly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokenly-0.2.2-py3-none-any.whl
- Subject digest: 58d48710e8f40d6cc9beeffc0e18edced2dbd04a07b212628b51e7e7a85d2166
- Sigstore transparency entry: 1357140259
- Sigstore integration time: Apr 22, 2026
Source repository:
- Permalink: deependra04/tokenly@d05f169b087d3da4ed254dde9d6f18170e1664da
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/deependra04
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@d05f169b087d3da4ed254dde9d6f18170e1664da
- Trigger Event: push

tokenly 0.2.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

tokenly

Why

Install

Use it

Tag calls by user / feature

Budget alerts

Works with

OpenTelemetry GenAI export (optional)

Where is the data?

Pick any backend

Local dashboard

Dashboard security

Concurrency & shutdown

Production notes

Troubleshooting

vs other tools

Roadmap

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance