Runtime token proxy + optimization toolkit for LLM developers and enterprises. Intercepts API calls, strips waste in real-time, tracks costs, and serves a web dashboard.

These details have not been verified by PyPI

Project links

Project description

skim

Runtime token intelligence for Claude Code, Cursor, and any LLM tool.

Quickstart · How it works · Dashboard · Enterprise · CLI · Docs · Live Demo

LLM tools waste tokens invisibly. Claude Code reads package-lock.json (122k tokens, $0.37) before answering about a 200-line file. History compounds. Your context window fills silently, quality degrades, and you're paying for noise.

skim sits in the API call path and fixes this in real-time — one env var, no code changes.

Claude Code / Cursor / your app
         │
         ▼
    skim proxy                       ← set ANTHROPIC_BASE_URL=http://localhost:7474
    ├─ strips lock files & build artifacts from tool outputs (real-time)
    ├─ auto-injects prompt caching   (50–90% cost reduction on repeated context)
    ├─ enforces token/cost budgets   (hard block on 429, enterprise-grade)
    ├─ serves local dashboard        (opens in browser automatically)
    └─ streams live events to team dashboard (optional)
         │
         ▼
  Anthropic / OpenAI / Gemini API

Quickstart

pip install skim-llm

# Start — browser opens automatically to your dashboard
skim proxy

# Point Claude Code (or any LLM tool) at it
export ANTHROPIC_BASE_URL=http://localhost:7474

That's it. Every API call now goes through skim. Open http://localhost:7474/dashboard to see live token usage, cost, savings, and cache hit rate.

Works with all plans — no API key required for Claude Pro/Max users. skim detects your auth type automatically (x-api-key for API plans, Authorization: Bearer for Pro/OAuth plans) and routes accordingly.

How it works

1 · Waste filtering

Detects lock files, build artifacts, and generated code inside tool_result blocks and strips them before they enter context. A package-lock.json read becomes a 12-token note instead of 122k tokens.

Detected automatically: package-lock.json, yarn.lock, pnpm-lock.yaml, Cargo.lock, poetry.lock, composer.lock — and anything in your .llmignore.

2 · Prompt caching injection (Anthropic only)

Wraps your system prompt and large context blocks with cache_control: {"type": "ephemeral"} automatically. First call: Anthropic caches it (25% write fee once). Every subsequent call: free. CLAUDE.md and project context load at zero cost on calls 2+.

Skipped for Pro/OAuth plan users — Pro plan manages its own caching layer.

3 · Live dashboard

skim proxy opens a browser tab automatically. The local dashboard requires no login, no server setup, and persists all events to ~/.skim/events.db. Five pages:

Page	Shows
Overview	Token usage over time, cost, savings, cache hits, recent calls
Sessions	Full call log with model, latency, plan type, cost per call
Usage	Hourly activity heatmap, daily breakdown table
Models	Side-by-side comparison — cost/1k tokens, cache hit %, waste %
Savings	Cumulative savings, save rate, ROI of using skim

4 · Plan detection

_auth_type() → ("apikey", key)    API plan users   → full features
             → ("oauth",  token)  Pro/Max users    → filtering + tracking
             → ("", "")           No auth          → 401

One method owns this logic. Extending for new plan types (enterprise SSO, team tokens) is one elif.

5 · Budget enforcement (enterprise)

When SKIM_SERVER_URL is set, the proxy calls /api/v1/budget/check before every request. If the user or their team has exceeded their token/cost budget, the proxy returns 429 immediately — no call is forwarded. Fails open (200ms timeout) so server downtime never blocks work.

Dashboard

Local (solo — no setup)

skim proxy          # browser opens to http://localhost:7474/dashboard

No login. No server. Data lives in ~/.skim/events.db. Works for any plan.

Team (enterprise)

pip install 'skim-llm[web]'

SKIM_ADMIN_EMAIL=you@corp.com skim server --host 0.0.0.0 --port 7475
# → open http://your-server:7475/dashboard

Connect each developer's proxy:

export SKIM_SERVER_URL=https://skim.corp.internal
export SKIM_SERVER_TOKEN=sk-skim-...   # generate in Settings

The team dashboard adds: multi-user auth, team leaderboard, org-level insights, budget management, webhook alerts, user invites, and a full audit log.

Auth options: Local password · LDAP/AD (SKIM_LDAP_*) · Google/GitHub/Azure/Okta (SKIM_OIDC_*)

Enterprise

skim v0.5.0 ships a full enterprise control plane. All features are in the open-source repo.

Budget enforcement

Set hard spending limits per user, team, or globally. Proxy blocks requests that would exceed the limit.

# Set a 1M token monthly budget for a user
skim admin budget set --owner-type user --owner-id <user_id> --tokens 1000000 --period monthly

# Set a $500/month cost budget for a team
skim admin budget set --owner-type team --owner-id engineering --usd 500 --period monthly

When the budget is hit, the proxy returns:

{"error": {"type": "budget_exceeded", "message": "user token budget exceeded (103% used)"}}

Webhook alerts

Get notified on Slack (or any HTTP endpoint) when teams approach or exceed budgets.

# Slack (works with Teams connectors too)
skim admin webhooks add \
  --url https://hooks.slack.com/services/... \
  --channel slack \
  --events budget.warning,budget.exceeded

# Generic HTTP with HMAC signature
skim admin webhooks add --url https://your-system.example.com/hook

Payload on budget.warning:

{
  "event": "budget.warning",
  "data": {"user": "dev@corp.com", "team": "engineering", "pct_used": 83.4, "budget_type": "team"},
  "ts": "2026-05-31T14:23:01Z",
  "sig": "sha256=..."
}

User invites

No manual account creation. Admins generate invite links; users self-register.

skim admin users invite --email new@corp.com --role user --team engineering
# → https://skim.corp:7475/invite/abc123...  (7-day token, single-use)

API key scopes

Keys are scoped and can expire.

Scope	Can do
`ingest`	Push events from proxy (default)
`read`	Read stats and dashboard API
`admin`	Full access (only org admins can create)

# Create a 90-day read-only key
curl -X POST .../api/v1/auth/keys \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"label": "ci-reader", "scope": "read", "expires_days": 90}'

RBAC

Three roles: admin (org-wide), team_admin (own team only), user (own data only).

Audit log

Every action is logged. Queryable via API or CLI.

skim admin audit --days 30
#  Timestamp              User                         Action                 Detail
#  2026-05-31 14:23:01    admin@corp.com               auth.login
#  2026-05-31 14:24:10    admin@corp.com               budget.created         user:abc123
#  2026-05-31 14:31:55    dev@corp.com                 auth.key_created       scope=ingest

Data export

# CSV for accounting
skim admin export --days 30 --out june-usage.csv

# JSON for BI tools
curl .../api/v1/export/summary.json?days=30

`skim admin` CLI

Full management from the command line — no browser needed.

skim admin users list
skim admin users invite --email X --role team_admin --team platform
skim admin budget list
skim admin budget set --owner-type global --tokens 10000000 --period monthly
skim admin keys list
skim admin keys revoke sk-skim-abc1
skim admin webhooks list
skim admin audit --days 7 --action auth.login
skim admin export --days 30 --out report.csv

Reads SKIM_SERVER_URL + SKIM_SERVER_TOKEN from env.

CLI Reference

Static analysis (no API key needed):
  skim scan       Audit token costs per file across your codebase
  skim analyze    Detect waste patterns (lock files, build artifacts, etc.)
  skim fix        Auto-write .llmignore rules — shows before/after savings
  skim check      CI budget gate — exits 1 if over context threshold
  skim generate   Generate .llmignore, .skimrc, and CLAUDE.md
  skim secrets    Scan for leaked credentials before they reach an LLM

Runtime:
  skim proxy      Runtime interceptor — set ANTHROPIC_BASE_URL=http://localhost:7474
  skim server     Web dashboard + REST API (login, charts, team usage)
  skim admin      Manage users, budgets, keys, webhooks via server API

Operations:
  skim audit      View the local operation log (~/.skim/audit.log)
  skim config     Manage .skimrc configuration
  skim hooks      Install/remove git pre-commit budget gate
  skim baseline   Save & compare token count snapshots (regression detection)
  skim version    Print version

Key flags

skim proxy --port 7474 --model claude --no-filter --no-cache --no-browser
skim server --port 7475 --host 0.0.0.0
skim check --max-pct 30 --fail-on-waste --json
skim fix --min-severity medium --dry-run
skim scan --model gpt-4o --top 30 --json
skim secrets --path . --fail          # use in CI to block leaked keys
skim hooks install --max-pct 30 --fail-on-waste
skim baseline save --name pre-refactor
skim baseline compare --name pre-refactor

Configuration

.skimrc in your project root (commit for team-wide policy):

model         = claude       # claude | openai | gemini | ollama
max_pct       = 30           # fail CI if context exceeds this %
fail_on_waste = false        # also fail on HIGH severity waste patterns
min_severity  = high         # auto-fix threshold: high | medium | low
proxy_port    = 7474

Environment variables:

Variable	Purpose
`ANTHROPIC_BASE_URL`	Point Claude Code at the proxy
`OPENAI_BASE_URL`	Point OpenAI-compatible tools at the proxy
`SKIM_NO_FILTER`	Disable waste filtering (passthrough only)
`SKIM_NO_CACHE`	Disable prompt caching injection
`SKIM_SERVER_URL`	Central dashboard URL (enables enterprise mode)
`SKIM_SERVER_TOKEN`	API key for proxy → server reporting
`SKIM_JWT_SECRET`	JWT signing secret (auto-generated if unset)
`SKIM_ADMIN_EMAIL`	Auto-create admin user on first server run
`SKIM_ADMIN_PASSWORD`	Password for auto-created admin
`SKIM_DB_PATH`	SQLite DB path (default: `~/.skim/skim.db`)
`SKIM_LDAP_URL`	Enable LDAP auth
`SKIM_OIDC_GOOGLE_CLIENT_ID`	Enable Google SSO
`SKIM_OIDC_GITHUB_CLIENT_ID`	Enable GitHub SSO
`SKIM_OIDC_AZURE_CLIENT_ID`	Enable Azure AD SSO

Python API

from adapters import ClaudeAdapter, OpenAIAdapter, GeminiAdapter, OllamaAdapter

# Claude with prompt caching
claude = ClaudeAdapter(
    model="claude-sonnet-4-6",
    system_prompt="You are a terse coding assistant.",
    enable_caching=True,
)
response = claude.chat("Refactor the auth module")
claude.print_stats()
# Session: 12,400 tokens | Cache hit rate: 87% | Cost: $0.0037

# Subagent pattern — keeps your main context clean
summary = claude.run_subagent(
    "Investigate how authentication handles token refresh",
    context_files=["src/auth/"]
)

MCP Server

{
  "mcpServers": {
    "skim": { "command": "skim-mcp" }
  }
}

Tools: scan_tokens, analyze_context, check_budget, fix_context, generate_llmignore

Install

pip install skim-llm                      # core — zero hard deps
pip install 'skim-llm[tiktoken]'          # accurate token counting
pip install 'skim-llm[web]'              # dashboard (Flask)
pip install 'skim-llm[web,sso,ldap]'    # enterprise auth
pip install 'skim-llm[all]'             # everything

Docs

Document	What it covers
docs/quickstart.md	Zero to running in 2 minutes
docs/proxy.md	Proxy deep-dive — all features, all flags
docs/dashboard.md	Local and team dashboard guide
docs/enterprise.md	Budgets, webhooks, invites, RBAC, audit
docs/admin-cli.md	`skim admin` complete reference
docs/api.md	REST API reference
docs/configuration.md	All env vars and .skimrc options
docs/deployment.md	Production deployment guide
docs/mcp-setup.md	Claude Desktop MCP integration

MIT License · GitHub · PyPI · Issues · Changelog

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.1

May 31, 2026

This version

0.5.0

May 31, 2026

0.3.0

May 31, 2026

0.2.0

May 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skim_llm-0.5.0.tar.gz (111.7 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

skim_llm-0.5.0-py3-none-any.whl (120.7 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file skim_llm-0.5.0.tar.gz.

File metadata

Download URL: skim_llm-0.5.0.tar.gz
Upload date: May 31, 2026
Size: 111.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for skim_llm-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`4275cf702efd1aa828fd60bff0446566717a152b5b29ad3dfa9d9692fad2768e`
MD5	`64ea8b332a78dad024a33010c39fee82`
BLAKE2b-256	`591f6f3590a4709b179d43ce2533f37f74d01da15a250ff8662064d09ccdf708`

See more details on using hashes here.

File details

Details for the file skim_llm-0.5.0-py3-none-any.whl.

File metadata

Download URL: skim_llm-0.5.0-py3-none-any.whl
Upload date: May 31, 2026
Size: 120.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for skim_llm-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d159720f66b474793ca7e96caa2601b030421ba7911070426953620e05d13c45`
MD5	`6bba7329f917fc5b28c635e9544b5703`
BLAKE2b-256	`6a4f799914f38536f048f33abd4845112fac74ba73c1f611c769edc45a8e7bc8`

See more details on using hashes here.

skim-llm 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

skim

Quickstart

How it works

1 · Waste filtering

2 · Prompt caching injection (Anthropic only)

3 · Live dashboard

4 · Plan detection

5 · Budget enforcement (enterprise)

Dashboard

Local (solo — no setup)

Team (enterprise)

Enterprise

Budget enforcement

Webhook alerts

User invites

API key scopes

RBAC

Audit log

Data export

skim admin CLI

CLI Reference

Key flags

Configuration

Python API

MCP Server

Install

Docs

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`skim admin` CLI