Skip to main content

Non-invasive prompt cache instrumentation for LLM API apps

Project description

cache-lens

Non-invasive prompt cache instrumentation for LLM API apps. Wrap your client in one line. Get terminal reports, JSON exports, and OTEL metrics.

Prompt caching gives steep discounts on cached tokens — but nothing tells you whether your app is actually getting cache hits, or why not. cache-lens wraps your Anthropic, Gemini, or OpenAI client and reports cache hit rate, cost, savings, and the money you're leaving on the table, broken down by prompt layer.

See SPEC.md for the full design.

Install

pip install cache-lens                # core + rich
pip install cache-lens[anthropic]     # + Anthropic SDK
pip install cache-lens[gemini]        # + Gemini SDK
pip install cache-lens[openai]        # + OpenAI SDK
pip install cache-lens[otel]          # + OpenTelemetry
pip install cache-lens[all]           # everything

Quickstart

import anthropic
from cache_lens import wrap

client = wrap(anthropic.Anthropic())
# ... use client exactly as before; report prints on exit

Explicit session boundary with exports:

from cache_lens import CacheLens

with CacheLens(client, json_export="report.json", otel=True) as session:
    agent.run(...)        # your code, unchanged
report = session.report

Suppress the terminal report in CI with CACHE_LENS_TERMINAL=0.

Custom pricing

cache-lens ships a default price table, but you can override or extend it without forking — handy when a new model lands. User entries merge over the defaults:

# in-memory dict (native format, USD per 1M tokens)
wrap(client, pricing={"openai": {"gpt-5": {"input": 1.25, "output": 10.0, "cache_read": 0.125}}})

# or a JSON file (native or LiteLLM model_prices_and_context_window.json format)
wrap(client, pricing="pricing.json")

Or point at a file process-wide with CACHE_LENS_PRICING=/path/to/pricing.json. A bad pricing file falls back to defaults rather than breaking the run. See SPEC.md §12.

Status

v1.0. Implemented: wrapper interception with request capture, provider extraction + capture (Anthropic + Gemini + OpenAI), content-based layer classification (longest-common-prefix → named system_prompt / context / conversation layers, cross-referenced against actual cache reads), terminal/JSON/OTEL outputs, overridable pricing, tests. Pending: cache-lens run CLI injection, streaming support, and cross-run static/semi-static separation (see docs/architecture.md).

Develop

pip install -e .[dev]
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cachelens-1.0.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cachelens-1.0.0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file cachelens-1.0.0.tar.gz.

File metadata

  • Download URL: cachelens-1.0.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.13

File hashes

Hashes for cachelens-1.0.0.tar.gz
Algorithm Hash digest
SHA256 472c3cd818fcf1e6325559689c21becc39ff252648c03a334f3f5044f5fa2acd
MD5 2b3db3a7e3a3ddb7c31bde15dfa23606
BLAKE2b-256 11a2cc167acfc5cabef82a02ffaed50269cf81019e9cabffeaf47d99332ea1ee

See more details on using hashes here.

File details

Details for the file cachelens-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cachelens-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.13

File hashes

Hashes for cachelens-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbec23375ce270416605363c59e627120fa11271a29ff7410d46060adc574e85
MD5 790d212aff8a9efef888b038e4b2d36c
BLAKE2b-256 3ddbfeb2c3ee8f754d81f151cee74f11d4130c213cddda75a8861c39d957e1f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page