Skip to main content

Semantic LLM response caching — stop paying for the same call twice.

Project description

inferencache

Multi-tier semantic caching for LLM APIs. Stop paying for the same prompt twice.

pip install "inferencache[embed,serve]"
export ANTHROPIC_API_KEY=sk-ant-...
inferencache serve
# landing:   http://localhost:8080/
# dashboard: http://localhost:8080/dashboard/
# proxy:     http://localhost:8080/v1/messages

Point Cursor or Claude Code at http://localhost:8080 — no code changes required.

See CONTRIBUTING.md for development setup.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferencache-0.1.0.tar.gz (547.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inferencache-0.1.0-py3-none-any.whl (561.6 kB view details)

Uploaded Python 3

File details

Details for the file inferencache-0.1.0.tar.gz.

File metadata

  • Download URL: inferencache-0.1.0.tar.gz
  • Upload date:
  • Size: 547.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for inferencache-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fcca912faa74e5da0a28baa2256e04d193300588944b745889fc28ef398ac4a0
MD5 c844dd293ec4fc4f6acf8a1635c20d15
BLAKE2b-256 a40db9b1654e07026f6a9e06671b7a8415785924529e71704ae1fa3099157fe4

See more details on using hashes here.

File details

Details for the file inferencache-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: inferencache-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 561.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for inferencache-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e25b6310b3298f684f7a644cabd32b32a06de7a3eed23812f9b6aaddd8c55b48
MD5 4c97697f798cdc9cd23f1e94ef87f36d
BLAKE2b-256 578dd2ca422176a2d2c3b76d7782441d6b341bdbaf5777d93a8206711542a6bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page