Python client for CacheCore — semantic cache gateway for LLM agent workloads
Project description
cachecore
Python client for CacheCore — the LLM API caching proxy that reduces cost and latency for AI agent workloads.
CacheCore sits transparently between your application and LLM providers (OpenAI, Anthropic via OpenAI-compat, etc.) and caches responses at two levels: L1 exact-match and L2 semantic similarity. This client handles the CacheCore-specific plumbing — header injection, dependency encoding, invalidation — without replacing your LLM SDK.
Install
pip install cachecore-python
import cachecore # the import name is 'cachecore'
Quick start
Rung 1 — zero code changes: swap base_url
Point your existing SDK at CacheCore and get L1 exact-match caching immediately.
No import cachecore required.
from openai import AsyncOpenAI
oai = AsyncOpenAI(
api_key="your-openai-key",
base_url="https://gateway.cachecore.it/v1", # ← only change
)
# Identical requests are now served from cache.
resp = await oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
Rung 2 — tenant isolation (3 lines)
Add CacheCoreClient to unlock tenant-scoped namespaces, L2 semantic caching, and per-tenant
metrics. Three extra lines wired into the SDK's http_client.
from cachecore import CacheCoreClient
import httpx
from openai import AsyncOpenAI
cc = CacheCoreClient(
gateway_url="https://gateway.cachecore.it",
tenant_jwt="ey...", # your tenant JWT from the CacheCore dashboard
)
oai = AsyncOpenAI(
api_key="ignored", # gateway injects its own upstream key
base_url="https://gateway.cachecore.it/v1",
http_client=httpx.AsyncClient(transport=cc.transport),
)
# Requests now carry your tenant identity.
# Semantically similar prompts hit L2 cache.
resp = await oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain photosynthesis"}],
)
Rung 3 — dep invalidation
Declare which data a cached response depends on. When that data changes, invalidate the dep and all stale entries are evicted automatically.
from cachecore import CacheCoreClient, Dep
import httpx
from openai import AsyncOpenAI
cc = CacheCoreClient(
gateway_url="https://gateway.cachecore.it",
tenant_jwt="ey...",
)
oai = AsyncOpenAI(
api_key="ignored",
base_url="https://gateway.cachecore.it/v1",
http_client=httpx.AsyncClient(transport=cc.transport),
)
# Read path — declare what data this response depends on
with cc.request_context(deps=[Dep("table:products"), Dep("table:orders")]):
resp = await oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List all products under $50"}],
)
# Write path — bypass cache for the LLM call, then invalidate
with cc.request_context(bypass=True):
resp = await oai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Confirm order created."}],
)
await cc.invalidate("table:products")
# Invalidate multiple deps at once
await cc.invalidate_many(["table:orders", "table:products"])
Works with LangChain / LangGraph
The transport works with any SDK that accepts an httpx.AsyncClient:
from langchain_openai import ChatOpenAI
import httpx
from cachecore import CacheCoreClient, Dep
cc = CacheCoreClient(gateway_url="https://gateway.cachecore.it", tenant_jwt="ey...")
llm = ChatOpenAI(
model="gpt-4o",
api_key="ignored",
base_url="https://gateway.cachecore.it/v1",
http_async_client=httpx.AsyncClient(transport=cc.transport),
)
# Use request_context() around any ainvoke / astream call
with cc.request_context(deps=[Dep("doc:policy-42")]):
result = await llm.ainvoke("Summarise the compliance policy")
API reference
CacheCoreClient
CacheCoreClient(
gateway_url: str, # "https://gateway.cachecore.it"
tenant_jwt: str, # tenant HS256/RS256 JWT
timeout: float = 30.0, # for invalidation calls
debug: bool = False, # log cache status per request
)
| Property / Method | Description |
|---|---|
.transport |
httpx.AsyncBaseTransport — pass to httpx.AsyncClient(transport=...) |
.request_context(deps, bypass) |
Context manager — sets per-request deps / bypass |
await .invalidate(dep_id) |
Evict all entries tagged with this dep |
await .invalidate_many(dep_ids) |
Invalidate multiple deps concurrently |
await .aclose() |
Close HTTP clients. Also works as async with CacheCoreClient(...): |
Dep / DepDeclaration
Dep("table:products") # simple — hash defaults to "v1"
Dep("table:products", hash="abc123") # explicit hash for versioned deps
CacheStatus
Parsed from response headers after a proxied request:
from cachecore import CacheStatus
status = CacheStatus.from_headers(response.headers)
# status.status → "HIT_L1" | "HIT_L1_STALE" | "HIT_L2" | "MISS" | "BYPASS" | "UNKNOWN"
# status.similarity → float 0.0–1.0 (non-zero on L2 hits)
# status.age_seconds → int
Exceptions
| Exception | When |
|---|---|
CacheCoreError |
Base class for all CacheCore errors |
CacheCoreAuthError |
401 / 403 from the gateway |
CacheCoreRateLimitError |
429 — check .retry_after attribute (seconds, or None) |
How it works
The client injects headers at the httpx transport layer — below the LLM SDK, above the network. Your SDK continues to work exactly as before:
Your code → openai SDK → httpx → [CacheCoreTransport] → CacheCore proxy → OpenAI API
↑
injects X-CacheCore-Token
injects X-CacheCore-Deps
Requirements
- Python 3.10+
httpx >= 0.25.0
Links
- Website: cachecore.it
- Source: github.com/cachecore/cachecore-python
License
MIT — see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cachecore_python-0.1.0.tar.gz.
File metadata
- Download URL: cachecore_python-0.1.0.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4cc9be443ad2a74c2f6b7b73dfdd6fce3eb8c4d6133688336f712b2473c9e6a
|
|
| MD5 |
fe3e9861253f14d64e240e17595b4a16
|
|
| BLAKE2b-256 |
3e4f3cec0355ac9f137d54d68e381a256db92d94ad6c57052289dd39abfe1d9e
|
File details
Details for the file cachecore_python-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cachecore_python-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98b5e79f04a8b5a07e19786015673cd2ea3aa3fb61cbdda997c20bee8cb0babd
|
|
| MD5 |
88267c807a04e3b571fbddaa2e42eea1
|
|
| BLAKE2b-256 |
690413e6c50f309c1a90386668bd74486b90403faeb6641c1505c6bc374e0573
|