Skip to main content

Official Python SDK for cachly.dev – Managed Valkey/Redis cache with semantic AI caching

Project description

cachly Python SDK

Official Python SDK for cachly.dev – Managed Valkey/Redis cache. DSGVO-compliant · German servers · 30s provisioning.

Installation

pip install cachly
# or
uv add cachly

Requires Python 3.10+. Uses redis-py and numpy (for semantic cache).

Quick Start

import os
from cachly import CachlyClient

cache = CachlyClient(url=os.environ["CACHLY_URL"])

# Set / Get
cache.set("user:42", {"name": "Alice"}, ttl=300)
user = cache.get("user:42")          # returns dict or None

# Get-or-set pattern
report = cache.get_or_set("report:monthly", lambda: db.run_expensive_report(), ttl=3600)

# Atomic counter
views = cache.incr("page:views")

cache.close()

Async Usage

from cachly.asyncio import AsyncCachlyClient

async def main():
    cache = AsyncCachlyClient(url=os.environ["CACHLY_URL"])

    await cache.set("session:abc", session_data, ttl=1800)
    data = await cache.get("session:abc")

    await cache.close()

Semantic AI Cache (Speed / Business tiers)

from cachly import SemanticOptions

result = cache.semantic.get_or_set(
    prompt=user_question,
    fn=lambda: openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": user_question}]
    ),
    embed_fn=lambda text: openai_client.embeddings.create(
        model="text-embedding-3-small", input=text
    ).data[0].embedding,
    options=SemanticOptions(similarity_threshold=0.92, ttl_seconds=3600),
)

print("⚡ hit" if result.hit else "🔄 miss", result.value)

Django / FastAPI Integration

# FastAPI
from fastapi import FastAPI
from cachly import CachlyClient

app = FastAPI()
cache = CachlyClient(url=os.environ["CACHLY_URL"])

@app.on_event("shutdown")
async def shutdown():
    cache.close()

@app.get("/data/{key}")
async def get_data(key: str):
    return cache.get_or_set(key, lambda: fetch_from_db(key), ttl=60)

Batch API – mehrere Ops in einem Round-Trip

Bündelt GET/SET/DEL/EXISTS/TTL-Operationen in einem HTTP-Request (oder Redis-Pipeline).

from cachly import CachlyClient, BatchOp

cache = CachlyClient(
    url=os.environ["CACHLY_URL"],
    batch_url=os.environ.get("CACHLY_BATCH_URL"),  # optional
)

result = await cache.batch([
    BatchOp("get",    "user:1"),
    BatchOp("get",    "config:app"),
    BatchOp("set",    "visits", str(time.time()), ttl=86400),
    BatchOp("exists", "session:xyz"),
    BatchOp("ttl",    "token:abc"),
])
user    = result[0]   # str | None
config  = result[1]   # str | None
ok      = result[2]   # bool
present = result[3]   # bool
secs    = result[4]   # int (-1 = kein TTL, -2 = Key fehlt)

Ohne batch_url fällt die Methode automatisch auf eine Redis-Pipeline zurück (ein TCP-Round-Trip).

API Reference

Method Description
CachlyClient(url, batch_url=None, pool=None) Create client from Redis URL
get(key) Get value (None if missing); auto-deserialises JSON
set(key, value, ttl=None) Set value, optional TTL in seconds
delete(*keys) Delete one or more keys
exists(key) → bool Check existence
expire(key, seconds) Update TTL
incr(key) → int Atomic increment
get_or_set(key, fn, ttl=None) Get-or-set pattern
batch(ops) → BatchResult Bulk-Ops in einem Round-Trip
semantic SemanticCache for AI workloads
raw Direct redis.Redis access
close() Close connection pool + stop keep-alive

Connection Pooling & Keep-Alive

Fine-tune connection behaviour for high-throughput apps:

from cachly import CachlyClient, CachlyConfig, PoolConfig

cache = CachlyClient(config=CachlyConfig(
    url=os.environ["CACHLY_URL"],
    pool=PoolConfig(
        keep_alive_s=30,         # PING every 30s (prevents firewall idle-disconnect)
        max_retries=10,          # reconnect retries with exponential backoff
        base_retry_delay_s=0.1,  # first retry delay
        max_retry_delay_s=10,    # retry delay cap
        idle_timeout_s=300,      # auto-disconnect after 5min idle (0 = disabled)
        on_error=lambda e: print(f"cachly error: {e}"),
        on_reconnect=lambda: print("cachly reconnected"),
    ),
))

LLM Response Caching Proxy

Use cachly as a drop-in caching proxy for OpenAI or Anthropic — no SDK changes needed. Just swap the base URL:

# Instead of https://api.openai.com → use your cachly proxy URL:
OPENAI_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/openai

# Anthropic:
ANTHROPIC_BASE_URL=https://api.cachly.dev/v1/llm-proxy/YOUR_TOKEN/anthropic

Identical requests are served from cache with X-Cachly-Cache: HIT header. Check savings via GET /v1/llm-proxy/YOUR_TOKEN/stats.

Agent Workflow Persistence

Checkpoint agent workflow state so agents can resume from the last step on crash:

import httpx

base = f"https://api.cachly.dev/v1/workflow/{token}"

# Save a checkpoint after each workflow step
httpx.post(f"{base}/checkpoints", json={
    "run_id": "my-run-123",
    "step_index": 0,
    "step_name": "research",
    "agent_name": "researcher",
    "status": "completed",
    "state": json.dumps({"topic": "AI caching", "results": [...]}),
})

# Resume: get the latest checkpoint for a run
checkpoint = httpx.get(f"{base}/runs/my-run-123/latest").json()
# → {"step_index": 2, "step_name": "write", "state": "...", "status": "completed"}

Environment Variables

CACHLY_URL=redis://:your-password@my-app.cachly.dev:30101
CACHLY_BATCH_URL=https://api.cachly.dev/v1/cache/YOUR_TOKEN   # optional

Retry with Exponential Backoff

Every cache command is automatically retried on transient errors (ConnectionError, TimeoutError, BusyLoadingError, etc.) using AWS-style full-jitter exponential backoff:

from cachly import CachlyClient, RetryConfig

cache = CachlyClient(
    url=os.environ["CACHLY_URL"],
    retry=RetryConfig(
        max_retries=3,        # retry up to 3× (default)
        base_delay_s=0.05,    # first retry after ~50ms
        max_delay_s=2.0,      # cap at 2s
    ),
)

Disable retries with RetryConfig(max_retries=0).

OpenTelemetry Tracing

Pass an OpenTelemetry Tracer to auto-instrument every cache operation with spans:

from opentelemetry import trace

tracer = trace.get_tracer("my-app")
cache = CachlyClient(
    url=os.environ["CACHLY_URL"],
    otel_tracer=tracer,
)

# Every get/set/delete/incr now produces OTEL spans:
#   span: "cache.get"  attributes: { cache.key: "user:42" }
#   span: "cache.set"  attributes: { cache.key: "user:42", cache.ttl: 300 }

Spans include error.type and error.message attributes on failure.

License

MIT © cachly.dev

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cachly-0.1.0b1.tar.gz (36.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cachly-0.1.0b1-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file cachly-0.1.0b1.tar.gz.

File metadata

  • Download URL: cachly-0.1.0b1.tar.gz
  • Upload date:
  • Size: 36.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cachly-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 be794c460a35f680037f0447af91c2d310e57ffe2d164bc982063912719b1c74
MD5 4fbe7aa33525a5634b1a15d00345ca7c
BLAKE2b-256 ef3e1c2a1933201b5aca2bbdb6ce3eb44a5c853298eb0366ba8646bc76ab814a

See more details on using hashes here.

File details

Details for the file cachly-0.1.0b1-py3-none-any.whl.

File metadata

  • Download URL: cachly-0.1.0b1-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cachly-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 3c6b99dbf6a4a8cbe0ebdebc999bbf036933f1354fa50d497bf66d9fe91a0cab
MD5 6a266bbb9ea2c1ebfa086a0a9038d9aa
BLAKE2b-256 f8b8954013ee52846dd1433f5017800edb9a04e9262cd71bbea2391e7af614aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page