Skip to main content

Zero-cost, low-latency tiered semantic caching SDK for LLMs running local embeddings.

Project description

SemCache (Python SDK)

Zero-cost, low-latency tiered semantic caching SDK for LLMs running local embeddings.

Installation

pip install semcache

Quick Start

from semcache import SemCache, FileStore

# Initialize cache with local file storage
cache = SemCache(
    store=FileStore(".semcache/db.json"),
    fuzzy_threshold=0.95,
    semantic_threshold=0.85
)

# Set an entry (simulate LLM response time of 1500ms)
cache.set(
    query="What is the capital of France?",
    response="The capital of France is Paris.",
    latency_ms=1500,
    token_usage={"prompt_tokens": 7, "completion_tokens": 7}
)

# Query semantic matching (instant hit in ~20ms, $0 API cost)
result = cache.get("tell me the capital city of France")
if result:
    entry, tier, similarity = result
    print(f"Hit Tier: {tier} (Similarity: {similarity:.4f})")
    print(f"Response: {entry.response}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vanshulgoyal101_semcache-1.0.0.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vanshulgoyal101_semcache-1.0.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file vanshulgoyal101_semcache-1.0.0.tar.gz.

File metadata

File hashes

Hashes for vanshulgoyal101_semcache-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1014d7ede2cb9166c67abc2bc09cdd229e07b1f72a9affca7f083eda807e3617
MD5 1333700525f2d4c1e1e381dab7608e2f
BLAKE2b-256 1a75a4b35e5854a55fa2202e6f85bf2915216b361c9b44d1dead8067983e8a9e

See more details on using hashes here.

File details

Details for the file vanshulgoyal101_semcache-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vanshulgoyal101_semcache-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51523191c4b93b27b44ace4c0d4d94420aa35f565bf0f4412f27f51dd0389c6b
MD5 eab98785edce049023494112db8128ca
BLAKE2b-256 1bdf24bc98dc054916ad8436540f4b8757648ceb03793db517f5d6d6913d063e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page