Skip to main content

Replaying cache for LangChain

Project description

Replaying cache for LangChain

This is a replay cache implementation for LangChain. It is useful when you want to make the same LLM queries, and you do not want to get a single answer from the cache each time.

replay_cache wraps any cache from LangChain and remembers not only prompt & LLM configuration but also adds a sequential number to the cached value. So when it is used for the first time, it will only fill the cache and query the model each time.

When replay_cache is used again over the same cache, then it will return the results from the cache in the order how it was obtained in the previous run (on the same prompts over the same models). Each pair prompt & LLM configuration remembers its own order of responses. If a query is called more times than in the previous run, then a model is queried again and the result is stored in cached.

Installation

$ pip install replay_cache

Example

Without replay cache

cache = InMemoryCache()
set_llm_cache(cache)

llm.invoke("How are you?")  # Answer A (model queried)
llm.invoke("How are you?")  # Answer A (taken from cache)
llm.invoke("How are you?")  # Answer A (taken from cache)

With replay cache

from replay_cache import replay_cache

cache = InMemoryCache()

with replay_cache(cache):
    llm.invoke("How are you?")  # Answer A (model queried)
    llm.invoke("How are you?")  # Answer B (model queried)
    llm.invoke("How are you?")  # Answer C (model queried)

with replay_cache(cache):
    llm.invoke("How are you?")  # Answer A (taken from cache)
    llm.invoke("How are you?")  # Answer B (taken from cache)
    llm.invoke("How are you?")  # Answer C (taken from cache)
    llm.invoke("How are you?")  # Answer D (model queried)

with replay_cache(cache):
    llm.invoke("How are you?")  # Answer A (taken from cache)
    llm.invoke("How are you?")  # Answer B (taken from cache)
from replay_cache import replay_cache

cache = InMemoryCache()

with replay_cache(cache):
    llm.invoke("How are you?")        # Answer A (model queried)    
    llm.invoke("How are you?")        # Answer B (model queried)
    llm.invoke("What is your name?")  # Answer C (model queried)
    llm.invoke("What is your name?")  # Answer D (model queried)

with replay_cache(cache):
    llm.invoke("How are you?")        # Answer A (taken from cache)    
    llm.invoke("What is your name?")  # Answer C (taken from cache)
    llm.invoke("How are you?")        # Answer B (taken from cache)
    llm.invoke("What is your name?")  # Answer D (taken from cache)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

replay_cache-0.2.0.tar.gz (2.7 kB view hashes)

Uploaded Source

Built Distribution

replay_cache-0.2.0-py3-none-any.whl (3.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page