Skip to main content

LLMCache is an open-source caching solution designed to operate seamlessly within your cloud infrastructure.

Project description

Completely Open-Source! Entirely Free! Fully Private!

https://pypi.python.org/pypi/llm-cache/ https://pypi.python.org/pypi/llm-cache/ code_of_conduct.md GitHub release (latest by date)

LLMCache is an open-source caching solution designed to operate seamlessly within your cloud infrastructure, offering custom database integrations and more than just caching. With robust features including automatic retries for enhanced reliability, it empowers developers to efficiently manage data-intensive operations.

⭐ Features

  • Customizable Database Integrations: Out-of-the-box support for Firestore and local cache, with the flexibility for custom database integrations to meet your specific needs.
  • Efficient Caching: Optimizes your application by caching function responses, significantly reducing calling times and improving responsiveness.
  • Retry Logic with Backoff Intervals: Enhances reliability through robust retry mechanisms, including configurable backoff intervals to handle failures gracefully.
  • Reduced Latency: Minimizes delays in data retrieval, ensuring your application runs smoothly and efficiently.
  • Scalable and Cloud-Ready: Designed to seamlessly integrate with your cloud infrastructure, making it ideal for scaling applications.
  • Open Source: Provides the freedom to modify, extend, and tailor the solution to your project's requirements, backed by a community-driven support system.
  • Simplified Data Handling: Streamlines the process of storing and retrieving data, allowing for more focused development on core functionalities.

For feature requests, please reach out to us via this form.

🗄️ Supported Database Integrations

Cache Integration Type Status
Local Cache Completed ✅
Firestore Completed ✅
MongoDB In Progress 🚧
Redis In Progress 🚧

⚙️ Installation

Install via pip with just one command:

pip install llm_cache

🎉 Usage

LLMCache simplifies integrating efficient caching mechanisms into your applications, supporting both local and cloud-based environments. Below are demonstrations of how LLMCache can be utilized with local cache and Firestore, including support for streaming data.

Local Cache

For applications requiring rapid access without external dependencies, LLMCache offers a local caching solution:

from db_integrations.local_cache import LocalCache
from llm_cache import LLMCache
import openai

def call_openai(model,
                openai_messages,
                temperature,
                timeout=50):
  client = openai.Client()
  completion = client.chat.completions.create(model=model,
                                              messages=openai_messages,
                                              temperature=temperature,
                                              timeout=timeout)
  return completion.choices[0].message.content

# Initialize the local cache
cache_file_path = "test_cache.json"
local_cache = LocalCache(file_path=cache_file_path)
llm_cache = LLMCache(local_cache)

# Example function call using the local cache
res = llm_cache.call(func=call_openai,
                     model="gpt-4",
                     openai_messages=[{"content": "Hello, how are you?", "role": "user"}],
                     temperature=0.8,
                     exclude_cache_params=["timeout"])

Firestore Cache

from db_integrations.firestore_cache import FirestoreCache
from llm_cache import LLMCache
import openai

def call_openai(model,
                openai_messages,
                temperature,
                timeout=50):
  client = openai.Client()
  completion = client.chat.completions.create(model=model,
                                              messages=openai_messages,
                                              temperature=temperature,
                                              timeout=timeout)
  return completion.choices[0].message.content

# Configure Firestore cache
collection_name = "test_cache"
firestore_service_account_file = "firestore_key.json"
firestore_cache = FirestoreCache(collection_name=collection_name,
                                 firestore_service_account_file=firestore_service_account_file)
llm_cache = LLMCache(firestore_cache)

# Utilizing Firestore for caching
res = llm_cache.call(func=call_openai,
                     model="gpt-4",
                     openai_messages=[{"content": "Hello, how are you?", "role": "user"}],
                     temperature=0.8,
                     timeout=40,
                     exclude_cache_params=["timeout"],
                     num_retries_call=2,
                     backoff_intervals_call=[5, 10])

Note on Firestore Cache: When using Firestore as your caching solution, it's recommended to implement a Time-To-Live (TTL) policy for your cache entries. This ensures that your database does not indefinitely grow with stale data, which can lead to increased costs and decreased performance. Setting a TTL allows Firestore to automatically delete entries after a specified duration, keeping your database optimized and your costs in check.

Streaming Support

LLMCache also supports streaming responses for real-time data handling, enhancing applications that require continuous data flow:

def openai_stream_call(model,
                       openai_messages,
                       temperature,
                       timeout=120):
  client = openai.Client()

  streaming_response = client.chat.completions.create(model=model,
                                                      messages=openai_messages,
                                                      temperature=temperature,
                                                      stream=True,
                                                      timeout=timeout)

# Streaming call demonstration with Firestore cache
streaming_messages = [
    {"role": "system", "content": "The following is a conversation with an AI assistant."},
    {"role": "user", "content": "Hello, how are you today?"}
]

streaming_generator = llm_cache.stream_call(
    func=openai_stream_call,
    model="gpt-4",
    openai_messages=streaming_messages,
    temperature=0.8)

# Iterate over and print streaming responses
for chunk in streaming_generator:
    print(chunk.decode("utf-8"))

❓ Frequently Asked Questions

What types of applications can benefit from LLMCache?

LLMCache is particularly beneficial for applications that frequently interact with large language models (LLMs) or perform data-intensive operations. It's ideal for improving response times and reducing API costs in AI-powered apps, web services, and data analysis tools.

Can LLMCache be integrated with any database?

While LLMCache currently has built-in support for Firestore and local caching, it's designed to allow custom database integrations. Developers can extend LLMCache to work with databases like MongoDB and Redis, as well as others according to their project requirements.

Is LLMCache open-source?

Yes, LLMCache is an open-source project. This means you can freely use, modify, and distribute it under its licensing terms. It also allows the community to contribute to its development.

Can LLMCache improve the performance of my application?

Absolutely. By caching responses, LLMCache can significantly reduce the time it takes for your application to respond to user requests. This results in faster performance, especially for operations that involve heavy computational work or external API calls.

👥 Contributions

We welcome contributions from everyone who is looking to improve or add value to this project! Whether you're interested in fixing bugs, adding new features, or improving documentation, your help is appreciated.

If you would like to contribute, please fill out this form with your details and how you'd like to help. We'll review your submission and get back to you as soon as possible with the next steps.

Thank you for considering to contribute, and we look forward to collaborating with you!

🔔 Subscribe to updates

Stay updated with YOL by subscribing through this form

⚖️ License

For detailed licensing information, please refer to the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_cache_test-0.1.0.tar.gz (11.8 kB view hashes)

Uploaded Source

Built Distribution

llm_cache_test-0.1.0-py3-none-any.whl (12.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page