Skip to main content

LLMCache is an open-source caching solution designed to operate seamlessly within your cloud infrastructure.

Project description

Completely Open-Source! Entirely Free! Fully Private!

https://pypi.python.org/pypi/llm-cache/ https://pypi.python.org/pypi/llm-cache/ code_of_conduct.md GitHub release (latest by date)

LLMCache is an open-source caching solution designed to operate seamlessly within your cloud infrastructure, offering custom database integrations and more than just caching. With robust features including automatic retries for enhanced reliability, it empowers developers to efficiently manage data-intensive operations.

⭐ Features

  • Customizable Database Integrations: Out-of-the-box support for Firestore and local cache, with the flexibility for custom database integrations to meet your specific needs.
  • Efficient Caching: Optimizes your application by caching function responses, significantly reducing calling times and improving responsiveness.
  • Retry Logic with Backoff Intervals: Enhances reliability through robust retry mechanisms, including configurable backoff intervals to handle failures gracefully.
  • Reduced Latency: Minimizes delays in data retrieval, ensuring your application runs smoothly and efficiently.
  • Scalable and Cloud-Ready: Designed to seamlessly integrate with your cloud infrastructure, making it ideal for scaling applications.
  • Open Source: Provides the freedom to modify, extend, and tailor the solution to your project's requirements, backed by a community-driven support system.
  • Simplified Data Handling: Streamlines the process of storing and retrieving data, allowing for more focused development on core functionalities.

For feature requests, please reach out to us via this form.

🗄️ Supported Database Integrations

Cache Integration Type Status
Local Cache Completed ✅
Firestore Completed ✅
MongoDB In Progress 🚧
Redis In Progress 🚧

⚙️ Installation

Install via pip with just one command:

pip install llm_cache

🎉 Usage

LLMCache simplifies integrating efficient caching mechanisms into your applications, supporting both local and cloud-based environments. Below are demonstrations of how LLMCache can be utilized with local cache and Firestore, including support for streaming data.

Local Cache

For applications requiring rapid access without external dependencies, LLMCache offers a local caching solution:

from db_integrations.local_cache import LocalCache
from llm_cache import LLMCache
import openai

def call_openai(model,
                openai_messages,
                temperature,
                timeout=50):
  client = openai.Client()
  completion = client.chat.completions.create(model=model,
                                              messages=openai_messages,
                                              temperature=temperature,
                                              timeout=timeout)
  return completion.choices[0].message.content

# Initialize the local cache
cache_file_path = "test_cache.json"
local_cache = LocalCache(file_path=cache_file_path)
llm_cache = LLMCache(local_cache)

# Example function call using the local cache
res = llm_cache.call(func=call_openai,
                     model="gpt-4",
                     openai_messages=[{"content": "Hello, how are you?", "role": "user"}],
                     temperature=0.8,
                     exclude_cache_params=["timeout"])

Firestore Cache

from db_integrations.firestore_cache import FirestoreCache
from llm_cache import LLMCache
import openai

def call_openai(model,
                openai_messages,
                temperature,
                timeout=50):
  client = openai.Client()
  completion = client.chat.completions.create(model=model,
                                              messages=openai_messages,
                                              temperature=temperature,
                                              timeout=timeout)
  return completion.choices[0].message.content

# Configure Firestore cache
collection_name = "test_cache"
firestore_service_account_file = "firestore_key.json"
firestore_cache = FirestoreCache(collection_name=collection_name,
                                 firestore_service_account_file=firestore_service_account_file)
llm_cache = LLMCache(firestore_cache)

# Utilizing Firestore for caching
res = llm_cache.call(func=call_openai,
                     model="gpt-4",
                     openai_messages=[{"content": "Hello, how are you?", "role": "user"}],
                     temperature=0.8,
                     timeout=40,
                     exclude_cache_params=["timeout"],
                     num_retries_call=2,
                     backoff_intervals_call=[5, 10])

Note on Firestore Cache: When using Firestore as your caching solution, it's recommended to implement a Time-To-Live (TTL) policy for your cache entries. This ensures that your database does not indefinitely grow with stale data, which can lead to increased costs and decreased performance. Setting a TTL allows Firestore to automatically delete entries after a specified duration, keeping your database optimized and your costs in check.

Streaming Support

LLMCache also supports streaming responses for real-time data handling, enhancing applications that require continuous data flow:

def openai_stream_call(model,
                       openai_messages,
                       temperature,
                       timeout=120):
  client = openai.Client()

  streaming_response = client.chat.completions.create(model=model,
                                                      messages=openai_messages,
                                                      temperature=temperature,
                                                      stream=True,
                                                      timeout=timeout)

# Streaming call demonstration with Firestore cache
streaming_messages = [
    {"role": "system", "content": "The following is a conversation with an AI assistant."},
    {"role": "user", "content": "Hello, how are you today?"}
]

streaming_generator = llm_cache.stream_call(
    func=openai_stream_call,
    model="gpt-4",
    openai_messages=streaming_messages,
    temperature=0.8)

# Iterate over and print streaming responses
for chunk in streaming_generator:
    print(chunk.decode("utf-8"))

❓ Frequently Asked Questions

What types of applications can benefit from LLMCache?

LLMCache is particularly beneficial for applications that frequently interact with large language models (LLMs) or perform data-intensive operations. It's ideal for improving response times and reducing API costs in AI-powered apps, web services, and data analysis tools.

Can LLMCache be integrated with any database?

While LLMCache currently has built-in support for Firestore and local caching, it's designed to allow custom database integrations. Developers can extend LLMCache to work with databases like MongoDB and Redis, as well as others according to their project requirements.

Is LLMCache open-source?

Yes, LLMCache is an open-source project. This means you can freely use, modify, and distribute it under its licensing terms. It also allows the community to contribute to its development.

Can LLMCache improve the performance of my application?

Absolutely. By caching responses, LLMCache can significantly reduce the time it takes for your application to respond to user requests. This results in faster performance, especially for operations that involve heavy computational work or external API calls.

👥 Contributions

We welcome contributions from everyone who is looking to improve or add value to this project! Whether you're interested in fixing bugs, adding new features, or improving documentation, your help is appreciated.

If you would like to contribute, please fill out this form with your details and how you'd like to help. We'll review your submission and get back to you as soon as possible with the next steps.

Thank you for considering to contribute, and we look forward to collaborating with you!

🔔 Subscribe to updates

Stay updated with YOL by subscribing through this form

⚖️ License

For detailed licensing information, please refer to the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_cache_test-0.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_cache_test-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file llm_cache_test-0.1.0.tar.gz.

File metadata

  • Download URL: llm_cache_test-0.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for llm_cache_test-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8a23a29179295f77f1fec4b0645f86101d565f001c803f19f108cc998a11be9b
MD5 47be9393776d811b02d81ae38a14b671
BLAKE2b-256 5347407effa29a6be74e277f5b500281a64dd7da769977171f216a017b901eb2

See more details on using hashes here.

File details

Details for the file llm_cache_test-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_cache_test-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for llm_cache_test-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8bf6fb454e10c52bbdf372c6d0b4adef40e08ce0d5403580eef5b0a7c4dadd91
MD5 65568f54a82a43d5aca3649dec2514fa
BLAKE2b-256 0960cf3970076d818e2b03d4553bffff69fee69a036a17f32a20234573e68685

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page