collections-cache is a Python package for managing data collections across multiple SQLite databases. It allows efficient storage, retrieval, and updating of key-value pairs, supporting various data types serialized with pickle. The package uses parallel processing for fast access and manipulation of large collections.
Project description
collections-cache 🚀
collections-cache is a fast and scalable key–value caching solution built on top of SQLite. It allows you to store, update, and retrieve data using unique keys, and it supports complex Python data types (thanks to pickle). Designed to harness the power of multiple CPU cores, the library shards data across multiple SQLite databases, enabling impressive performance scaling.
Features ✨
- Multiple SQLite Databases: Distributes your data across several databases to optimize I/O and take advantage of multi-core systems.
- Key–Value Store: Simple and intuitive interface for storing and retrieving data.
- Supports Complex Data Types: Serialize and store lists, dictionaries, objects, and more using
pickle. - Parallel Processing: Uses Python’s
multiprocessingandconcurrent.futuresmodules to perform operations in parallel. - Efficient Data Retrieval: Caches all keys in memory for super-fast lookups.
- Cross-Platform: Runs on Linux, macOS, and Windows.
- Performance Scaling: Benchmarks show near-linear scaling with the number of real CPU cores.
Installation 📦
Use Poetry to install and manage dependencies:
-
Clone the repository:
git clone https://github.com/Luiz-Trindade/collections_cache.git cd collections-cache
-
Install the package with Poetry:
poetry install
Usage ⚙️
Simply import and start using the main class, Collection_Cache, to interact with your collection:
Basic Example
from collections_cache import Collection_Cache
# Create a new collection named "STORE"
cache = Collection_Cache("STORE")
# Set a key-value pair
cache.set_key("products", ["apple", "orange", "onion"])
# Retrieve the value by key
products = cache.get_key("products")
print(products) # Output: ['apple', 'orange', 'onion']
Bulk Insertion Example
For faster insertions, accumulate your data and use set_multi_keys:
from collections_cache import Collection_Cache
from random import uniform, randint
from time import time
cache = Collection_Cache("web_cache")
insertions = 100_000
data = {}
# Generate data
for i in range(insertions):
key = str(uniform(0.0, 100.0))
value = "some text :)" * randint(1, 100)
data[key] = value
# Bulk insert keys using multi-threaded execution
cache.set_multi_keys(data)
print(f"Inserted {len(cache.keys())} keys successfully!")
API Overview 📚
set_key(key, value): Stores a key–value pair. Updates the value if the key already exists.set_multi_keys(key_and_value): (Experimental) Inserts multiple key–value pairs in parallel.get_key(key): Retrieves the value associated with a given key.delete_key(key): Removes a key and its corresponding value.keys(): Returns a list of all stored keys.export_to_json(): (Future feature) Exports your collection to a JSON file.
Performance Benchmark 📊
On a machine with 4 real CPU cores, benchmarks indicate around 781 insertions per second. The library is designed to scale nearly linearly with the number of real cores. For example:
- 6 cores: ~1,171 insertions per second.
- 16 cores: ~3,125 insertions per second.
- 128 cores: ~25,000 insertions per second (theoretically).
Note: Actual performance will depend on disk I/O, SQLite contention, and system architecture.
Development & Contributing 👩💻👨💻
To contribute or run tests:
-
Install development dependencies:
poetry install --dev
-
Run tests using:
poetry run pytest
Feel free to submit issues, pull requests, or feature suggestions. Your contributions help make collections-cache even better!
License 📄
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgements 🙌
- Inspired by the need for efficient, multi-core caching with SQLite.
- Created by Luiz Trindade.
- Thanks to all contributors and users who provide feedback to keep improving the library!
Give collections-cache a try and let it power your high-performance caching needs! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file collections_cache-0.3.0.20250303.tar.gz.
File metadata
- Download URL: collections_cache-0.3.0.20250303.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.11.0-17-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b75814eb96ccc1c91e1e61a0222f67dd1be98ab7dcb32a34875782e4d437b774
|
|
| MD5 |
04bcefe0c474795c3e80c49c93023479
|
|
| BLAKE2b-256 |
a30c398bc92b0519c3ec60e8503ebb215fdd7272eba5e08c6f0eae1290358675
|
File details
Details for the file collections_cache-0.3.0.20250303-py3-none-any.whl.
File metadata
- Download URL: collections_cache-0.3.0.20250303-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.11.0-17-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d74a7db1344d541afc6659b63ec6455abf4220e9ce98934c21e92b24e769e7e3
|
|
| MD5 |
010ccb3e6020ccf039e7375764119c40
|
|
| BLAKE2b-256 |
e7d8cd31cfe2abc50e684298148968669341c629d17e3f0a0b011f560ad914ee
|