Framework-agnostic multi-backend storage abstraction for ML and scientific computing
Project description
polystore
Framework-agnostic multi-backend storage abstraction for ML and scientific computing
Features
- Pluggable Backends: Disk, memory, Zarr, and streaming backends with auto-registration
- Multi-Framework I/O: Seamless support for NumPy, PyTorch, JAX, TensorFlow, CuPy
- Atomic Operations: Cross-platform atomic file writes with automatic locking
- Batch Operations: Efficient batch loading and saving
- Format Detection: Automatic format detection and routing
- Type-Safe: Full type hints and mypy support
- Zero Dependencies: Core requires only NumPy (framework support is optional)
Quick Start
from polystore import FileManager, BackendRegistry
# Create registry and file manager
registry = BackendRegistry()
fm = FileManager(registry)
# Save data to disk
import numpy as np
data = np.array([[1, 2], [3, 4]])
fm.save(data, "output.npy", backend="disk")
# Load data back
loaded = fm.load("output.npy", backend="disk")
# Use memory backend for testing
fm.save(data, "test.npy", backend="memory")
cached = fm.load("test.npy", backend="memory")
Installation
# Base installation (NumPy only)
pip install polystore
# With specific frameworks
pip install polystore[zarr]
pip install polystore[torch]
pip install polystore[jax]
pip install polystore[tensorflow]
pip install polystore[cupy]
# With streaming support
pip install polystore[streaming]
# With all optional dependencies
pip install polystore[all]
Supported Backends
| Backend | Description | Storage | Dependencies |
|---|---|---|---|
| disk | Local filesystem | Persistent | None |
| memory | In-memory cache | Volatile | None |
| zarr | Zarr/OME-Zarr arrays | Persistent | zarr, ome-zarr |
| streaming | ZeroMQ streaming | None | pyzmq |
Supported Formats
| Format | Extensions | Frameworks |
|---|---|---|
| NumPy | .npy, .npz |
NumPy, PyTorch, JAX, TensorFlow, CuPy |
| TIFF | .tif, .tiff |
NumPy, PyTorch, JAX, TensorFlow, CuPy |
| Zarr | .zarr |
NumPy, PyTorch, JAX, TensorFlow, CuPy |
| PyTorch | .pt, .pth |
PyTorch |
| CSV | .csv |
NumPy, pandas |
| JSON | .json |
Python dicts |
Architecture
polystore/
├── base.py # Abstract interfaces (DataSink, DataSource, StorageBackend)
├── backend_registry.py # Auto-registration system
├── disk.py # Disk storage backend
├── memory.py # In-memory backend
├── zarr.py # Zarr backend
├── streaming.py # ZeroMQ streaming backend
├── filemanager.py # High-level API
├── atomic.py # Atomic file operations
└── exceptions.py # Custom exceptions
Advanced Usage
Custom Backends
from polystore import StorageBackend
class MyBackend(StorageBackend):
_backend_type = 'my_backend' # Auto-registers
def save(self, data, file_path, **kwargs):
# Your save logic
pass
def load(self, file_path, **kwargs):
# Your load logic
pass
Batch Operations
# Save multiple files
data_list = [np.random.rand(100, 100) for _ in range(10)]
paths = [f"image_{i}.npy" for i in range(10)]
fm.save_batch(data_list, paths, backend="disk")
# Load multiple files
loaded_list = fm.load_batch(paths, backend="disk")
Atomic Writes
from polystore import atomic_write, atomic_write_json
# Atomic file write with automatic locking
with atomic_write("output.txt") as f:
f.write("data")
# Atomic JSON write
atomic_write_json({"key": "value"}, "config.json")
Why polystore?
Before (Manual backend management):
if backend == 'disk':
np.save(path, data)
elif backend == 'memory':
cache[path] = data
elif backend == 'zarr':
zarr.save(path, data)
# ... 50 more lines of if/elif ...
After (polystore):
fm.save(data, path, backend=backend)
Documentation
Full documentation available at polystore.readthedocs.io
Addons
Extend polystore with additional backends:
- polystore-napari: Napari viewer streaming backend
- polystore-fiji: Fiji/ImageJ streaming backend
- polystore-omero: OMERO server backend
Performance
- Zero-copy conversions between frameworks via DLPack (when possible)
- Lazy loading for optional dependencies
- Batch operations for efficient I/O
- Atomic writes with minimal overhead
License
MIT License - see LICENSE file for details
Contributing
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
Credits
Developed by Tristan Simas.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polystore-0.1.0.tar.gz.
File metadata
- Download URL: polystore-0.1.0.tar.gz
- Upload date:
- Size: 99.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e5a7f3560680bbe016bba53d4c07f36a4321d4a9c65ed9c4356e7e26b269fd7
|
|
| MD5 |
892fdaeae97ec9fe33ab00b4722e381f
|
|
| BLAKE2b-256 |
faa49d2e702839da9b024e9d244fca1f589c9870eced0eab9833976ed6d4efd2
|
Provenance
The following attestation bundles were made for polystore-0.1.0.tar.gz:
Publisher:
publish.yml on OpenHCSDev/PolyStore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polystore-0.1.0.tar.gz -
Subject digest:
5e5a7f3560680bbe016bba53d4c07f36a4321d4a9c65ed9c4356e7e26b269fd7 - Sigstore transparency entry: 854661612
- Sigstore integration time:
-
Permalink:
OpenHCSDev/PolyStore@221c983da5a8e72d9fdc59a982a0aa4afdd4eb91 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/OpenHCSDev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@221c983da5a8e72d9fdc59a982a0aa4afdd4eb91 -
Trigger Event:
push
-
Statement type:
File details
Details for the file polystore-0.1.0-py3-none-any.whl.
File metadata
- Download URL: polystore-0.1.0-py3-none-any.whl
- Upload date:
- Size: 92.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1de2effe1bede4b155a305f540befc2b3e9ce3696a9df2dc293f947a9703bf64
|
|
| MD5 |
5c79ebf97335f70285e4bab0b01dcf41
|
|
| BLAKE2b-256 |
fdb846fdf1ffcc0da062444f293199985e1fcb9c236f15dfef6759cf244b07d2
|
Provenance
The following attestation bundles were made for polystore-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on OpenHCSDev/PolyStore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polystore-0.1.0-py3-none-any.whl -
Subject digest:
1de2effe1bede4b155a305f540befc2b3e9ce3696a9df2dc293f947a9703bf64 - Sigstore transparency entry: 854661647
- Sigstore integration time:
-
Permalink:
OpenHCSDev/PolyStore@221c983da5a8e72d9fdc59a982a0aa4afdd4eb91 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/OpenHCSDev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@221c983da5a8e72d9fdc59a982a0aa4afdd4eb91 -
Trigger Event:
push
-
Statement type: