Skip to main content

Framework-agnostic multi-backend storage abstraction for ML and scientific computing

Project description

polystore

Framework-agnostic multi-backend storage abstraction for ML and scientific computing

PyPI version Documentation Status Python 3.11+ License: MIT Coverage

Features

  • Pluggable Backends: Disk, memory, Zarr, and streaming backends with auto-registration
  • Multi-Framework I/O: Seamless support for NumPy, PyTorch, JAX, TensorFlow, CuPy
  • Atomic Operations: Cross-platform atomic file writes with automatic locking
  • Batch Operations: Efficient batch loading and saving
  • Format Detection: Automatic format detection and routing
  • Type-Safe: Full type hints and mypy support
  • Zero Dependencies: Core requires only NumPy (framework support is optional)

Quick Start

from polystore import FileManager, BackendRegistry

# Create registry and file manager
registry = BackendRegistry()
fm = FileManager(registry)

# Save data to disk
import numpy as np
data = np.array([[1, 2], [3, 4]])
fm.save(data, "output.npy", backend="disk")

# Load data back
loaded = fm.load("output.npy", backend="disk")

# Use memory backend for testing
fm.save(data, "test.npy", backend="memory")
cached = fm.load("test.npy", backend="memory")

Installation

# Base installation (NumPy only)
pip install polystore

# With specific frameworks
pip install polystore[zarr]
pip install polystore[torch]
pip install polystore[jax]
pip install polystore[tensorflow]
pip install polystore[cupy]

# With streaming support
pip install polystore[streaming]

# With all optional dependencies
pip install polystore[all]

Supported Backends

Backend Description Storage Dependencies
disk Local filesystem Persistent None
memory In-memory cache Volatile None
zarr Zarr/OME-Zarr arrays Persistent zarr, ome-zarr
streaming ZeroMQ streaming None pyzmq

Supported Formats

Format Extensions Frameworks
NumPy .npy, .npz NumPy, PyTorch, JAX, TensorFlow, CuPy
TIFF .tif, .tiff NumPy, PyTorch, JAX, TensorFlow, CuPy
Zarr .zarr NumPy, PyTorch, JAX, TensorFlow, CuPy
PyTorch .pt, .pth PyTorch
CSV .csv NumPy, pandas
JSON .json Python dicts

Architecture

polystore/
├── base.py              # Abstract interfaces (DataSink, DataSource, StorageBackend)
├── backend_registry.py  # Auto-registration system
├── disk.py              # Disk storage backend
├── memory.py            # In-memory backend
├── zarr.py              # Zarr backend
├── streaming.py         # ZeroMQ streaming backend
├── filemanager.py       # High-level API
├── atomic.py            # Atomic file operations
└── exceptions.py        # Custom exceptions

Advanced Usage

Custom Backends

from polystore import StorageBackend

class MyBackend(StorageBackend):
    _backend_type = 'my_backend'  # Auto-registers
    
    def save(self, data, file_path, **kwargs):
        # Your save logic
        pass
    
    def load(self, file_path, **kwargs):
        # Your load logic
        pass

Batch Operations

# Save multiple files
data_list = [np.random.rand(100, 100) for _ in range(10)]
paths = [f"image_{i}.npy" for i in range(10)]
fm.save_batch(data_list, paths, backend="disk")

# Load multiple files
loaded_list = fm.load_batch(paths, backend="disk")

Atomic Writes

from polystore import atomic_write, atomic_write_json

# Atomic file write with automatic locking
with atomic_write("output.txt") as f:
    f.write("data")

# Atomic JSON write
atomic_write_json({"key": "value"}, "config.json")

Why polystore?

Before (Manual backend management):

if backend == 'disk':
    np.save(path, data)
elif backend == 'memory':
    cache[path] = data
elif backend == 'zarr':
    zarr.save(path, data)
# ... 50 more lines of if/elif ...

After (polystore):

fm.save(data, path, backend=backend)

Documentation

Full documentation available at polystore.readthedocs.io

Addons

Extend polystore with additional backends:

  • polystore-napari: Napari viewer streaming backend
  • polystore-fiji: Fiji/ImageJ streaming backend
  • polystore-omero: OMERO server backend

Performance

  • Zero-copy conversions between frameworks via DLPack (when possible)
  • Lazy loading for optional dependencies
  • Batch operations for efficient I/O
  • Atomic writes with minimal overhead

License

MIT License - see LICENSE file for details

Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.

Credits

Developed by Tristan Simas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polystore-0.1.0.tar.gz (99.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polystore-0.1.0-py3-none-any.whl (92.3 kB view details)

Uploaded Python 3

File details

Details for the file polystore-0.1.0.tar.gz.

File metadata

  • Download URL: polystore-0.1.0.tar.gz
  • Upload date:
  • Size: 99.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for polystore-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5e5a7f3560680bbe016bba53d4c07f36a4321d4a9c65ed9c4356e7e26b269fd7
MD5 892fdaeae97ec9fe33ab00b4722e381f
BLAKE2b-256 faa49d2e702839da9b024e9d244fca1f589c9870eced0eab9833976ed6d4efd2

See more details on using hashes here.

Provenance

The following attestation bundles were made for polystore-0.1.0.tar.gz:

Publisher: publish.yml on OpenHCSDev/PolyStore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polystore-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: polystore-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 92.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for polystore-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1de2effe1bede4b155a305f540befc2b3e9ce3696a9df2dc293f947a9703bf64
MD5 5c79ebf97335f70285e4bab0b01dcf41
BLAKE2b-256 fdb846fdf1ffcc0da062444f293199985e1fcb9c236f15dfef6759cf244b07d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for polystore-0.1.0-py3-none-any.whl:

Publisher: publish.yml on OpenHCSDev/PolyStore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page