Skip to main content

SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container.

Project description

SIMD R Drive (Python Bindings)

Experimental Python bindings for SIMD R Drive — a high-performance, schema-less storage engine using a single-file storage container optimized for zero-copy binary access, written in Rust.

This library provides access to core functionality of simd-r-drive from Python, including high-performance key/value storage, zero-copy reads via memoryview, and support for streaming writes and reads.

Threaded streaming writes from Python are not supported. See Thread Safety for important limitations.

Features

  • 🔑 Append-only key/value storage
  • ⚡ Zero-copy reads via memoryview and mmap
  • 🧵 Thread-safe reads and writes from Python (with restrictions)
  • 📆 Single-file binary container (no schema or serialization required)
  • ↺ Streaming interface for writing and reading large entries
  • 🐍 Native Rust extension module for Python (via PyO3)

Supported Environments

The simd_r_drive_py Python bindings are built as native extension modules and require environments that support both Python and Rust toolchains.

✅ Platforms

  • Linux (x86_64, aarch64)
  • macOS (x86_64, arm64/M1/M2)

Wheels are built using cibuildwheel and tested on GitHub Actions.

✅ Supported Python Versions

  • Python 3.10 – 3.13

Older versions (≤3.9) are explicitly skipped during wheel builds.

❌ Not Supported

  • Windows (x86_64, AMD64, ARM64) Python bindings are not officially supported on Windows due to platform-specific filesystem and memory-mapping inconsistencies in the Python runtime.

    The underlying Rust library works on Windows and is tested continuously, but the Python bindings fail some unit tests in CI. Manual builds (including AMD64 and ARM64) have succeeded locally but are not considered production-stable.

  • Python < 3.10
  • 32-bit Python
  • musl-based Linux environments (e.g., Alpine Linux)
  • PyPy or other alternative Python interpreters

If you need support for other environments or interpreters, consider compiling from source with maturin develop inside a compatible environment.

Installation

pip install -i simd-r-drive-py

Usage

Regular Writes and Reads

from simd_r_drive import DataStore

# Create or open a datastore
store = DataStore("mydata.bin")

# Write a key/value pair
store.write(b"username", b"jdoe")

# Read the value
value = store.read(b"username")
print(value)  # b'jdoe'

# Check existence
assert store.exists(b"username")

# Delete the key
store.delete(b"username")
assert store.read(b"username") is None

Batch Writes

from simd_r_drive import DataStore

store = DataStore("batch.bin")

# Prepare entries as a list of (key, value) byte tuples
entries = [
    (b"user:1", b"alice"),
    (b"user:2", b"bob"),
    (b"user:3", b"charlie"),
]

# Write all entries in a single batch
store.batch_write(entries)

# Verify that all entries were written correctly
for key, value in entries:
    assert store.read(key) == value

Streamed Writes and Reads (Large Payloads)

from simd_r_drive import DataStore
import io

store = DataStore("streamed.bin")

# Simulated payload — in practice, this could be any file-like stream,
# including one that does not fit entirely into memory.
payload = b"x" * (10 * 1024 * 1024)  # Example: 10 MB of dummy data
stream = io.BytesIO(payload)

store.write_stream(b"large-file", stream)

# Read the payload back in chunks
read_stream = store.read_stream(b"large-file")
result = bytearray()

while chunk := read_stream.read(4096):
    result.extend(chunk)

assert result == payload

API

DataStore(path: str)

Opens (or creates) a file-backed storage container at the given path.

.write(key: bytes, value: bytes) -> None

Atomically appends a new key-value entry. Overwrites any previous version of the key.

.write_stream(key: bytes, reader: IO[bytes]) -> None

Streams from a Python file-like object (.read(n) interface). Not thread-safe.

.read(key: bytes) -> Optional[bytes]

Returns the full value for a key, or None if the key does not exist.

.read_entry(key: bytes) -> Optional[EntryHandle]

Returns a memory-mapped handle, exposing .as_memoryview() for zero-copy access.

.read_stream(key: bytes) -> Optional[EntryStream]

Returns a streaming reader exposing .read(n).

.delete(key: bytes) -> None

Marks an entry as deleted. The file remains append-only; use Rust-side compaction if needed.

.exists(key: bytes) -> bool

Returns whether a key is currently valid in the index.

Thread Safety

This Python binding is not thread-safe.

Due to Python’s Global Interpreter Lock (GIL) and the limitations of PyO3, concurrent streaming writes or reads from multiple threads are not supported, and doing so may cause hangs or inconsistent behavior.

  • Use only from a single thread.
  • ❌ Do not call methods like write_stream or read_stream from multiple threads.
  • ❌ Do not share a DataStore instance across threads.
  • ✅ For concurrent, high-performance use — especially with streaming — use the native Rust version directly.

This design avoids working around the GIL or spawning internal locks for artificial concurrency. If you need reliable multithreading, call into the Rust API instead.

Limitations

  • Python bindings currently lack async support.
  • write_stream is blocking and not safe for concurrent use.
  • Compaction is not yet exposed via Python.
  • This is not a drop-in database — you're expected to manage your own data formats.

Development

To develop and test the Python bindings:

Requirements

  • Python 3.10 or above
  • Rust toolchain (with cargo)
pip install -r requirements.txt -r requirements-dev.txt

Test Changes

maturin develop # Builds the Rust library
pytest # Tests the Python integration

Build a Release Wheel

maturin build --release
pip install dist/simd_r_drive_py-*.whl

License

Licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

simd_r_drive_py-0.4.1a0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (328.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a0-cp313-cp313-macosx_11_0_arm64.whl (293.0 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a0-cp312-cp312-macosx_11_0_arm64.whl (292.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a0-cp311-cp311-macosx_11_0_arm64.whl (297.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a0-cp310-cp310-macosx_11_0_arm64.whl (297.4 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file simd_r_drive_py-0.4.1a0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 196ab5dde05ffe2b6f37b76f542e5b28e7a37820e29e3c56647c81385d13fa21
MD5 6eafe9b95159f6b6d7cc79bba8975482
BLAKE2b-256 b6454b3a3554594f0047b351fa71758b5c4f1f2cdc6adac5d4082d8be14ff0c1

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 280bedc6abe0d3b69f762f93bf6f4aecbba2300ba8dfce0ecd17e046d42b6e58
MD5 7f8b389ded408f206129d00885624431
BLAKE2b-256 7adff7398b6ddc3b29b8f742032dee172913e9f2f3ff3dbf21a721f1cdefc2f9

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 053140c18c49bd5c9b6fa81a23d3a6ecbb19d95774ce85eb4321b1193b6dd72c
MD5 1321062a8640a67091f4b0f83c1fe739
BLAKE2b-256 6d7c2dbe883e52b56ce8e42d2a4038dc932ecab6939c18ca30b8f8482c198061

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b781be7611d6d5080179e049dc08c7f44c078a44754bd327e912100dfdf33b12
MD5 2b8c44e6639c968a0aea2601ff44fe42
BLAKE2b-256 222cf619834a38e3b1a9f72117565766181d00bda0b4d06480184fe7f0a7aeb2

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c736fb35e88b9ec8f70daab7cf791a0e128b0c7ca5ab3d5636f8bd3546d957f1
MD5 bf834716490f65fa0ff554f2f1f6dc9b
BLAKE2b-256 1faada0cab32ccd690fe6efc6bb8af909ebb7b4a76d393d8a1daa864d2f37315

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 04f67a620046ceadc4c6787fbb9b9a89f007b093779f1e7ff6fecd458d827c23
MD5 259859d92f58467226672dd80ce1bf58
BLAKE2b-256 2e2c2a46599ca1ce18ea5f605af3ec1b9cd7605be6f7d636cafa20403425ecc7

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a928eea3b1497e67eea6bf2f462a74ceb0f1852f828db5991d5b523acd7ab51d
MD5 e874bb7f45c73fe8e570012a056dc9d2
BLAKE2b-256 96d44b86b59dd4339d5e10b9baeddfac028038bf1531c46ebb818884f1919163

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4c0ef20fc2053a344dcadf327e00cac9dece373f428c15742cd185dc8f3273d5
MD5 2f9e65e7389bb5ca5f3c748c7defe815
BLAKE2b-256 f8148c7a39134109644f4eb8b60c9ecb067096804f090528c179421965a592af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page