Skip to main content

SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container.

Project description

SIMD R Drive (Python Bindings)

Experimental Python bindings for SIMD R Drive — a high-performance, schema-less storage engine using a single-file storage container optimized for zero-copy binary access, written in Rust.

This library provides access to core functionality of simd-r-drive from Python, including high-performance key/value storage, zero-copy reads via memoryview, and support for streaming writes and reads.

Threaded streaming writes from Python are not supported. See Thread Safety for important limitations.

Features

  • 🔑 Append-only key/value storage
  • ⚡ Zero-copy reads via memoryview and mmap
  • 📆 Single-file binary container (no schema or serialization required)
  • ↺ Streaming interface for writing and reading large entries
  • 🐍 Native Rust extension module for Python (via PyO3)

Supported Environments

The simd_r_drive_py Python bindings are built as native extension modules and require environments that support both Python and Rust toolchains.

✅ Platforms

  • Linux (x86_64, aarch64)
  • macOS (x86_64, arm64/M1/M2)

Wheels are built using cibuildwheel and tested on GitHub Actions.

✅ Supported Python Versions

  • Python 3.10 – 3.13

Older versions (≤3.9) are explicitly skipped during wheel builds.

❌ Not Supported

  • Windows (x86_64, AMD64, ARM64) Python bindings are not officially supported on Windows due to platform-specific filesystem and memory-mapping inconsistencies in the Python runtime.

    The underlying Rust library works on Windows and is tested continuously, but the Python bindings fail some unit tests in CI. Manual builds (including AMD64 and ARM64) have succeeded locally but are not considered production-stable.

  • Python < 3.10
  • 32-bit Python
  • musl-based Linux environments (e.g., Alpine Linux)
  • PyPy or other alternative Python interpreters

If you need support for other environments or interpreters, consider compiling from source with maturin develop inside a compatible environment.

Installation

pip install -i simd-r-drive-py

Usage

Regular Writes and Reads

from simd_r_drive import DataStore

# Create or open a datastore
store = DataStore("mydata.bin")

# Write a key/value pair
store.write(b"username", b"jdoe")

# Read the value
value = store.read(b"username")
print(value)  # b'jdoe'

# Check existence
assert store.exists(b"username")

# Delete the key
store.delete(b"username")
assert store.read(b"username") is None

Batch Writes

from simd_r_drive import DataStore

store = DataStore("batch.bin")

# Prepare entries as a list of (key, value) byte tuples
entries = [
    (b"user:1", b"alice"),
    (b"user:2", b"bob"),
    (b"user:3", b"charlie"),
]

# Write all entries in a single batch
store.batch_write(entries)

# Verify that all entries were written correctly
for key, value in entries:
    assert store.read(key) == value

Streamed Writes and Reads (Large Payloads)

from simd_r_drive import DataStore
import io

store = DataStore("streamed.bin")

# Simulated payload — in practice, this could be any file-like stream,
# including one that does not fit entirely into memory.
payload = b"x" * (10 * 1024 * 1024)  # Example: 10 MB of dummy data
stream = io.BytesIO(payload)

store.write_stream(b"large-file", stream)

# Read the payload back in chunks
read_stream = store.read_stream(b"large-file")
result = bytearray()

while chunk := read_stream.read(4096):
    result.extend(chunk)

assert result == payload

API

DataStore(path: str)

Opens (or creates) a file-backed storage container at the given path.

.write(key: bytes, value: bytes) -> None

Atomically appends a new key-value entry. Overwrites any previous version of the key.

.write_stream(key: bytes, reader: IO[bytes]) -> None

Streams from a Python file-like object (.read(n) interface). Not thread-safe.

.read(key: bytes) -> Optional[bytes]

Returns the full value for a key, or None if the key does not exist.

.read_entry(key: bytes) -> Optional[EntryHandle]

Returns a memory-mapped handle, exposing .as_memoryview() for zero-copy access.

.read_stream(key: bytes) -> Optional[EntryStream]

Returns a streaming reader exposing .read(n).

.delete(key: bytes) -> None

Marks an entry as deleted. The file remains append-only; use Rust-side compaction if needed.

.exists(key: bytes) -> bool

Returns whether a key is currently valid in the index.

Thread Safety

This Python binding is not thread-safe.

Due to Python’s Global Interpreter Lock (GIL) and the limitations of PyO3, concurrent streaming writes or reads from multiple threads are not supported, and doing so may cause hangs or inconsistent behavior.

  • Use only from a single thread.
  • ❌ Do not call methods like write_stream or read_stream from multiple threads.
  • ❌ Do not share a DataStore instance across threads.
  • ✅ For concurrent, high-performance use — especially with streaming — use the native Rust version directly.

This design avoids working around the GIL or spawning internal locks for artificial concurrency. If you need reliable multithreading, call into the Rust API instead.

Limitations

  • Python bindings currently lack async support.
  • write_stream is blocking and not safe for concurrent use.
  • Compaction is not yet exposed via Python.
  • This is not a drop-in database — you're expected to manage your own data formats.

Development

To develop and test the Python bindings:

Requirements

  • Python 3.10 or above
  • Rust toolchain (with cargo)
pip install -r requirements.txt -r requirements-dev.txt

Test Changes

maturin develop # Builds the Rust library
pytest # Tests the Python integration

Build a Release Wheel

maturin build --release
pip install dist/simd_r_drive_py-*.whl

License

Licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

simd_r_drive_py-0.4.1a1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (328.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a1-cp313-cp313-macosx_11_0_arm64.whl (292.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a1-cp312-cp312-macosx_11_0_arm64.whl (292.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a1-cp311-cp311-macosx_11_0_arm64.whl (297.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

simd_r_drive_py-0.4.1a1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (329.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

simd_r_drive_py-0.4.1a1-cp310-cp310-macosx_11_0_arm64.whl (297.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file simd_r_drive_py-0.4.1a1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6ddc08989e2109bb85fffeed794401653f523dafaf3252b4f00447da2cd8e69b
MD5 65bd5737439fbcbcec868e8d177af8a6
BLAKE2b-256 ec71923949620213f6b577e9b906622e528d183429e48893de076535a99dbaa1

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5aef8670d4a79999997c63de7ad50541a092ab91bd721a2cfd2191a8d785c0c9
MD5 3a3a7d8813b8e8bdf7d9cfdad59fe4e2
BLAKE2b-256 ea92a7658722e8b602306af1688cdb43077838ead02b179b34f8b2eff09948b7

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb28149f2f059c24f67c2a53c58817aad855740718904bb609d96a0131631763
MD5 6aecb6460c0928b3304c0ce341c630b4
BLAKE2b-256 810dd2abace8339da4892a0c4eadb3038aa096486ec2deeb9b012273036c9ab7

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a8c08169de095282081800eaaa227781d23748f2d4a1fe49f05b0315493b5a68
MD5 66998fdf119e50590abc7416344b14ad
BLAKE2b-256 87c04bb3042ad0efe7e6ed1da792e8fa670712d656aa864477a9165425734dda

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 555e5f2c264fe53e9e80c784353a83f3ecadddb05593ea650022f3aeda650b81
MD5 47535d4797f19f6c0bc90dd00463de6f
BLAKE2b-256 1e7b25f37bcf12b20cd87b47b48f42063ddbee01f6d7a23fbce19fa393e1cea1

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 46107a9f9ce9fc02f071a389aa5c0402784a9dd02112202931ae017299dba52f
MD5 02725f4a078cb4a257d01d145f515f0e
BLAKE2b-256 4e66bed1356c474ea7b79d8437e53dab032bc159d045d1b21b196f1149c4307f

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 839138744ecc80e0ceabba7351de063bd06d85b638e5122ad3f7d73fe0de515a
MD5 8baa5f25b8ad857ea052c5b5240531b1
BLAKE2b-256 63957ac47cca17f2e663fef34ab348c1a65b16f65a041592c5074da9a97de2d9

See more details on using hashes here.

File details

Details for the file simd_r_drive_py-0.4.1a1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for simd_r_drive_py-0.4.1a1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a7da74b86ed9fa7f0087a96111d8dbf4d8406bcf3ebffe15bb459d000b93f88c
MD5 c7ab7f9d25f7e1274a1ff216a5fa9eb5
BLAKE2b-256 2bd4700c1622b0bf88c6678b06143a4a087dbe79d714913c08d52c1616baa910

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page