Skip to main content

A pure-Rust multimodal dataset format + dataloader for PyTorch.

Project description

ferroload

A pure-Rust multimodal dataset format + dataloader for PyTorch.

Ferroload stores images, video, audio, tensors, and rich metadata in a self-contained, shardable on-disk format and serves them to training loops with parallel decode in Rust (the GIL released), SQL-style subsetting, and a one-call loader that drops straight into PyTorch.

Install

pip install ferroload

Optional extras:

pip install "ferroload[hf]"      # HuggingFace import tooling (datasets, pillow, hub)
pip install "ferroload[torch]"   # torch + numpy, for the DataLoader glue

In-Rust video decode is feature-gated (it needs system ffmpeg) and is not in the published wheel — build from source for it:

maturin develop --release --features video

Quickstart

from ferroload import make_loader

dl = make_loader("/data/ds", batch_size=64,
                 columns=["image", "video", "label"],   # kinds resolved from the manifest
                 resize=(224, 224), out="torch")
for epoch in range(epochs):
    dl.set_epoch(epoch)                                  # reshuffle (DDP-aware)
    for batch in dl:
        train_step(batch)                                # batch["image"], batch["video"], batch["label"]

Enrich a dataset with an additive, resumable layer — functions bind to inputs positionally and run once per sample by default, so they're generic:

import ferroload

def mean_color(img):                                     # img <- inputs=["image"]
    return img.mean(axis=(0, 1)).astype("float32")

ds = ferroload.Dataset.open("/data/ds")
ds = ds.map(mean_color, inputs=["image"],
            outputs={"emb": ferroload.Modality("npy")}, name="emb")
ds.read_array(0, "emb")

Documentation

Full docs (Python API, quickstart, Rust core, benchmarks) are built with MkDocs and published via GitHub Pages. See the project repository for links.

License

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ferroload-0.1.1.tar.gz (107.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ferroload-0.1.1-cp39-abi3-win_amd64.whl (4.0 MB view details)

Uploaded CPython 3.9+Windows x86-64

ferroload-0.1.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

ferroload-0.1.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.2 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

ferroload-0.1.1-cp39-abi3-macosx_11_0_arm64.whl (3.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file ferroload-0.1.1.tar.gz.

File metadata

  • Download URL: ferroload-0.1.1.tar.gz
  • Upload date:
  • Size: 107.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ferroload-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5dbdd2e15b28fe9106a0835cbea9551141b4f04d7c6182c1c10621577f505218
MD5 8590058f7473f57bd25f9044c96bf15a
BLAKE2b-256 176d6b7043749e9461432d7596554813781a3d0f5419743e741174fe29969672

See more details on using hashes here.

Provenance

The following attestation bundles were made for ferroload-0.1.1.tar.gz:

Publisher: release.yml on midhunharikumar/ferroload

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ferroload-0.1.1-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: ferroload-0.1.1-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ferroload-0.1.1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 82a2362f6a12e45f7319151ea077935fcdee7615cdd332db4192bc309df8e075
MD5 295ed8d71bb6c4c4c40611f367afdb55
BLAKE2b-256 d515515e0a7495b863793622f3df0c5684866122f1b6010346e8a9a116249bc6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ferroload-0.1.1-cp39-abi3-win_amd64.whl:

Publisher: release.yml on midhunharikumar/ferroload

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ferroload-0.1.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ferroload-0.1.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a42d98e38297ae9d47453ff6b655c71367fa2f7137374ab23fc2d49abeaaad04
MD5 229f0aac7f40f1e2f78446169cbe001b
BLAKE2b-256 0e6c09ce23285fb4882bf95f6a5b0a3df6dc4f77a0ce1d0df398bdf18dd8179b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ferroload-0.1.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on midhunharikumar/ferroload

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ferroload-0.1.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ferroload-0.1.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d45259e343ee917160e76d95d618550d38abbcfca83beb84bda64dae97cabb1c
MD5 b56af5cc2f78cacf0bc3423c86b8cfa3
BLAKE2b-256 a252b812bcf27cdcd8c2556978138177225a5130ff1554a603f2d41582ba992a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ferroload-0.1.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on midhunharikumar/ferroload

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ferroload-0.1.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ferroload-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b0d2559f8dcb39f2262a26b4f3103414b555a8d32d9990cc62a2a8fcaa4c9803
MD5 4d92d5d471733a692584cb8899375810
BLAKE2b-256 70e343b686de0a430a9f6f89fbdf11a20967e1122e375611dbfc0943ccdff10f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ferroload-0.1.1-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on midhunharikumar/ferroload

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page