Skip to main content

druta (द्रुत) - A fast video dataset format for PyTorch (for when storage isn't a problem)

Project description

druta (द्रुत)

A fast video dataset format for PyTorch (for when storage is cheap, but time is not)

pip install druta
import druta

druta.prep_dataset(
    video="video.mp4",
    save_as="video.druta",
    num_threads=4,
)

dataset = druta.Dataset(
    filename="video.druta",
)

for i in range(len(dataset)):
    frame = dataset[i]
    ## (height, width, 3)
    print(f"Frame {i} shape: {frame.shape}")

Why druta?

When training a model on video data using something like decord, we end up performing the video decoding gymnastics thousands of times redundantly. Druta skips this redundancy by decoding the video once and storing it as a memory mapped file with raw uint8 tensor data.

But there's no free lunch. The speedup comes at a cost of a massive disk-size, but this trade-off is well worth it for some folks. (The speed-tests were run on an M3 Max macbook pro on 2048 frames)

Running tests

pytest -vvx --capture=no tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

druta-0.0.2.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

druta-0.0.2-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file druta-0.0.2.tar.gz.

File metadata

  • Download URL: druta-0.0.2.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for druta-0.0.2.tar.gz
Algorithm Hash digest
SHA256 63a3a9cc1a9563ae3270fedce23f492992e4c2675cfb2d4efa656a9484c59517
MD5 b9d40ffd436766dfa306be0c9e384e00
BLAKE2b-256 44949ca7c6def860a3ec88d092e58651be6da12ed9ece40eb33a72d696087467

See more details on using hashes here.

File details

Details for the file druta-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: druta-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for druta-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 803bbfbe1ea534b02a78aa8cd897c3b301902b97dc12c95071c904202a0186d6
MD5 581b837361617a9c6868fb5b21c7a27d
BLAKE2b-256 66bb953db70ed3bfe9280593d09aaefdb8778595a32f709d94f3f3f1605c9280

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page