Skip to main content

druta (द्रुत) - A fast video dataset format for PyTorch (for when storage isn't a problem)

Project description

druta (द्रुत)

A fast video dataset format for PyTorch (for when storage is cheap, but time is not)

pip install druta
import druta

druta.prep_dataset(
    video="video.mp4",
    save_as="video.druta",
    num_threads=4,
)

dataset = druta.Dataset(
    filename="video.druta",
)

for i in range(len(dataset)):
    frame = dataset[i]
    ## (height, width, 3)
    print(f"Frame {i} shape: {frame.shape}")

Why druta?

When training a model on video data using something like decord, we end up performing the video decoding gymnastics thousands of times redundantly. Druta skips this redundancy by decoding the video once and storing it as a memory mapped file with raw uint8 tensor data.

But there's no free lunch. The speedup comes at a cost of a massive disk-size, but this trade-off is well worth it for some folks.

How much faster?

It's kinda ridiculous tbh (tests were run on an M3 Max macbook pro on 2048 frames)

Running tests

pytest -vvx --capture=no tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

druta-0.0.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

druta-0.0.1-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file druta-0.0.1.tar.gz.

File metadata

  • Download URL: druta-0.0.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for druta-0.0.1.tar.gz
Algorithm Hash digest
SHA256 494bc45a1b32cdc71b34aa74d5528fe84a4f9499466eb724b2fa19fe666fcfcb
MD5 f4f4404382bd4f88a86a6df34417bc27
BLAKE2b-256 6c9db1d66b6ebbdb40e83e03b5436989065876bf3f3debd9e641d91e3773d05c

See more details on using hashes here.

File details

Details for the file druta-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: druta-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for druta-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a1eddfd1025b2106c32d280cf893760310edc83fd745cbc10b0e1eae11965cf6
MD5 78e84f6efaf68b9f9adcf4fc0ed5ce05
BLAKE2b-256 7a3ba71ea835b6f7f994478014a54b366e33b894f80fba59784719ff359bda99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page