Skip to main content

A Dataloader using rpc-based workers

Project description

Documentation Continuous tests

RPC Dataloader

This library implements a variant of the PyTorch Dataloader using remote workers. It allows to distribute workers over remote servers rather than the one running the main script.

To use it, start one or several worker daemons on remote computers. The machines running the data loaders will dispatch requests for items to the workers and await the returned values.

Though similar to torch.rpc, this library uses its own implementation of RPC (Remote Procedure Call) which is simpler (no initialization) and does not conflict with the one from pytorch.

Installation

pip install rpcdataloader

Usage

To use the RPC dataloader, start a few workers either from the command line:

python -m rpcdataloader.launch --host=0.0.0.0 --port=6543

or by calling rpcdataloader.run_worker directly from a python script.

Then instantiate a remote dataset and dataloader:

dataset = rpcdataloader.RPCDataset(
    workers=['node01:6543', 'node02:5432'],
    dataset=torchvision.datasets.ImageFolder,
    root=args.data_path + "/train",
    transform=train_transform,
)

dataloader = rpcdataloader.RPCDataloader(
    dataset
    batch_size=2,
    shuffle=True,
    pin_memory=True)

for minibatch in dataloader:
    ...

Further reading

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rpcdataloader-0.2.1.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

rpcdataloader-0.2.1-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file rpcdataloader-0.2.1.tar.gz.

File metadata

  • Download URL: rpcdataloader-0.2.1.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for rpcdataloader-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1ac494650615bc4339911f90231471abf233fe5de2f170c224ebc6a5e4a3ba03
MD5 f1aa2aad57b678dbca8ea986fe79eecf
BLAKE2b-256 d8d086241eaecc6e447332e835376dc06c2f988d7f75e1ca41377f28937747f2

See more details on using hashes here.

File details

Details for the file rpcdataloader-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for rpcdataloader-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2d0e44adfa508e9499dbb0e38a5e4a6855675e22bbdd45ec449116a9ab52e663
MD5 ba6c532261a62ed5fc265dbc57e8f9df
BLAKE2b-256 21ba0318a039ed6f8d6fe03ae7a8f7edeb0e4cb64d4f4014d54f18bf7e4a0aaa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page