Skip to main content

Composable data loading modules for PyTorch

Project description

TorchData (see note below on current status)

What is TorchData? | Stateful DataLoader | Install guide | Contributing | License

:warning: June 2024 Status Update: Removing DataPipes and DataLoader V2

We are re-focusing the torchdata repo to be an iterative enhancement of torch.utils.data.DataLoader. We do not plan on continuing development or maintaining the [DataPipes] and [DataLoaderV2] solutions, and they will be removed from the torchdata repo. We'll also be revisiting the DataPipes references in pytorch/pytorch. In release torchdata==0.8.0 (July 2024) they will be marked as deprecated, and in 0.9.0 (Oct 2024) they will be deleted. Existing users are advised to pin to torchdata==0.8.0 or an older version until they are able to migrate away. Subsequent releases will not include DataPipes or DataLoaderV2. The old version of this README is available here. Please reach out if you suggestions or comments (please use #1196 for feedback).

What is TorchData?

The TorchData project is an iterative enhancement to the PyTorch torch.utils.data.DataLoader and torch.utils.data.Dataset/IterableDataset to make them scalable, performant dataloading solutions. We will be iterating on the enhancements under the torchdata repo.

Our first change begins with adding checkpointing to torch.utils.data.DataLoader, which can be found in stateful_dataloader, a drop-in replacement for torch.utils.data.DataLoader, by defining load_state_dict and state_dict methods that enable mid-epoch checkpointing, and an API for users to track custom iteration progress, and other custom states from the dataloader workers such as token buffers and/or RNG states.

Stateful DataLoader

torchdata.stateful_dataloader.StatefulDataLoader is a drop-in replacement for torch.utils.data.DataLoader which provides state_dict and load_state_dict functionality. See the Stateful DataLoader main page for more information and examples. Also check out the examples in this Colab notebook.

Installation

Version Compatibility

The following is the corresponding torchdata versions and supported Python versions.

torch torchdata python
master / nightly main / nightly >=3.9, <=3.12
2.4.0 0.8.0 >=3.8, <=3.12
2.0.0 0.6.0 >=3.8, <=3.11
1.13.1 0.5.1 >=3.7, <=3.10
1.12.1 0.4.1 >=3.7, <=3.10
1.12.0 0.4.0 >=3.7, <=3.10
1.11.0 0.3.0 >=3.7, <=3.10

Local pip or conda

First, set up an environment. We will be installing a PyTorch binary as well as torchdata. If you're using conda, create a conda environment:

conda create --name torchdata
conda activate torchdata

If you wish to use venv instead:

python -m venv torchdata-env
source torchdata-env/bin/activate

Install torchdata:

Using pip:

pip install torchdata

Using conda:

conda install -c pytorch torchdata

From source

pip install .

In case building TorchData from source fails, install the nightly version of PyTorch following the linked guide on the contributing page.

From nightly

The nightly version of TorchData is also provided and updated daily from main branch.

Using pip:

pip install --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Using conda:

conda install torchdata -c pytorch-nightly

Contributing

We welcome PRs! See the CONTRIBUTING file.

Beta Usage and Feedback

We'd love to hear from and work with early adopters to shape our designs. Please reach out by raising an issue if you're interested in using this tooling for your project.

License

TorchData is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

torchdata-0.9.0-cp312-cp312-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.12 Windows x86-64

torchdata-0.9.0-cp312-cp312-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.12

torchdata-0.9.0-cp312-cp312-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

torchdata-0.9.0-cp311-cp311-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.11 Windows x86-64

torchdata-0.9.0-cp311-cp311-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11

torchdata-0.9.0-cp311-cp311-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

torchdata-0.9.0-cp310-cp310-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

torchdata-0.9.0-cp310-cp310-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10

torchdata-0.9.0-cp310-cp310-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

torchdata-0.9.0-cp39-cp39-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

torchdata-0.9.0-cp39-cp39-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9

torchdata-0.9.0-cp39-cp39-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

File details

Details for the file torchdata-0.9.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ba0d2c6364576f3ee8a077dd1dbd0ce26d8e7b1bb4cb04282d0021dad989891f
MD5 c610e236e78af506d11d521297729a76
BLAKE2b-256 b552af77cbeec30f5527512ed6a0fd6ff936ce98e48ea8b037b40a5c504e0950

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp312-cp312-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp312-cp312-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3b437e1305998efae3f0b6f46c71a5cb154c1661b98528238dad64b559203dfc
MD5 e8bf076a406dc4b1786ea7b61e402a35
BLAKE2b-256 dc6c7678aafb57cbd932666433e9edc68639a9f31af57d6dee1ac36df50a3f0a

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5aa08868ee14c1afae0d88d86398cf623d716ec030b8c66237c2be4fbf0932ea
MD5 c4848827dbc02d6c80d4975a3fc7eab5
BLAKE2b-256 a409a437eed49dccc1b50e43446e8f58721ea8b63eeaeb23128bd158dfd432a4

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f9c36cfd9fe86c9b4f61bdde8344be4312b84d7a4ffdf2e45e24d66d0198f5aa
MD5 e2fa2bf635657264df874af76e77777c
BLAKE2b-256 8c73328ba7fa2ef807d0eaa2bcf631eb7f537ea4aa4b22e61a95382bef1e08f0

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp311-cp311-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp311-cp311-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a0e20f1f9365e8ff8a6f207f6cbb41c56e300c027b2f96b68f3371a9a89446d0
MD5 709bdf77fe45c5e34939fcfddfd94e40
BLAKE2b-256 9e73811aaab2b76cc31646f111367abd1332849df60f96b093bf751e5ef6040d

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 79781f557feb97e2a8d54563a5b484b367d658108107e7b4bb2acf58df1765ad
MD5 68bdee86334007891d6da9da29e29afa
BLAKE2b-256 d0972c5a52db867015924b63d200700e070ca1bfffa9aa327f28b849a2555565

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7c8dfdf2af1b3127b8dbd7072e503ae37cf6cc86e4d02fde6e49ac0d8109fd43
MD5 87fb6887d8e59bb347a1fcc6e5d7d6c4
BLAKE2b-256 70526c629b2eaddb4c41bf7844d83d0639082b7209f575b8fff5436225397e30

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 594f1c32934c44d699737d3444cbe6a6f3cecd51047469b19fb99c865bb03eb2
MD5 e65cc0ac21c6202fe8074fbff68b23d7
BLAKE2b-256 ecd286a99c579dc30a889cd2ebbfbbddfe844a12789225fd6811f97fe4d9ded1

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f3f3bb4d519e59a4c0fcb71d41c594ba2baf8a211fadb40a1a7780d7fb80faa5
MD5 625cdf23dfd194adc492ed8d4cd1f585
BLAKE2b-256 87284980c2d94329f01cc7477be5c1c42dd955bfb840e49bf9dc20545c2fc3bd

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: torchdata-0.9.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for torchdata-0.9.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 435aac61ad043ce79e61edee8d24e8bffcf2c969cc72c3d8e6c2ba0c1796cd29
MD5 59c2e87a3aa8b85f49d2d8d5d895e020
BLAKE2b-256 1122158a30519ec270def73c0af75ea5328dc3366d4da5eecada20b1abb3e611

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 19085d5db150ce1457ca19b64f5434d9d4ab1d266f6a89716a18852103bdfba5
MD5 5eb8ac844281b51d86f16c5b8ddacafa
BLAKE2b-256 0096207b562d248a6aad5e1e481ea86a30989f983c13b19bad41038e541253a6

See more details on using hashes here.

File details

Details for the file torchdata-0.9.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchdata-0.9.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d0d9e32d63f4bc9ae780c1f956cd8b2dc1c8db34aa236dc174660d22c7a6b394
MD5 74d92a75fc9dac1aaad5ec39b85ec217
BLAKE2b-256 605ccf2a02763450176971aade9e7b39b4175dee9328731781e29fec015cb646

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page