Composable data loading modules for PyTorch

These details have not been verified by PyPI

Project links

Homepage

Project description

TorchData (see note below on current status)

What is TorchData? | Stateful DataLoader | Install guide | Contributing | License

:warning: June 2024 Status Update: Removing DataPipes and DataLoader V2

We are re-focusing the torchdata repo to be an iterative enhancement of torch.utils.data.DataLoader. We do not plan on continuing development or maintaining the [DataPipes] and [DataLoaderV2] solutions, and they will be removed from the torchdata repo. We'll also be revisiting the DataPipes references in pytorch/pytorch. In release torchdata==0.8.0 (July 2024) they will be marked as deprecated, and sometime after 0.9.0 (Oct 2024) they will be deleted. Existing users are advised to pin to torchdata==0.9.0 or an older version until they are able to migrate away. Subsequent releases will not include DataPipes or DataLoaderV2. The old version of this README is available here. Please reach out if you suggestions or comments (please use #1196 for feedback).

What is TorchData?

The TorchData project is an iterative enhancement to the PyTorch torch.utils.data.DataLoader and torch.utils.data.Dataset/IterableDataset to make them scalable, performant dataloading solutions. We will be iterating on the enhancements under the torchdata repo.

Our first change begins with adding checkpointing to torch.utils.data.DataLoader, which can be found in stateful_dataloader, a drop-in replacement for torch.utils.data.DataLoader, by defining load_state_dict and state_dict methods that enable mid-epoch checkpointing, and an API for users to track custom iteration progress, and other custom states from the dataloader workers such as token buffers and/or RNG states.

Stateful DataLoader

torchdata.stateful_dataloader.StatefulDataLoader is a drop-in replacement for torch.utils.data.DataLoader which provides state_dict and load_state_dict functionality. See the Stateful DataLoader main page for more information and examples. Also check out the examples in this Colab notebook.

torchdata.nodes

torchdata.nodes is a library of composable iterators (not iterables!) that let you chain together common dataloading and pre-proc operations. It follows a streaming programming model, although "sampler + Map-style" can still be configured if you desire. See torchdata.nodes main page for more details. Stay tuned for tutorial on torchdata.nodes coming soon!

Installation

Version Compatibility

The following is the corresponding torchdata versions and supported Python versions.

`torch`	`torchdata`	`python`
`master` / `nightly`	`main` / `nightly`	`>=3.9`, `<=3.13`
`2.6.0`	`0.11.0`	`>=3.9`, `<=3.13`
`2.5.0`	`0.10.0`	`>=3.9`, `<=3.12`
`2.5.0`	`0.9.0`	`>=3.9`, `<=3.12`
`2.4.0`	`0.8.0`	`>=3.8`, `<=3.12`
`2.0.0`	`0.6.0`	`>=3.8`, `<=3.11`
`1.13.1`	`0.5.1`	`>=3.7`, `<=3.10`
`1.12.1`	`0.4.1`	`>=3.7`, `<=3.10`
`1.12.0`	`0.4.0`	`>=3.7`, `<=3.10`
`1.11.0`	`0.3.0`	`>=3.7`, `<=3.10`

Local pip or conda

First, set up an environment. We will be installing a PyTorch binary as well as torchdata. If you're using conda, create a conda environment:

conda create --name torchdata
conda activate torchdata

If you wish to use venv instead:

python -m venv torchdata-env
source torchdata-env/bin/activate

Install torchdata:

Using pip:

pip install torchdata

Using conda:

conda install -c pytorch torchdata

From source

pip install .

In case building TorchData from source fails, install the nightly version of PyTorch following the linked guide on the contributing page.

From nightly

The nightly version of TorchData is also provided and updated daily from main branch.

Using pip:

pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu

Using conda:

conda install torchdata -c pytorch-nightly

Contributing

We welcome PRs! See the CONTRIBUTING file.

Beta Usage and Feedback

We'd love to hear from and work with early adopters to shape our designs. Please reach out by raising an issue if you're interested in using this tooling for your project.

License

TorchData is BSD licensed, as found in the LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.11.0

Feb 20, 2025

0.10.1

Dec 13, 2024

0.10.0

Dec 10, 2024

0.9.0

Oct 21, 2024

0.8.0

Jul 31, 2024

0.7.1

Nov 15, 2023

0.7.0

Oct 4, 2023

0.6.1

May 8, 2023

0.6.0

Mar 15, 2023

0.5.1

Dec 16, 2022

0.5.0

Oct 27, 2022

0.4.1

Aug 5, 2022

0.4.0

Jun 28, 2022

0.3.0

Mar 10, 2022

0.3.0a1 pre-release

Feb 11, 2022

0.3.0a0 pre-release yanked

Dec 2, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

torchdata-0.11.0-py3-none-any.whl (62.0 kB view details)

Uploaded Feb 20, 2025 Python 3

File details

Details for the file torchdata-0.11.0-py3-none-any.whl.

File metadata

Download URL: torchdata-0.11.0-py3-none-any.whl
Upload date: Feb 20, 2025
Size: 62.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for torchdata-0.11.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`52b940fbbe0e00fb21cabddf528449d1bec5bfb0d0823b7487b15f951658ee33`
MD5	`0dd7b97cb36fd06595d4962bf263c01e`
BLAKE2b-256	`95d4af694ef718aedbe95a72760ab9ff7a6a7a44ace2d7f70c27bfeb67c5c503`

See more details on using hashes here.

torchdata 0.11.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TorchData (see note below on current status)

What is TorchData?

Stateful DataLoader

torchdata.nodes

Installation

Version Compatibility

Local pip or conda

From source

From nightly

Contributing

Beta Usage and Feedback

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes