Simple ETL Pipeline for PyTorch
Project description
PyTorch Pipeline: Simple ETL Pipeline for PyTorch
PyTorch Pipeline is a simple ETL framework for PyTorch. It is an alternative to tf.data in TensorFlow
Requirements
- Python 3.6+
- PyTorch 1.2+
Installation
To install PyTorch Pipeline:
pip install pytorch_pipeilne
Basic Usage
import pytorch_pipeilne as pp
d = pp.TextDataset('/path/to/your/text')
d.shuffle(buffer_size=100).batch(batch_size=10).first()
Usage with PyTorch
from torch.utils.data import DataLoader
import pytorch_pipeilne as pp
d = pp.Dataset(range(1_000)).parallel().shuffle(100).batch(10)
loader = DataLoader(d, num_workers=4, collate_fn=lambda x: x)
for x in loader:
...
Usage with LineFlow
You can use PyTorch Pipeline with pre-defined datasets in LineFlow:
from torch.utils.data import DataLoader
from lineflow.datasets.wikitext import cached_get_wikitext
import pytorch_pipeilne as pp
dataset = cached_get_wikitext('wikitext-2')
# Preprocessing dataset
train_data = pp.Dataset(dataset['train']) \
.flat_map(lambda x: x.split() + ['<eos>']) \
.window(35) \
.parallel() \
.shuffle(64 * 100) \
.batch(64)
# Iterating dataset
loader = DataLoader(train_data, num_workers=4, collate_fn=lambda x: x)
for x in loader:
...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytorch-pipeline-0.0.1.tar.gz
(14.4 kB
view details)
File details
Details for the file pytorch-pipeline-0.0.1.tar.gz
.
File metadata
- Download URL: pytorch-pipeline-0.0.1.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c0c421aaf73cb279d5891d3e89f4527fbe144c5d1ee4f6967d4616a9f90a4a2 |
|
MD5 | 95526c1bd0a7d5e5d789140a66b21781 |
|
BLAKE2b-256 | 3dbd4d2d422bdbba7836008ff35bec0b4682d6fd865db0991154d4dcb67862d4 |