Vineyard integration with machine learning frameworks
Project description
vineyard-ml: Accelerating Data Science Pipelines
Vineyard has been tightly integrated with the data preprocessing pipelines in
widely-adopted machine learning frameworks like PyTorch, TensorFlow, and MXNet.
Shared objects in vineyard, e.g., vineyard::Tensor
, vineyard::DataFrame
,
vineyard::Table
, etc., can be directly used as the inputs of the training
and inference tasks in these frameworks.
Examples
The following examples shows how DataFrame
in vineyard can be used as the input
of Dataset for PyTorch:
import os
import numpy as np
import pandas as pd
import torch
import vineyard
# connected to vineyard, see also: https://v6d.io/notes/getting-started.html
client = vineyard.connect(os.environ['VINEYARD_IPC_SOCKET'])
# generate a dummy dataframe in vineyard
df = pd.DataFrame({
# multi-dimensional array as a column
'data': vineyard.data.dataframe.NDArrayArray(np.random.rand(1000, 10)),
'label': np.random.rand(1000)
})
object_id = client.put(df)
# take it as a torch dataset
from vineyard.contrib.ml.torch import torch_context
with torch_context():
# ds is a `torch.utils.data.TensorDataset`
ds = client.get(object_id)
# or, you can use datapipes from torchdata
from vineyard.contrib.ml.torch import datapipe
pipe = datapipe(ds)
# use the datapipes in your training loop
for data, label in pipe:
# do something
pass
Reference and Implementation
- torch: including PyTorch datasets, torcharrow and torchdata.
- tensorflow
- mxnet
For more details about vineyard itself, please refer to the Vineyard project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for vineyard_ml-0.19.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 052c7230b8b72a5e86032caa4b891a1f31e8e3ef33d4cf913df2ff1f7d2877e7 |
|
MD5 | 0159f91aed761cde2c84558d3e7a82b6 |
|
BLAKE2b-256 | 196db80991978515c3db53018773ad01447657d5d18c9e63bcf7376289e91034 |