A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
Project description
HybridBackend
HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.
Features
-
Memory-efficient loading of categorical data
-
GPU-efficient orchestration of embedding layers
-
Communication-efficient training and evaluation at scale
-
Easy to use with existing AI workflows
Usage
A minimal example:
import tensorflow as tf
import hybridbackend.tensorflow as hb
ds = hb.data.ParquetDataset(filenames, batch_size=batch_size)
ds = ds.apply(hb.data.parse())
# ...
with tf.device('/gpu:0'):
embs = tf.nn.embedding_lookup_sparse(weights, input_ids)
# ...
Please see documentation for more information.
Install
Method 1: Pull container images from PAI DLC
docker pull registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:{TAG}
{TAG} |
TensorFlow | Python | CUDA | OS | Columnar Data Loading | Embedding Orchestration | Hybrid Parallelism |
---|---|---|---|---|---|---|---|
0.7-tf1.15-py3.8-cu116-ubuntu20.04 |
1.15 | 3.8 | 11.6 | Ubuntu 20.04 | ✓ | ✓ | ✓ |
Method 2: Install from PyPI
pip install {PACKAGE}
{PACKAGE} |
TensorFlow | Python | CUDA | GLIBC | Columnar Data Loading | Embedding Orchestration | Hybrid Parallelism |
---|---|---|---|---|---|---|---|
hybridbackend-tf115-cu114 * |
1.15 | 3.8 | 11.4 | >=2.31 | ✓ | ✓ | ✓ |
hybridbackend-tf115-cu100 | 1.15 | 3.6 | 10.0 | >=2.27 | ✓ | ✓ | ✗ |
hybridbackend-tf115-cpu | 1.15 | 3.6 | - | >=2.24 | ✓ | ✗ | ✗ |
*
nvidia-pyindex must be installed first
Method 3: Build from source
License
HybridBackend is licensed under the Apache 2.0 License.
Community
-
Please see Contributing Guide before your first contribution.
-
Please register as an adopter if your organization is interested in adoption. We will discuss RoadMap with registered adopters in advance.
-
Please cite HybridBackend in your publications if it helps:
@inproceedings{zhang2022picasso, title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems}, author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others}, booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)}, year={2022}, organization={IEEE} }
Contact Us
If you would like to share your experiences with others, you are welcome to contact us in DingTalk:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file hybridbackend_tf115_cu116-0.7.0.dev1672506489-cp38-cp38-manylinux_2_31_x86_64.whl
.
File metadata
- Download URL: hybridbackend_tf115_cu116-0.7.0.dev1672506489-cp38-cp38-manylinux_2_31_x86_64.whl
- Upload date:
- Size: 62.2 MB
- Tags: CPython 3.8, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.9.2 readme-renderer/34.0 requests/2.20.0 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31ff35ca781ec50f158c6a9058aa909ee45aad6c30b70f679a0d3969487a671b |
|
MD5 | 60fdd41adcbdb757568bf5b58b5cae61 |
|
BLAKE2b-256 | 8c41301d331b4aa6011f90aebbf8cbf6850b098b98e29e825d51bd0c7023a455 |