A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
Project description
HybridBackend
HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.
Features
-
Memory-efficient loading of categorical data
-
GPU-efficient orchestration of embedding layers
-
Communication-efficient training and evaluation at scale
-
Easy to use with existing AI workflows
Usage
A minimal example:
import tensorflow as tf
import hybridbackend.tensorflow as hb
ds = hb.data.ParquetDataset(filenames, batch_size=batch_size)
ds = ds.apply(hb.data.parse())
# ...
with tf.device('/gpu:0'):
embs = tf.nn.embedding_lookup_sparse(weights, input_ids)
# ...
Please see documentation for more information.
Install
Method 1: Pull container images from PAI DLC
docker pull registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:{TAG}
{TAG} |
TensorFlow | Python | CUDA | OS | Columnar Data Loading | Embedding Orchestration | Hybrid Parallelism |
---|---|---|---|---|---|---|---|
0.7-tf1.15-py3.8-cu114-ubuntu20.04 |
1.15 | 3.8 | 11.4 | Ubuntu 20.04 | ✓ | ✓ | ✓ |
Method 2: Install from PyPI
pip install {PACKAGE}
{PACKAGE} |
TensorFlow | Python | CUDA | GLIBC | Columnar Data Loading | Embedding Orchestration | Hybrid Parallelism |
---|---|---|---|---|---|---|---|
hybridbackend-tf115-cu114 * |
1.15 | 3.8 | 11.4 | >=2.31 | ✓ | ✓ | ✓ |
hybridbackend-tf115-cu100 | 1.15 | 3.6 | 10.0 | >=2.27 | ✓ | ✓ | ✗ |
hybridbackend-tf115-cpu | 1.15 | 3.6 | - | >=2.24 | ✓ | ✗ | ✗ |
*
nvidia-pyindex must be installed first
Method 3: Build from source
License
HybridBackend is licensed under the Apache 2.0 License.
Community
-
Please see Contributing Guide before your first contribution.
-
Please register as an adopter if your organization is interested in adoption. We will discuss RoadMap with registered adopters in advance.
-
Please cite HybridBackend in your publications if it helps:
@inproceedings{zhang2022picasso, title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems}, author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others}, booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)}, year={2022}, organization={IEEE} }
Contact Us
If you would like to share your experiences with others, you are welcome to contact us in DingTalk:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hybridbackend_tf115_cu114-0.7.0.dev1667294742-cp38-cp38-manylinux_2_31_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b90e6e600aa5275fd8d0c77d10ff1de39eaa461f12d5f95a5e4f609ace96d309 |
|
MD5 | 9a34cfb249178a60f860ff7da4a42751 |
|
BLAKE2b-256 | b4d46f70a02d9a517b2fe1871ee638173fd69e09f8903135204c1f0193d32d6f |
Hashes for hybridbackend_tf115_cu114-0.7.0.dev1667294742-cp36-cp36m-manylinux_2_27_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 795493db505f097b90a8b7c06479a1ab2701805e2308d400d3ab66e2aa79d620 |
|
MD5 | 54b6fcb786b3d2c9704cff88e7eb3070 |
|
BLAKE2b-256 | 88746e4dc43bbab0ccf3f01f94df3b2aa26d586b515ab7909339c977b8c3f17c |