Skip to main content

A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster

Project description

HybridBackend

cibuild readthedocs PRs Welcome license

HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.

Features

  • Memory-efficient loading of categorical data

  • GPU-efficient orchestration of embedding layers

  • Communication-efficient training and evaluation at scale

  • Easy to use with existing AI workflows

Usage

A minimal example:

import tensorflow as tf
import hybridbackend.tensorflow as hb

ds = hb.data.ParquetDataset(filenames, batch_size=batch_size)
ds = ds.apply(hb.data.to_sparse())
# ...

with tf.device('/gpu:0'):
  embs = tf.nn.embedding_lookup_sparse(weights, input_ids)
  # ...

Please see documentation for more information.

Install

Method 1: Pull container images from PAI DLC

docker pull registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:{TAG}

{TAG} TensorFlow Python CUDA OS Columnar Data Loading Embedding Orchestration Hybrid Parallelism
0.6-tf1.15-py3.8-cu114-ubuntu20.04 1.15 3.8 11.4 Ubuntu 20.04

Method 2: Install from PyPI

pip install {PACKAGE}

{PACKAGE} TensorFlow Python CUDA GLIBC Columnar Data Loading Embedding Orchestration Hybrid Parallelism
hybridbackend-tf115-cu114 * 1.15 3.8 11.4 >=2.31
hybridbackend-tf115-cu100 1.15 3.6 10.0 >=2.27
hybridbackend-tf115-cpu 1.15 3.6 - >=2.24

* nvidia-pyindex must be installed first

Method 3: Build from source

See Building Instructions.

License

HybridBackend is licensed under the Apache 2.0 License.

Community

  • Please see Contributing Guide before your first contribution.

  • Please register as an adopter if your organization is interested in adoption. We will discuss RoadMap with registered adopters in advance.

  • Please cite HybridBackend in your publications if it helps:

    @inproceedings{zhang2022picasso,
      title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems},
      author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others},
      booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)},
      year={2022},
      organization={IEEE}
    }
    

Contact Us

If you would like to share your experiences with others, you are welcome to contact us in DingTalk:

dingtalk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp38-cp38-manylinux_2_31_x86_64.whl (62.2 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.31+ x86-64

hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp36-cp36m-manylinux_2_27_x86_64.whl (49.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.27+ x86-64

File details

Details for the file hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp38-cp38-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp38-cp38-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 cdd28f4aa3232176d9872da1824756ee16f15e2c77a3818c7ccdd972abfc92eb
MD5 11cbf08c17daab29679e38eaf90dd054
BLAKE2b-256 017bc4107988470a5f09b9a76d056d63ea876cb821cdee70f69095bbbdb8dd66

See more details on using hashes here.

File details

Details for the file hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp36-cp36m-manylinux_2_27_x86_64.whl.

File metadata

File hashes

Hashes for hybridbackend_tf115_cu114-0.6.1b1.dev1661427296-cp36-cp36m-manylinux_2_27_x86_64.whl
Algorithm Hash digest
SHA256 a61b5bcbce04202dae3060b37da397a10b7ea5a900c494e5e29b990240678343
MD5 b5336c5c83b323ec0252b8f556bb66b3
BLAKE2b-256 e63935c4c9ea5839f12d03e47268a75aa491922a4695e18573fe06c35f739e3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page