Skip to main content

A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster

Project description

HybridBackend

cibuild readthedocs PRs Welcome license

HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.

Features

  • Memory-efficient loading of categorical data
  • GPU-efficient orchestration of embedding layers
  • Communication-efficient training and evaluation at scale
  • Easy to use with existing AI workflows

Usage

A minimal example:

import tensorflow as tf
import hybridbackend.tensorflow as hb

ds = hb.data.Dataset.from_parquet(filenames)
ds = ds.batch(batch_size)
# ...

with tf.device('/gpu:0'):
  embs = tf.nn.embedding_lookup_sparse(weights, input_ids)
  # ...

Please see documentation for more information.

Install

Method 1: Install from PyPI

pip install {PACKAGE}

{PACKAGE} Dependency Python CUDA GLIBC Data Opt. Embedding Opt. Parallelism Opt.
hybridbackend-tf115-cu121 TensorFlow 1.15 3.8 12.1 >=2.31
hybridbackend-tf115-cu100 TensorFlow 1.15 3.6 10.0 >=2.27
hybridbackend-tf115-cpu TensorFlow 1.15 3.6 - >=2.24

Method 2: Build from source

See Building Instructions.

We also provide built docker images for latest DeepRec: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:1.0.0-deeprec-py3.6-cu114-ubuntu18.04

License

HybridBackend is licensed under the Apache 2.0 License.

Community

  • Please see Contributing Guide before your first contribution.

  • Please register as an adopter if your organization is interested in adoption. We will discuss RoadMap with registered adopters in advance.

  • Please cite HybridBackend in your publications if it helps:

    @inproceedings{zhang2022picasso,
      title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems},
      author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others},
      booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)},
      year={2022},
      organization={IEEE}
    }
    

Contact Us

If you would like to share your experiences with others, you are welcome to contact us in DingTalk:

dingtalk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

hybridbackend_tf115_cpu-1.0.0.dev1708666397-cp36-cp36m-manylinux_2_24_x86_64.whl (34.9 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.24+ x86-64

File details

Details for the file hybridbackend_tf115_cpu-1.0.0.dev1708666397-cp36-cp36m-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: hybridbackend_tf115_cpu-1.0.0.dev1708666397-cp36-cp36m-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 34.9 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for hybridbackend_tf115_cpu-1.0.0.dev1708666397-cp36-cp36m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 353d374c47c4c886fa319eed94e11acf16bb9acb0a0133f56972ffec15b2d999
MD5 1b0f8f4718d47b998ff874d4b06975ab
BLAKE2b-256 c0014fd37c2d3c45e51a90921c0dd2e619fa3f5d50b509392851fc4287e0641a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page