Skip to main content

Deep Learning Training Acceleration with Bagua and Lightning AI

Project description

Lightning ⚡ Bagua

Deep Learning Training Acceleration with Bagua and Lightning AI

lightning PyPI Status PyPI - Python Version PyPI Status Deploy Docs

General checks Build Status pre-commit.ci status

Bagua is a deep learning training acceleration framework which supports multiple advanced distributed training algorithms including:

  • Gradient AllReduce for centralized synchronous communication, where gradients are averaged among all workers.
  • Decentralized SGD for decentralized synchronous communication, where each worker exchanges data with one or a few specific workers.
  • ByteGrad and QAdam for low precision communication, where data is compressed into low precision before communication.
  • Asynchronous Model Average for asynchronous communication, where workers are not required to be synchronized in the same iteration in a lock-step style.

By default, Bagua uses Gradient AllReduce algorithm, which is also the algorithm implemented in DDP, but Bagua can usually produce a higher training throughput due to its backend written in Rust.

Installation

pip install -U lightning-bagua

Usage

Simply set the strategy argument in the Trainer:

from lightning import Trainer

# train on 4 GPUs (using Bagua mode)
trainer = Trainer(strategy="bagua", accelerator="gpu", devices=4)

See Bagua Tutorials for more details on installation and advanced features.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightning-bagua-0.1.0.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

lightning_bagua-0.1.0-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file lightning-bagua-0.1.0.tar.gz.

File metadata

  • Download URL: lightning-bagua-0.1.0.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for lightning-bagua-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ca4f1659816b6d62cc604081fa0b4210a0b80de45cc604c9bdcb2c906cb4d4c
MD5 0408d97afcf28643e2bc0d41124d9d92
BLAKE2b-256 61c8232a596b7809ef6f3957c3e7027fa2af848d0b12507374db06bdf1b2d4ef

See more details on using hashes here.

File details

Details for the file lightning_bagua-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for lightning_bagua-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8cb1e0156be950c1859061c9586d606f9543c96c5e0bb8dd128c7075a4233295
MD5 627d8f8496791b2f517f3adf8c361920
BLAKE2b-256 6f4a33799300a1fa158e4545a6639325b91a26ddef5d336fa514089915905fd9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page