Lightning suport for Intel Habana accelerators
Project description
Lightning ⚡ Intel Habana
Habana® Gaudi® AI Processor (HPU) training processors are built on a heterogeneous architecture with a cluster of fully programmable Tensor Processing Cores (TPC) along with its associated development tools and libraries, and a configurable Matrix Math engine.
The TPC core is a VLIW SIMD processor with an instruction set and hardware tailored to serve training workloads efficiently. The Gaudi memory architecture includes on-die SRAM and local memories in each TPC and, Gaudi is the first DL training processor that has integrated RDMA over Converged Ethernet (RoCE v2) engines on-chip.
On the software side, the PyTorch Habana bridge interfaces between the framework and SynapseAI software stack to enable the execution of deep learning models on the Habana Gaudi device.
Gaudi offers a substantial price/performance advantage -- so you get to do more deep learning training while spending less.
For more information, check out Gaudi Architecture and Gaudi Developer Docs.
Installation
pip install -U lightning lightning-habana
Usage
To enable PyTorch Lightning to utilize the HPU accelerator, simply provide accelerator="hpu"
parameter to the Trainer class.
from lightning import Trainer
# run on as many Gaudi devices as available by default
trainer = Trainer(accelerator="auto", devices="auto", strategy="auto")
# equivalent to
trainer = Trainer()
# run on one Gaudi device
trainer = Trainer(accelerator="hpu", devices=1)
# run on multiple Gaudi devices
trainer = Trainer(accelerator="hpu", devices=8)
# choose the number of devices automatically
trainer = Trainer(accelerator="hpu", devices="auto")
The devices>1
parameter with HPUs enables the Habana accelerator for distributed training.
It uses HPUParallelStrategy
internally which is based on DDP
strategy with the addition of Habana's collective communication library (HCCL) to support scale-up within a node and
scale-out across multiple nodes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for lightning-habana-1.0.0rc1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55d77f865b4867586a887597fd0a6b39f60036f69977a2c65cce1c01cba1ed63 |
|
MD5 | 4001a21761db5bb2baf60bd6c9a880fd |
|
BLAKE2b-256 | c72efd3b984c6a9497c434dcfaac3881f3f26b4d5392729eeb69801e39a3cf60 |
Hashes for lightning_habana-1.0.0rc1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9346b8424c5fc06ecf80ef77db796ef57e0ede8f2072037ce411b2b93ce7bca |
|
MD5 | dccac9395001530969692e5e569662a1 |
|
BLAKE2b-256 | 2aba203896efbe913c7e576d3a2c864cc281c8ca15cd8a193d0449984b320181 |