A framework for high-performance data analytics and machine learning.
Project description
Project Status
Heat
High-performance data analytics in Python, at scale.
Getting Started | Tutorials | Docs | Contributing
Why Heat?
Heat is a distributed tensor framework built on PyTorch and mpi4py. It provides highly optimized algorithms and data structures for tensor computations using CPUs, GPUs (CUDA/ROCm), and distributed cluster systems. It is designed to handle massive arrays that exceed the memory and computational limits of a single machine.
- Seamless integration: Port existing NumPy/SciPy code to multi-node clusters with minimal effort.
- Hardware-agnostic: Supports CPUs and GPUs (CUDA, ROCm, Apple MPS).
- Efficient scaling: Exploit the entire, cumulative RAM of your cluster for memory-intensive operations.
Requirements
- Python: >= 3.11
- MPI: OpenMPI, MPICH, or Intel MPI
- Frameworks: mpi4py >= 3.1, pytorch >= 2.3
Installation
# Via pip (with optional I/O support)
pip install heat[hdf5,netcdf,zarr]
# Via conda-forge
conda install -c conda-forge heat
# Via easybuild (for HPC systems)
eb heat-<version>.eb --robot
# Via spack (for HPC systems)
spack install py-heat
Distributed Example
Heat handles inter-node communication automatically. Define how your data is partitioned across the cluster using the DNDarray.split attribute. Push computations to your GPUs with the DNDarray.device attribute. Heat will take care of the rest, ensuring efficient data movement and synchronization across nodes.
Here an example from our Linear Algebra tutorial:
View Distributed Example (mpirun / srun)
1. Create your script (my_script.py):
import heat as ht
split_A=0
split_B=1
M = 10000
N = 10000
K = 10000
A = ht.random.randn(M, N, split=split_A, device="gpu")
B = ht.random.randn(N, K, split=split_B, device="gpu")
C = ht.matmul(A, B)
print(C)
2. Run with MPI:
On your laptop, e.g. with OpenMPI:
mpirun -np 4 python my_script.py
On an HPC cluster with SLURM:
srun --nodes=2 --ntasks-per-node=2 --gpus-per-node=2 python my_script.py
Contributing and Support
We welcome contributions from the community. Please see our Contribution Guidelines and the Code of Conduct.
For bug reports, feature requests, or general questions, please use GitHub Issues or Discussions.
Citations
Citations are essential for the sustainability of this project. If Heat supports your work, please cite our main paper:
Götz, M., et al. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE. DOI: 10.1109/BigData50022.2020.9378050.
BibTeX
@inproceedings{heatBigData2020,
author={Götz, Markus and Debus, Charlotte and Coquelin, Daniel and Krajsek, Kai and Comito, Claudia and Knechtges, Philipp and Hagemeier, Björn and Tarnawa, Michael and Hanselmann, Simon and Siggel, Martin and Basermann, Achim and Streit, Achim},
booktitle={2020 IEEE International Conference on Big Data (Big Data)},
title={HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics},
year={2020},
volume={},
number={},
pages={276-287},
keywords={Heating systems;Industries;Data analysis;Big Data;Parallel processing;Libraries;Arrays;HeAT;Tensor Framework;High-performance Computing;PyTorch;NumPy;Message Passing Interface;GPU;Big Data Analytics;Machine Learning;Dask;Model Parallelism;Parallel Application Frameworks},
doi={10.1109/BigData50022.2020.9378050}}
Acknowledgments
This work was funded by the Helmholtz Association Initiative and Networking Fund (Project ZT-I-0003, "Helmholtz Analytics Framework"); the Helmholtz AI platform grant; the European Space Agency (ESA) (Programme 4000144045); the Helmholtz Association Science Serve call 2025 (Project DB002891, HeatHub); the Google Summer of Code 2022 program.
License
Heat is distributed under the MIT license. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file heat-1.8.0.tar.gz.
File metadata
- Download URL: heat-1.8.0.tar.gz
- Upload date:
- Size: 411.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0d64e122c88a44ca27ad60d91cdb7250f97c71c971913302ed90d838d7fd253
|
|
| MD5 |
dfecba710afd6186a2a6643db5f8870a
|
|
| BLAKE2b-256 |
c96ea9ec30387024b9e3506634b28db872cfbc800634a587e98bb2012a6d740c
|
File details
Details for the file heat-1.8.0-py3-none-any.whl.
File metadata
- Download URL: heat-1.8.0-py3-none-any.whl
- Upload date:
- Size: 453.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fee1ed1fd7f64e7b0f3ab81dec78ded4583964897b09891d84e867e1c7c52cba
|
|
| MD5 |
53a08600f9b91396d8362849fd869c83
|
|
| BLAKE2b-256 |
1dda15bf5bbf2d6e3b1e7f2547068da3cc6eef986e83e2d0fc03a479043d5806
|