Skip to main content

A framework for high-performance data analytics and machine learning.

Project description


Project Status

CPU/CUDA/ROCm tests Documentation Status coverage license: MIT PyPI Version Downloads conda-forge OpenSSF Scorecard OpenSSF Best Practices DOI Benchmarks

Heat

High-performance data analytics in Python, at scale.

Getting Started | Tutorials | Docs | Contributing


Why Heat?

Heat is a distributed tensor framework built on PyTorch and mpi4py. It provides highly optimized algorithms and data structures for tensor computations using CPUs, GPUs (CUDA/ROCm), and distributed cluster systems. It is designed to handle massive arrays that exceed the memory and computational limits of a single machine.

  • Seamless integration: Port existing NumPy/SciPy code to multi-node clusters with minimal effort.
  • Hardware-agnostic: Supports CPUs and GPUs (CUDA, ROCm, Apple MPS).
  • Efficient scaling: Exploit the entire, cumulative RAM of your cluster for memory-intensive operations.

Requirements

  • Python: >= 3.11
  • MPI: OpenMPI, MPICH, or Intel MPI
  • Frameworks: mpi4py >= 3.1, pytorch >= 2.3

Installation

# Via pip (with optional I/O support)
pip install heat[hdf5,netcdf,zarr]

# Via conda-forge
conda install -c conda-forge heat

# Via easybuild (for HPC systems)
eb heat-<version>.eb --robot

# Via spack (for HPC systems)
spack install py-heat

Distributed Example

Heat handles inter-node communication automatically. Define how your data is partitioned across the cluster using the DNDarray.split attribute. Push computations to your GPUs with the DNDarray.device attribute. Heat will take care of the rest, ensuring efficient data movement and synchronization across nodes.

Here an example from our Linear Algebra tutorial:

View Distributed Example (mpirun / srun)

1. Create your script (my_script.py):

import heat as ht

split_A=0
split_B=1
M = 10000
N = 10000
K = 10000
A = ht.random.randn(M, N, split=split_A, device="gpu")
B = ht.random.randn(N, K, split=split_B, device="gpu")
C = ht.matmul(A, B)
print(C)

2. Run with MPI:

On your laptop, e.g. with OpenMPI:

mpirun -np 4 python my_script.py

On an HPC cluster with SLURM:

srun --nodes=2 --ntasks-per-node=2 --gpus-per-node=2 python my_script.py

Contributing and Support

We welcome contributions from the community. Please see our Contribution Guidelines and the Code of Conduct.

For bug reports, feature requests, or general questions, please use GitHub Issues or Discussions.

Citations

Citations are essential for the sustainability of this project. If Heat supports your work, please cite our main paper:

Götz, M., et al. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE. DOI: 10.1109/BigData50022.2020.9378050.

BibTeX
@inproceedings{heatBigData2020,
  author={Götz, Markus and Debus, Charlotte and Coquelin, Daniel and Krajsek, Kai and Comito, Claudia and Knechtges, Philipp and Hagemeier, Björn and Tarnawa, Michael and Hanselmann, Simon and Siggel, Martin and Basermann, Achim and Streit, Achim},
  booktitle={2020 IEEE International Conference on Big Data (Big Data)},
  title={HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics},
  year={2020},
  volume={},
  number={},
  pages={276-287},
  keywords={Heating systems;Industries;Data analysis;Big Data;Parallel processing;Libraries;Arrays;HeAT;Tensor Framework;High-performance Computing;PyTorch;NumPy;Message Passing Interface;GPU;Big Data Analytics;Machine Learning;Dask;Model Parallelism;Parallel Application Frameworks},
  doi={10.1109/BigData50022.2020.9378050}}

Acknowledgments

This work was funded by the Helmholtz Association Initiative and Networking Fund (Project ZT-I-0003, "Helmholtz Analytics Framework"); the Helmholtz AI platform grant; the European Space Agency (ESA) (Programme 4000144045); the Helmholtz Association Science Serve call 2025 (Project DB002891, HeatHub); the Google Summer of Code 2022 program.

License

Heat is distributed under the MIT license. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

heat-1.8.0.tar.gz (411.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

heat-1.8.0-py3-none-any.whl (453.5 kB view details)

Uploaded Python 3

File details

Details for the file heat-1.8.0.tar.gz.

File metadata

  • Download URL: heat-1.8.0.tar.gz
  • Upload date:
  • Size: 411.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for heat-1.8.0.tar.gz
Algorithm Hash digest
SHA256 f0d64e122c88a44ca27ad60d91cdb7250f97c71c971913302ed90d838d7fd253
MD5 dfecba710afd6186a2a6643db5f8870a
BLAKE2b-256 c96ea9ec30387024b9e3506634b28db872cfbc800634a587e98bb2012a6d740c

See more details on using hashes here.

File details

Details for the file heat-1.8.0-py3-none-any.whl.

File metadata

  • Download URL: heat-1.8.0-py3-none-any.whl
  • Upload date:
  • Size: 453.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for heat-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fee1ed1fd7f64e7b0f3ab81dec78ded4583964897b09891d84e867e1c7c52cba
MD5 53a08600f9b91396d8362849fd869c83
BLAKE2b-256 1dda15bf5bbf2d6e3b1e7f2547068da3cc6eef986e83e2d0fc03a479043d5806

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page