cutensor-cu12

NVIDIA cuTENSOR

These details have been verified by PyPI

Maintainers

JeremyWangNVDA leofang mhohnerbach_nvidia mtjrider

These details have not been verified by PyPI

Project links

Homepage

Project description

cuTENSOR is a high-performance CUDA library for tensor primitives.

Key Features

Extensive mixed-precision support:
- FP64 inputs with FP32 compute.
- FP32 inputs with FP16, BF16, or TF32 compute.
- Complex-times-real operations.
- Conjugate (without transpose) support.
Support for up to 64-dimensional tensors.
Arbitrary data layouts.
Trivially serializable data structures.
Main computational routines:
- Direct (i.e., transpose-free) tensor contractions.
- Tensor reductions (including partial reductions).
- Element-wise tensor operations:
  - Support for various activation functions.
  - Arbitrary tensor permutations.
  - Conversion between different data types.

Documentation

Please refer to https://docs.nvidia.com/cuda/cutensor/index.html for the cuTENSOR documentation.

Installation

The cuTENSOR wheel can be installed as follows:

pip install cutensor-cuXX

where XX is the CUDA major version (currently CUDA 11 & 12 are supported). The package cutensor (without the -cuXX suffix) is considered deprecated. If you have cutensor installed, please remove it prior to installing cutensor-cuXX.

Project details

These details have been verified by PyPI

Maintainers

JeremyWangNVDA leofang mhohnerbach_nvidia mtjrider

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.0.2

Jul 9, 2024

2.0.1

Feb 8, 2024

2.0.0

Nov 21, 2023

1.7.0

Apr 6, 2023

This version

1.6.2

Jan 5, 2023

0.0.1.dev0 pre-release yanked

Dec 9, 2022

Reason this release was yanked:

placeholder

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

cutensor_cu12-1.6.2-py3-none-manylinux2014_x86_64.whl (142.6 MB view hashes)

Uploaded Jan 5, 2023 Python 3

cutensor_cu12-1.6.2-py3-none-manylinux2014_aarch64.whl (138.3 MB view hashes)

Uploaded Jan 5, 2023 Python 3

Hashes for cutensor_cu12-1.6.2-py3-none-manylinux2014_x86_64.whl

Hashes for cutensor_cu12-1.6.2-py3-none-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`7bb044f32c408b8fe9020dda606dc75f1a6eff09d35a0c35832400ab6cb7233c`
MD5	`bc58d8eff90ab70cf27378854d8213e4`
BLAKE2b-256	`58657210cebebe46dfc2faeae6441cdf1bdfe73bb968340f486097a4ebb544f2`

Hashes for cutensor_cu12-1.6.2-py3-none-manylinux2014_aarch64.whl

Hashes for cutensor_cu12-1.6.2-py3-none-manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`b05d797bf674d46e3bec0e134985dc5acbe98b3b0a0c249d3cb02132d350a46c`
MD5	`2059a0ec1205ed7e997f562a67df13ba`
BLAKE2b-256	`89c6283800aa459fa8eda985a7ba258965752c6cc3c69d51ec7c2a8f03e80d57`