NVIDIA cuTENSOR
Project description
cuTENSOR is a high-performance CUDA library for tensor primitives.
Key Features
Extensive mixed-precision support:
FP64 inputs with FP32 compute.
FP32 inputs with FP16, BF16, or TF32 compute.
Complex-times-real operations.
Conjugate (without transpose) support.
Support for up to 64-dimensional tensors.
Arbitrary data layouts.
Trivially serializable data structures.
Main computational routines:
Direct (i.e., transpose-free) tensor contractions.
Tensor reductions (including partial reductions).
Element-wise tensor operations:
Support for various activation functions.
Arbitrary tensor permutations.
Conversion between different data types.
Documentation
Please refer to https://docs.nvidia.com/cuda/cutensor/index.html for the cuTENSOR documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for cutensor-1.6.0.3-py3-none-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03aa002df808785f879ca42511d41d61a997946ec553fb208a8c34bcd0ed143a |
|
MD5 | 2c3feb83b64d506392450ebc95ae6f23 |
|
BLAKE2b-256 | b3d774e39dc3c13d6350c609e7b691de69ecc844b4f4723e1f95786f1f19e032 |
Hashes for cutensor-1.6.0.3-py3-none-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab01a8712e9dc064898ba287d9357dcb5c2ba7c052995594492d5919c56d3e44 |
|
MD5 | 4e5c582ef8ef8e6b1cae17c7ffb98de9 |
|
BLAKE2b-256 | a9975bf36d3340fe8ae4e433afba219fde48d5f7cbbe6dbde7312f29c742ef0b |