Skip to main content

Tensor contractions for AWS Trainium via NKI

Project description

trntensor

CI codecov Ruff PyPI Python License Docs

Tensor contractions for AWS Trainium via NKI.

Einstein summation with contraction planning, CP and Tucker decompositions. Expresses scientific tensor workloads naturally instead of decomposing to GEMM. Part of the trnsci scientific computing suite (github.com/trnsci).

Current phase

trntensor follows the trnsci 5-phase roadmap. Active work is tracked in phase-labeled GitHub issues:

Suite-wide tracker: trnsci/trnsci#1.

Install

pip install trntensor

Usage

import torch
import trntensor

# Einsum — drop-in for torch.einsum with contraction planning
C = trntensor.einsum("ij,jk->ik", A, B)           # matmul
T = trntensor.einsum("ap,bp->ab", B_i, B_j)       # DF-MP2 contraction
X = trntensor.einsum("mi,mnP->inP", C_occ, eri)   # AO→MO transform

# Contraction planning
plan = trntensor.plan_contraction("ij,jk->ik", A, B)
flops = trntensor.estimate_flops("ij,jk->ik", A, B)

# CP decomposition (tensor hypercontraction)
factors, weights = trntensor.cp_decompose(tensor, rank=10)
reconstructed = trntensor.cp_reconstruct(factors, weights)

# Tucker decomposition (HOSVD)
core, factors = trntensor.tucker_decompose(tensor, ranks=(5, 5, 5))

Operations

Category Operation Description
Contraction einsum General tensor contraction
Contraction multi_einsum Multiple contractions (fusion-ready)
Planning plan_contraction Analyze and select strategy
Planning estimate_flops FLOPs for a contraction
Decomposition cp_decompose CP/PARAFAC via ALS
Decomposition tucker_decompose Tucker via HOSVD

Status

  • Einsum with matmul/bmm/torch dispatch
  • Contraction planner
  • CP decomposition (ALS)
  • Tucker decomposition (HOSVD)
  • DF-MP2 einsum example
  • NKI fused contraction kernels (mp2_energy, ao_to_mo_transform)
  • XLA operand residency (to_xla / from_xla)
  • NKI CPU simulator + nki-simulator CI gate
  • Optimal contraction ordering — greedy path search for 3+ operands
  • Multi-contraction shared-operand XLA pre-pinning
  • Contraction plan cache (clear_plan_cache / plan_cache_info)
  • Alpha/beta scaling for einsum (cuTENSOR GEMM-style)
  • Input validation with descriptive errors
  • Tensor Train (TT) decomposition (TT-SVD)
  • Non-negative CP + warm-start CP
  • PEP 561 py.typed marker
  • Mixed precision / dtype override for bf16/fp16

Related Projects

Project What
trnfft FFT + complex ops
trnblas BLAS operations
trnsolver Linear solvers
trnrand Random number generation
trnsparse Sparse operations

License

Apache 2.0 — Copyright 2026 Scott Friedman

Disclaimer

trnsci is an independent open-source project. It is not sponsored by, endorsed by, or affiliated with Amazon.com, Inc., Amazon Web Services, Inc., or Annapurna Labs Ltd.

"AWS", "Amazon", "Trainium", "Inferentia", "NeuronCore", "Neuron SDK", and related identifiers are trademarks of their respective owners and are used here solely for descriptive and interoperability purposes. Use does not imply endorsement, partnership, or any other relationship.

All work, opinions, analyses, benchmark results, architectural commentary, and editorial judgments in this repository and on trnsci.dev are those of the project's contributors. They do not represent the views, positions, or commitments of Amazon, AWS, or Annapurna Labs.

Feedback directed at the Neuron SDK or Trainium hardware is good-faith ecosystem commentary from independent users. It is not privileged information, is not pre-reviewed by AWS, and should not be read as authoritative about product roadmap, behavior, or quality.

For official AWS guidance, see aws-neuron documentation and the AWS Trainium product page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trntensor-0.3.0.tar.gz (165.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trntensor-0.3.0-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file trntensor-0.3.0.tar.gz.

File metadata

  • Download URL: trntensor-0.3.0.tar.gz
  • Upload date:
  • Size: 165.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trntensor-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0fd0831b8e3f91c53857cec6009637ba6df0e4e11b3e68a2c4978a807768895c
MD5 98da673d6d7685e3cb378aa92d2e6d18
BLAKE2b-256 c23b804356895ae4add333da42a5094d93bf7f634832392682e600af11e26413

See more details on using hashes here.

Provenance

The following attestation bundles were made for trntensor-0.3.0.tar.gz:

Publisher: publish.yml on trnsci/trntensor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trntensor-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: trntensor-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trntensor-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 acb1bd70d9c12eff052b0d66370c56570405a6608f78b9ae70b1e7ba4fbc8c13
MD5 731586d75ef7714b7bd9a358184df068
BLAKE2b-256 ca15027fe8ce350123e34de1f53ff4ac6b381272eb855723a289f4cd0948d39a

See more details on using hashes here.

Provenance

The following attestation bundles were made for trntensor-0.3.0-py3-none-any.whl:

Publisher: publish.yml on trnsci/trntensor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page