Library for dense and sparse tensors built on the tensor algebra compiler
Project description
Tensora
A fast and easy-to-use dense/sparse tensor library for Python. Evaluate arbitrary tensor expressions with tensors in a variety of sparsity formats.
Tensors are n-dimensional generalizations of matrices. Instead of being confined to 1 or 2 dimensions, tensors may have 3, 4, or more dimensions. They are useful in a variety of applications. NumPy is the best known tensor library in Python; its central ndarray
object is an example of a dense tensor.
In a dense tensor, each element is explicitly stored in memory. If the vast majority of elements are zero, then this is an inefficient layout, taking far more memory to store and far more time to operate on. There are many different sparse tensor formats, each one better or worse depending on which elements of the tensor are nonzero.
Many tensor kernels (functions that perform a specific algebraic calculation between tensors with specific sparse formats) have been written to solve specific problems. Recently, the Tensor Algebra Compiler (TACO) was invented to automate the construction and optimization of tensor kernels for arbitrary algebraic expressions for arbitrary sparse formats. TACO takes an algebraic expression and description of the format of each tensor in the expression and returns a C function that efficiently evaluates the given expression for those tensor arguments.
Tensora is a Python wrapper around two tensor algebra compilers: the original TACO library and an independent Python implementation in Tensora. Tensora has a central class Tensor
that simply has a pointer to a taco tensor held in C memory, managed by the cffi
package. Tensora exposes functions that take a string of an algebraic expression and return a Python function the performs that operation in fast C code. In order to do that, the string is parsed and passed to the tensor algebra compiler; the C code generated by the tensor algebra compiler is compiled "on the fly" by cffi
and then wrapped by code that provides good error handling.
Tensora comes with a command line tool tensora
, which provides the C code to the user for given algebraic expressions and tensor formats.
Getting started
Tensora can be installed with pip
from PyPI:
pip install tensora
The class Tensor
and the function evaluate
together provide the porcelain interface for Tensora.
Here is an example of multiplying a sparse matrix in CSR format with a dense vector:
from tensora import Tensor, evaluate
elements = {
(1,0): 2.0,
(0,1): -2.0,
(1,2): 4.0,
}
A = Tensor.from_dok(elements, dimensions=(2,3), format='ds')
x = Tensor.from_lol([0, -1, 2])
y = evaluate('y(i) = A(i,j) * x(j)', 'd', A=A, x=x)
assert y == Tensor.from_lol([2,8])
Creating Tensor
s
Creating a Tensor
is best done via the Tensor.from_*
methods. These methods convert a variety of data types into a Tensor
. Most of the conversion methods optionally take both dimensions
and format
to determine the dimensions and format of the resulting tensor.
from_lol
: list of lists
Tensor.from_lol(lol, *,
dimensions: tuple[int, ...] = None, format: Format | str = None)
Convert a dense list of lists to a Tensor
.
-
lol
is a list of lists, possibly deeply nested. That is,lol
is afloat
, alist[float]
, alist[list[float]]
, etc. to an arbitrary depth oflist
s. The values are read in row-major format, meaning the top-level list is the first dimension and the deepest list (the one containing actual scalars) is the last dimension. All lists at the same level must have the same length. For those familiar, this is identical to the NumPy behavior when constructing an array from lists of lists vianumpy.array
. -
dimensions
has a default value that is inferred from the structure oflol
. If provided, it must be consistent with the structure oflol
. Providing the dimensions is typically only useful when one or more non-final dimensions may have size zero. For example,Tensor.from_lol([[], []])
has dimensions of(2,0)
, whileTensor.from_lol([[], []], dimensions=(2,0,3))
has dimensions of(2,0,3)
. -
format
has a default value of all dense dimensions.
from_dok
: dictionary of keys
Tensor.from_dok(dok: dict[tuple[int, ...], float], *,
dimensions: tuple[int, ...] = None, format: Format | str = None)
Convert a dictionary of keys to a Tensor
.
-
dok
is a Python dictionary where each key is the coordinate of one non-zero value and the value of the entry is the value of the tensor at that coordinate. All coordinates not mentioned are implicitly zero. -
dimensions
has a default value that is the largest size in each dimension found among the coordinates. -
format
has a default value of dense dimensions as long as the number of nonzeros is larger than the product of those dimensions and then sparse dimensions after that. The default value is subject to change with experience.
from_aos
: array of structs
Tensor.from_aos(aos: Iterable[tuple[int, ...]], values: Iterable[float], *,
dimensions: tuple[int, ...] = None, format: Format | str = None)
Convert a list of coordinates and a corresponding list of values to a Tensor
.
-
aos
is an iterable of the coordinates of the non-zero values. -
values
must be the same length asaos
and each value is the non-zero value at the corresponding coordinate. -
dimensions
has the same default asTensor.from_dok
, the largest size in each dimension. -
format
has the same default asTensor.from_dok
, dense for an many dimensions as needed to fit the non-zeros.
from_soa
: struct of arrays
Tensor.from_soa(soa: tuple[Iterable[int], ...], values: Iterable[float], *,
dimensions: tuple[int, ...] = None, format: Format | str = None)
Convert lists of indexes for each dimension and a corresponding list of values to a Tensor
.
-
soa
is a tuple of iterables, where each iterable is all the indexes of the corresponding dimension. All iterables must be the same length. -
values
must be the same length as the iterables in coordinates and each value is the non-zero value at the corresponding coordinate. -
dimensions
has the same default asTensor.from_dok
, the largest size in each dimension. -
format
has the same default asTensor.from_dok
, dense for an many dimensions as needed to fit the non-zeros.
from_numpy
: convert a NumPy array
Tensor.from_numpy(array: numpy.ndarray, *,
format: Format | str = None)
Convert a NumPy array to a Tensor
.
-
array
is anyNumPy.ndarray
. The resultingTensor
will have the same order, dimensions, and values of this array. -
format
has a default value of all dense dimensions.
from_scipy_sparse
: convert a SciPy sparse matrix
Tensor.from_scipy_sparse(data: scipy.sparse.spmatrix, *,
format: Format | str = None)
Convert a SciPy sparse matrix to a Tensor
.
-
matrix
is anySciPy.sparse.spmatrix
. The resultingTensor
will have the same order, dimensions, and values of this matrix. The tensor will always have order 2. -
format
has a default value ofds
forcsr_matrix
andd1s0
forcsc_matrix
and alsods
for the other sparse matrix types, though that is subject to changes as Tensora adds new format mode types.
Evaluating expressions
Tensora generates kernels for algebraic expressions of tensor kernels and wraps them in the evaluate
function.
evaluate(assignment: str, output_format: str, **inputs: Tensor)
-
assignment
is parsable as an algebraic tensor assignment. -
output_format
is the desired format of the output tensor. -
inputs
is all the inputs to the expression. There must be one named argument for each variable name inassignment
. The dimensions of the tensors ininputs
must be consistent withassignment
and with each other.
There is also evaluate_tensora
and evaluate_taco
that have identical interfaces, but use different tensor algebra compilers. evaluate
is an alias for the default, which is currently evaluate_tensora
.
Getting the C code
The tensora
CLI tool emits the C code for a given algebraic expression, tensor formats, and kernel type. It can emit the C code it generates itself, or the C code generated by TACO. It comes installed with Tensora and can be run like:
tensora 'y(i) = A(i,j) * x(j)' -f A:ds -t compute -o kernel.c
Here is the output of tensora --help
for reference:
Usage: tensora [OPTIONS] ASSIGNMENT
╭─ Arguments ──────────────────────────────────────────────────────────────────────────────╮
│ * assignment TEXT The assignment for which to generate code, e.g. y(i) = A(i,j) │
│ * x(j). │
│ [required] │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────╮
│ --format -f TEXT A tensor and its format │
│ separated by a colon, e.g. │
│ A:d1s0 for CSC matrix. │
│ Unmentioned tensors are be │
│ assumed to be all dense. │
│ --type -t [assemble|compute|evaluate] The type of kernel that will │
│ be generated. Can be │
│ mentioned multiple times. │
│ [default: compute] │
│ --compiler -c [tensora|taco] The tensor algebra compiler │
│ to use to generate the │
│ kernel. │
│ [default: tensora] │
│ --output -o PATH The file to which the kernel │
│ will be written. If not │
│ specified, prints to standard │
│ out. │
│ [default: None] │
│ --install-completion [bash|zsh|fish|powershell|p Install completion for the │
│ wsh] specified shell. │
│ [default: None] │
│ --show-completion [bash|zsh|fish|powershell|p Show completion for the │
│ wsh] specified shell, to copy it │
│ or customize the │
│ installation. │
│ [default: None] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tensora-0.3.0.tar.gz
.
File metadata
- Download URL: tensora-0.3.0.tar.gz
- Upload date:
- Size: 51.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f40787e36d6cf07d78a7644e06f18f7288d603f4279d2e014dded966fb62de3b |
|
MD5 | 108243bed90c36b23fa98df5da3b3160 |
|
BLAKE2b-256 | 611dd65543a552a1978477359261f5f37ed6e0b2e74fb58612829a52daaa407c |
File details
Details for the file tensora-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: tensora-0.3.0-py3-none-any.whl
- Upload date:
- Size: 66.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6509a68d4e8daa224fea9212edcad17aadbebf522fa054b60a65351755bddb22 |
|
MD5 | 432ae507e5a6d8bc7c54a4e53b706fff |
|
BLAKE2b-256 | 4c1d9a1f1e7e889c25d057bbce87604d94940aad1de9ac93270a4e7a9e0ae4b6 |