Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Library for dense and sparse tensors built on the tensor algebra compiler.

Project description

Tensora

A fast and easy-to-use dense/sparse tensor library for Python. Evaluate arbitrary tensor expressions with tensors in a variety of sparsity formats.

Tensors are n-dimensional generalizations of matrices. Instead of being confined to 1 or 2 dimensions, tensors may have 3, 4, or more dimensions. They are useful in a variety of applications. NumPy is the best known tensor library in Python; its central ndarray object is an example of a dense tensor.

In a dense tensor, each element is explicitly stored in memory. If the vast majority of elements are zero, then this is an inefficient layout, taking far more memory to store and far more time to operate on. There are many different sparse tensor formats, each one better or worse depending on which elements of the tensor are nonzero.

Many tensor kernels (functions that perform a specific algebraic calculation between tensors with specific sparse formats) have been written to solve specific problems. Recently, the Tensor Algebra Compiler (taco) was invented to automate the construction and optimization of tensor kernels for arbitrary algebraic expressions for arbitrary sparse formats. taco takes an algebraic expression and description of the format of each tensor in the expression and returns a C function that efficiently evaluates the given expression for those tensor arguments.

Tensora is a Python wrapper around taco. Tensora has a central class Tensor that simply has a pointer to a taco tensor held in C memory, managed by the cffi package. Tensora exposes functions that take a string of an algebraic expression and return a Python function the performs that operation in fast C code. In order to do that, the string is parsed and passed to taco; the C code generated by taco is compiled "on the fly" by cffi and then wrapped by code that provides good error handling.

This package is highly experimental. Do not trust the results without independently verifying the output for any particular problem. This is mostly because the underlying taco compiler is itself highly experimental. Much of Tensora's test suite is skipped because of known underlying bugs in the generated kernels. As the research on taco improves, the test suite of Tensora will be expanded and the documentation improved.

Getting started

Tensora can be installed with pip from PyPI:

pip install tensora

The class Tensor and the function evaluate together provide the porcelain interface for Tensora.

Here is an example of multiplying a sparse matrix in CSR format with a dense vector:

from tensora import Tensor, evaluate

elements = {
    (1,0): 2.0,
    (0,1): -2.0,
    (1,2): 4.0, 
}

A = Tensor.from_dok(elements, dimensions=(2,3), format='ds')
x = Tensor.from_lol([0, -1, 2])

y = evaluate('y(i) = A(i,j) * x(j)', 'd', A=A, x=x)

assert y == Tensor.from_lol([2,4])

Creating Tensors

Creating a Tensor is best done via the Tensor.from_* methods. These methods convert a variety of data types into a Tensor. Most of the conversion methods optionally take both dimensions and format to determine the dimensions and format of the resulting tensor.

from_lol: list of lists

Tensor.from_lol(lol, *, 
                dimensions: Tuple[int, ...] = None, format: Union[Format, str] = None)

Convert a dense list of lists to a Tensor.

  • lol is a list of lists, possibly deeply nested. That is, lol is a float, a List[float], a List[List[float]], etc. to an arbitrary depth of Lists. The values are read in row-major format, meaning the top-level list is the first dimension and the deepest list (the one containing actual scalars) is the last dimension. All lists at the same level must have the same length. For those familiar, this is identical to the NumPy behavior when constructing an array from lists of lists via numpy.array.

  • dimensions has a default value that is inferred from the structure of lol. If provided, it must be consistent with the structure of lol. Providing the dimensions is typically only useful when one or more non-final dimensions may have size zero. For example, Tensor.from_lol([[], []]) has dimensions of (2,0), while Tensor.from_lol([[], []], dimensions=(2,0,3)) has dimensions of (2,0,3).

  • format has a default value of all dense dimensions.

from_dok: dictionary of keys

Tensor.from_dok(dok: Dict[Tuple[int, ...], float], *, 
                dimensions: Tuple[int, ...] = None, format: Union[Format, str] = None)

Convert a dictionary of keys to a Tensor.

  • dok is a Python dictionary where each key is the coordinate of one non-zero value and the value of the entry is the value of the tensor at that coordinate. All coordinates not mentioned are implicitly zero.

  • dimensions has a default value that is the largest size in each dimension found among the coordinates.

  • format has a default value of dense dimensions as long as the number of nonzeros is larger than the product of those dimensions and then sparse dimensions after that. The default value is subject to change with experience.

from_aos: array of structs

Tensor.from_aos(aos: Iterable[Tuple[int, ...]], values: Iterable[float], *, 
                dimensions: Tuple[int, ...] = None, format: Union[Format, str] = None)

Convert a list of coordinates and a corresponding list of values to a Tensor.

  • aos is an iterable of the coordinates of the non-zero values.

  • values must be the same length as aos and each value is the non-zero value at the corresponding coordinate.

  • dimensions has the same default as Tensor.from_dok, the largest size in each dimension.

  • formathas the same default as Tensor.from_dok, dense for an many dimensions as needed to fit the non-zeros.

from_aos: struct of arrays

Tensor.from_soa(soa: Tuple[Iterable[int], ...], values: Iterable[float], *, 
                dimensions: Tuple[int, ...] = None, format: Union[Format, str] = None)

Convert lists of indexes for each dimension and a corresponding list of values to a Tensor.

  • soa is a tuple of iterables, where each iterable is all the indexes of the corresponding dimension. All iterables must be the same length.

  • values must be the same length as the iterables in coordinates and each value is the non-zero value at the corresponding coordinate.

  • dimensions has the same default as Tensor.from_dok, the largest size in each dimension.

  • formathas the same default as Tensor.from_dok, dense for an many dimensions as needed to fit the non-zeros.

from_numpy: convert a NumPy array

Tensor.from_numpy(array: numpy.ndarray, *, 
                  format: Union[Format, str] = None)

Convert a NumPy array to a Tensor.

  • array is any NumPy.ndarray. The resulting Tensor will have the same order, dimensions, and values of this array.

  • format has a default value of all dense dimensions.

from_scipy_sparse: convert a SciPy sparse matrix

Tensor.from_scipy_sparse(data: scipy.sparse.spmatrix, *, 
                         format: Union[Format, str] = None)

Convert a SciPy sparse matrix to a Tensor.

  • matrix is any SciPy.sparse.spmatrix. The resulting Tensor will have the same order, dimensions, and values of this matrix. The tensor will always have order 2.

  • format has a default value of ds for csr_matrix and d1s0 for csc_matrix and also ds for the other sparse matrix types, though that is subject to changes as taco adds new format mode types.

Evaluating expressions

Taco generates kernels for algebraic expressions of tensor kernels. Tensora wraps this process with the evaluate function.

evaluate(assignment: str, output_format: str, **inputs: Tensor)
  • assignment is parsable as an algebraic tensor assignment.

  • output_format is the desired format of the output tensor.

  • inputs is all the inputs to the expression. There must be one named argument for each variable name in assignment. The dimensions of the tensors in inputs must be consistent with assignment and with each other.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for tensora, version 0.0.2
Filename, size File type Python version Upload date Hashes
Filename, size tensora-0.0.2.tar.gz (449.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page