Skip to main content

The Blis BLAS-like linear algebra library, as a self-contained C-extension.

Project description

This repository provides the Blis linear algebra routines as a self-contained Python C-extension.

Build Status pypi Version

Overview

You can install the package via pip, optionally specifying your machine’s architecture via an environment variable:

BLIS_ARCH=haswell pip install blis

If you’re using an Intel CPU, it should autodetect haswell. Other architectures available are sandybridge, carrizo, piledriver, bulldozer, knl and reference. After installation, run a small matrix multiplication benchmark:

$ python -m blis.benchmark
Setting up data nO=384 nI=384 batch_size=2000. Running 1000 iterations
Blis...
Total: 11032014.6484
7.35 seconds
Numpy (Openblas)...
Total: 11032016.6016
16.81 seconds
Blis einsum ab,cb->ca
8.10 seconds
Numpy einsum ab,cb->ca
Total: 5510596.19141
83.18 seconds

The low numpy.einsum performance is expected, but the low numpy.dot performance is surprising. Linking numpy against MKL gives better performance:

Numpy (mkl_rt) gemm...
Total: 11032011.71875
5.21 seconds

These figures refer to performance on a Dell XPS 13 i7-7500U. Running the same benchmark on a 2015 Macbook Air gives:

Blis...
Total: 11032014.6484
8.89 seconds
Numpy (Accelerate)...
Total: 11032012.6953
6.68 seconds

Clearly the Dell’s numpy+OpenBLAS performance is the outlier, so it’s likely something has gone wrong in the compilation and architecture detection.

Usage

Two APIs are provided: a high-level Python API, and direct Cython access. The best part of the Python API is the einsum function, which works like numpy’s, but with some restrictions that allow a direct mapping to Blis routines. Example usage:

from blis.py import einsum
from numpy import ndarray, zeros

dim_a = 500
dim_b = 128
dim_c = 300
arr1 = ndarray((dim_a, dim_b))
arr2 = ndarray((dim_b, dim_c))
out = zeros((dim_a, dim_c))

einsum('ab,bc->ac', arr1, arr2, out=out)
# Change dimension order of output
out = einsum('ab,bc->ca', arr1, arr2)
assert out.shape == (dim_a, dim_c)
# Matrix vector product, with transposed output
arr2 = ndarray((dim_b,))
out = einsum('ab,b->ba', arr1, arr2)
assert out.shape == (dim_b, dim_a)

The Einstein summation format is really awesome, so it’s always been disappointing that it’s so much slower than equivalent calls to tensordot in numpy. The blis.einsum function gives up the numpy version’s generality, so that calls can be easily mapped to Blis:

  • Only two input tensors

  • Maximum two dimensions

  • Dimensions must be labelled a, b and c

  • The first argument’s dimensions must be 'a' (for 1d inputs) or 'ab' (for 2d inputs).

With these restrictions, there are ony 15 valid combinations – which correspond to all the things you would otherwise do with the gemm, gemv, ger and axpy functions. You can therefore forget about all the other functions and just use the einsum. Here are the valid einsum strings, the calls they correspond to, and the numpy equivalents:

Equation

Maps to

Numpy

'a,a->a'

axpy(A, B)

A+B

'a,b->ab'

ger(A, B)

outer(A, B)

'a,b->ba'

ger(B, A)

outer(B, A)

'ab,a->ab'

batch_axpy(A, B)

A*B

'ab,a->ba'

batch_axpy(A, B, trans1=True)

(A*B).T

'ab,b->a'

gemv(A, B)

A*B

'ab,a->b'

gemv(A, B, trans1=True)

A.T*B

'ab,ac->bc'

gemm(A, B, trans1=True, trans2=False)

dot(A.T, B)

'ab,ac->cb'

gemm(B, A, trans1=True, trans2=True)

dot(B.T, A)

'ab,bc->ac'

gemm(A, B, trans1=False, trans2=False)

dot(A, B)

'ab,bc->ca'

gemm(B, A, trans1=False, trans2=True)

dot(B.T, A.T)

'ab,ca->bc'

gemm(A, B, trans1=True, trans2=True)

dot(B, A.T)

'ab,ca->cb'

gemm(B, A, trans1=False, trans2=False)

dot(B, A)

'ab,cb->ac'

gemm(A, B, trans1=False, trans2=True)

dot(A.T, B.T)

'ab,cb->ca'

gemm(B, A, trans1=False, trans2=True)

dot(B, A.T)

We also provide fused-type, nogil Cython bindings to the underlying Blis linear algebra library. Fused types are a simple template mechanism, allowing just a touch of compile-time generic programming:

cimport blis.cy
A = <float*>calloc(nN * nI, sizeof(float))
B = <float*>calloc(nO * nI, sizeof(float))
C = <float*>calloc(nr_b0 * nr_b1, sizeof(float))
blis.cy.gemm(blis.cy.NO_TRANSPOSE, blis.cy.NO_TRANSPOSE,
             nO, nI, nN,
             1.0, A, nI, 1, B, nO, 1,
             1.0, C, nO, 1)

Bindings have been added as we’ve needed them. Please submit pull requests if the library is missing some functions you require.

Development

To build the source package, you should run the following command:

./bin/copy-source-files.sh

This populates the blis/_src folder for the various architectures, using the flame-blis submodule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blis-0.1.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distributions

blis-0.1.0-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

blis-0.1.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.1.0-cp36-cp36m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

blis-0.1.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.1.0-cp35-cp35m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.5m Windows x86-64

blis-0.1.0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.5m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.1.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.2 MB view details)

Uploaded CPython 2.7m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

File details

Details for the file blis-0.1.0.tar.gz.

File metadata

  • Download URL: blis-0.1.0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.0

File hashes

Hashes for blis-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a104880dfef04e2b3c4f02f3c5c29685a58b2b37a18bd0288fc708609802fd95
MD5 759ef5cff64818223a6edc98d4c03a2f
BLAKE2b-256 6ad2cf9546e234663410b39ab280757925a458b565e8ef5de6fcf78ca818b1df

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: blis-0.1.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.1.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 fd12dea4700f1c37b2d2a0291736cf02fb85e7f679659ae912f05b8ee4991233
MD5 40be1447f571d1edfa10cb70be47c97e
BLAKE2b-256 e80aa304b54efe0df3e586e7f28d6ce7475c0912b5300aafdd7983c0ccad5c66

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.1.0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 6163852260b0833797fcbb021706e92c0cec2029a593c9df7881a0894adfe643
MD5 7a3e986c9a4c26728177df2e9165e1f6
BLAKE2b-256 73c6a7f4f6359bc110b460c4b4db544e4d13120a44080d69cc3364d7d9d83d29

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: blis-0.1.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 6192366c1c3d06b669ad75d3d10849f74f5ed79d86342f58cc390ec35388c2ec
MD5 07179aec2a3816835ae5072a40c42ef2
BLAKE2b-256 503a831326826e7bc4f0e99f63d629e631557c4ff1699d6f37d12bad259e91bc

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.1.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 fe20d3f0c633e63be974764460fa16038481c153d2f5f4307454818a0366fa8c
MD5 f60fc8c6ae80bad07bee4367add86a5c
BLAKE2b-256 709027f0f73bdad5a57be792bcced6b2f69d4f0f035dacb04eeed8bc0ffd6a8d

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: blis-0.1.0-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.1.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 4f81bb47b63b70f2aca32077f1ac1282181a281dff90beea943e43cb5c8a5c51
MD5 3dc274ffae29a15268860cc9ef5f2525
BLAKE2b-256 0db4f17f0093324fbdb2e4af0f0bd11037f2ad1b7ca0894de92fc0be80edc86e

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.1.0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 048ae2485b83048581067b77a416c08bfa2123b05317f76068a5193cc4d6c271
MD5 c7d581396046290ec15c4ee1987c7149
BLAKE2b-256 3d2a435887f6cc9faa3bb03bd1a8d87aea13e3bc461de0dc51a4c15b9168089a

See more details on using hashes here.

File details

Details for the file blis-0.1.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.1.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 4447d639d39a96e1e1a99983f1b6ae14095e0dca30b4b903694afb93c4d9077e
MD5 19dc4c8192d174b55033ff363c7585fa
BLAKE2b-256 5c81f7941b9f49d86e9794d0d91bca26b2395aa15d6cd0d457fd49b577211d47

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page