Skip to main content

The Blis BLAS-like linear algebra library, as a self-contained C-extension.

Project description

This repository provides the Blis linear algebra routines as a self-contained Python C-extension.

Currently, we only supports single-threaded execution, as this is actually best for our workloads (ML inference).

Build Status pypi Version

Overview

You can install the package via pip:

pip install blis

Wheels should be available, so installation should be fast. If you want to install from source and you’re on Windows, you’ll need to install LLVM.

After installation, run a small matrix multiplication benchmark:

$ export OMP_NUM_THREADS=1 # Tell Numpy to only use one thread.
$ python -m blis.benchmark
Setting up data nO=384 nI=384 batch_size=2000. Running 1000 iterations
Blis...
Total: 11032014.6484
7.35 seconds
Numpy (Openblas)...
Total: 11032016.6016
16.81 seconds
Blis einsum ab,cb->ca
8.10 seconds
Numpy einsum ab,cb->ca
Total: 5510596.19141
83.18 seconds

The low numpy.einsum performance is expected, but the low numpy.dot performance is surprising. Linking numpy against MKL gives better performance:

Numpy (mkl_rt) gemm...
Total: 11032011.71875
5.21 seconds

These figures refer to performance on a Dell XPS 13 i7-7500U. Running the same benchmark on a 2015 Macbook Air gives:

Blis...
Total: 11032014.6484
8.89 seconds
Numpy (Accelerate)...
Total: 11032012.6953
6.68 seconds

Clearly the Dell’s numpy+OpenBLAS performance is the outlier, so it’s likely something has gone wrong in the compilation and architecture detection.

Usage

Two APIs are provided: a high-level Python API, and direct Cython access. The best part of the Python API is the einsum function, which works like numpy’s, but with some restrictions that allow a direct mapping to Blis routines. Example usage:

from blis.py import einsum
from numpy import ndarray, zeros

dim_a = 500
dim_b = 128
dim_c = 300
arr1 = ndarray((dim_a, dim_b))
arr2 = ndarray((dim_b, dim_c))
out = zeros((dim_a, dim_c))

einsum('ab,bc->ac', arr1, arr2, out=out)
# Change dimension order of output
out = einsum('ab,bc->ca', arr1, arr2)
assert out.shape == (dim_a, dim_c)
# Matrix vector product, with transposed output
arr2 = ndarray((dim_b,))
out = einsum('ab,b->ba', arr1, arr2)
assert out.shape == (dim_b, dim_a)

The Einstein summation format is really awesome, so it’s always been disappointing that it’s so much slower than equivalent calls to tensordot in numpy. The blis.einsum function gives up the numpy version’s generality, so that calls can be easily mapped to Blis:

  • Only two input tensors

  • Maximum two dimensions

  • Dimensions must be labelled a, b and c

  • The first argument’s dimensions must be 'a' (for 1d inputs) or 'ab' (for 2d inputs).

With these restrictions, there are ony 15 valid combinations – which correspond to all the things you would otherwise do with the gemm, gemv, ger and axpy functions. You can therefore forget about all the other functions and just use the einsum. Here are the valid einsum strings, the calls they correspond to, and the numpy equivalents:

Equation

Maps to

Numpy

'a,a->a'

axpy(A, B)

A+B

'a,b->ab'

ger(A, B)

outer(A, B)

'a,b->ba'

ger(B, A)

outer(B, A)

'ab,a->ab'

batch_axpy(A, B)

A*B

'ab,a->ba'

batch_axpy(A, B, trans1=True)

(A*B).T

'ab,b->a'

gemv(A, B)

A*B

'ab,a->b'

gemv(A, B, trans1=True)

A.T*B

'ab,ac->bc'

gemm(A, B, trans1=True, trans2=False)

dot(A.T, B)

'ab,ac->cb'

gemm(B, A, trans1=True, trans2=True)

dot(B.T, A)

'ab,bc->ac'

gemm(A, B, trans1=False, trans2=False)

dot(A, B)

'ab,bc->ca'

gemm(B, A, trans1=False, trans2=True)

dot(B.T, A.T)

'ab,ca->bc'

gemm(A, B, trans1=True, trans2=True)

dot(B, A.T)

'ab,ca->cb'

gemm(B, A, trans1=False, trans2=False)

dot(B, A)

'ab,cb->ac'

gemm(A, B, trans1=False, trans2=True)

dot(A.T, B.T)

'ab,cb->ca'

gemm(B, A, trans1=False, trans2=True)

dot(B, A.T)

We also provide fused-type, nogil Cython bindings to the underlying Blis linear algebra library. Fused types are a simple template mechanism, allowing just a touch of compile-time generic programming:

cimport blis.cy
A = <float*>calloc(nN * nI, sizeof(float))
B = <float*>calloc(nO * nI, sizeof(float))
C = <float*>calloc(nr_b0 * nr_b1, sizeof(float))
blis.cy.gemm(blis.cy.NO_TRANSPOSE, blis.cy.NO_TRANSPOSE,
             nO, nI, nN,
             1.0, A, nI, 1, B, nO, 1,
             1.0, C, nO, 1)

Bindings have been added as we’ve needed them. Please submit pull requests if the library is missing some functions you require.

Development

To build the source package, you should run the following command:

./bin/copy-source-files.sh

This populates the blis/_src folder for the various architectures, using the flame-blis submodule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blis-0.2.0.dev0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distributions

blis-0.2.0.dev0-cp37-cp37m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m

blis-0.2.0.dev0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.7m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.2.0.dev0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.6m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.2.0.dev0-cp35-cp35m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.5m

blis-0.2.0.dev0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.5m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

blis-0.2.0.dev0-cp27-cp27mu-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 2.7mu

blis-0.2.0.dev0-cp27-cp27m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 2.7m

blis-0.2.0.dev0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (3.0 MB view details)

Uploaded CPython 2.7m macOS 10.10+ Intel (x86-64, i386) macOS 10.10+ x86-64 macOS 10.6+ Intel (x86-64, i386) macOS 10.9+ Intel (x86-64, i386) macOS 10.9+ x86-64

File details

Details for the file blis-0.2.0.dev0.tar.gz.

File metadata

  • Download URL: blis-0.2.0.dev0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.0

File hashes

Hashes for blis-0.2.0.dev0.tar.gz
Algorithm Hash digest
SHA256 7595bde2b5a729318ea7254790b5fb7d353075f651cacff2e66df28a07b2adfd
MD5 de6445ff81ef430b02772cd904ab669e
BLAKE2b-256 82db64d0917a14f08bb4614968cf6feef87026236dd2ecba5c3fa8cfbd516839

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: blis-0.2.0.dev0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.2.0.dev0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5f979e5623fd328e7063f16847476005887043b6c4d3201dfa2e3b48b014e719
MD5 bdb93d146d3ab90ce0a5eed37cf3c538
BLAKE2b-256 8d9f4572f254acf08f0f9d87848a4c2c8300c857255d190b0a3c9a25b59cd8c0

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.2.0.dev0-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 0450baec2305ce196efe3d36479fbc52a36b568ee04d5ebca7747809fc22c232
MD5 42d8c9adb5c8943be17b3c4ff1518c7b
BLAKE2b-256 62577e748e8e4b8980288a0bcbd05e9ba59644c153c3a06fc7317612abc2dabe

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.2.0.dev0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 9586adc4692daddc728140ed0dd0bfc3e5fd163550bdfdfbebbf41e668186b38
MD5 58d8168895f7601aa9cc72f1f6f229b9
BLAKE2b-256 3934f435d6d5fbc06e78c6ef1238ea98ba2149d53eebdc60ea20995f1cfc2268

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: blis-0.2.0.dev0-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.2.0.dev0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5195af13d16ce5e1d490527a12ba7ba0ee23106739585a49e6ddd227c7e080df
MD5 140dc9388915313219996d6aec3b270d
BLAKE2b-256 064c2dc0580a61b7334b676c83e541dcd8f325e5e39a268da1401de7a9e38ee5

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.2.0.dev0-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 561f38c321e34f331a2e97fef40b8f053d60169a1331a91d55bd22bc62f7a236
MD5 5f61b3a0e52464947ad0e5f31c9c91d2
BLAKE2b-256 2449d2cf4be17dbe1d3ae72f684b39dbb030c255bde83f7c9959e3ab3dbd7fb7

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

  • Download URL: blis-0.2.0.dev0-cp27-cp27mu-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.2.0.dev0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 009f78a195f5fd19cf00244087926d24a950f38dfd218b13ca6a7bfdee4e1c22
MD5 693de3adabd4b23bc4c1cc974fb951ad
BLAKE2b-256 d81b7f15eab566bdb82a36638413105cce262af8f2ca82ad83a1760d233908c3

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

  • Download URL: blis-0.2.0.dev0-cp27-cp27m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 2.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.0

File hashes

Hashes for blis-0.2.0.dev0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ced4ad723d2e22a7112c25d2a094baf6579eb6472baf56f94ab985b520c8604b
MD5 93074aa727dbef3fa5cf2d5219a8f317
BLAKE2b-256 2b463b2e9b0bda587f604f345b03b2a30cb754fd5aef849a198e9fc2f4ebb2d3

See more details on using hashes here.

File details

Details for the file blis-0.2.0.dev0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for blis-0.2.0.dev0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 31b75d4c9b1f1851e1d879407318754ddca7b99410911853d2c5608c4b0639c2
MD5 c71181e51cc1a033b784290f58fe7e5a
BLAKE2b-256 dc657c36367dc8add5137c078125f805fe9f145951eeff206464194a0a460390

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page